How Moap Works: Trajectorization, and Labeling

So far in the series, we’ve started in the middle at reconstruction.  Then we took a step back and talked about reflectivity and markers.  Now, we’re going to move forward again, into the steps after reconstruction.

This article will be a little different than the previous ones, in that its more theoretical than practical.  That is to say, its the theory of how these kinds of things are done, not neccesarily how its done in Arena or in Vicon’s IQ.  Both systems are really closed boxes when it comes to a lot of this.  I can say, that the theory explained here is the basis for a series of operators in Kinearx, my "in development" mocap software.  And most of the theory is used in some form or another in Arena and IQ as well.  It just may not quite work exactly as I’m describing it.  Also, its entirely possible I’m overlooking some other techniques.  It would be good if this post spurred some discussion of alternate techniques. 

So, to review, the mocap system has triangulated the markers in 3d space for each frame.  However, it has no idea which marker is which.  They are not strung together in time.  Each frame simply contains a bunch of 3d points that are separate from the 3d points in the previous and next frames. I’ll term this "raw point cloud data."

Simple Distance Based Trajectorization

Theory:  Each point in a given frame can be compared to each point in the previous frame.  If the current point is closer than a given distance to a point in the previous frame, there’s a good chance its the same marker, just moved a little.

Caveats:   The initial desire here, will be to turn up the threshold, so that when the marker is moving, it registers as being close enough.  The problem, is that the distance one would expect markers to be from one another on a medium to small object, is close to the distance they would be expected to travel if the object were moved at a medium speed.  Its the same order of magnitude.  Therefore, there’s a good chance that it will make mistakes.

Recommendation:  This can be a useful heuristic.  However, the threshold must be kept low.  What will result, will be trajectorization of  markers that are moving slowly, or are mostly still.  However, movement will very quickly pass over the threshold and keep moving markers from being trajectorized.  This technique could be useful for creating a baseline or starting point.  However, it should probably be ignored if another more reliable heuristic disagrees with it.

Trajectorization Based on Velocity

Theory:  When looking at an already trajectorized frame, one can use the velocity of a trajectory to predict the location of a point in the next frame.  Comparing every point in the new frame against the predicted location, with a small distance threshold should yield a good match.  Since we are tracking real world objects that actually have real world momentum, this should be a valid assumption.  This technique can also be run in reverse.  This technique can be augmented further by measuring acceleration and using it to modify the prediction.

Caveats:  Since there is often a lot of noise involved in raw mocap data, a simple two frame velocity calculation could be WAY off.  A more robust velocity calculation taking multiple samples into consideration can help, but increase the likelihood that the data samples are from too far back in time to be relevant to the current velocity and acceleration of the marker (by now, maybe the muscle has engaged and is pushing the maker a different direction entirely).  An elastic collision will totally throw this algorithm off. Since the orientation of the surfaces that are colliding is unknown to the system, its not realistic for it to be able to predict direction.  And since most collisions are only partially elastic, the distance can not be predicted.  Therefore, an elastic collision will almost always result in a break of the trajectory.

Recommendation:  This heuristic is way more trustworthy than the simple distance calculation.  The threshold can be left much lower and should be an order of magnitude smaller than the velocity of a moving marker.  It can also be run multiple times with different velocity calculations and thresholds.  The results should be biased appropriately, but in general, confidence in this technique should be high.

Manual Trajectorization

 Theory: You, the human, can do the work yourself.  You are trustworthy.  And its your own fault if you’re not.

Caveats:  Who has time to click click click every point in every frame to do this?

Recommendation:  Manual trajectorization should be reserved for extremely difficult small sections of mocap, and for sparse seeding of the data with factual information.  Confidence in a manual trajectory should be extremely high however.

Labeling enforces Trajectorization

Theory:  If the labeling of two points says they’re the same label, then they should be part of the same trajectory.

Caveats:  Better hope that labeling is right.

Recommendation:  We’re about to get into labeling in a bit.  So you might think of this as a bit of a circular argument.  The points are not labeled yet.  And they’re trajectorized before we get to labeling.  So its too late right?  Or too early?  Not necessarily.  I can only really speak for Kinearx here, not Arena or IQ.  However, Kinearx will approach the labeling and trajectorization problems in parallel.  So in a robust pipeline, there will be labeling data and trajectorization data available.  The deeper into the pipeline, the more data will be available.  So, assuming you limit a trajectorization decision to labeling data that is highly trusted, this technique can also be highly trusted.

Trajectorization enforces Labeling

Theory: If a string of points in time are trajectorized, and one of those points are labeled, all the points in the trajectory can be labeled the same.

Caveats: Better hope that trajectorization is right.

Recommendation:  Similar to the previous technique, this one is based on execution order.  IQ uses this very clearly.  You can see it operate when you start manually labeling trajectories. The degree to which Arena uses it is unknown, but I suspect its in there.  Kinearx will make this part of its parallel solving system.  It will also likely split trajectories based on labeling, if conflicting labels exist on a single trajectory.  I prefer to rely on this quite a bit.  I prefer to spot label the data with highly trusted labeling techniques, erring on the side of not labeling if you’re not sure, and have this technique fill in the blanks.

Manaual Labeling

Theory: You, the human, can do the work yourself.  You are trustworthy.  And its your own fault if you’re not.

Caveats:  Who has time to click click click every point in every frame to do this?

Recommendation:  Manual labeling should be reserved for extremely difficult sections of mocap, and for sparse seeding of the data with factual information.  Confidence in a manual label should be extremely high however.  When I use IQ, I take an iterative approach to the process and have the system do an automatic labeling pass, to see where its having trouble on its own.  I then step back to before the automatic labeling pass and seed the trouble areas with some manual labeling.  Then I save and set off the automatic labeling again.  Iterating this process, adding more manual labeling data, eventually results in a mostly correct solve.  Kinearx will make sure to allow a similar workflow, as I’ve found it to be the most reliable to date.

Simple Rigid Body Distance Based Labeling

Theory:  If you kn
ow a certain number of markers to move together because they are attached to the same object, you can inform the system of that fact.  It can measure their distances from one another (calibrate the rigid body) and then use that information to identify them on subsequent frames.

Caveats:  Isosceles triangles and equilateral triangles cause issues here.  There is a lot of inaccuracy and noise involved in optical mocap and therefore, the distances between markers will vary to a point.  When it comes to the human body, there is a lot of give and stretch.  Even though you might want to treat the forearm as a single rigid body, the fact is, it twists along its length and markers spread out over the forearm will move relative to one another.

Recommendation:  This is still the single best hope for automatic marker recognition.  When putting markers on objects, its important to play to the strengths and weaknesses of this technique.  So, make sure you vary the distances between markers.  Avoid making equilateral and isosceles  triangles with your markers.  Always look for a scalene triangle setup.  When markering similar or identical objects, make sure to vary the marker locations so they can be individually identified by the system (this includes left and right sides of the human body).  If this is difficult, consider adding an additional superfluous marker on the objects in a different location on each, simply for identification purposes.  On deforming objects (such as the human body), try to keep the markers in an area with less deformation (closer to bone and farther from flesh).  Make good use of slack factors to forgive deformation and inaccuracy.  Know the resolution of your volume.  Don’t place markers so close that your volume resolution will get in the way of an accurate identification.

Articulated Rigid Body Distance and Range of Motion Based Labeling

Theory:  This is an expansion of the previous technique, to include the concept of connected, jointed or articulated rigid body systems.  If two rigids are connected by a joint (humerus to radius in a human arm for example) the joint location can be considered an extra temporary marker for distance based identification on either rigid.  Therefore, if one rigid is labeled enough to find the location of the joint, the joint can be used to help label the other rigid.  Furthermore, information regarding the range of motion of the joint can help cull mis identifications.

Caveats:  Its possible that the limits on a joint’s rotation could be too restricting compared with the reality of the subject, and cull valid labels.

Recommendation:  This is perhaps the most powerful technique of all.  Its nonlinear and therefore somewhat recursive in nature.  However, most importantly, it has a concept of structure and pose and therefore can be a lot more intelligent about what its doing that other more generic methods.  It wont help you track a bunch of marbles or a swarm of ants, but anything that can be abstracted to an articulated jointed system (most things you’d want to mocap) are greatly assisted by this technique.  You can also go so far as to check the pose of the system from previous frames against the current solution to throw out labeling that would create too much discontinuity from frame to frame.


These techniques get you what you need to trajectorize and label your data.  However, there are plenty of places to go from here.  These steps serve multiple purposes.  They’ll be executed for realtime feedback.  They’ll be the first steps in a cleanup process.  They may be used and their results exported to a 3rd party app such as motion builder.  Later steps may include:

  • more cleanup
  • export
  • tracking of skeletons and rigids
  • retargeting
  • motion editing

IQ, Arena, Blade and Kinearx may or may not support all of those paths.  For example, currently, Arena will allow more cleanup.  It will track skeletons and rigids.  It will stream data into motion builder.  It will export data to motion builder.  It will not regarget.  It will not get into motion editing.  Motiobuilder can retarget and motion edit, and it also has some cleanup functionality.  IQ will allow more cleanup, export and tracking.  It does not perform retargeting or motion editing.  Blade supports all of this.  Kinearx will likely support some retargeting but will stay clear of too much motion editing in favor of a separate product that will be integrated into an animator’s favorite 3d package (Maya or XSI for example).

The next topic will likely be tracking of skeletons and rigids.  You might notice that we’ve kind of gotten into this a bit with the labeling of articulated rigid systems.  And you’d be correct in making that identification. A lot of code would be shared between the labeler and the tracker.  However, whats best for labeling may not be best for tracking.  So the implementation is usually different at a higher level because the goals are different. 

Leave a Reply