Professional Software Forums

This is a rant.  If you do not like rants, move on.

So, I’ve frequented a number of professional level media creation software forums in my time.  I even worked with the creator of XSIBase for a couple of years (professionally in animation, not on XSIBase itself). I’ve also spent quite a bit of time in forums for MMOGs (Massively Multiplayer Online Games) both as a member and as a moderator.

As you might immagine, the MMOG forums are sesspools of childish behavior, ego jockeying and all around nastyness. They’re also a lot of fun.  And no one really expects them to be anything other than a waste of time.

Professional forums however, are a different matter.  What I’m talking about here are forums for software like Maya, XSI, Avid, Final Cut etc.  The expectations for these forums are different.  They’re a lot of things to a lot of people and I’d like to rant about some of the typical forum diseases I see a lot.

Top Dog Syndrome

This syndrome is common.  Users who play into this syndrome have their egos attached to an image of being a top dog within the forum.  They feel that they are seen as having a superior technical or professional ability and they have to keep that image up.  When this is practiced to a minor degree, its beneficial, as it provides a push toward answering questions.  It gets out of hand more often than not and has seriously detrimental effects.

Now, I KNOW I have a tendency toward falling to this syndrome myself.  And I try to keep a check on it.  That being said, its difficult.  Specifically because I actually am much more learned and experienced on most of these matters than the general professional community.  My usual position within an animation team is that of "Technical Director" which is defined as being the guy/gal with all the answers on technical and technique matters.  So I’m paid to be top dog.

So, what I use to define a healthy top dog versus an unhealthy top dog, is weather the pressure to answer questions results in answers that should never have been given.  When a person on a forum starts skimming questions and responding with erroneous information, its a problem.  And more importantly, some forums are pandemic with it, to the point that its the norm.

For example, I logged into the central forum for a very high end video editing application recently, because I was having a problem with a feature.  I was unsure if what I was seeing was user error, or a bug, or a design limitation after searching the forum for answers and reading the documentation.  I did have enough evidence that I was fairly certain it was user error or a bug, as I had been able to force the software to work correctly under some very specific settings that were unfortunately, not good enough to let me work in the general case.  Anyhow, I posted a good detailed explanation of what I was seeing, what I thought I should be seeing, forum threads that had talked about similar problems and what resulted when I tried to implement the recommendations in those threads.  I asked if anyone had any experience or ideas that were related to what I was seeing.  So I wrote what I believe to be any professional forum’s dream post.

What did I get?  I got a guy with over 20,000 posts responding almost immediately with a suggestion that I was doing everything wrong and should change my entire workflow.  He also recommended that I read the manual.

Now, I’ve been using non-linear editing software for over 12 years at this point.  I was using it professionally for broadcast before a single system cost less than $200,000.  I’m currently acting as a combo post production supervisor and visual effects supervisor on a feature film and I know more about video compression and post production workflow than most people with the title of "post production supervisor" on this planet. I work with the software.  I can write the software.  I developed those skills in a professional environment as the technology developed over the past decade.

So as you might imagine, when a 20,000 post top dog fails to actually read my post and comprehend it, and gives me a canned line for amateurs, I don’t say "thank you" and throw away my post production pipeline because he said so.  Instead, I posted that while the advice was appreciated, its not valid due to reasons a, b c, d and e that were explained in the initial post.  I also added that I’d prefer it if the thread remained focused on the features I was having an issue with, rather than commenting on the general post production workflow.  This again, is a professional way of dealing with the issue.  Keep the thread on target.  And if the problem is not solved, don’t let the thread die, if for no other reason than that others will find the thread when they run into the problem, and they’re as entitled to a reasonable conclusion as you are.

So, a 2,000 post user then came to his defense and reaffirmed that I was doing everything wrong and made even more suggestions that were immediately invalidated by the information in the original post (he didn’t read it).

So what’s really going on here?  Its top dog syndrome.  They are not actually interested in the problem.  They’re interested in being seen answering my question, especially since my tone and technical explanation indicates that I’m a threat to their top dog status. By composing an initial post thats very high level, I’ve put myself in their line of fire. I’m a threat and they have to respond.

There was a little back and forth while I refuted their claims with tests and information to the contrary.  They continued to tell me everything I was doing was wrong.  I made an extreme effort to not make personal attacks and stop at the level of suggesting the topic was steering off course.

Eventually I gave up, frustrated and angry.  I posted a quick rant at the end of the thread where I declared the forum usesless due to a focus on ego polishing and rampant misinformation.

I then proceeded to investigate the problem further myself until I was convinced I understood the behavior enough to classify it a bug or design flaw.  Either way, at that point it should actually be submitted to the developer in the form of a bug report.  Its also clear at that point, that you wont get any relief from it.  Possibly ever.  Just because you can isolate a bug and give full repro steps and get it into a developer’s system, doesn’t mean that its ever going to be fixed.  In fact, it often wont be if the developer is large enough.  Internal politics and bureaucracy almost always gets in the way.  So at that point, if the functionality is important, you have to find another solution (workaround.) And thats what the forums are for really at this level.  They allow exchange of information on bugs and software misbehaviors.  More importantly, they provide workarounds and ideas.  But this particular forum was not serving that purpose and probably never will.  All because of the rampant top dog syndrome.  The results of my attempts to combat it in just my one little post because I really needed someone to take the problem seriously?  I was belittled and  attacked.  Some forums are beyond help.

In the past, I’ve been able to overcome the top dog issue with the approach I tried here.  Generally, repeated appeals to fact and reason result in forcing the top dogs to actually deal with the problem in order to be seen ultimately solving the problem, or at least be part of the confirmation of the problem.  However, that only seems to work if the larger forum community is technical enough to see those facts and reasons for what they are, even if they can’t provide an answer.  If the top dog feels the general community i
s smart enough to see them messing up, they’ll try to save their skin.  Film and Video editors working in a generally Macintosh community do not meet that threshold and therefore, the top dogs on that community had no fear of being seen playing ego games when its clear to a technically inclined individual that there’s something wrong going on.  So it didn’t work.  And I declared the forum a lost cause.

Professionals vs Amatuers vs Prosumers

These forums tend to be populated by users at varying levels of usage.  Amateurs tend to be looking for training and answers to questions that require a certain level of expertise in order to research oneself.  These users drive professionals mad.  Because they often are asking to be able to do incredibly complex or difficult things without actually studying and training enough to even understand what they’re asking.  Add to that, they often belittle the fact that it does take a lot of training and dedication.  They often have a certain level of entitlement to these more difficult techniques but don’t feel the answer that it requires time and experience is fair, and it breeds anger.

Prosumers are people who see themselves as professional but are actually unaware what the professional level actually means.  For example, animators who work on projects of 30 to 300 seconds with teams less than 10 people.  They don’t comprehend the issues involved with projects of 20 or more minutes with teams of 50 – 500 people.  They think its just a matter of hiring more people and being organized.  So their responses and approaches to issues are often not scale-abe and would bring a full scale production to a halt.  But they and their peers don’t understand that and therefore are unable to evaluate or comprehend it.  These users make up the majority of the user base.  These types of users are frustrating to Professionals but often not infuriating.  They’re frustrating for a number of reasons.  Firstly, because they often spurn the advice of professionals because they don’t fully understand it and see them as being overly complex.  Second, because the software is usually written for prosumers and not professionals.  The developers often confuse the prosumers for the professionals and cater to them, often creating features that are useless in a professional environment at the expense of professional level features or functionality.  Thirdly, it is the prosumer userbase that professionals recruit from, and its frustrating to see the prosumer base become accustomed to working in a non-scalable manner, because you know you’re going to have to retrain them when you eventually recruit them.  Both they and you would be better off if they’d just listen and try to understand… but well, that wont happen.  So you just let it go and move on.  But the chorus of prosumer voices completely overpowers the professional voice.

What’s the solution?  The forum moderators need to try to categorize their forums.  Create subforums.  Create beginners forums.  Create topic forums.  This keeps everyone from getting in everyone else’s way.  This is the way XSIBase is organized actually.  And its a good approach. You’ll find most of the professional level users who are concerned with scale-able solutions in the "scripting" and "programming" forums.

WingIDE for Python

Thought I’d just put in a quick shout out for my favorite Python coding tool, ever.  WingIDE from WingWare. WingIDE is by far the best Python coding environment I’ve ever used.

I know the first question that comes to mind when looking at the pricetag:  "With all the free python IDEs and script editors, why bother buying one?  They’re all about the same."  Well, thats mostly true.  Most python script editors I’ve used are about the same.  They provide some mediocre code completion and code folding.  Not bad… just not as good as it could be.

For me, its all about code completion.  Smart code completion.  The kind that reads APIs on the fly, knows what kind of object you’re working with, and tells you what is possible with that object.  Its sort of a combination of code completion and an object browser.  Visual Studio is renowned for its ability to do this on the fly.  Most good python script editors attempt this level of completion but they’re confounded by pythons dynamic typing.  I’ll give an example.

[code] 

import xml.dom.minidom

def myFunction (doc, element):
    pass

[/code]

So, here’s the question.  Since python has dynamic loose typing (opposite of static typing), when I try to code with the objects doc and element, how is the editor to know what types of objects they are so it can tell me what I can do with them?  It might be able to look at the code that is calling the function, but thats backwards.  A function can be called multiple times from anywhere.  Perhaps with completely different object types.  And its possible both of those calls could be valid.  The same problem shows up when trying to figure out what type of object a function returned.  There’s no rule that a function always has to return the same kind of object.  So how could the system know?

Its at this point that most script editors give up.  Code completion stops working the moment you get outside the scope of objects you create yourself within a single function.

With WingIDE, you can hint the system and you get your code completion back.  All you have to do is put in an particular type of assert statement.  For example:

[code]

import FIE.Constraints

def myFunction(obj, const):
    assert isinstance(const, FIE.Constraints.ParentConstraint)

[/code]

from the assert statement on down, code completion now works again.  There’s also an added benefit, in that the script will throw an exception should the assertion fail.  In python, my script could go for another 20 lines working on the wrong type of object and giving me a vague error without the assert check that cuts straight to the heart of the matter.

WingIDE will also parse source files for documentation and display it for you as you code, eliminating the need to constantly look up the API docs yourself.

Now, I know there’s a hardcore base of programmers out there who say all they need is a text editor and be damned with all these fancy IDEs and their crutches.  Well, I simply disagree.  I’m sure if you are a coder who has maybe 2 APIs to work with on a regular basis, perhaps that is all you need.  But in my job, I am required to learn a new API within a few hours and repeat that as much a necessary.  That can sometimes be 2-3 APIs a day. Do I know the full API?  No.  I know enough to get the job done.  And thats what I’m paid to do.  For that kind of coding (and scripting, I think lends itself to that kind of coding more that development does) there is no better tool than WingIDE.  Call me a weak coder if you wish.  I’ll just keep coding, getting the job done faster and better, and keep getting paid to do it. I have a job to do.

 

 

 

How Moap Works: Trajectorization, and Labeling

So far in the series, we’ve started in the middle at reconstruction.  Then we took a step back and talked about reflectivity and markers.  Now, we’re going to move forward again, into the steps after reconstruction.

This article will be a little different than the previous ones, in that its more theoretical than practical.  That is to say, its the theory of how these kinds of things are done, not neccesarily how its done in Arena or in Vicon’s IQ.  Both systems are really closed boxes when it comes to a lot of this.  I can say, that the theory explained here is the basis for a series of operators in Kinearx, my "in development" mocap software.  And most of the theory is used in some form or another in Arena and IQ as well.  It just may not quite work exactly as I’m describing it.  Also, its entirely possible I’m overlooking some other techniques.  It would be good if this post spurred some discussion of alternate techniques. 

So, to review, the mocap system has triangulated the markers in 3d space for each frame.  However, it has no idea which marker is which.  They are not strung together in time.  Each frame simply contains a bunch of 3d points that are separate from the 3d points in the previous and next frames. I’ll term this "raw point cloud data."

Simple Distance Based Trajectorization

Theory:  Each point in a given frame can be compared to each point in the previous frame.  If the current point is closer than a given distance to a point in the previous frame, there’s a good chance its the same marker, just moved a little.

Caveats:   The initial desire here, will be to turn up the threshold, so that when the marker is moving, it registers as being close enough.  The problem, is that the distance one would expect markers to be from one another on a medium to small object, is close to the distance they would be expected to travel if the object were moved at a medium speed.  Its the same order of magnitude.  Therefore, there’s a good chance that it will make mistakes.

Recommendation:  This can be a useful heuristic.  However, the threshold must be kept low.  What will result, will be trajectorization of  markers that are moving slowly, or are mostly still.  However, movement will very quickly pass over the threshold and keep moving markers from being trajectorized.  This technique could be useful for creating a baseline or starting point.  However, it should probably be ignored if another more reliable heuristic disagrees with it.

Trajectorization Based on Velocity

Theory:  When looking at an already trajectorized frame, one can use the velocity of a trajectory to predict the location of a point in the next frame.  Comparing every point in the new frame against the predicted location, with a small distance threshold should yield a good match.  Since we are tracking real world objects that actually have real world momentum, this should be a valid assumption.  This technique can also be run in reverse.  This technique can be augmented further by measuring acceleration and using it to modify the prediction.

Caveats:  Since there is often a lot of noise involved in raw mocap data, a simple two frame velocity calculation could be WAY off.  A more robust velocity calculation taking multiple samples into consideration can help, but increase the likelihood that the data samples are from too far back in time to be relevant to the current velocity and acceleration of the marker (by now, maybe the muscle has engaged and is pushing the maker a different direction entirely).  An elastic collision will totally throw this algorithm off. Since the orientation of the surfaces that are colliding is unknown to the system, its not realistic for it to be able to predict direction.  And since most collisions are only partially elastic, the distance can not be predicted.  Therefore, an elastic collision will almost always result in a break of the trajectory.

Recommendation:  This heuristic is way more trustworthy than the simple distance calculation.  The threshold can be left much lower and should be an order of magnitude smaller than the velocity of a moving marker.  It can also be run multiple times with different velocity calculations and thresholds.  The results should be biased appropriately, but in general, confidence in this technique should be high.

Manual Trajectorization

 Theory: You, the human, can do the work yourself.  You are trustworthy.  And its your own fault if you’re not.

Caveats:  Who has time to click click click every point in every frame to do this?

Recommendation:  Manual trajectorization should be reserved for extremely difficult small sections of mocap, and for sparse seeding of the data with factual information.  Confidence in a manual trajectory should be extremely high however.

Labeling enforces Trajectorization

Theory:  If the labeling of two points says they’re the same label, then they should be part of the same trajectory.

Caveats:  Better hope that labeling is right.

Recommendation:  We’re about to get into labeling in a bit.  So you might think of this as a bit of a circular argument.  The points are not labeled yet.  And they’re trajectorized before we get to labeling.  So its too late right?  Or too early?  Not necessarily.  I can only really speak for Kinearx here, not Arena or IQ.  However, Kinearx will approach the labeling and trajectorization problems in parallel.  So in a robust pipeline, there will be labeling data and trajectorization data available.  The deeper into the pipeline, the more data will be available.  So, assuming you limit a trajectorization decision to labeling data that is highly trusted, this technique can also be highly trusted.

Trajectorization enforces Labeling

Theory: If a string of points in time are trajectorized, and one of those points are labeled, all the points in the trajectory can be labeled the same.

Caveats: Better hope that trajectorization is right.

Recommendation:  Similar to the previous technique, this one is based on execution order.  IQ uses this very clearly.  You can see it operate when you start manually labeling trajectories. The degree to which Arena uses it is unknown, but I suspect its in there.  Kinearx will make this part of its parallel solving system.  It will also likely split trajectories based on labeling, if conflicting labels exist on a single trajectory.  I prefer to rely on this quite a bit.  I prefer to spot label the data with highly trusted labeling techniques, erring on the side of not labeling if you’re not sure, and have this technique fill in the blanks.

Manaual Labeling

Theory: You, the human, can do the work yourself.  You are trustworthy.  And its your own fault if you’re not.

Caveats:  Who has time to click click click every point in every frame to do this?

Recommendation:  Manual labeling should be reserved for extremely difficult sections of mocap, and for sparse seeding of the data with factual information.  Confidence in a manual label should be extremely high however.  When I use IQ, I take an iterative approach to the process and have the system do an automatic labeling pass, to see where its having trouble on its own.  I then step back to before the automatic labeling pass and seed the trouble areas with some manual labeling.  Then I save and set off the automatic labeling again.  Iterating this process, adding more manual labeling data, eventually results in a mostly correct solve.  Kinearx will make sure to allow a similar workflow, as I’ve found it to be the most reliable to date.

Simple Rigid Body Distance Based Labeling

Theory:  If you kn
ow a certain number of markers to move together because they are attached to the same object, you can inform the system of that fact.  It can measure their distances from one another (calibrate the rigid body) and then use that information to identify them on subsequent frames.

Caveats:  Isosceles triangles and equilateral triangles cause issues here.  There is a lot of inaccuracy and noise involved in optical mocap and therefore, the distances between markers will vary to a point.  When it comes to the human body, there is a lot of give and stretch.  Even though you might want to treat the forearm as a single rigid body, the fact is, it twists along its length and markers spread out over the forearm will move relative to one another.

Recommendation:  This is still the single best hope for automatic marker recognition.  When putting markers on objects, its important to play to the strengths and weaknesses of this technique.  So, make sure you vary the distances between markers.  Avoid making equilateral and isosceles  triangles with your markers.  Always look for a scalene triangle setup.  When markering similar or identical objects, make sure to vary the marker locations so they can be individually identified by the system (this includes left and right sides of the human body).  If this is difficult, consider adding an additional superfluous marker on the objects in a different location on each, simply for identification purposes.  On deforming objects (such as the human body), try to keep the markers in an area with less deformation (closer to bone and farther from flesh).  Make good use of slack factors to forgive deformation and inaccuracy.  Know the resolution of your volume.  Don’t place markers so close that your volume resolution will get in the way of an accurate identification.

Articulated Rigid Body Distance and Range of Motion Based Labeling

Theory:  This is an expansion of the previous technique, to include the concept of connected, jointed or articulated rigid body systems.  If two rigids are connected by a joint (humerus to radius in a human arm for example) the joint location can be considered an extra temporary marker for distance based identification on either rigid.  Therefore, if one rigid is labeled enough to find the location of the joint, the joint can be used to help label the other rigid.  Furthermore, information regarding the range of motion of the joint can help cull mis identifications.

Caveats:  Its possible that the limits on a joint’s rotation could be too restricting compared with the reality of the subject, and cull valid labels.

Recommendation:  This is perhaps the most powerful technique of all.  Its nonlinear and therefore somewhat recursive in nature.  However, most importantly, it has a concept of structure and pose and therefore can be a lot more intelligent about what its doing that other more generic methods.  It wont help you track a bunch of marbles or a swarm of ants, but anything that can be abstracted to an articulated jointed system (most things you’d want to mocap) are greatly assisted by this technique.  You can also go so far as to check the pose of the system from previous frames against the current solution to throw out labeling that would create too much discontinuity from frame to frame.

Conclusion

These techniques get you what you need to trajectorize and label your data.  However, there are plenty of places to go from here.  These steps serve multiple purposes.  They’ll be executed for realtime feedback.  They’ll be the first steps in a cleanup process.  They may be used and their results exported to a 3rd party app such as motion builder.  Later steps may include:

  • more cleanup
  • export
  • tracking of skeletons and rigids
  • retargeting
  • motion editing

IQ, Arena, Blade and Kinearx may or may not support all of those paths.  For example, currently, Arena will allow more cleanup.  It will track skeletons and rigids.  It will stream data into motion builder.  It will export data to motion builder.  It will not regarget.  It will not get into motion editing.  Motiobuilder can retarget and motion edit, and it also has some cleanup functionality.  IQ will allow more cleanup, export and tracking.  It does not perform retargeting or motion editing.  Blade supports all of this.  Kinearx will likely support some retargeting but will stay clear of too much motion editing in favor of a separate product that will be integrated into an animator’s favorite 3d package (Maya or XSI for example).

The next topic will likely be tracking of skeletons and rigids.  You might notice that we’ve kind of gotten into this a bit with the labeling of articulated rigid systems.  And you’d be correct in making that identification. A lot of code would be shared between the labeler and the tracker.  However, whats best for labeling may not be best for tracking.  So the implementation is usually different at a higher level because the goals are different.