
The video always looks a little funky when they are showing crowds of running people.
It reminds me of early-/mid-90's video games with FMV (full motion video) graphics. The people are flat images pre-recorded (or pre-rendered if they are CGI) as individual sprites.
The video is then overlayed using chromakeying and sometimes touched up.
Observe these silly takes from 1997 classic
Lands of Lore IIhttp://www.youtube.com/watch?v=f8L-RjZ6 ... re=relatedhttp://www.youtube.com/watch?v=wIkXycH5 ... re=relatedWithout a reference on how to store a 2D sprite in a 3D environment, the lazy creators might conceivably drop in the FMV without comparing to real people and just try to "feel out" how the crowds should appear.
And now see how such a video is inserted into the actual 2.5-D (not fully 2D or 3D) gameplay. (Sorry again for the ridiculous fantasy theme. It kinda fits the "fantasy world" of the 9/11 videos though.)
http://www.youtube.com/watch?v=qvGpOExd ... re=relatedWhen 2D characters are encountered in a 3D landscape filled with 2D sprites, you get some really warped looking graphics. I believe "2-and-a-half-D" is a term that programmers and game designers from the mid-90s are familiar with and can recognize in a great deal of the wonky 9/11 graphics. I think for ease of rendering, the military employed this cheap & quick kind of software as opposed to attempting to render their 3D model in full or "on the fly".
Another example is from the 90's hit
Diablo 2, where the characters are 2D sprites, but created from
pre-rendered 3-D models.

Just a speculation - we obviously don't know - but
what if they had an amazing 3D model of New York City ... that they simply couldn't get to render their "controlled area" properly so they chopped up the scenery and created 2D sprites to futz around with just like a mid-90's video game? Perhaps even some photographs of the buildings could be rendered on the building models for a 'Google Earth' type effect.