Motion Estimation with Optical Flow

pontifier · on April 29, 2019

I hit upon a similar idea back in about 2000... the key at the time was video encoding hardware that was designed to incorporate MPEG compression. Pulling motion vectors from the encoded stream could have given realtime optical flow way back then. Unfortunately I was nowhere good enough as a programmer to take advantage of it for real projects. A team at MIT had a paper about this that I was able to find.

Had a page about it on robots.net, but it seems to be down now

tbirdz · on April 29, 2019

We're doing this with H.264 at work. Pretty useful since we need to generate the encoded stream anyway, so now we get motion estimation for free.

rjeli · on April 29, 2019

Awesome. Which encoder provides this? Can you pull it from cuvid?

tbirdz · on April 30, 2019

In our case it's a hardware h264 encoder in an embedded product, so our method wouldn't be any of any use for you in a more general case. We sussed out an undocumented debug api to get the chip to dma the motion vectors to a debug buffer, and then we can extract and use them. I think those apis are originally intended to be used if a customer is having issues with the product so that the vendor engineers can help troubleshoot.

We were originally looking at analyzing the h264 bitstream in software and getting the vectors out that way. Honestly I don't think that would be extraordinarily difficult, there's already code in ffmpeg to draw the motion vectors as arrows on the decoded frame, so that would be a good place to look if there aren't any preexisting apis or tools to extract them.

Good luck!

haxiomic · on April 30, 2019

NVidia has some example code for getting the motion vectors during video encoding

https://github.com/NVIDIA/video-sdk-samples/blob/master/Samp...

rjeli · on April 30, 2019

Thank you!

stefan_ · on April 29, 2019

The H.264 hardware encoder in the Raspberry Pi can provide the motion vectors:

https://picamera.readthedocs.io/en/release-1.13/recipes2.htm...

ole_gooner · on April 29, 2019

Hey,

Most real-time video processing systems/techniques only address relationships of objects within the same frame, disregarding time information. Optical flow accounts for this temporal relationship between frames. Advances in Optical Flow have changed the game in Object Tracking and Human Activity Recognition in videos.

This article explains the fundamentals and gives you the code to try it out for yourself.

Darkphibre · on April 29, 2019

I've dabbled with optical flow for a hobby side project (using the Windows ShaderEffectClass of all things). I worked on the Kinect back in the day; while I was primarily audio, I had a fascination of applying DSPs to two- and three-dimensional temporal data.

I've always felt that it was a missed opportunity to tap into temporal information for entity recognition. I'm excited to see this take hold!

chuanenlin · on April 30, 2019

@Darkphibre, wonderful background! Yes, and optical flow + recent advancements in deep learning is also certainly something exciting to see flourish! Action recognition seems like a promising research area (https://research.google.com/ava/ CVPR '18).

haxiomic · on April 29, 2019

Back in 2008 there was an amazing demo of realtime dense optical flow on a GPU[0] but all the links are dead now. I've searched hard for another comparable implementation since then but had no luck.

Does anyone have a hint on what technique they might have used?

[0] https://www.youtube.com/watch?v=ssINeWRb58M

lcrs · on April 29, 2019

A lot of that group's publications are listed here, many involving optical flow: http://web.archive.org/web/20161014025823/http://gpu4vision....

Maybe this one from 2007: https://www-pequan.lip6.fr/~bereziat/cours/master/vision/pap...

haxiomic · on April 29, 2019

Oh amazing find Icrs, thank you! :)

mlthoughts2018 · on April 30, 2019

Thomas Brox’s lab had a ton of these around 2008-2012 as well, such as [0]. I believe Brox had some freely available early CUDA program to calculate optical flow that was sort of SOTA for many years.

[0]: https://lmb.informatik.uni-freiburg.de/Publications/2010/Bro...

chuanenlin · on April 30, 2019

Hi @haxiomic, I'm the author of the article - thanks for reading! Several more recent approaches of computing optical flow with CNNs such as FlowNet2 (CVPR '17) (https://github.com/lmb-freiburg/flownet2) should also be runnable on video or live webcam feed.

haxiomic · on April 30, 2019

Thanks @chuanenlin, I found the article very informative at easy to digest :)

Thanks for the link - can’t wait to have a play with it!

manneshiva · on April 30, 2019

I have seen similar results using Gunner Farneback's algorithm.

bitL · on April 30, 2019

(Hierarchical) Optical flow is slow and prone to the need to manually adjust its constants for each scene (not really universal). Did you think about using 3D convolutions & attention with Deep Learning instead?

chuanenlin · on April 30, 2019

Hi @bitL, I'm the author of the article. Yes, I think deep learning is a promising solution that replaces the need for manual fine-tuning and is certainly driving momentum in optical flow research. Something you may want to look into is sequence-to-sequence (seq2seq) learning (https://google.github.io/seq2seq/).

adzm · on April 29, 2019

Really neat article. I have always wondered if these techniques are used at all in higher fps interpolations which always seem... off.

youbetcha · on April 29, 2019

Yes, they are. In fact, I think my Samsung TV from circa 2009 actually calls it "Motion Estimation". It makes everything look like it was filmed on a Sony Handycam from the 80s. I don't understand why anyone would want to turn it on.

chuanenlin · on April 30, 2019

Hi @adzm, I'm the author of the article - thanks for reading! I think many of the implementations involve video processing techniques such as motion-compensated frame interpolation... this seems interesting (https://www.mia.uni-saarland.de/Publications/raket-isvc12.pd...).