I hit upon a similar idea back in about 2000... the key at the time was video encoding hardware that was designed to incorporate MPEG compression. Pulling motion vectors from the encoded stream could have given realtime optical flow way back then. Unfortunately I was nowhere good enough as a programmer to take advantage of it for real projects. A team at MIT had a paper about this that I was able to find.
Had a page about it on robots.net, but it seems to be down now
In our case it's a hardware h264 encoder in an embedded product, so our method wouldn't be any of any use for you in a more general case. We sussed out an undocumented debug api to get the chip to dma the motion vectors to a debug buffer, and then we can extract and use them. I think those apis are originally intended to be used if a customer is having issues with the product so that the vendor engineers can help troubleshoot.
We were originally looking at analyzing the h264 bitstream in software and getting the vectors out that way. Honestly I don't think that would be extraordinarily difficult, there's already code in ffmpeg to draw the motion vectors as arrows on the decoded frame, so that would be a good place to look if there aren't any preexisting apis or tools to extract them.
Most real-time video processing systems/techniques only address relationships of objects within the same frame, disregarding time information. Optical flow accounts for this temporal relationship between frames.
Advances in Optical Flow have changed the game in Object Tracking and Human Activity Recognition in videos.
This article explains the fundamentals and gives you the code to try it out for yourself.
I've dabbled with optical flow for a hobby side project (using the Windows ShaderEffectClass of all things). I worked on the Kinect back in the day; while I was primarily audio, I had a fascination of applying DSPs to two- and three-dimensional temporal data.
I've always felt that it was a missed opportunity to tap into temporal information for entity recognition. I'm excited to see this take hold!
@Darkphibre, wonderful background! Yes, and optical flow + recent advancements in deep learning is also certainly something exciting to see flourish! Action recognition seems like a promising research area (https://research.google.com/ava/ CVPR '18).
Back in 2008 there was an amazing demo of realtime dense optical flow on a GPU[0] but all the links are dead now. I've searched hard for another comparable implementation since then but had no luck.
Does anyone have a hint on what technique they might have used?
Thomas Brox’s lab had a ton of these around 2008-2012 as well, such as [0]. I believe Brox had some freely available early CUDA program to calculate optical flow that was sort of SOTA for many years.
Hi @haxiomic, I'm the author of the article - thanks for reading! Several more recent approaches of computing optical flow with CNNs such as FlowNet2 (CVPR '17) (https://github.com/lmb-freiburg/flownet2) should also be runnable on video or live webcam feed.
(Hierarchical) Optical flow is slow and prone to the need to manually adjust its constants for each scene (not really universal). Did you think about using 3D convolutions & attention with Deep Learning instead?
Hi @bitL, I'm the author of the article. Yes, I think deep learning is a promising solution that replaces the need for manual fine-tuning and is certainly driving momentum in optical flow research. Something you may want to look into is sequence-to-sequence (seq2seq) learning (https://google.github.io/seq2seq/).
Yes, they are. In fact, I think my Samsung TV from circa 2009 actually calls it "Motion Estimation". It makes everything look like it was filmed on a Sony Handycam from the 80s. I don't understand why anyone would want to turn it on.
Hi @adzm, I'm the author of the article - thanks for reading! I think many of the implementations involve video processing techniques such as motion-compensated frame interpolation... this seems interesting (https://www.mia.uni-saarland.de/Publications/raket-isvc12.pd...).
Had a page about it on robots.net, but it seems to be down now