Sense: A New Open-Source Video Understanding Framework for Deep Learning

ilaksh · on Feb 8, 2021

How powerful!

Not having intermediate pose estimation is interesting to me. It just seems like it would need something similar to that to at least be automatically learned. Maybe it effectively recognizes different image patches and their relevant movements? And so the full pose is just extra information that makes things less efficient?

The third dimension of the CNN is for time?

I wonder if it could be useful to have a pose estimation that would compliment the pixels. So not throwing out the pixels after the pose estimation. That way if the pose wasn't perfectly detected or there was other relevant information in the scene then it would all be considered. But correct pose estimation could still be useful to the network in those cases.

monkeydust · on Feb 8, 2021

Virtual personal trainer anyone? Looks impressive.

suyash · on Feb 9, 2021

weird double licensing makes it no go for me