A great article. This is what Hacker News should be about; identifying a problem, researching solutions, and publishing the results for use. Thank you for a fantastic post.
This is great.
I wonder if applying a crowdsource solution, with humans selecting the billboard, would also have delivered a cost-effective solution with the benefits in less time.
This is actually something we considered (we originally built this tool to import data from one of the major vendors). Their data was consistent enough that we were able to achieve a high enough hit rate, that it made sense to do it this way (OpenCV is an amazing framework for this type of task). However, from most of our vendors, the board themselves are not outlined/blocked out. In those cases, the CV algorithms fall apart fairly quickly, since quality of the input photos is quite variable (most are rather low resolution and you'd be surprised at the number of images we get with trees or other objects blocking a large portion of the board). The crowd sourced version is definitely on our fun tasks pile, since we've got a lot of inventory identify the bounds automatically.
I'd like to see how it works on the third image from the top, with the gas station sign, etc.
It is fairly straightforward to detect a billboard in a random nature scene; look for straight lines and intersections of them. Most (all?) billboards are also horizontal.
It is quite another thing to detect billboards in an urban setting.