I hesitate to say I'm working on a startup, but I've been working on a piece of software for a few years now. One of the key components is a scraper, so I have pretty serious interest in this topic.
It looks like they have thought things through pretty well, but I looked around and didn't find interesting data or useful code.
screen scraping is a bot that pulls the visual data from the screen and analyzes it. A bunch of tutorials talk about web scraping and call it screen scraping.
By the way, if you're interested in scraper/crawlers also have a look at 80legs.com crawling SaaS, with some custom code capabilities too.
Or, if python is your bag, there's a scraping lib called scrapy that was opensourced last year that's ok.
On ScraperWiki, they need source control as the environment is actually run on their servers, I'd guess a bespin type implementation, so you put all your code on the wiki, and it can get augmented etc. hence needs SCM. At least, for the moment, it seems they've got plans of releasing their api engine or at least calls to it in which case you'd end up doing the code locally, and can use your own SCM.
Their confusing naming could be explained that their not pitching this at coders, but rather journalists to try and get them to use the vast data resources out on the net.
screen scraping is a bot that pulls the visual data from the screen and analyzes it. A bunch of tutorials talk about web scraping and call it screen scraping
A more useful definition would be "extracting structured data from human interfaces"
except it seems to imply pulling data from interfaces other than the screen, so screen scraping is a subset of ui scraping. I still see this as distinct from scraping data from web pages, because that's basically an entity extraction problem which lies underneath the ui scraping problem.
It looks like they have thought things through pretty well, but I looked around and didn't find interesting data or useful code.
screen scraping is a bot that pulls the visual data from the screen and analyzes it. A bunch of tutorials talk about web scraping and call it screen scraping.
# Built in source control
really???