When I've searched for content that falls under a DMCA takedown there is usually a link to the original DMCA notice at chillingeffects at the bottom of the google page.
In theory shouldn't all the take down notices include the URL that was de-listed? So how hard would it be to take a search term, run it through google, look at all the DMCA results at chilling effect that are linked, parse out the original offending URL and result Title, then reconstruct something close to what would have been the pre-DMCA search result?
Is googlewithoutdmca.com available?
I'm not going to pursue this. If anyone wants to pick it up, knock yourself out.
I just Googled "hobbit torrent" -- there were 5 DMCA notices at the bottom of the page -- so far, so good.
The problem is a single DMCA complaint contains thousands of URLs for many movies (ex: http://www.chillingeffects.org/notice.cgi?sID=709810). To reconstruct the page with these results would require Google revealing which of the thousands of urls was actually withheld.
I think it's possible, just a bit too hard to make it a quick "piss off the RIAA" project. Instead of merely parsing out the URLs it would require linking the infringing titles at the top to the URLs below.
I would do a first pass that tries to compare the URL string to the infringing content. The second pass would cURL all of the yet to be identified URLs and then parse it for content that would link it back to the infringing content titles. I'd actually expect the page titles to contain the infringing content title most of the time (also there's probably a small number of sites that account for 90% of the takedown notices).
This all leads me to another thought : what's to stop someone from parsing all the DMCA takedown URLs for the content they contain and then generating a search engine for them. Legality for US citizens would be minimal here at best but not all of us are from the US.
And yet Google is still the easiest way to search for pirated content. I think that the most interesting part of this article is that the biggest alleged infringer, FilesTube, doesn't actually host any infringing files! It's simply a search engine which uses hyperlinks and iframes to embed results.
At the rate the requests are growing there is no way to see if the requests are legitimate. This allows the DMCA to be used as a weapon against sites that are not infringing anything.
there is a column which shows which requests have urls that were rejected. You have to go back pretty far to find requests with rejected URLs (since there are a lot of requests and urls are rejected only rarely).
A third party is free to go through googles transparancy reports, there take down requests, and use the engine itself to validate what got taken down. As far as I am aware, Google is the only party devoting the resources nessasary to go through their DMCA requests.
They will grow, and not in a good way. These are automated requests, and RIAA and the others don't really care if a percentage of them are not correct. The problem is that if your site was in these lists, and you actually have a fair use, your complaint will be manually processed - so it will take a lot of time.
BTW, the site list[1] looks like a great index for file sharing sites; somehow it defeats the purpose.
For anyone interested in pirating content on the internet, it is already pretty easy (even without google search, or this page in particular). So it doesn't really change anything.
Right. twitter search for putlocker.com or sockshare.com
#whac-a-mole
I never understood how megavideo got taken down when youtube had a much larger library of pirated content. Still does. Just watched Dredd 3D on it last week.
Or, you could try Bing instead. Last May, MS sent Google a takedown notice for a Bing search result, which was still working[1].
[1]http://www.techdirt.com/articles/20121008/03500520637/micros...