Hacker News new | past | comments | ask | show | jobs | submit login
Peerwiki: all of Wikipedia on BitTorrent (github.com/mafintosh)
88 points by galapago on March 5, 2015 | hide | past | favorite | 30 comments



Author here. If anyone is interested I presented this at jsconf.eu last year as part of a BitTorrent talk, https://www.youtube.com/watch?v=BTCsSwCpGP8 (slides: http://mafintosh.github.io/slides/jsconf-2014/jsconf-eu-2014...)


What are the obstacles to running this client-side in the browser? That would lower the barrier to entry for users by a lot.

Edit: I just finished watching your talk where you mention that https://github.com/feross/webtorrent is already doing that.


The idea is to showcase how a large dataset is pretty good at being shared without having central servers. Here using bittorrent.

mafintosh showed how wikipedia could be shared without a central server(s), and instead rely on a network of peers.

subtack did something similar, peermaps, which is a showcase on how you can share geo data over bittorrent. Imagine a google maps without a google servers. https://github.com/substack/peermaps

Of course there's many unsolved questions, like "how do you update?", "how do you manage the data?", etc. But the examples are pretty solid.


This README is a bit short on telling what it actually does. Anyone?


Pretty sure that it's: When you want to go on a wikipedia article, it requests the file from other peers in the network using the bittorrent protocol. It's a copy of Wikipedia placed on bittorrent, presumably with some semantics for article updates.


It looks like it uses a single Wikipedia dump from over a year ago. Would be cool if it supported deltas somehow, so the network doesn't split when a new dump is used.


Bittorrent doesn't have support for that, but you could publish torrents that just host a full version of all the modified pages.


Found out recently that there's a cool project underway to make a distributed wiki - http://wardcunningham.github.io/.


this breaks the bittorrent.


Can an observer watch what pages people read?


Is this like Popcorn Time for Wikipedia?

I think Bittorrent is also working on a similar project for the whole web, called Maelstrom:

http://blog.bittorrent.com/2014/12/10/project-maelstrom-the-...


Closed source?


Seems like bittorrent foundation has been doing a lot of closed work source ever since bittorrent has been used across the world and they got very little money out of it.

Just my drive by judgement.


I think so, yes.


I wouldn't want something like this to be closed source. It's very important that the security of the decentralization can be validated.

And in these NSA times it's even more important.


If the concept works well, an open source variant is sure to be created precisely for this reason.


A Wikipedia hosted in a decentralized manner (i.e. DHT on running computers) that could still be updated in a distributed fashion would really help us maintain that knowledge for the future while not relying Wikimedia's servers to keep running.


Would a blockchain-like solution work here? Where new edits piled up on top of the existing data, constantly sharing it across all servers?


It could but it would be an unnecessary hurdle, because there is no need for a global consensus on a single version of the encyclopedia. Think of it as a git tree, and checkout the branch you like. With a currency, it's imperative that everyone refers to the same branch all the time, not so with an encyclopedia.

In practice there would be a few "popular" branches, and one would likely dominate, so that it would be trivial to identify it by relying on a social consensus.

Using a blockchain when what you need is a distributed database is overkill.


If you consider malicious actors (http://en.wikipedia.org/wiki/Wikipedia:Long-term_abuse), some form of local consensus becomes necessary. In the real Wikipedia this is handled by admin actions right now. And that's really the only workable system I can come up with right now.

You could have different branches with different admin teams, but in a sense we have that technology already: anybody can download an XML dump of the Wikipedia database and set up their own Wikipedia clone with minimal effort.


Yes, we do have that technology, yes it's Wikipedia, yes it can be distributed, no that doesn't require a blockchain nor particularly benefit from one.


It's not imperative that everyone refer to the same branch all the time for currency either. If User A gives Bitcoin to User B, and User C gives Bitcoin to User D, there's no need for them to be on the same branch as the two transactions are unrelated.

It's when you need to merge branches back together that things tend to get messy - with currency or an encyclopedia, and everyone being on the same branch all the time is better, for both cases.

I can just imagine what the 'popular' branch for an encyclopedia would look like if 4chan set their sites on causing problems for it.


1) Sure, some operations are commutative and compatible, and you can import them from one branch to another, but in the general case, they aren't

2) Same thing that happens currently on Wikipedia. There's a set of rules that determine what edits are valid and which aren't, and a local client can applies those rules (or not) to determine the HEAD version. There's nothing that prevents a Sybil attack on Wikipedia, but there doesn't seem to be the need for it.


Why? Specifically - why would I choose to browse this way over just accessing wikipedia?


You wouldn't ask if you lived in China. Though bittorrent traffic may need to be disguised.

Decentralising the internet is generally a great idea.


An interesting point here is that Wikipedia actually works in China, only some articles are censored. I tried to bypass this by typing https://www.wikipedia.org into the browser, but it loaded the http version regardless.


Then take a look at Freenet. :P


This is just one answer. You know how Wikipedia is always asking for donations (and rightfully so)? Part of those donations go to paying server cost.

Well, this cuts the server cost by decentralizing their content.


I don't think it is so much a question about why. More that is pretty cool idea.


Interesting project.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: