Hacker News new | past | comments | ask | show | jobs | submit login
How the Bitcoin protocol actually works (michaelnielsen.org)
524 points by zootar on Dec 6, 2013 | hide | past | favorite | 55 comments



Absolutely love this quote from the article: "Money is like gas in the car – you need to pay attention or you’ll end up on the side of the road – but a well-lived life is not a tour of gas stations!" - Tim O'Reilly


It would be better to just be explicit.

Money can be an incredibly useful thing to have, but getting happiness is a lot more complicated than just getting a lot of money.


A platitude is only interesting if the language used to convey it has some poetry to it. Otherwise it's just a platitude.


What I said isn't a platitude. It is straight to the point.


"a remark or statement, esp. one with a moral content, that has been used too often to be interesting or thoughtful."

Your statement /was/ to the point. that doesn't make it not a platitude.


A platitude would be, "Money doesn't buy happiness," or the equivalent. I specifically did not say that.

I don't want to talk about this any further. We have misunderstood one another to the point that this isn't productive.


>I don't want to talk about this any further. We have misunderstood one another to the point that this isn't productive.

Oh man. A hacker news comment 'This isn't productive' Now you have me laughing. Seriously, I need to block news.ycombinator.com at my router again. I didn't even get one server up and ready today (though I dealt with several hard drives, a co-lo emergency and some social/business bullshit... I wasn't completely unproductive, but HN... did not help with that.)


It's still just a platitude. Poetry is just a pointless distraction.

You can put lipstick on a pig...


>It's still just a platitude. Poetry is just a pointless distraction.

and a painting is just some oil with some impurities on a cotton backing.

The fact that words sometimes convey practical meaning does not preclude them from also being art.


>>Poetry is just a pointless distraction.

I need to start a website called "The cold-hearted comments of HN".


Quintessential HN.


So what's life like without the ability to acknowledge art?


> It's still just a platitude. Poetry is just a pointless distraction.

> You can put lipstick on a pig...

Was this comment the spoken word in disguise? It sounded a little poetic to me.


Poetry, art is our only umbrella in the shitstorm that we call life, how can you say it is a pointless distraction?


Great article. There is a lot in the bitcoin protocol that is possible that the vast majority of people have never heard of, some of which this article touches on.

For an idea of some of what could be done, see the Contracts section on the bitcoin wiki:

https://en.bitcoin.it/wiki/Contracts

In the near future, we may start to see things like multi signature transactions* and the like. In theory, it could open up some very interesting options.

*Technically, I believe there exist a few of these transactions already on the blockchain.


There's even a web service for easy multisig: https://www.bitrated.com/


From the article: "I don’t understand why double spending can’t be prevented in a simpler manner using two-phase commit. What drawbacks and advantages does it have compared to the full Bitcoin protocol? uppose Alice tries to double spend an infocoin with both Bob and Charlie. The idea is that Bob and Charlie would each broadcast their respective messages to the Infocoin network, along with a request: “Should I accept this?” They’d then wait some period – perhaps ten minutes – to hear any naysayers who could prove that Alice was trying to double spend. If no such nays are heard (and provided there are no signs of attempts to disrupt the network), they’d then accept the transaction."

The problem with 2PC is that a malicious node could stop somebody from being able to spend their coins by always sending nays out to the network when ever the victim sent a transaction. To prevent this, you would need to be able to detect when a node is faulty/malicious which would require implementing a costly Byzantine Consensus Protocol [1]. In practical systems that can withstand Byzantine faults, the number of messages required to agree on a log entry (e.g. a transaction) would be O(n^2) in the number of nodes in the network, which would greatly limit scalability.

The genius of Bitcoin's Proof of Work protocol is that:

1) it is more resilient than Byzantine agreement. Byzantine agreement with 3f+1 nodes can handle at most f faulty nodes, while Bitcoin can tolerate 49% of the network hashing power being malicious. (though this is being debated currently due to theoretical strategies like selfish-mining)

2) it is much more efficient than Byzantine agreement in the number of messages sent, which has allowed the network to scale to thousands of nodes, although they have been running into issues with the blocksize/block limit, but their are currently research efforts [4] underway to remedy this.

The main problem with Bitcoin's proof of work scheme it is that it is extremely expensive in terms of CPU cycles, but this is solved by compensating miners for their efforts through coin generation and transaction fees.

[1] - http://www.cs.cornell.edu/courses/cs614/2004sp/papers/lsp82.... [2] - http://pmg.csail.mit.edu/papers/osdi99.pdf [3] - http://arxiv.org/pdf/1311.0243v5.pdf [4] - http://www.cs.huji.ac.il/~avivz/pubs/13/btc_scalability_full...


I don't think this exactly right -- an evil node can't send a false nay because it would have to be able to forge the conflicting transaction, the signature prevents that.

Bitcoin already essentially does the 2PC that the author is asking about for unconfirmed transactions. The problem being solved is that the Byzantine Generals problem is unsolvable for anonymous actors, as a malicious participant can create a majority of evil voters, winning any dispute resolution by 'Sybil attack'.

The proof of work system allows a newly joined node to determine the current consensus even when it's disputed, without having any idea who is on the network, so long as it has at least one link to the true hashing-power consensus. It also acts as a commitment protocol -- once you've signed your winning block to the network, it's nonrepudiable even by you.

With an alternate source of identification, a pseudo-anonymous 'Infocoin' ledger should be able to function and scale just fine without all the PoW expenditure--or in other words, you must have a system of making identities expensive, and Bitcoin's is Proof of Work.


Original post author here. Thanks -- your comment pretty much nails the answer to my question.

Edit: A problem with this is that it would be necessary for a naysayer to exhibit the other transaction (proof of double spending). But they couldn't forge such a transaction, since they don't have the private key necessary to generate the signature. So I'm still a bit puzzled by this.

Edit 2: And it appears that bcoates is making the same point.


Btw, Ripple is basically a two-phase commit protocol. The problem is trust - ripple has basically decided to trust a centralized list of nodes, whereas bitcoin does not require this trust.


The problem is network disruption. Your system would work if you could guarantee all nodes in the network could communicate with each other every 10 minutes. What happens if this is not possible?

This scenario is called a netsplit in Bitcoin parlance. It has happened before, and it will happen again. In bitcoin, healing a netsplit is relatively easy, you pick the longest chain and you allow all the transactions from the smaller chains to be re-added to blocks in the new official chain.

In your system, if you had a double-spend where both sides ended up being accepted due to a netsplit, which one would win during the healing process? In bitcoin, the winner is the one that happens to be in the longest chain.

Secondly, how would you store the infocoin ledger in your system? The advantage of the bitcoin blockchain is that it provides an official source for all transactions that is easy to verify. A new node can come in, download the block chain and be certain they have all the information available about bitcoin ownership, with no holes. In your system, without a blockchain, how do you guarantee you have a record of all transactions, with no gaps?


"In your system, if you had a double-spend where both sides ended up being accepted due to a netsplit, which one would win during the healing process? In bitcoin, the winner is the one that happens to be in the longest chain."

That doesn't seem to be unanswerable, and it's not like "happens to be on the longest chain" is more deserving or anything, it just happens to be the way it works in bitcoin.


The smallest of nits to pick, but Readability has issues with the "images as text" behavior of the variables and numbers in the crypto parts of the post.

Otherwise, I'm very sorry that I didn't find this until after Thanksgiving. As many of you are likely similarly tagged, I'm the "computer" guy in the family, and so whenever something happens in the news that's got anything to do with computers, I'm the guy people bug for answers.

Bitcoin was the topic de jour this Thanksgiving, and my explanations would have benefitted greatly from this plain-English run-through.

Fortunately, after having read this, I now will be able to give people the deluxe explanation, should they make the fatal mistake of asking. :) Thanks.


Yeah, and I find they're more difficult to read. Anyone have suggestions for a good solution, which still allows for image fallback?


Amazing post. I thought I had a pretty good understanding of Bitcoin, yet this is the first time I've even heard of an associated scripting language. I can't wait for the post detailing that. Here's the wiki article for now: https://en.bitcoin.it/wiki/Script


The scripting language is the reason I'm betting on bitcoin in the long run. The scripting language lets you do so many interesting things with it (escrow, wills, and surely lots of things I can't envision).

Check out this video about bitcoin contracts which talks about the scripting language: https://www.youtube.com/watch?feature=player_embedded&v=mD4L...

Unfortunately, if I understand correctly, non-standard transactions are not accepted by most miners currently, which means the interesting things in the previously linked contracts video wouldn't work. This should improve over time as new transactions are proposed and added to the list of standard transactions. The bitcointalk forums, while having a lot of noise, also has interesting discussions about new transaction types.

Bitcoin combined with its scripting language is just too good of an idea to go away, and I think we'll start to see more interesting transactions within a couple years.

edit: Forgot to mention that everyone who is manic about bitcoin never even mentions the good ideas in the actual protocol, they only focus on the crazy price swings. Learning about the backing technology is the best way to feel confident about bitcoin's future.


Someone is running a competition right now. To win, you have to find a collision using various hash functions. https://bitcointalk.org/index.php?topic=293382.0 The rules of the competition are encoded in the bitcoin transaction using Script. For example:

  OP_2DUP OP_EQUAL OP_NOT OP_VERIFY OP_SHA1 OP_SWAP OP_SHA1 OP_EQUAL
which means SHA1(x) == SHA1(y) AND NOT (x == y)


Here's two dumb questions you might want to answer in a follow-up (and I encourage you to write more articles).

1. If it's a principle that "everybody" has a copy of the block chain, and the block chain records every transaction ever, isn't that going to be a storage problem? How big is one person's copy of the block chain now, and how many petabytes might it grow to be in a few years?

2. The way to spend a fractional coin is to have one input and two outputs, one back to yourself with the unspent change. But doesn't one's wallet begin to be cluttered with lots and lots of little fractional items? Like having a pocketful of small coins... So is there an automated way to every so often, do a many-small-input, one-big-output payment to yourself to consolidate the change?


Trying to answer: 1. https://en.bitcoin.it/wiki/Scalability#Storage 2. This is actually implemented by the client and is not visible to the user, the bitcoin-qt client creates hidden change addresses that are not shown to the user, so you should see it as a "backend implementation" and not something you'll see in your transaction list.


1. This is something that has come up in discussion a fair amount, as the blockchain is already GBs in size and takes a fair amount of time to even sync with peers to download it all. There are a number of proposals for "thin" clients that will hopefully address this:

https://en.bitcoin.it/wiki/Thin_Client_Security

2. I'm not sure what you are asking here. Transactions are almost always fractional as is - you can send as little as 0.00000001BTC at a time. I think I am missing what you are asking.


In the article is the example of paying somebody 0.15 of a bitcoin with the input of 0.20 and two outputs, 0.15 to the payee and 0.05 back to oneself. I am picturing having my wallet cluttered with lots of such little fractional bits, the "change" from transactions. How to tidy up? But maybe the answer is, next time I want to pay somebody 0.10 I just give them two my 0.05 "nickel" pieces... Well, I said it was a dumb question...


If you want to, you could create a transaction that lets you spend all of those "loose change" transactions into a bigger transaction.

You have to do this for assurance contracts [0], for example, where the transaction cannot have any change (a transaction back to yourself).

[0] - https://en.bitcoin.it/wiki/Contracts#Example_3:_Assurance_co...


2. Unspent transaction output (UTXO) bloat exists but today it doesn't really cause any problems. Maybe once ultimate blockchain compression is implemented there will be an incentive for people to merge their UTXOs.


In answer to the question:

"Suppose Bitcoin mining software always explored nonces starting with x = 0, then x = 1, x = 2,\ldots. If this is done by all (or even just a substantial fraction) of Bitcoin miners then it creates a vulnerability."

The reward for mining the block includes the miner's account, so each miner is hashing a unique block.


This explanation is pure genius. Instead of explaining something very complicated, explain how to arrive at the complicated solution by evolving the simple solution and eliminating its flaws. Wish more explanations were like this. Someone do git.


> Of course, any still-pending transactions in A will still be pending in the queues of the miners working on fork B, and so all transactions will eventually be validated.

I don't understand this. So you have fork A and fork B, and once fork B wins, what happens to the transactions being done on fork A? Discarded? Do transactions sometimes not ever get verified?

From what I've quoted, I'm guessing no, transactions in fork A somehow get verified by work being done in fork B. How? Do these transactions exist in both forks?

Also, in general, how do p2p networks facilitate the finding of nodes? The links provided [0] describes how a node figures out what to tell other nodes about itself [0], and how to talk to other nodes once they've been discovered [1], but how does a node actually find other nodes, to begin with? Best I can tell it's just a text file with a bunch of "starter" nodes.

    [0] - https://en.bitcoin.it/wiki/Satoshi_Client_Node_Discovery
    [1] - https://en.bitcoin.it/wiki/Network


I'm pretty sure that for normal operation, a transaction gets broadcast to the network, and all miners see it and try to include it in their blocks. Each miner has a queue of unverified transactions to include in the blocks it's trying to build. When the miner gets a valid block from another miner, then it removes all of the transactions in that block from it's queue.

In a legit fork with no attempt at malicious action, there's no reason for the transactions in the forks to be any different, since all of the miners are seeing the same transactions broadcast to the network and have essentially the same transaction queue.

For an attempted double-spend attack, miners already check that there isn't a double-spend in the nodes in their block, so they would drop the second transaction they got trying to spend the same coins.

Since a double-spend can't happen in a single block, the double-spend attack relies on 2 miners working on different blocks, where the attacker-controlled miner is semi-isolated from the network and does not include the original spend transaction, but instead the attacker's second spend transaction. If the attack miner solves the block before any other miner on the network and broadcasts it, then the rest of the miners will accept it and reject the original spend transaction as a double-spend, leaving whoever was going to receive it with no coins.


The non incompatible transactions will be present in both forks, and they will get validated and fixed whatever fork wins. This includes most of the transactions that are not related to the double spending.

In the case of a double spending, each transaction can live in only one of the forks, and the decedents [1] transactions can be validated only in the same fork. When a fork wins, all the transactions that are incompatible can’t be added and they disappear.

[1] Sopuse that X sends the same bitcoins to A and B. Then A send these bitcoins to A1, that sends these bitcoins to A2. During the same time B send these bitcoins to B1, that sends these bitcoins to B2. Then you have two forks, one with the A, A1 and A2 transactions and another with the B, B1 and B2 transactions. Only these transactions have problems and one half of them will disappear, all the other transactions will apear in both forks.


> Early in the section I mentioned that there is a natural way of reducing the variance in time required to validate a block of transactions. If that variance is reduced too much, then it creates an interesting attack possibility...

You've described a 51% attack. The security model of bitcoin assumes these are economically infeasible to mount.

> Suppose Bitcoin mining software always explored nonces starting with x = 0, then x = 1, x = 2,\ldots. If this is done by all (or even just a substantial fraction) of Bitcoin miners then it creates a vulnerability. Namely, it’s possible for someone to improve their odds of solving the proof-of-work merely by starting with some other (much larger) nonce...

No, this does not improve their odds. No miners are scanning the same ranges, because the block header is different for each miner (due to different coinbase transactions and timestamps, if for no other reason).


> You've described a 51% attack.

No, this isn't related to 51% attacks.

Your other point I agree with; this is also discussed in the post comments.


If Alice is able to maintain a fork at equal length to the main chain, that's a 51% attack.


Two questions:

Mining bitcoins is about being the first one to generate the hash that satisfies having however many 0s. Whoever gets there first wins the prize, cool.

Does that mean that you can only get new bitcoins in blocks of 25? If my mining program gets unlucky, and never gets a hit, do I get nothing for it? Or are there schemes where I am helping, and I get a portion?

Further, is there something in the data being hashed that includes the previous block chain?

Otherwise, couldn't I make myself some free bitcoins by pre-computing the hash for a transaction I'm going to do in the future, and verifying it first? Sure it might take some computing power, but with no competition I just need to make sure that my costs are less than 25 bitcoins- and at today's prices, that's a lot cash to work with.


> If my mining program gets unlucky, and never gets a hit, do I get nothing for it? Or are there schemes where I am helping, and I get a portion?

Correct, if you are mining alone, it is all-or-nothing. One way to distribute this risk is pooled mining, where you join a group of other miners and split the rewards.


> Further, is there something in the data being hashed that includes the previous block chain?

Yes, the block chain structure includes the previous block; otherwise it would just be a block set :)


To be precise, each block contains the hash of its previous block.


A bit shorter explanation from me: http://perlalchemy.blogspot.com/2011/05/bitcoin-protocol-hig...

Not so detailed - but I think it covers the main mechanics.


Nice start. Another dumb question: how do individual transactions get to the miners for addition to the blockchain? Is there some P2P-type auto discovery going on? A central server shouting out miner IPs?


There's an article on that here: https://en.bitcoin.it/wiki/Satoshi_Client_Node_Discovery

TL;DR: The peer discovery process is bootstrapped using a few domain names hard coded in the client. https://github.com/bitcoin/bitcoin/blob/master/src/chainpara...


Thanks. Is there anything on the protocol for passing messages around once you've found some peers?


Geez. That seems...vulnerable.


How many transactions end up in a block? How big are blocks?


As many as will fit, and maximum 1MB.


Recently, a new patch is going to enter in the bitcoin-qt client that increases the default size of the block for miners from 50k to 300k to accomodate higher transaction volume.


There's some controversy around what to do about this problem. See, for example:

http://keepbitcoinfree.org/





Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: