Hacker News new | past | comments | ask | show | jobs | submit login
An open-source fully homomorphic encryption library (github.com/shaih)
127 points by cirwin on April 30, 2013 | hide | past | favorite | 69 comments



"Fully homomorphic" means "a computation can be done on two encrypted values without decrypting them." It is an extension of partially homomorphic systems, which allow some computation which cannot be extended to arbitrary computations.

A good usage case for partially homomorphic systems is public elections. Suppose given E(x) and E(y) you can efficiently compute E(x + y). Now imagine that you arrange a set of bit-fields as:

    1000 1100  0111 0111    0100 1110     0010 1001
    (random)  (check bits) (votes Alice)  (votes Bob)
In other words, the current vote tally is being represented as a number E(0x8C774E29). You can now create two ballots, E(0xBF010100) and E(0xBF010001). You can add either of those ballots to the vote tally even though you cannot read the vote tally. We can make all of these numbers public knowledge without disclosing your vote -- i.e. a public database can say "Alice's vote was E(0xBF010100)" and "Bob's vote was E(0xDB010001)" and once that encryption is performed, third parties cannot actually verify that Alice didn't vote for Bob or vice versa. So any member of the population can take the public database and confirm that the arithmetic was done properly on the encrypted vote tallies, without figuring out what the actual votes were. Then at the highest level, the tally can be decrypted to find out that Alice won the vote, without disclosing exactly who voted for her. The "check bits" form here a sum of votes as a quick check that someone didn't just add some random ballot in there to screw with the system.

Of course, you need more bits as you want to have more candidates and more voters; and there are problems with confirming that a number looks like 0x010100 and not 0x050500. But fundamentally you can get these really cool cryptographic voting systems which preserve the anonymity of your vote at the lowest level, but can be audited at the lowest level (i.e. we can potentially remove votes, say, from people who were not alive at the time of the election), the votes have perfectly auditable mathematics up to the regional level due to the public databases of encrypted votes; and then once the population is large enough to suitably anonymize the vote, you can decrypt and tally votes publicly too.

Now that's if you just have some function esum such that esum(E(x), E(y)) = E(x + y).

If you have enough to make a full set of logical gates, you could in principle do an entire computation on a set of inputs which you couldn't possibly know, so that the "cloud" could "compute" with your data without ever actually knowing it.


This is a phenomenally good comment. You take the core concept, homomorphic, which is possibly opaque to someone not skilled in the art (of encryption algorithms) and explain the main advantage of this approach using a real-world example that is both complete for the purpose of illustration and also relevant for 'why should I care'.

Encryption has always been something that I care about as a user, but your comment indicates that it might be a practical application for the group theory that I've been reading up on as a hobby. Bravo, good parent.


Operations which preserve structure aren't just confined to group theory. Everyone in a quantitative field should understand the core idea.


In theory if you have arbitrary addition and arbitrary multiplication (i.e. a ring structure) you can perform any transform on the cyphertext that you could on the (finite) plaintext.

As a practical matter conplext transforms tend to produce enormous cyphertexts that can't be evaluated in any reasonable time. Which is why a lot of the cutting edge work is in renormalizing, even at the cost of restricting operations (thus somewhat rather than fully homomorphic).


> can't be evaluated in any reasonable time

I gave the same statement to a coworker who inquired about fully homomorphic encryption, but I didn't have the data to back it up. Can someone help me find performance numbers relevant to this library, or at least a recent comparable implementation?


I don't see benchmarks for this particular implimentation yet, but the readme references the Gentry-Halevi-Smart optimizations as an inspiration. That paper Homomorphic Evaluation of the AES Circuit (http://eprint.iacr.org/2012/099.pdf), discusses three FHE methods of computing AES-128. The fastest disclosed implementation has an amortized run time (on a beast of a machine -- 18MB cache, 256GB RAM) of around 19 seconds per byte. In contrast some CPU have native instructions that can run AES-128 on the order of cycles per byte. So we are roughly talking a 10 billion fold slowdown.


I thought the Gentry method solved the problem of enormous cyphertexts (via 'bootstrapping') while still preserving addition and multiplication operations. Is that not true?


Sure but that method just trades time for space. Later work did reduce the overall complexity by substituting different "hard problems" as the cryptographic primitive, but it's still in the realm of unusable scaling characteristics.

On the other hand, so far as I know, no one has published any papers that suggest a fundamental barrier to improvement. Here's hoping!


I never understand how one could call anything that is fully homomorphic encryption.

Let's say I have access to E(N), the encryption of some secret integer N, and want to know N.

Because of the "fully homomorphic" part, given E(x) and E(y) I can compute E(x+y), E(x-y), E(x/y), etc.

So from E(N), assuming N != 0, I can compute E(N/N) = E(1). Using addition, I can compute E(2), E(3), from there.

From there, I can compute E(N mod 2), E(N mod 3), etc. Comparing them with E(1), E(2), etc, I can recover (N mod 2), (N mod 3), etc.

Once I have those for numbers whose LCM is known to be greater than N, I can use the Chinese remainder theorem to recover N. I think this is fairly efficient in terms of number of computations. If not, other methods exist. For example, one could compute E(1 << b) and from there E(N & (1 << b)) for successive values of b, thus recovering successive bits of N.

What do I overlook here?


To build off of the comments below, this attack is foiled by the fact that one message can be encrypted to one of a large number of ciphertexts.

As mentioned below, this is true for ElGamal. It is also true for all other styles of FHE scheme I'm aware of (including Ring-Learning With Errors based schemes, like this one).

In fact, if a message can only encrypt to one of a small number of ciphertexts, even more direct brute force attacks are often possible. For example, many FHE systems publish a public key. In this case, one can just encrypt E(1), E(2), etc oneself.


Well, ideally you wouldn't be using tiny numbers. If you're using huge numbers, then what you're describing is the equivalent of a rainbow table technique. It's going to take an extraordinary amount of time, computations, and comparisons to be able to figure out what a value was originally, and once you do you've only done it for that one key, whose value you still don't know.

Besides, the big foreseen uses of HE is in applications where the confidential data is going to be changing constantly, so once you've finally figured out what one value was, it'll be worthless to have it because computation has continued.


"If you're using huge numbers, then what you're describing is the equivalent of a rainbow table technique"

Please reread what I wrote; it is not. Let's do the second variant: to determine whether N is odd or even, compute:

  e1 = E(N/N) (equals E(1)) ; 1, encrypted
  b1 = E(N & 1)             ; equals e1 iff N is odd
                            ; (least significant bit of N)
  e2 = E(1+1)               ; 2, encrypted
  b2 = E(N & 2)             ; equals e2 iff second bit is set
  e4 = E(2+2)               ; 4, encrypted
  e4 = E(N & 4)             ; equals e4 iff third bit is set
  e8 = E(4+4)               ; 8, encrypted
  Etc.
Each line is one homomorphic operation. You need ceil(2 log(N)) lines to recover N.


This is a great question! I freely confess that I'm not totally sure about the fully homomorphic systems, but you are making the assumption that you can test for equality "outside" the cryptosystem, and for at least one partially homomorphic case I know of, that's not true (viz. ElGamal).

So you're thinking of a private key cryptosystem, where the ciphertext and the plaintext have the same length and E(.) is bijective, but in the case of ElGamal and public-key encryption in general, E(.) is not actually a function in the mathematical sense and instead encrypts one-to-many. (In fact if memory serves, n plaintext bits become 2n ciphertext bits in ElGamal.)

If that's the case in general then you could indeed compute E(1) := e_div(E(N), E(N)) and e_and(E(N), E(1)), but you could not say that E(N & 1) === E(1) if N is odd, because even though N & 1 = 1 and we've got a nice property of mathematical functions which says that if a = b then f(a) = f(b), this isn't a mathematical function and the one-to-many nature foils precisely this sort of naive comparison.


Thanks for the reply. I indeed was thinking of just comparing the encrypted bits.

That requires that encrypting X always produces the same bits. Reading on ElGamal on Wikipedia learnt me that it does not guarantee that; it also doubles messages in size, as you thought. That will thwart this attack for any reasonable size of the encoded messages.

One could do the comparison inside the crypto system, but that would not help either, as one cannot check the outcome of such a comparison.


Would everyone who voted Alice have the same vote representation? Because that would make it fairly trivial to work out how everyone voted, surely?


That's the point of the random bits.


Of course! Thank you.


Can this be computed with map reduce?

Also related, it might be an interesting way to encode Ricardian Contracts http://iang.org/papers/ricardian_contract.html


This is not terribly on-topic, but, if you're interested in cryptography, Dan Boneh (Craig Gentry's advisor) is currently giving a Coursera course on it:

https://class.coursera.org/crypto-006/class

I'd highly recommend it, it's amazingly interesting, and Boneh is very good at explaining things. I used to think that cryptography was hopelessly impenetrable, but it turns out most algorithms (well, mostly symmetric cryptography) are simple to understand. We're currently at asymmetric crypto and the math's starting.


Be aware that you will need 1 hour/day for the videos, and a couple of hours/week to read additional material and exercises.

I joined the course but couldn't keep the pace.

That being said you can still access the material without doing the exams to do the course at your own rhythm.


Hmm, each week of lessons took me roughly 3 hours a week for the whole course. Doing them at your own pace is also very good, though, as you're pretty much getting the same benefit, but not as much motivation.


Yes you are right, i should clarify that this is my personal experience, with zero previous knowledge on cryptography, and pausing the video continuously to investigate the topics and do some calculations.


I didn't have any prior knowledge either, I guess this is due to each person's method of studying. I generally don't go in very much depth.


I'm taking this course too. It's the first "MOOC" (massive open online course) I've actually stuck with more than a couple weeks.

Matasano's crypto challenge complements it nicely too, if you want to go a bit deeper.


I emailed them about those, but they never got back to me.


Mail me directly! We got a flood of people doing them, far more than we expected, and there have definitely been people falling through the cracks. Please feel free to bug us directly if that happens to you. We're trying our best and are happy to find ways to do better.


Alright, I emailed the address a few hours ago again, but I'll email you directly now, thanks.


First set should be in your email shortly.


Just arrived, thank you!


The README has lots of really cool words and names and things. But it doesn't say what homomorphic encryption is or what it is good for. Could someone enlighten me?


TL;DR: It allows you to edit encrypted stuff without decrypting it.

For example, if you had a book of everyone's salaries, you can ask Bob to add $5000 to it. Normally you'd have to give him the key, but with homomorphic encryption you can just perform operations without decrypting it and having Bob know everyone's salaries.

This becomes really powerful for stuff that you need to work with lots of data without knowing the contents of - cloud computing, for example.


Think Google without the privacy issues.


"Homomorphic encryption is a form of encryption which allows specific types of computations to be carried out on ciphertext and obtain an encrypted result which decrypted matches the result of operations performed on the plaintext. For instance, one person could add two encrypted numbers and then another person could decrypt the result, without either of them being able to find the value of the individual numbers."

From: http://en.wikipedia.org/wiki/Homomorphic_encryption

I don't know much about it either.

For a recent project, I'd be interested in searching and sorting of encrypted data; I wonder if this is up to that. The tricky bit with search is that in many cases, you want to be able to search for substrings, not just an exact match.


The most approachable layman's summary of homomorphic encryption I've seen (with illustrations):

http://www.americanscientist.org/libraries/documents/2012861... [PDF]

Try this before Wikipedia.


Great article, thanks.


http://en.wikipedia.org/wiki/Homomorphic_encryption

Basically encryption dchemes that allow operations to be done on the ciphertext that give a result that can be decrypted and is correct plaintext.


A worthwhile example is the problem of encrypted spam. In theory, spammers could use public keys to evade spam filters, and systems like PGP and S/MIME do not give you any ability to prevent that. On the other hand, an FHE system would allow Google to perform spam filtering on your encrypted email, and so that you receive both the email itself and an encrypted "spam or not spam" bit from Google. You can imagine this sort of thing be applied in other situations -- advertising, options modeling on EC2, etc.

Unfortunately, FHE is nowhere near practical enough to do that sort of thing. Maybe in a decade or two we will see FHE implementations used outside the research community.


It's like regular encryption except you only need to encrypt 1 in 10,000 bits for it to be effective.


Why troll?


Since when is making inoffensive jokes considered trolling? I was amused by the similarity between the words 'homomorphic' and 'homeopathic' so I thought other people might be too.

The number of downvotes my attempt at humour received would suggest that I was wrong.


Oh, HN has a rather strict line on idle jokes. It's not so much that we mind them, but there's a slippery slope into a low signal to noise ratio.

Please pardon my comment about the trolling. Sometimes it's also hard to determine intentions via text-only communication.


I skimmed some docs, and I can't find the list of operation that can be done with that. Is it numerical stuff like addition/multiplication? Is it text editing?


Any algorithm.


It's basically this, http://eurocrypt2010rump.cr.yp.to/9854ad3cab48983f7c2c5a2258... , I think. I wonder what the performance numbers are.


This is not quite true: the mathematical problems upon which they base their security, though related, have some important differences. Most significantly, the problem upon which this new implementation is based (Ring-Learning with errors) is theoretically much more time and space efficient than the problem upon which your linked implementation is based (approximate GCD).


While I greatly applaud the effort of implementing such a complex crypto scheme, I'm afraid I will have to wait years before using something like this. Who wants to be the early adopter of a cryptographic library?


Holomorphic encryption is a very interesting concept especially in a cloud setting. One could perform some operations on encrypted client data. But I don't know if it's mature enough.


I doubt it. A quick look at the papers referenced shows a huge cost factor.


Yes, in general. Special cases can be fast enough.


For those trying to compile it, you'll need to install the last version (6.0.0) of NTL (the Ubuntu package from 13.04 does not provide this version).


could you use something like this to make an anonymous bitcoin protocol?


Check out Zerocoin, though I don't think it uses homomorphic encryption https://en.bitcoin.it/wiki/Zerocoin


I thought homomorphic encryption wasn't secure or complete due to the fiaso that happened on SE earlier this month?


The SE thing was someone claiming a company was doing homomorphic encryption when they really weren't.

(Here's a link for others: http://crypto.stackexchange.com/questions/3645/how-is-cipher...)


OK, I read this as homeopathic. I need some sleep


GPL? That doesn't encourage too many uses of this code. Especially for commercial purposes.


Welcome to Hacker News.

There are some guidelines here: http://ycombinator.com/newsguidelines.html

In particular: Please avoid introducing classic flamewar topics unless you have something genuinely new to say about them.


Feel free to write your own version and release it under a more permissive license then.


While it seems ridiculous to complain about which FOSS license is being used, it's a legitimate observation since there are certain app stores that disallow GPL'd code entirely.

It'd be nice if the author was willing to change the license but it's not something one can reasonably demand.


It would also be nice (actually nicer) if those app stores would allow GPL code.


It's often the other way around, the GPL disallows distributing the software under the very restrictive terms of those app stores.


Sure, but it would be entirely possible for Appstores to make their terms less restrictive, I don't blame people for using licenses like the GPL which are designed to push people in that direction.


I don't think it's very likely app stores will change their terms in the direction of complying with the GPL.

I don't blame people using the GPL either (except possibly people that use it for libraries), nor people that publish apps on the various stores, I was just clarifying a point.


Or even less restrictive specifically for GPL apps...


That is basically the point of the GPL: to fight proprietary software ecosystems.


The moral point of RMS & GPL is that if an ecosystem (such as an appstore) disallows even a possibility for users being in full control of their hardware & software, then such a system is evil, morally repugnant and should change their ways or be shunned by everyone - that 'peaceful coexistance' with them hurts society and should not happen.

YMMV.


Nothing's stopping you (or a potential commercial user) from negotiating a commercial, non-GPL license, you know.


Normally i contact the authors of GPL code and ask them nicely to move to LGPL. It has been working so far.


It uses NTL, and NTL is GPL, so the choice is forced.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: