Why, yes, it's eminently doable. Even a single byte will let you discard 255 out of 256 attempts, and as you kindly point out, 4 bytes would be enough to uniquely identify almost any word in the standard dictionary.
Feel free to peruse some SHA-256 hashes with slightly above six bytes of fixed prefix - several orders of magnitude more difficult to find than a mere 32 bits partial collision: http://blockexplorer.com/
Now you can argue about how likely it is for an attacker to actually bother to find and exploit such a weakness, and how a salt would mitigate the severity and so forth, but the takeaway here is that this attack can be trivially and permanently defeated by using a timing independent comparison function.
The "timing leak" here does not uniquely identify any word in the dictionary; the attacker is on the opposite side of the problem. This attack is implausible.
Here I model the server as a simple function that takes a password, hashes it, and compares it to a known digest with an '==' substitute. The function returns true or false, but also leaks information about how long the match was, through a simulated timing leak.
This lets me identify the dictionary word that was used as a password using just 23 login attempts, instead of the expected 19300 from a brute force attack.
In effect, you're giving the attacker the ability to perform an almost offline attack, as if they possessed the password hash, by supplying however much of the hash they need.
I don't see how this would not be a security problem, if you acknowledge that testing a HMAC with == is.
Again, this is trivially preventable by using a timing independent comparison.
No, I understand the attack you're proposing. I just don't think it works, or, for that matter, buys you much of anything.
Some things a reader of this thread would want to know, to make sense of it:
* C memcmp (which is what Python uses) is below the known measurement floor for remote timing.
* With (many) repeated trials and statistical filtering, the floor is (IIRC, but just Google [crosby wallach timing]) hundreds of nanoseconds on a LAN, tens of microseconds on a WAN.
* The difference between 1 and 2 bytes of matched SHA1 hash is certainly not a millisecond.
* Even if it was a millisecond, which it isn't, you'd come to that conclusion only after many repeated trials.
* To make this attack appear worthwhile, you introduced an artificial rate limit of 1 attempt per second, but obviously ignored that limit when trying to measure the (very noisy) timing of each hash byte.
* Obviously, you can't time a randomized hash, because you don't know enough to generate proposed password hashes to match against.
* The 4-byte shared prefix you're looking for is probably not present in the set of all valid passwords; or, put differently, you'd have better luck just guessing likely passwords through repeated trials than you would making repeated trials to find a next prefix byte.
I'm not disputing that there's information of some sort "leaked" in a password hash comparison; I'm just disputing that it's valuable in any way to a real world attacker.
A more effective way to try to launch this attack would be to try to dump the set of all users known to the system. This attack, far more straightforward than the one you proposed, also doesn't work because of measurement difficulties, but it at least has value in theory, doesn't depend on arbitrary shifting obstacles for the attacker, and nobody in the history of web app development has ever tried to stop it.
It was a toy example, it was not meant to capture the exact details of an attack.
I didn't ignore the time limit. As long as login attempts are much slower than offline bruteforcing, all that matters is whether we need less attempts total. Since it lets you cut your attempts by some fraction with a corresponding constant cost, it always wins out for a large number of passwords. In the toy example it's about a wash if you assume 1000 samples per 'attempt'
I won't examine each of your plausibility objections, but I'll note that the case is identical when checking a HMAC. You shouldn't rely on those assumptions if you can avoid it.
Simply put, using a timing independent comparison is best practice both for checking MACs and password hashes, and I think it is wrong to dissuade people from doing so.
Then tell people to use secure_compare for everything, because virtually all web applications have much worse leaks than the time it takes to compare password hashes. The username example is far more potent in a real-world attack.
Feel free to peruse some SHA-256 hashes with slightly above six bytes of fixed prefix - several orders of magnitude more difficult to find than a mere 32 bits partial collision: http://blockexplorer.com/
Now you can argue about how likely it is for an attacker to actually bother to find and exploit such a weakness, and how a salt would mitigate the severity and so forth, but the takeaway here is that this attack can be trivially and permanently defeated by using a timing independent comparison function.