2009-09-29 11:27 - Programming
Quite a long time ago, I read a generally good article on Coding Horror called Rainbow Hash Cracking. It's an introduction into a particular security vulnerability (rainbow tables) and the method for solving it (salting a hash). I won't go into too much detail, as that article does a fine job of explaining it.
A slightly more interesting point is this quote from Jeff (the site's author), in a comment: "You want hashes to be a prefix, not a suffix." I remembered reading this for quite some time, and only much later (after quite a bit of work) did I dig up that original reference. I went so far as to send an email to Jeff to ask why. Why should a hash be a prefix rather than a suffix? How does it matter?
Jeff admitted, shamefully, that he couldn't find his original reference either, and couldn't see why it would matter whether the salt is applied as a prefix or a suffix. At the time, I agreed.
Much later, a cloud hosting company called Engineyard hosted a programming contest, where the goal was to come up with a phrase (within particular limitations) that had the closest SHA hash to the target phrase. Once the contest was over, the winners described how they did it. A particular pair of quotes there prove very interesting: "SHA-1 is a hash function that operates on blocks of 64 bytes at a time" and "We are not aware of any weakness of SHA-1 that would help [so] we reused the first block as often as possible, i.e. for fixed first 64 bytes vary the end".
MD5 also operates on 512 bit chunks (512 bit = 64 byte), thus it has this same limitation. Assume a secret password, phrase, document being hashed and salted which meets or exceeds a 64 byte length. Putting the salt on the end (suffix) means that the hash state up to the salt can be pre-calculated, leaving a smaller operation to calculate the remaining iterations for only the varying salt part. If, on the other hand, the salt is a prefix, then the entire hash must be run over the entire input message, for every possible salt.
An admittedly small difference. And of even less significance for small input messages (like passwords) where even message+salt is likely to fit inside 64 bytes. But those two things I read bounced around inside my skull until I came up with this explanation, and I couldn't resist writing it down.