Plagarized from another site, and adapted and significantly expanded to this discussion. Would have taken too long to write something entirely original.
Passwords have to be stored somehow on the server. You could store them as plaintext:
| Code:: |
<users>
<user name='Alice' password='7&y2si(V1dX'/>
<user name='Bob' password='mary'/>
<user name='Fred' password='mary'/>
</users> |
After implementing something like this, you'll likely feel rather uncomfortable that all those passwords are sitting there in one file, in the clear. If you don't feel uncomfortable, you should! This makes it way too easy for an attacker who compromises your system to walk away with user passwords without even breaking a sweat. And, if this happens, it's not just your site that could feel the repercussions—most people use the same password for multiple sites. A stolen password is a privacy violation for the user, and frankly, if you didn't do anything to protect those passwords, you're to blame.
The first approach you might take to protect these passwords is to encrypt them. That's better than nothing, but it's not the best solution either. In order to validate a user's password, you need the encryption key, which means it needs to be available on the machine where the passwords are processed. While this does raise the bar a bit because the attacker must find the key, there's a better solution that doesn't require any key at all: a one-way function.
A cryptographic hash algorithm like SHA-1 or MD5 is a sophisticated one-way function that takes some input and produces a hash value as output, like a checksum, but more resistant to collisions (modulo what's alread been discussed about recent developments in cryptography, but set that aside for now). This means that it's incredibly unlikely that you'd find two messages that hash to the same value. In any case, because a hash is a one-way function, it can't be reversed. There is no key that you need to bury. So let's imagine you hash the password before storing it in the database:
| Code:: |
<users>
<user name='Alice' password='D16E9B18FA038...'/>
<user name='Bob' password='5665331B9B819...'/>
<user name='Fred' password='5665331B9B819...'/>
</users> |
Now when you receive the cleartext password and need to verify it, you don't decrypt the stored password for comparison. Instead, you hash the password provided by the user and compare the result with your stored hash. If an attacker manages to steal your password database, he won't immediately be able to use the passwords, as they can't be reversed back into cleartext. But look closely at Bob and Fred's hashed passwords. If the attacker happened to be Fred, he now knows that Bob uses the same password he does. What luck! Even without this sort of luck, a bad guy can perform a dictionary attack against the hashed passwords to find matches.
The usual way a dictionary attack is performed is to get a list of commonly used passwords, like the lists you'll find at
coast.cs.purdue.edu/pu...wordlists, and calculate the hash for each. Now the attacker can compare the hash values of his dictionary with those in the password database. Once he finds a match, he looks up the corresponding password.
More sophisticated dictionary attacks create a large database of pre-hashed strings and store the string and the precalculated hash as a pair in a database. Such dictionaries are only bound by storage space. If you hash every possible string (and can store it in a large disk array), you merely need to seach for the matching hash, then look at the string it was created from. This is still a brute-force attack, although it is only bound by time once - the time it takes to hash and store all possible values (up to a certain length, eventually you'll run out of practical storage space).
To slow down the attack, salt is used. Salt is a way to season the passwords before hashing them, making the attacker's precomputed dictionary useless. Here's how it's done. Whenever you add an entry to the database, you calculate a random string of digits to be used as salt and store it in the password database with the hashed value of the password. When you want to calculate the hash of Alice's password, you look up the salt value for Alice's account, prepend it to the password, and hash them together. The resulting database looks like this:
| Code:: |
<users>
<user name='Alice' salt='Tu72*&' password='6DB80AE7...'/>
<user name='Bob' salt='N5sb#X' password='096B1085...'/>
<user name='Fred' salt='q-V3bi' password='9118812E...'/>
</users> |
Note that now there is no way to tell that Bob and Fred are using the same password. Note that the salt itself isn't a secret, and it's stored with the hashed user password in the password database on the server. The important thing is that it's different for each user account.
Once you decide to store hashed passwords, you'll realize there's no way to e-mail the user their password if they forget what it is, because the server doesn't really know what the password is, only the hashed value of the password.
With a salted password database, the attacker can't use a prehashed dictionary. But he can still perform a dictionary attack using the salt for each account to rehash his dictionary. You can further slow this attack by requiring a certain level of complexity for passwords, including a minimum length. You can also require that users use a combination of uppercase and lowercase letters, digits, and punctuation. Of course, if passwords are too hard to remember, users may write them down. It's a difficult balancing act.
Salt isn't a silver bullet. You should react immediately if your password database has been compromised, even if it uses salted hashes. But salt buys you a wee bit of extra time. It just might give you enough time to discover the attack and disable the affected accounts until the users can change their passwords.
Note that hashing passwords only buys you time (the time it takes to brute-force any or all the password hashes in the database) in the case where your user password database is compromised. The hash collision flaws found in MD5 and SHA-1 only reduce the average time margin, they do not invalidate the mechanism or concept. However, there are other threats as well. Hashing does not protect your password in transit. When you log into Dragonfly, your username and password are POSTed to the server in the clear. If someone can observe the HTTP request packet as it transits the network (usually by sniffing the network at your computer or at the server), they can just pull your password out of the ether entirely when you first log in.
The server does the password hashing on the other side, verifies the hash of the password you sent matches the stored hash, and then returns you a session cookie with your login credentials in a hash. Afterwards, you present your credentials as that hashed cookie. If someone steals your session credentials, they can impersonate you (until those credendials expire or are revoked), but they cannot tell what your password is (without brute-forcing it).
Some systems use password hashing to protect the password while in transit by implementing the hashing algorithm on client. The client takes input from the user, hashes the password, and sends it over the network to the server. The server them merely compares the hashed password to the stored hash in the password database. However, such schemes are not generally practical for web applications. Which is why secure web applications use transport (SSL) or link (VPN, etc.) encryption to protect data in transit. Sites like eBay have you log in during an SSL session to protect the POSTed plaintext password on its way to the server, then drop down to an unencrypted session to hand you your session cookie and continue the session. Banking sites just start you in a SSL session and leave you there (as I'm sure we all prefer).
Because password hashing addresses a very narrow threat, it is less important to worry about the strength of the hash unless the strength of every other threat in the sequence of events has likewise been mitigated. Threat mitigation isn't delt with in absolutes - you buy average time to failure with dollars (time to implement is also money, as we all know time==money), and you never spend $$$$$$$$$$ when your data is worth 0.$. Which is to say, you "spend" security in proportion to the value of what you are protecting, and you don't buy up a single aspect of security disproportionately, as it's the weak link that will take you down.
In this particular case, password hashing, even without salt, even considering the known issues with cryptographic hashes, really does address the concerns that particular threat mitigation method needs to for the operating environment that Dragonfly was designed for. I'm sure the devs would have preferred using salts, but in order to provide an upgrade path from all the *Nuke and other CMSes which don't use them, you'd have to throw out your user databases and have everyone either sign up from scratch, or change their passwords. Mass password changes are a disaster from both a security standpoint (you don't want accounts with pending password changes handing out there indefinitely because a user never logs in or has abandoned their account), and a logistical one (many, many, many problems and lost password requests, and even that is assuming the password change mechanism isn't buggy).