Secure Password Storage for Websites

Secure password storage is extremely important in today’s on-line world and are often the only thing standing between malicious individuals and our private data. It is vital that your applications do everything they can to store passwords as securely as possible or risk losing the trust and confidence of your users.

So what is secure password storage? We want to have some way of storing users passwords that makes it difficult / impossible for an attacker to recover them. I believe that it is the responsibility of every site that stores a user password, no matter how small to keep that password safe at all costs. People today often use the same password over multiple sites and having your site reveal users passwords is guaranteed to lose their trust. The passwords should also be stored in such a way that even the site administrators cannot recover it, giving users confidence that you yourself are not looking at their private credentials.

Encryption

One, often misguided, way of storing user passwords is to use a symmetric encryption algorithm such as AES. When the user creates their account, they supply a password. The password is encrypted using the servers secret key and stored in the database. AES is currently believed to be unbreakable and is used in many popular technologies today such as HTTPS, so you would think this arrangement would be pretty secure? No, The problem is the passwords are still recoverable and this means that there is always a risk they can be stolen by hackers or browsed by nosy administrators.

Hashing

To truly make the passwords unrecoverable, they must be hashed. Hashing is a one way algorithm, it takes a password as input and outputs the hash. There is no possible function that will accept a hash and return a password. When a user creates an account, their password is hashed and stored in the database. When it comes time for them to login, they again supply their password which is passed through the same hashing algorithm. This new hash is then compared against the one stored in the database to check for a match.

One often raised argument for the use of encryption rather than hashing, is how do we recover the users password if they have forgotten it. The answer is, we don’t. The purpose of a password is to allow the user to prove to you who they are (Authenticate). If they do not have this, you must authenticate them in some different way, perhaps by sending them a password reset link with a time limited token attached. Sending them their own password in an email is dangerous as it leaves their password open to eavesdropping. The key point here is that we should never need to tell a user their password. They either know it and authenticate or we go through some other process to authenticate them and allow them to set a new password.

salt

One problem with hashing functions is that the same password when passed through the hash function will always result in the same hash. This makes sense, but think for a moment, if someone were to create a database of all possible passwords and hashes, you would be able to find the password that matches the hash instantly. Indeed, attackers have been using this for many years, or rather a variant of this idea.

Storing all possible hashes and passwords using a lookup table would quickly gobble up a large amount of memory. Instead, an alternative method known as a rainbow table is used, where an initial value is passed to a series of hashing and retry functions to reconstruct an entire chain of hashes and their corresponding passwords. This allows some memory to be saved at the expense of more CPU time to find the matching hash. An explanation of how these tables work can be found here (http://kestas.kuliukas.com/RainbowTables/)

To defeat rainbow tables, we use a special value known as a salt. The salt is a random piece of data whose purpose is to add additional randomness. Salts are combined with the passwords during the hashing process. The resulting hash is then stored in the database with the salt also being stored alongside it. One common mistake is to go to great lengths to keep the salt a secret. Although it won’t do any harm, this is not the purpose of the salt and trying to keep it secret is almost always a wasted effort. The salt must be different for each user and must be random. The point of this salt is that now, when we put the same password and salt through the hashing function, we will get different values because the salt will be different for each user.

Rainbow tables are becoming less prevalent in favour of incredibly fast dictionary / brute force attacks. To keep passwords safe in the modern age, we need to need to go further.

Key stretching

Although hashes are indeed one way, if someone managed to get hold of the password database, they may be able to find out what the original passwords are by guessing. An attacker could perform the same calculation used to generate the hash for a list of words in a dictionary or for every possible combination of letters and characters. This can be done extremely quickly on modern machines (3367 Million per second on a moderately powerful GPU.) It is therefore quite clear that simply hashing the password is not sufficient.

The way to resolve this issue, or at least make it infeasible, is to increase the amount of work required to compute the hash. One simple method is to take the output from the hash function and pass it back into the hash function, generating a second, new hash. You can continue this process for example 10000 and store the final hash.  This is known as key stretching. Each time a user logs in, their password goes through the same 10000 hash cycle and the final hash compared against that stored in the database. This means that for each guess an attacker makes, they will have to calculate the hash 10000 times making it far slower for them to rapidly guess passwords.

Existing password systems

Storing passwords securely is a tricky business and if the previous two issues I’ve been through haven’t scared you off rolling your own, there are many other problems that can befall a poor password hashing scheme.

Thankfully, there are algorithms out there already that prevent us from having to roll our own password hashing system. Two of the most popular algorithms are PBKDF2 and BCrypt. These algorithms take care of combining the salt and password in a secure manner and also allow you to specify a work factor that increases the number of iterations performed, slowing down the process.

PBKDF2 (Password Based Password Derivation Function 2) is an algorithm that is recommended by NIST. It has a built-in implementation in Microsoft’s .NET framework (see Rfc2898DeriveBytes). Unfortunately, this uses the SHA1 hashing algorithm internally which is now considered to be obsolete by NIST themselves. Using PBKDF2 with a stronger hashing algorithm such as SHA256 or SHA512 is advisable.

BCrypt is another key derivation function based on the blowfish cipher. Bcrypt is preferrable to PBKDF2 in some ways due to the fact that PBKDF2 (Especially using SHA1 internally) is extremely efficient to compute on a graphics card. Normal PBKDF2 hashes will be computed on the CPU but in order to make it as difficult as possible for attackers using custom built password cracking machines (look at this one from 2012), a very high iteration count may need to be used, reducing performance for legitimate users. Bcrypt is much less efficient on a GPU because of its memory access requirements meaning it can be run faster for legitimate users while still proving difficult for attackers to brute force.

A newer algorithm named Scrypt is looking very promising in replacing Bcrypt in the future, it exploits the GPU’s memory access weakness even further by requiring more memory to compute. It also makes it more difficult to compute hashes using Field Programmable Gate Arrays (FPGAs) which are currently being used more and more to attack Bcrypt hashes. Scrypt has been around since 2009 and although it hasn’t “stood the test of time” yet, it will have pretty soon.

The conclusion is that you should use one of the pre-existing systems, PBKDF2, Bcrypt or if you want to go bleeding edge, Scrypt.

Summary

DO:

  • Use a key derivation function, preferably Bcrypt or alternatively PBKDF2 or Scrypt.
  • Use as large a number of iterations (work factor) as possible without affecting legitimate user login performance greatly.

DON’T:

  • Store simply the hash of the password
  • Have a single salt for all users
  • Roll your own hashing algorithm
  • Encrypt passwords

Leave a comment