A password is meant to secure an asset against unauthorized access from an attacker.
In order to prevent someone from gaining access, the password must be hard to guess, and that means that it must be strong enough to avoid guessing based attacks (like dictionaries and brute-force).
Some heuristics to prevent a weak password are a combination of:
- special characters
- upper and lower case characters
- a minimum of 8 characters long
Statistics show that the most common password used by users are “password” itself and “123456” [Reference] A weak password can be used as an entry point for unauthorized users.
In other words, a website could be utterly secure in the way the passwords are stored, but if a user choose a weak password like “123456” or “password” then there is nothing that security can do about it.
This article aims to talk about how should passwords be stored on a server, and how should not.
For instance, if you can avoid storing passwords on your server then do it. Let others do that job for you (Google, Facebook or Twitter). This is recommended if you are not a security expert. Maybe you are starting a new online project and want to let your users to login with a social network profile. By doing this you forget about storing passwords on your server, and despite that all of the previous sites have already been hacked, I am pretty sure that they will have a better understanding of secure password storage.
Storing passwords in plaintext
This may sound silly but there are websites that do store the user’s passwords in plaintext without any kind of encryption.
For sure this is the worst practice. There are a lot of big corporations that still use this naive approach. You can tell this because, when for example you forgot your password and ask them to help you recover it, they have the kindness of sending it to you in plaintext.
One recent example of this dreadful and naive practice was a Russian dating website (RussianCupid.com) that exposed 42 million passwords to hackers .
If you are doing this, please change this as soon as possible and use some of the recommendations in this article.
Another practice could be using hashing functions to store passwords. The more common ones nowadays are MD5 (avoid it) , SHA-1 (avoid it) and SHA-2.
Those are one-way mapping algorithms and cannot be reversed. Once a plaintext enters into the hash function, there is no way to obtain the plaintext given a hash. It is like converting a mouse into an elephant, try to reverse that!
Basically the procedure for doing this would be:
The user types a password
The system takes that password and hash it using a hash function like SHA-2
The hash generated is stored in the database
This is obviously a better approach than using plain text, first of all if there is a security breach in your system and attackers gain access to the database, they will not know the passwords per se. Another advantage is that neither you would know it, and it is better that way. A password is something personal, is the key that the user has to access the system and cannot be compromised under any circumstance.
Despite that one-way hashing functions cannot be reversed, there are some techniques like Brute-Force attack and Rainbow Tables that can help you crack the passwords. The first technique could take days, months or even years (depending of the strength of the password used) for a hacker to crack them.
Nonetheless, Rainbow Tables can really be a nightmare. They are extremely fast and can retrieve a possible password given a hash in seconds. How?
Being concise, a rainbow table is a precomputed table for reversing cryptographic hash functions . Imagine you are a hacker and just stole 38 million hashes from Adobe (wait that really happened ). You can check those hashes in a rainbow table and if they match, the rainbow table will also have the string they were hashed from and retrieve it to you. Easy-peasy, right?
One way to fight against this is using “hashing and salting”.
Hashing and Salting
This technique is considered one of the most secure nowadays. Is adding “something” (a salt) and hashing it along with the user’s password.
What is a salt? A salt is a random string (8 bytes minimum) that is generated for each user when registering in your website.
In a nutshell the procedure would be:
The user types a password
The system takes that password, generates a random unique salt for that user and hashes the concatenation of both using a hash function that could be SHA-2.
That hash generated is stored in the database along with the salt used. The salt can be stored even in plaintext next to it.
But… knowing the salt, can a hacker retrieve the password?
Practically the answer is no. You could hash the salt if you want but knowing the salt would not compromise at all the password. If you try to use a rainbow table attack you would need to create a new table for each salt, and that has no sense at all. It will be faster for a hacker to brute-force each password individually.
This is how it looks:
Despite hashing and salting is extremely a better approach than simply hashing, it is still vulnerable to dictionary attacks and brute-force attacks. Why? They are way too fast and this is consider as a downside. Which is a better workaround then?
Basically, the iteration count is hashing what it has been hashed before for n times. (choose a suitable number of iterations, the minimum recommended is 1000)
The user types the password
The system takes that password, generates a random unique salt for that user and hashes the concatenation of both with SHA-2.
The system takes the generated hash, concatenates again the random salt generated and hashes it again n times.
When the iteration finishes the result is stored in the database.
This will totally add extra security to the basic “hashing and salting” technique. Obviously it is going to take more time (depending on the number of iterations) but hashing it once or n times will cause a real computational headache for an attacker.
An example of this technique is PBKDF2  which stands for Password-Based Key Derivation Function 2. It is a key derivation function that applies a pseudo-random function to the input password along with a salt value, and repeats the process n times to produce a derived key.
PBKDF2 is defined as follows:
DK = PBKDF2(PRF, Password, Salt, c, dkLen)
- PRF is a pseudo-random function
- c is the number of iterations desired
- dkLen is the desired length of the derived key
- DK is the generated derived key
As previously mentioned, this kind of technique is way more slower than a commom hash function such as raw SHA-2. This is an advantage in case someone tries to brute-force the password, but also it must be fast enough to not cause a noticeable delay for the user.
In order to establish this goal, the number of iterations used must be balanced according to the hardware capabilities.
The pseudo-random function (PRF) that can be used along with PBKDF2, apart from the ones mentioned earlier (MD5,SHA-1,SHA-2) is HMAC.
HMAC stands for Hash-Based Message Authentication Code, and despite it is primarily meant for verifying integrity, it also can be used as a “keyed hash function”.
As noticed, there are a lot of flavours that can be chosen to generate and store passwords.
Using only hash functions without adding nothing else it is not recommended due to Rainbow Table attacks.
A good approach but not completely secure could be using a hashing function + salt.
Finally, using PBKDF2 along with HMAC or SHA-2 would be a great choice.
A more powerful alternative of PBKDF2 is bcrypt , which is more slower, thus, more secure regarding brute-force attacks.
Within this article we described some bad techniques used to store password and some good alternatives to improve your password storage in a more secure way.
This is what it is used today, maybe (and it will) in the future these techniques will not be secure at all.
Be responsible, the user’s data is the most valuable asset a company has.
Secure it wisely!