MySQL 4.1.x authentication internals
While reviewing the deployment plans for a new MySQL server yesterday, I started wondering about the security of the authentication protocol it uses. Most of the users of the new database would be accessing it via the LAN, without the benefit of SSL encryption. It's well known that the MySQL 3.x authentication protocol is vulnerable to sniffing attacks, so MySQL 4.1 introduced a new authentication scheme designed to be more secure even over a plaintext session. But was it good enough?
I first spent some time to understand exactly how the protocol works. Thanks to some kind folks in the #mysql channel on irc.freenode.net, I located the MySQL internals documentation page the describes the authentication protocol. Unfortunately, this page doesn't really describe what's going on in the 4.1 protocol, so I turned to the source code for the definitive scoop.
Just so no one else has to do this again, here's a better description of how the protocol works, based on the 5.0.19 release. Look in libmysql/password.c at the scramble() and check_scramble() functions if you want to check the source code for yourself.
The first thing to know is that the MySQL server stores password information in the "Password" field of the "mysql.user" table. This isn't actually the plaintext password; it's a SHA1 hash of a SHA1 hash of the password, like this: SHA1(SHA1(password)).
When a client tries to connect, the server will first generate a random string (the "seed" or "salt", depending on which document you're reading) and send it to the client. More on this later.
Now the client must ask for the user's password and use it to compute two hashes. The first hash is a simple hash of the user's password: SHA1(password). This is referred to in the code as the "stage1_hash". The client re-hashes stage1 to obtain the "stage2_hash": SHA1(stage1_hash). Note that the stage2 hash is the same as the hash the server stored in the database.
Next, the client hashes the combination of the seed and the stage2 hash like so: SHA1(seed + stage2_hash). This temporary result is then XORed with the stage1 hash and transmitted to the server.
Now the server has all the information it needs to know whether the user supplied the correct password. Remember that the server stores the stage2_hash in the database, and it originally created the seed, so it knows both of those. What it wants to recover is the stage1_hash. Fortunately, the data transmitted from the client is simply the seed and the stage2 hash, XORed with the stage1 hash. To recover the stage1 hash, the server simply has to recompute the hash SHA1(seed + stage2_hash), then XOR it with the data provided by the client. The result will be the stage1 hash. Since the stage2_hash is simply SHA1(stage1_hash), the server can simply compute a new stage2_hash and compare it to the hash in the database. If the two hashes match, the user provided the proper password.
Ok, I admit that was probably a little confusing, but here's where I think it gets really interesting. There is a flaw in this protocol. The security depends on keeping two pieces of information a secret from attackers: the stage1 and stage2 hashes. If the attacker knows both hashes, they can construct a login request that will be accepted by the server.
Stage2 hashes are protected by keeping the database files away from prying eyes, but if an attacker were to steal the database backup tapes, for example, the hashes for all database accounts would be in his possession.
Still, just having the stage2 hashes is not enough. The final piece of the client authentication credential is XORed with the stage1 hash, yet the server doesn't know the stage1 hash. Therefore, the protocol provides it with all the information it needs to know in order to derive the stage1 hash. Unfortunately, that means that the same information could be derived by anyone who could sniff a valid authentication transaction off the wire and who also possessed a copy of the stage2 hash.
So the weakness is that if an attacker has the stage2_hash stored in the db, and can observe a single successful authentication transaction, they can recover the stage1_hash for that account. Using both hashes, they can then create their own successful login request, even without knowing the user's actual password.
I've confirmed this with the MySQL team, who say that this is a known issue. It is difficult to exploit, I think, since you need a fair amount of information in order to make the attack successful. Frankly, if you already have the password hashes, you probably also already have the rest of the info in the database as well, so this might be overkill. Still, I had fun working through this last night, so I thought others might enjoy it as well. At the least, maybe I'll save someone else the time I spent understanding how the protocol works.
If this vulnerability really concerns you, you can protect the stage1 hash information by using MySQL's built-in SSL session encryption support. You should probably also be encrypting your backup tapes to protect the stage2 hash and all the data.