MD5 in PHP works exactly as it should (don’t believe the hype!)
Hey guys,
I’m afraid today I have to confess to an almighty amount of stupidity, but first - please allow me to set the record straight so that nobody ever makes that mistake again:
The PHP MD5 function produces the same hashes as any other MD5 function written in any other language ever devised.
Please don’t pay any heed whatsoever to the whisperings that abound the wonderful interweb. Hmmm… Maybe ‘abound’ isn’t the right word, which is exactly what should have sent the rumour-alarm clanging away in my head. Truth is, there are a small spattering of references to this problem, but often people (like me) work out that it was a problem with their own code that they attributed to the ‘encoding problem’ because it’s easier to ‘blame their tools‘.
As was pointed out to me:
md5 always takes the argument as a bit vector rather than a string of letters, i.e. no encoding matters. If your script is written in ISO-8559-15 and you passed an embedded string literal to md5(), the result is the hash of a ISO-8859-15 string
Y’know what? It’s true! When I did a bit more debugging I found that I was inserting invisible whitespace into the string I tried hashing. Whitespace is as visible as any other character to the MD5 function - the hash of ‘ hashtext’ (notice the leading space) will therefore be different to the hash of ‘hashtext’. Nothing to do with utf7 or utf8!
And guess what? It’s not just me… On experts exchange1 I found a user with a similar problem in Java. He later explains that in his case a string wasn’t being lowecased prior to hashing:
Hello. Have got access to the php code now and can see that the php programmer did not actually follow the specification (did not make all chars to lowercase bfore md5…) Sorry to have bothereed u with this, was extremely painfull to sort out the bug when I could not see the php code.
But this sort of response is never publicised in the same way. These answers, these non-problems, are always buried as apologetic admissions of bad development practice. I want to put an end to this, and by publishing this post I hope to nip this slowly spreading rumour in the bud.
Go forth and spread the good news - The PHP MD5 is not dead. Long live (urm) the PHP MD5 function!
Tom x
August 4th, 2008 at 2:45 pm
I’m not sure anyone claimed MD5 was dead in PHP or even that it was producing incorrect results. It was purely stated that the results were inconsistent with another language (in my case .NET).
PHP MD5 might not take any consideration of encoding, but the .NET MD5 function clearly does. For that reason, if you want them to match, you need to make sure your string is encoded correctly in .NET to give a consistent result. Neither PHP or .NET MD5 are broken and I suspect the Java one works correctly as well!
P.S. I also love Tea
August 4th, 2008 at 3:00 pm
No-one claimed that MD5 was dead in PHP - in fact it’s easy to think of tons of instances when MD5 (or any hashing) is useful without escaping a particular language.
As someone who has never used .NET, I couldn’t possibly comment on how the MD5 function works there. Furthermore, as you clearly are someone who uses .NET, I’m prepared to take your word on this issue.
The key reason for me posting this entry wasn’t to bash your entry though, more that yours was a great example of how an innocuous post can start bizarre rumours which send clumsy programmers (myself included) on wild goose chases.
I’d like to think that the next time someone searches for this sort of problem they won’t just find your post and a collection of misguided forum posts, but that maybe they’ll find mine and examine their code a little more closely.
August 4th, 2008 at 4:12 pm
I’m not entirely sure what your post is saying though, from what I can gather your point is this:
“Check the data you are hashing is correct.”
There are plenty of people around the web saying this already, as I had to wade through them all to get to the real problem.
August 4th, 2008 at 4:42 pm
In terms of this specific example - you’ve pretty much nailed it. “Check the data you are hashing is correct.” is a pretty good summary of how to avoid the problem of inconsistent hashes. The only thing I’d add to that would be that it’s aimed at a very specific audience; those who think their inconsistent hashes are due to encoding problems.
In terms of the general situation though, my post is meant as an example of solution sharing rather than problem sharing. We see a lot of ‘problem sharing’ on the internet, but the trouble with this is that often the problem is only very loosely connected to the solution. This quickly seeds an avalanche of forum posts which each associate a problem with this new cause.
Sorry the post wasn’t very clear, I normally use this space just to collect my notes…
As for tea… I really ought to put something on that page soon!
August 4th, 2008 at 4:53 pm
As long as everyone’s hashes are correct I guess we’ve collectively achieved something
September 8th, 2008 at 12:21 am
I had the problem where the md5 hash generated in PHP and Delphi were different. turns out it wasn’t any sort of error with the input data. I found the solution on PHP.net in the comments - this worked for me http://nz2.php.net/manual/en/function.md5.php#77030.
The hash generated by PHP is correct but represented differently to how it is in Delphi. The PHP hash being generated was 32 chars long, and in Delphi only 16.
December 14th, 2008 at 7:59 pm
I am using Arabic (I think the problem will be there for other languages)
I am trying to build a login based on VBulletin its working for English but Arabic its not working I tried all types of encoding:
(its with Arabic Windows encoding “windows-1256″”)
The following code comparing the DB password and the hashed password they should match:
//”y)@” is the salt
string password = “كودلاب”;
foreach (EncodingInfo enc1 in Encoding.GetEncodings())
foreach (EncodingInfo enc2 in Encoding.GetEncodings())
//foreach (EncodingInfo enc3 in list3)
if (Md5Hash(Md5Hash(password, enc1.GetEncoding()) + “y)@”, enc2.GetEncoding()) == “acfae8024d61fe3697203fbf0fc6e6ed”)
MessageBox.Show(enc1.Name + ” - ” + enc2.Name);
Can anybody help