30 June 2017 Crypto, Hash Robert Muehsig

Creating hashs is quite common to check if content X has changed without looking at the whole content of X. Git for example uses SHA1-hashs for each commit. SHA1 itself is a pretty old cryptographic hash function, but in the case of Git there might have been better alternatives available, because the “to-be-hashed” content is not crypto relevant - it’s just content marker. Well… in the case of Git the current standard is SHA1, which works, but a ‘better’ way would be to use non-cryptographic functions for non-crypto purposes.

Why you should not use crypto-hashs for non-crypto

I discovered this topic via a Twitter-conversation and it started with this Tweet:

Clemens Vasters then came and pointed out why it would be better to use non-crypto hash functions:

The reason makes perfect sense for me - next step: What other choice are available?

Non-cryptographic hash functions in .NET

If you are googleing around you will find many different hashing algorithm, like Jenkins or MurmurHash.

Sean Feldman, who more or less started the Twitter discussion mentioned a very good library for .NET developers:

The author of this awesome package is Brandon Dahler, who created .NET versions of the most well known algorithm and published them as NuGet packages.

The source and everything can be found on GitHub.

Lessons learned

If you want to hash something and it is not crypto relevant, then it would be better to look at one of those Data.HashFunctions - some a pretty crazy fast.

I’m not sure which one is ‘better’ - if you have some opinions please let me know. Brandon created a small description of each algorithm on the Data.HashFunction documentation page.

(my blogging backlog is quite long, so I needed 6 month to write this down ;) )

Written by Robert Muehsig

Software Developer - from Saxony, Germany - working on primedocs.io. Microsoft MVP & Web Geek.
Other Projects: KnowYourStack.com | ExpensiveMeeting | EinKofferVollerReisen.de