Your guide to cryptographic hash functions

Image of letters on a colorful background - Destination Certification

Cryptographic hash functions play essential roles in the security landscape. They can be used on their own to verify the integrity of data, as well as playing crucial roles in hash-based message authentication codes (HMACs) and digital signatures. HMACs allow us to verify the integrity and authenticity of data, while digital signatures enable us to verify integrity and authenticity, as well as providing non-repudiation.

What are cryptographic hash functions?

Cryptographic hash functions can take inputs of any length while always outputting a fixed-length string, which is known as the hash. These functions are deterministic, which means that the output for a given input is always the same. Cryptographic hash functions are also one-way, which means that it’s infeasible to figure out the original input with only the output. SHA-256 is one of the most common cryptographic hash functions, but there are a number of other useful cryptographic hash functions, including the SHA-3 family.

If we want our systems to be cryptographically secure, then we need to use appropriate cryptographic hash functions. We can’t just use the ordinary hash functions that we use in contexts like data storage and retrieval, nor can we use insecure cryptographic hash functions like MD5. For a cryptographic hash function to be secure and useful, it must feature each of the following properties:

  • It must be able to take on variable-length inputs, whether they are just a few bits or a large file.
  • It must always output fixed-length strings. As an example, SHA-256 always has an output of 256 bits, no matter what the size of the input.
  • It must be quick to compute.
  • It must be deterministic.
  • It must be collision resistant. This means that it needs to be infeasible to find two separate inputs that both produce the same hash.
  • It must be preimage resistant. This is the one-way property, which means that it should be fast and easy to compute a hash from an input, but it should be infeasible to take a hash and then compute what the original input must have been.
  • It must be second preimage resistant. This means that if we start with a specific message, it should be infeasible to find another message that also produces the same hash.

How do cryptographic hash functions work?

It can be difficult to understand cryptographic hash functions through these abstract properties, so let’s run through some examples of how they work with an online tool. If we run the number ‘1’ through the SHA-256 algorithm it gives us the following 256-bit hash as the output:

6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b

Every single time we input a ‘1’ through the SHA-256 algorithm, it will give us this specific hash, which shows that cryptographic hash functions are deterministic. If we run a sentence, ‘What are cryptographic hash functions?’ through the algorithm we get a different hash, but it’s still 256 bits long:

14634df8f61ee6e44680ea308e712566ee4191bab8e5ffd6cf2f1a21f08c8086

This helps to show how cryptographic hash functions always take on variable-length inputs and output fixed-length strings. We could put an entire encyclopedia through the algorithm and still end up with a 256-bit string, in the same way that a single digit also produced a 256-bit string. Let’s go a step further and change just a single letter of our sentence, ‘What are cryptographic hash functions?’ It gives us:

5bb87dc9fc6749e66d3f79cc0010ede3e3c0cb10bbccbe93dcb6d1beaf5507fd

With just this subtle one-letter change, we end up with a completely different hash that bears no resemblance to the prior one. This is a very deliberate part of the algorithm’s design. If a subtle change resulted in a similar hash, it is unlikely that the hash function would meet the collision resistant, preimage resistant and second preimage resistant requirements.

Using cryptographic hash functions

The properties of hash functions make them useful for verifying data integrity. One example is in software distribution. We can write our software, and then run the code through a cryptographic hash algorithm to produce a hash. If we display the hash on our website, users can download the code, run it through the same hash function, and then compare the result with the one displayed on our website. If the two results match, then the code users downloaded is the same as the code that we uploaded, and an attacker hasn’t meddled with it to spread malware.

When we use hash functions to build HMACs and digital signatures, they open up a variety of interesting ways to secure our communications. But that’ll have to wait for some other time.

Image of the author

Cybersecurity and privacy writer.

Would you like to receive the DestCert Weekly via email?

Your information will remain 100% private. Unsubscribe with 1 click.

Page [tcb_pagination_current_page] of [tcb_pagination_total_pages]