SHA-3 Step-by-Step Visualization

Github
(–) (+)

What is a hash function?

A hash function is a simple function that takes in a number and returns a number. The input is a number that can be as large as desired (for example from -∞ to +∞). The output is a number that must be bounded (for example from 0 to 10). Any function that satisfies these properties is called a hash function.

As an example, let's consider a hash function called f. Can you guess the rule of this hash function?

f(5) = 5

f(15) = 5

f(24005) = 5

f(11113) = 3

f(111222333444555666) = 6

As seen in the examples above, f can take in integers as large as desired, and outputs numbers from 0 to 10 only. The rule is that f simply returns the last digit of the number. Thus the general formula of f would be:

f(x) = x mod 10

(–) (+)

What is a cryptographic hash function?

A hash function is called cryptographic or secure when it's hard to find collisions in the hash function. A collision is found when we discover two inputs that result in the same output. Let's take the f function that we described previously. f(5) and f(15) both result in 5, so f is said to have collisions. This makes f unsecure. For cryptographic applications, we desire our hash functions to be secure.

Now let's introduce another function, called g, which is secure. This function outputs numbers from 0 to 10, just like f. Can you guess the rule of this hash function?

g(5) = 2

g(15) = 3

g(24005) = 6

g(11113) = 1

g(111222333444555666) = 7

Unlike f, this function doesn't have an apparent rule. Say we want to find collisions in g by evaluating the function with strategically chosen inputs. Since we don't know the rule, we evaluate g with random inputs. At best, on the very first try we'll find a collision. At worst, after 5 more tries we'll find a collision. In general, if a hash function outputs numbers from 0 to N, if it takes N tries to find a collision, that function is considered secure.

(–) (+)

What is SHA-3?

SHA-3, or Secure Hashing Algorithm 3, is a group of hashing functions that are proven to be secure, and are used in cryptographic applications like Ethereum. National Institute of Standards and Technology (NIST) organized a public competition in 2006 to discover the best hashing functions. In 2012, Keccak Algorithm was selected as the winner of the competition. In 2015, Keccak was standardized under the name SHA-3.

There are 6 variants of the function family in SHA-3: sha3-224, sha3-256, sha3-384, sha3-512, shake128, and shake256. The difference is in the upper bound of the hash. For example, sha3-224 outputs 224-bit hashes, that is, numbers from 0 to 2^224. The number is so large that it's practically impossible to find a collision.

Shake128 and Shake256 are also known as Extended Output Functions (XOF). The upper bound of these functions is a parameter that is up to the user.

(–) (+)

What are other useful resources about SHA-3?

SHA-3 Explained in Plain English (Another explainer)

NIST FIPS 202 (The standard itself)

I want to calculate a hash that is bits long. The input will be .

Obtaining the bit array

Normally, SHA-3 is defined as a function whose input and output are an array of bits. Since our input is a string, first we must translate the string into an array of bits. There are many different encodings, but the most widely used one is UTF-8. First we get the bytes of each letter:

Next, we combine the bytes such that the LSB bits of each byte (rightmost ones) comes first.

Let's first turn the nibbles (0-f) and bits to binary. Then combine everything together.

Note that normally bytes are combined such that LSB bits of each byte (rightmost one) comes first, but here we combine nibbles MSB-first, which produces different results.

Files are like a sequence of bytes. First, we get all the bytes in the file, and write them in terms of bits.

Next, we combine the bytes such that the LSB bits of each byte (rightmost ones) comes first.

Note that it's okay to have obtained no bits at this point. This happens when an empty string is input, or a file is empty.

Padding the bits and splitting in blocks

Padding is the process of adding redundant bits to an array to make it satisfy a certain condition. For SHA-3, the general procedure is as follows:

bits + (01 or 1111) + 1 + n zeros + 1

SHA-3 variant r (rate)
sha3-224 1152 bits
sha3-256 1088 bits
sha3-384 832 bits
sha3-512 576 bits
shake128 1344 bits
shake256 1088 bits

In order to find out how many zeros to put, we do

length(bits) + length(01 or 1111) + length(1) + length(n zeros) + length(1)
= + + 1 + n + 1

= + n

Now we should find the smallest n that satisfies the following:

( + n ) is a multiple of

n = zeros

Once we do the padding as explained above, we end up with a bit array whose length is divisible by r=. Now we can split the array in r-bit-long blocks. The blocks are listed below. Padding bits are shown in green.

Introduction to the State

Other than the input blocks, we also have the state. The state is a series of 1600 bits that we will use as our scratchpad to calculate the hash. Initially, all 1600 bits are set to 0. Throughout this guide, the states will be shown as 25 lanes by 64 slices. The initial state:

    Absorbing Phase

    Now that we have our input block(s) and our state (initially all zeros), we can proceed to the absorbing phase of the calculations. Absorbing phase can be summarized as:

    Tip: Hover over each bit in the resulting state to see how it's calculated.

    Input block , step .

    In the Xor step, we xor bits of the current input block and the state, from the first bit onwards. The state is always longer than the input block, so the last few bits will remain unchanged. The input block:


    The state before:

    The state after:

    In the Theta step, 5 bits from the same slice, 5 bits from the neighboring slice and the original bit are all xor'ed together. An easier way to reason about this is to sum up the highlighted bits, and see if the result is odd.

    The state before:

    The state after:

    In the Rho step, the selected lane is rotated to the right by a certain amount, which depends on which lane it is. The amount of rotation is 0, 1, 62, 28, 27, 36, 44, 6, 55, 20, 3, 10, 43, 25, 39, 41, 45, 15, 21, 8, 18, 2, 61, 56, 14, respectively.

    The state before:

    The state after:

    In the Pi step, lanes are reordered. The new orders of the lanes are 1, 7, 13, 19, 25, 4, 10, 11, 17, 23, 2, 8, 14, 20, 21, 5, 6, 12, 18, 24, 3, 9, 15, 16, 22, respectively. That is, the first lane remains in its spot, while the second lane becomes the 7th lane, and so on.

    The state before:

    The state after:

    In the Chi step, first we divide the state into 5-lane groups, then pick the selected bit and two bits beneath it in the same group. The result is calculated according to green xor ( (not yellow) and red ).

    The state before:

    The state after:

    In the Iota step, only the first lane is changed, while the rest is untouched. The bits of the first lane are xor'ed with a constant, which is given below. Note that the constant that's used depends on the round.


      The state before:

      The state after:

      Squeezing Phase

      For non-shake variants, the squeezing part is pretty simple. First, take the first d bits out of the final state. Group the bits as 8-bit bytes. The values of d are given below:

      SHA-3 variant d (output)
      sha3-224 224 bits
      sha3-256 256 bits
      sha3-384 384 bits
      sha3-512 512 bits

      The final state:

        Finally, reverse the bits in each 8-bit group, so that the leftmost bits become rightmost ones, and vice versa.

        We can also write it in hex notation:

        And that's our final result!

        Squeezing Phase

        For the shake128 and shake256 variants, we do the following:

        The length of the output is fixed for non-shake variants of SHA-3, whereas for shake128 and shake256, the output length (d) is determined by the user. In our case, you chose it to be up here. The value of r is fixed and is defined as:

        SHA-3 variant r (rate)
        shake128 1344 bits
        shake256 1088 bits

        Output block , step .

        In the Take step, we take the first r bits (for it's ) and append them to the output array. The size of output array has grown from to . The state remains unchanged.

        The size of the output array hasn't reached our desired size yet (d= bits), so we proceed with the Theta-Rho-Pi-Chi-Iota steps.

        The size of the output array has reached and passed our desired size (d= bits), so we don't need to do the Theta-Rho-Pi-Chi-Iota steps anymore. We remove the extra -= bits.

        The state:

        The state after:

        In the Theta step, 5 bits from the same slice, 5 bits from the neighboring slice and the original bit are all xor'ed together. An easier way to reason about this is to sum up the highlighted bits, and see if the result is odd.

        The state before:

        The state after:

        In the Rho step, the selected lane is rotated to the right by a certain amount, which depends on which lane it is. The amount of rotation is 0, 1, 62, 28, 27, 36, 44, 6, 55, 20, 3, 10, 43, 25, 39, 41, 45, 15, 21, 8, 18, 2, 61, 56, 14, respectively.

        The state before:

        The state after:

        In the Pi step, lanes are reordered. The new orders of the lanes are 1, 7, 13, 19, 25, 4, 10, 11, 17, 23, 2, 8, 14, 20, 21, 5, 6, 12, 18, 24, 3, 9, 15, 16, 22, respectively. That is, the first lane remains in its spot, while the second lane becomes the 7th lane, and so on.

        The state before:

        The state after:

        In the Chi step, first we divide the state into 5-lane groups, then pick the selected bit and two bits beneath it in the same group. The result is calculated according to green xor ( (not yellow) and red ).

        The state before:

        The state after:

        In the Iota step, only the first lane is changed, while the rest is untouched. The bits of the first lane are xor'ed with a constant, which is given below. Note that the constant that's used depends on the round.


          The state before:

          The state after:

          Here are all the bits we obtained in the last step:

          Finally, reverse the bits in each 8-bit group, so that the leftmost bits become rightmost ones, and vice versa.

          We can also write it in hex notation:

          And that's our final result!