CRC32: Maths and programming

This page presents the CRC32 algorithm and describes the mathematics behind it. I am not a mathematician nor an informatics specialist, so this is what I have I learned from various sources (freeware CRC calculators -I liked Mark Adler's zlib and its commentation- and articles). I recommend reading Sarwate D.V, Computation Of Cyclic Redundancy Checks Via Table Look-Up, Communications of the ACM, v.31 n.8, p.1008-1013, Aug. 1988, though I trust there are many more articles on the subject.

Intoduction: Bits and polynomials

A polynomial with coefficients 0 and 1 can be represented as a bit stream, and vice versa:

Let us consider a polynomial a0 + a1x + a2x2 + ... + anxn belonging to Z2[x]. This means that ai belongs to {0,1}. The coefficients a0, a1, a2, ..., an can be represented in binary format. For example, 1 + x2 + x4 + x7 is actually 1 + 0x + x2 + 0x3 + x4 + 0x5 + 0x6 + x7, its coefficients are 1,0,1,0,1,0,0,1 and can be represented as the binary number 10101001 (= A9h). Note that the highest power (x7) is represented in the right-most (least significant) digit. Therefore, every bit stream may be considered as a polynomial of Z2[x], and vice versa.

Polynomial addition is performed by XOR:

Let us define first addition modulo m. Remember that a mod b is the remainder of the division a/b [r=a mod b means that a=kb+r, where r is less than b]. Now to add h and g modulo m, we take the remainder of the division (h+k) / m. For example, the addition of 35 and 42 modulo 50 is (35+42) mod 50=77 mod 50=27.

That settled, we go back to Z2, and see what happens when we apply addition modulo 2. 0+0 = 0, 0+1 = 1, 1+0 = 1, but 1+1 = 0 because (1+1) mod 2 = 2 mod 2 = 0. This is exactly the behavior of eXclusive-OR in binary digits. So, we can add polynomials in Z2(x) simply by performing XOR operations. Here is an example, where we add 1 + x2 + x5, and x + x2 + x7 to get 1 + x + x5 + x7. Note that x2 + x2 = 0. This is XOR addition.

1 + 0x + 1x2 + 0x3 + 0x4 + 1x5 + 0x6 + 0x7 = 10100100, add

0 + 1x + 1x2 + 0x3 + 0x4 + 0x5 + 0x6 + 1x7 = 01100001, equals

1 + 1x + 0x2 + 0x3 + 0x4 + 1x5 + 0x6 + 1x7 = 11000101

Multiplication by x is performed by SHR:

How about multiplication by x? This means that 1 + x will become x + x2, or that 1 + x + 0x2 becomes 0 + x + 1x2. If we use 8 bits, this is represented as 11000000 which becomes 01100000. Actually, we just shifted right by 1 position, which can be done easily by the shr command.

What we have up to now: Every bit stream can be considered a polynomial, XOR can be used for addition, SHR can be used for multiplication.

What is CRC32

CRC is the remainder of the division of our data (seen as a polynomial) by the polynomial c(x):

CRC32 is a 32-bit Cyclic Redundancy Check code, used mainly as an error detection method during data transmission. If the computed CRC bits are different from the original (transmitted) CRC bits, then there has been an error in the transmission. If they are identical, we can assume that no error occured (there is a chance 1 in 4 billion that two different bit streams have the same CRC32). The idea is that the data bits are treated as a data polynomial and the CRC bits represent the remainder of the division of the data polynomial by a fixed, known polynomial (called the CRC polynomial).

The CRC32 polynomial is c(x) = 1 + x + x2 + x4 + x5 + x7 + x8 + x10 + x11 + x12 + x16 + x22 + x23 + x26 + x32

We need to find a b(x) so that d(x) = a(x)c(x) + b(x). d(x) is our data, b(x) is our CRC32 and we don't care about a(x). Or, in other words, b(x) = d(x) mod c(x).

How we do it:

We perform our calculation byte after byte.

In our first byte, we have a polynomial with a degree of 7 (at most). For example, 4Ah = 01001010 which is x + x4 + x6. We multiply this by x32 in order to get a polynomial to divide with c(x) [which has a degree of 32].

We procceed with the division as usual. That is, we multiply the divisor [c(x)] with a power of x, so that we get a polynomial of the same degree as the dividend, and we substract this from the dividend. The remainder is our new dividend and we DON'T care about the quotient.

Actually, we use s(x) = x32 mod c(x) to make the calculations easier. s(x) = 1 + x + x2 + x4 + x5 + x7 + x8 + x10 + x11 + x12 + x16 + x22 + x23 + x26.

When we meet the first 1 bit, we set the remainder variable to s(x).

That's because we consider it to be a x32 polynomial and so the remainder x32 mod c(x) is indeed s(x).

For every new bit, we multiply our remainder by x. We perform SHR 1 to our variable:

Now, (what a suprise!) a new bit follows. That means that our previous polynomial was a degree higher. No problem at all! We multiply our remainder variable with x, and the remainder will be now x33 mod c(x).

Make sure you understand this well. We had d(x)=a(x)c(x) + b(x). But we see that d(x) was actually x*d(x). Therefore, x*d(x)=x*a(x)c(x) + x*b(x). We care only about the remainder, so this isn't b(x), its x*b(x).

The bit operation is shr 1.

When we meet a 0 bit, we do nothing.

When we meet a 1 bit we XOR add s(x) to the remainder:

Why that? Well, we have d(x)=a(x)c(x) + b(x). Now we see that we have d(x) + x32. That means that, d(x) + x32=a(x)c(x) + b(x) + x32. Our remainder is x32 + b(x). However, that is not acceptable, as this is a polynomial of a degree equal to our divisor, and as such, can be divided. Now, [x32 + b(x)] mod c(x) = [x32 mod c(x)] + [b(x) mod c(x)] = s(x) + b(x). So, we need to XOR add s(x).

If the discarded bit (after shr 1) is 1, we XOR add s(x):

While multiplying by x (SHR 1) we may step into a remainder of a degree equal to 32. That's when we discard a 1 bit off our (32-bit length) variable. We need to calculate its modulo c(x). So, we have x32 + b(x) [our variable represents b(x) and it doesn't contain x32]. As we have seen above, we only need to XOR add s(x).

Let's see an example: 4Ah = 01001010. This stream is coming. We start from the right:

-Bit- ----Action--- ---------------Remainder----------------------
0 Nothing 0000 0000 0000 0000 0000 0000 0000 0000
Next SHR 1 0000 0000 0000 0000 0000 0000 0000 0000
1 XOR s(x) 1110 1101 1011 1000 1000 0011 0010 0000
Next SHR 1 0111 0110 1101 1100 0100 0001 1001 0000 Discard:0
0 Nothing 0111 0110 1101 1100 0100 0001 1001 0000
Next SHR 1 0011 1011 0110 1110 0010 0000 1100 1000 Discard:0
1 XOR s(x) 1101 0110 1101 0110 1010 0011 1110 1000
Next SHR 1 0110 1011 0110 1011 0101 0001 1111 0100 Discard:0
0 Nothing 0110 1011 0110 1011 0101 0001 1111 0100
Next SHR 1 0011 0101 1011 0101 1010 1000 1111 1010 Discard:0
0 Nothing 0011 0101 1011 0101 1010 1000 1111 1010
Next SHR 1 0001 1010 1101 1010 1101 0100 0111 1101 Discard:0
1 XOR s(x) 1111 0111 0110 0010 0101 0111 0101 1101
Next SHR 1 0111 1011 1011 0001 0010 1011 1010 1110 Discard:1
XOR s(x) 1001 0110 0000 1001 1010 1000 1000 1110
0 Nothing 1001 0110 0000 1001 1010 1000 1000 1110

Or, 9609A88Eh. I have this feeling that you don't believe me, so here is the *painful* method: 01001010 represents x + x4 + x6. Let's calculate the remainder of x32(x + x4 + x6) divided by 1 + x + x2 + x4 + x5 + x7 + x8 + x10 + x11 + x12 + x16 + x22 + x23 + x26 + x32.

We have x32(x + x4 + x6) = x33 + x36 + x38 = x38 + x36 + x33 [= dividend]

c(x)=x32 + x26 + x23 + x22 + x16 + x12 + x11 + x10 + x8 + x7 + x5 + x4 + x2 + x1 + x0 [= divisor]

Now we substract (xor addition) x6c(x):

x6c(x)=x38 + x32 + x29 + x28 + x22 + x18 + x17 + x16 + x14 + x13 + x11 + x10 + x8 + x7 + x6

We get (x38 + x38) + x36 + x33 + x32 + x29 + x28 + x22 + x18 + x17 + x16 + x14 + x13 + x11 + x10 + x8 + x7 + x6

However, x38 + x38 = 0, so we get this:

x36 + x33 + x32 + x29 + x28 + x22 + x18 + x17 + x16 + x14 + x13 + x11 + x10 + x8 + x7 + x6

Same thing: xor add x4c(x):

x4c(x)=x36 + x30 + x27 + x26 + x20 + x16 + x15 + x14 + x12 + x11 + x9 + x8 + x6 + x5 + x4

Result: x33 + x32 + x30 + x29 + x28 + x27 + x26 + x22 + x20 + x18 + x17 + x15 + x13 + x12 + x10 + x9 + x7 + x5 + x4

xor add xc(x):

xc(x)=x33 + x27 + x24 + x23 + x17 + x13 + x12 + x11 + x9 + x8 + x6 + x5 + x3 + x2 + x1

Result: x32 + x30 + x29 + x28 + x26 + x24 + x23 + x22 + x20 + x18 + x15 + x11 + x10 + x8 + x7 + x6 + x4 + x3 + x2 + x1

And the last one: xor add c(x) (only result displayed)

Result: x30 + x29 + x28 + x24 + x20 + x18 + x16 + x15 + x12 + x6 + x5 + x3 + x0

Or, in binary, 1001 0110 0000 1001 1010 1000 1000 1110 = 9609A88Eh

You may check this out using a7x7 + a6x6 + a5x5 + a4x4 + a3x3 + a2x2 + a1x + a0 and proove that these two operations do exactly the same (or even use anxn + an-1xn-1 + ...). I'm not going to.

Enough! I believe you! Now what?

We are going now to create a table of all possible 8-bit combinations (that is all possible bytes) and use it as reference, so that we can process the signal byte-by-byte. This is easy, we just use the algorithm described for all values from 0000000 to 11111111 (that is 0 to 255). The table holds the remainder of each value divided by c(x).

The table is quite small (256 32-bit values) for modern computers and programs, so you may as well get it ready in your program and you won't have to calculate it everytime. Now, let's see what happens as the stream comes. We hold CRC in a 32-bit variable.

When the first byte comes, crc=table[byte]:

That's easy, we check it out from the table. So, crc = table[byte].

We add the next byte to the 8 rightmost bits of crc, and use them as dividend:

This new byte represents a polynomial 8 degrees lower than the one we used as a dividend (the previous byte). Now, our remainder (of the previous calculation) is of the same degree with the polynomial this new byte represents. So, we add the byte to our variable and we use the result as the new dividend. Actually, we use only the 8 rightmost bits (we consider them a polynomial of a degree higher than 32).

The remainder is found in the table.

We add the remainder from the table with the other 24 bits:

We have another 24 bits in our variable representing the remainder of the previous division. We correct their position (shr 8) and add them to the result from the table.

Let's get this straight: new crc = table [(crc and 255) xor byte] xor (crc shr 8). We use binary AND 255 (which is AND 11111111) to get the 8 bits. We repeat this procedure for every byte.

In CRC32 we start our crc with FFFFFFFFh (all bits to 1) and when we finish we invert the bits (binary operation NOT).

WE ARE DONE! GOD I CAN'T BELIEVE IT! I'VE BEEN WRITING THIS STUFF FOR HOURS!!!

I'd like to thank my family for their immense support, my director and the academy... Hey! Where's my little Oscar?

I must also thank Bjorn (he knows who he is..) for asking me to clear some things out and thus "forcing" me to rewrite some parts and make them more precise and easier to understand (well, I hope...)

Hey, don't forget. Any comments directly to hibaman@hotmail.com.

Last modified:
This page has been viewed 21073 times since 22/10/2003.