Cyclic redundancy check

A cyclic redundancy check (CRC) is a type of hash function used to produce a checksum – a small, fixed number of bits – against a block of data, such as a packet of network traffic or a block of a computer file. The checksum is used to detect errors after transmission or storage. A CRC is computed and appended before transmission or storage, and verified afterwards by the recipient to confirm that no changes occurred on transit. CRCs are popular because they are simple to implement in binary hardware, are easy to analyze mathematically, and are particularly good at detecting common errors caused by noise in transmission channels.

Introduction

A CRC "checksum" is the remainder of a binary division with no bit carry (XOR used instead of subtraction), of the message bit stream, by a predefined (short) bit stream of length $n$ , which represents the coefficients of a polynomial. Before the division, $n$ zeros are appended to the message stream.

CRCs are based on division in the ring of polynomials over the finite field GF(2) (the integers modulo 2). In simpler terms, this is the set of polynomials where each coefficient is either zero or one (a single binary bit), and arithmetic operations wrap around. For example:

(x^{3}+x)+(x+1)=x^{3}+2x+1\equiv x^{3}+1{\pmod {2}}

Note that $2x$ becomes zero in the above equation because addition of coefficients is performed modulo 2:

2x=x+x=x\times (1+1)=x\times 0\equiv 0{\pmod {2}}

Multiplication is similar:

(x^{2}+x)(x+1)=x^{3}+2x^{2}+x\equiv x^{3}+x{\pmod {2}}

We can also divide polynomials mod 2 and find the quotient and remainder. For example, suppose we're dividing x³ + x² + x by x + 1. We would find that

{\frac {(x^{3}+x^{2}+x)}{(x+1)}}=(x^{2}+1)-{\frac {1}{(x+1)}}

In other words,

(x^{3}+x^{2}+x)=(x^{2}+1)(x+1)-1\equiv (x^{2}+1)(x+1)+1{\pmod {2}}

The division yields a quotient of x² + 1 with a remainder of -1, which, since it is odd, has a last bit of 1.

Any string of bits can be interpreted as the coefficients of a polynomial of this sort, and to find the CRC, we divide by another fixed polynomial, after first multiplying by $x^{n}$ where $n$ is the degree of the fixed polynomial. The coefficients of the remainder polynomial are the CRC.

In the above equations, $x^{2}+x+1$ represents the original message bits 111, $x+1$ is the generator polynomial, and the remainder $1$ (equivalently, $x^{0}$ ) is the CRC. The degree of the generator polynomial is 1, so we first multiplied the message by $x^{1}$ to get $x^{3}+x^{2}+x$ .

In general form:

M(x)\cdot x^{n}=Q(x)\cdot G(x)+R(x)

Here M(x) is the original message polynomial and G(x) is the degree- $n$ generator polynomial. The bits of $M(x)\cdot x^{n}$ are the original message with $n$ zeros added at the end. R(x) is the remainder polynomial, which is the CRC 'checksum'. The quotient polynomial Q(x) is uninteresting. In communication, the sender attaches the $n$ bits of R after the original message bits of M and sends them out (in place of the zeros). The receiver takes M and R and checks whether $M(x)\cdot x^{n}-R(x)$ is divisible by $G(x)$ . If it is, then the receiver assumes the received message bits are correct. Note that $M(x)\cdot x^{n}-R(x)$ is exactly the string of bits the sender sent; this string is called the codeword.

A CRC is a checksum in a strict mathematical sense, as it can be expressed as the weighted modulo-2 sum of per-bit syndromes, but that word is generally reserved more specifically for sums computed using larger moduli, such as 10, 256, or 65535.

CRCs can also be used as part of error-correcting codes, which allow not only the detection of transmission errors, but the reconstruction of the correct message. These codes are based on closely related mathematical principles.

Endianness

When implemented in bit serial hardware, the generator polynomial uniquely describes the bit assignment; the first bit transmitted is always the coefficient of the highest power of $x$ , and the last $n$ bits transmitted are the CRC remainder $R(x)$ , starting with the coefficient of $x^{n-1}$ and ending with the coefficient of $x^{0}$ , a.k.a. the coefficient of 1.

However, when bits are processed a byte at a time, such as when using parallel transmission, byte framing as in 8B/10B encoding or RS-232-style asynchronous serial communication, or when implementing a CRC in software, it is necessary to specify the endianness of the data; which bit in each byte is considered "first" and will be the coefficient of the higher power of $x$ .

If the data is destined for serial communication, it is best to use the bit ordering the data will ultimately be sent in. This is because a CRC's ability to detect burst errors is based on proximity in the message polynomial $M(x)$ ; if adjacent polynomial terms are not transmitted sequentially, a physical error burst of one length may be seen as a longer burst due to the rearrangement of bits.

For example, both IEEE 802 (ethernet) and RS-232 standards specify little-endian (least-significant bit first) transmission, so a software CRC implementation to protect should map the least significant bits in each byte to coefficients of the highest powers of $x$ . On the other hand, floppy disks and most hard drives write the most significant bit of each byte first.

The little-endian CRC is slightly simpler to implement in software, so is somewhat more commonly seen, but many programmers find the big-endian bit ordering easier to follow. Thus, for example, the XMODEM-CRC extension, an early use of CRCs in software, uses a big-endian CRC.

Example

Suppose that we are trying to compute an 8-bit CRC of an 8-bit message made of the ASCII character "W", which is decimal 87₁₀ or hexadecimal 57₁₆. For illustration, we will use the CRC-8-ATM polymomial $x^{8}+x^{2}+x+1$ . Writing the first bit transmitted (the coefficient of the highest power of $x$ ) on the left, this corresponds to the 9-bit string "100000111".

The byte value 57₁₆ can transmitted in two different orders, depending on the endianness convention used. Each one generates a different message polynomial $M(x)$ . Big-endian, this is $x^{6}+x^{4}+x^{2}+x+1$ = 01010111, while little-endian, it is $x^{7}+x^{6}+x^{5}+x^{3}+x$ = 11101010. These can then be multiplied by $x^{8}$ to produce two 16-bit message polynomials $x^{8}M(x)$ .

Computing the remainder then consists of subtracting multiples of the generator polynomial $G(x)$ . This is just like decimal long division, but even simpler because the only possible multiples at each step are 0 and 1. And because we do not care about the quotient, there is no need to record it.

Big-endian example

Little-endian example

	0	1	0	1	0	1	1	1	0	0	0	0	0	0	0	0
-	0	0	0	0	0	0	0	0	0
=	0	1	0	1	0	1	1	1	0	0	0	0	0	0	0	0
	-	1	0	0	0	0	0	1	1	1
=	0	0	0	1	0	1	1	0	1	1	0	0	0	0	0	0
		-	0	0	0	0	0	0	0	0	0
=	0	0	0	1	0	1	1	0	1	1	0	0	0	0	0	0
			-	1	0	0	0	0	0	1	1	1
=	0	0	0	0	0	1	1	0	1	0	1	1	0	0	0	0
				-	0	0	0	0	0	0	0	0	0
=	0	0	0	0	0	1	1	0	1	0	1	1	0	0	0	0
					-	1	0	0	0	0	0	1	1	1
=	0	0	0	0	0	0	1	0	1	0	1	0	1	1	0	0
						-	1	0	0	0	0	0	1	1	1
=	0	0	0	0	0	0	0	0	1	0	1	0	0	0	1	0
							-	0	0	0	0	0	0	0	0	0
=	0	0	0	0	0	0	0	0	1	0	1	0	0	0	1	0

	1	1	1	0	1	0	1	0	0	0	0	0	0	0	0	0
-	1	0	0	0	0	0	1	1	1
=	0	1	1	0	1	0	0	1	1	0	0	0	0	0	0	0
	-	1	0	0	0	0	0	1	1	1
=	0	0	1	0	1	0	0	0	0	1	0	0	0	0	0	0
		-	1	0	0	0	0	0	1	1	1
=	0	0	0	0	1	0	0	0	1	0	1	0	0	0	0	0
			-	0	0	0	0	0	0	0	0	0
=	0	0	0	0	1	0	0	0	1	0	1	0	0	0	0	0
				-	1	0	0	0	0	0	1	1	1
=	0	0	0	0	0	0	0	0	1	0	0	1	1	0	0	0
					-	0	0	0	0	0	0	0	0	0
=	0	0	0	0	0	0	0	0	1	0	0	1	1	0	0	0
						-	0	0	0	0	0	0	0	0	0
=	0	0	0	0	0	0	0	0	1	0	0	1	1	0	0	0
							-	0	0	0	0	0	0	0	0	0
=	0	0	0	0	0	0	0	0	1	0	0	1	1	0	0	0

Observe that after each subtraction, the bits are divided into three groups: at the beginning, a group which is all zero, at the end, a group which is unchanged from the original, and a shaded group in the middle which is "interesting". The "interesting" group is 8 bits long, matching the degree of the polynomial. Every step, the zero group becomes one bit longer and the unchanged group becomes one bit shorter, until only the final remainder is left.

In the big-endian example, the remainder polynomial is $x^{7}+x^{5}+x$ . Converting to a binary number using a big-endian convention, this is A2₁₆. In the little-endian example, the remainder is $x^{7}+x^{4}+x^{3}$ . Converting to binary using a little-endian convention, this is 19₁₆.

Implementation

Writing out the full message at each step, as done in the example above, is very tedious. Efficient implementations use an $n$ -bit register to hold only the interesting bits. Here is a first draft of some pseudocode for computing an n-bit CRC:

 function crc(bit array bitString[1..len+n], int len) {
     remainderPolynomial := polynomialForm(bitString[1..n])   // First n bits of the message
     // A popular variant complements remainderPolynomial here
     for i from 1 to len {
         remainderPolynomial := remainderPolynomial * x + bitString[i+n] * x⁰
         if coefficient of xⁿ of remainderPolynomial = 1 {
             remainderPolynomial := remainderPolynomial xor generatorPolynomial
         }
     }
     // A popular variant complements remainderPolynomial here
     return remainderPolynomial
 }

Note that this example code refrains from specifying an endianness preference; the input bitString is already in the form of a bit array, and the remainderPolynomial is manipulated in terms of polynomial operations; the multiplication by $x$ could be a left or right shift, and the addition of bitString[i+n] is done to the $x^{0}$ coefficient, which could be the right or left end of the register.

This code has two disadvantages. First, it actually requires an n+1-bit register to hold the remainderPolynomial so that the $x^{n}$ coefficient can be tested, but more significantly, it requires the bitString to be padded with n zero bits.

The first problem can be solved by testing the $x^{n-1}$ coefficient of the remainderPolynomial before it is multiplied by $x$ .

The second problem could be solved by doing the last n iterations differently, but there is a more subtle optimzation which is used universally, in both hardware and software implementations.

Because the XOR operation used to subtract the generator polynomial from the message is commutative and associative, it does not matter in what order the various inputs are combined into the remainderPolynomial. And specifically, a given bit of the bitString does not need to be added to the remainderPolynomial until the very last instant when it is tested to determine whether to xor with the generatorPolynomial.

This eliminates the need to preload the remainderPolynomial with the first n bits of the message, as well:

 function crc(bit array bitString[1..len], int len) {
     remainderPolynomial := 0
     // A popular variant complements remainderPolynomial here
     for i from 1 to len {
         if (coefficient of x^n-1 of remainderPolynomial xor bitString[i]) = 1 {
             remainderPolynomial := (remainderPolynomial * x) xor generatorPolynomial
         } else {
             remainderPolynomial := (remainderPolynomial * x)
         }
     }
     // A popular variant complements remainderPolynomial here
     return remainderPolynomial
 }

This is the standard bit-at-a-time hardware CRC implementation, and is well worthy of study; once you understand why this computes exactly the same result as the first version, the remaining optimizations are quite straightforward. If remainderPolynomial is only n bits long, then the $x^{n}$ coefficients of it and of generatorPolynomial are simply discarded. This is the reason that you will usually see CRC polynomials written in binary with the leading coefficient omitted.

In software, it is convenient to note that while one may delay the xor of each bit until the very last moment, it is also possible to do it earlier. It is usually convenient to perform the xor a byte at a time, even in a bit-at-a-time implementation like this:

 function crc(byte array string[1..len], int len) {
     remainderPolynomial := 0
     // A popular variant complements remainderPolynomial here
     for i from 1 to len {
         remainderPolynomial := remainderPolynomial xor polynomialForm(string[i])
         for j from 1 to 8 {    // Assuming 8 bits per byte
             if coefficient of x^n-1 of remainderPolynomial = 1 {
                 remainderPolynomial := (remainderPolynomial * x) xor generatorPolynomial
             } else {
                 remainderPolynomial := (remainderPolynomial * x)
             }
         }
     }
     // A popular variant complements remainderPolynomial here
     return remainderPolynomial
 }

This is usually the most compact software implementation, used in microcontrollers when space is at a premium over speed.

So far, the pseudocode has avoided specifying an endianness by describing shifts in the pseudocode as multiplications by $x$ and writing explicit conversions from binary to polynomial form. In practice, the CRC is held in a standard binary register using a particular endianness. In big-endian form, the most significant binary bits contain the higher-order polynomial coefficients (sent first), while in little-endian form, the least-significant binary bits contain the higher-order coefficients. The above pseudocode can be written in both forms. For concreteness, this uses the 16-bit CRC-16-CCITT polynomial $x^{16}+x^{12}+x^{5}+1$ :

 // Big-endian, x^16+x^12+x^5+1 = (1) 0001 0000 0010 0001 = 0x1021
 function crc(byte array string[1..len], int len) {
     rem := 0
     // A popular variant complements rem here
     for i from 1 to len {
         rem := rem xor string[i] leftShift (n-8)   // n = 16 in this example
         for j from 1 to 8 {    // Assuming 8 bits per byte
             if leftmost (most significant) bit of rem is set {
                 rem := (rem leftShift 1) xor 0x1021
             } else {
                 rem := rem leftShift 1
             }
         }
     }
     // A popular variant complements rem here
     return rem
 }

 // Little-endian, x^16+x^12+x^5+1 = 1000 0100 0000 1000 (1) = 0x8408
 function crc(byte array string[1..len], int len) {
     rem := 0
     // A popular variant complements rem here
     for i from 1 to len {
         rem := rem xor string[i]
         for j from 1 to 8 {    // Assuming 8 bits per byte
             if rightmost (least significant) bit of rem is set {
                 rem := (rem rightShift 1) xor 0x8408
             } else {
                 rem := rem rightShift 1
             }
         }
     }
     // A popular variant complements rem here
     return rem
 }

Note that the little-endian form avoids the need to shift string[i] before the xor. Also, when converting rem to bytes and appending it to the message, follow the chosen endianness convention.

Another common optimization uses a lookup table indexed by highest order coefficients of rem to perform the inner loop over 8 bits in fewer steps. A 256-entry lookup table is a particularly common choice, although using a 16-entry table twice per byte is very compact and still faster than the bit at a time version. This replaces the inner loop over j with

         // Big-endian
         rem = (rem leftShift 8) xor big_endian_table[(leftmost 8 bits of rem) rightShift (n-8)]

         // Little-endian
         rem = (rem rightShift 8) xor little_endian_table[rightmost 8 bits of rem]

One of the most commonly encountered CRC algorithms is known as CRC-32, used by (among others) Ethernet, FDDI, ZIP and other archive formats, and PNG image format. Its polynomial can be written big-endian as 0x04C11DB7, or little-endian as 0xEDB88320. The W3C webpage on PNG includes an appendix with a short and simple table-driven implementation in C of CRC-32. You will note that the code corresponds to the little-endian byte-at-a-time pseudocode presented here, and the table is generated using the bit-at-a-time code.

Despite its popular acclaim, the polynomial in question was arbitrarily chosen and generally exhibits very poor error detection properties (in terms of Hamming distance per given block size) compared to polynomials selected by algorithmic means^[1] ^[2].

Variations

There are several variations on CRCs

To check the CRC, instead of calculating the CRC on the message and comparing it to the CRC, a CRC calculation may be run on the entire codeword. If the result is zero, the check passes. This works because the codeword is $M(x)\cdot x^{n}-R(x)=Q(x)\cdot G(x)$ , which is always divisible by $G(x)$ .

The shift register may be initialized with ones instead of zeroes. Equivalently, the first $n$ bits of the message may be inverted before feeding them into the algorithm. The reason to do this is because an unmodified CRC does not distinguish between two messages which differ only in the number of leading zeros. When this inversion is done, the CRC does distinguish between such messages.

The CRC may be inverted before being appended to the message stream. This is because an unmodified CRC does not distinguish between two messages that differ only in the number of trailing zeros. When this is done, the CRC does distinguish. With this modification, the CRC of a full codeword that already includes a CRC is no longer zero. The CRC may be checked either by the straightforward method of computing the CRC on the message, inverting it, and comparing with the CRC in the message stream, or by calculating the CRC on the entire codeword and comparing it with an expected fixed value called the check polynomial C(x). This result is sometimes called the corresponding magic number, and may be computed as $C(x)=\sum _{i=n}^{2n-1}x^{i}{\bmod {G}}(x)$ .

When using the CRC-32 polynomial and the CRC-16-CCITT polynomial, by convention both inversions are performed. The check polynomial for CRC-32 is $C(x)=x^{31}+x^{30}+x^{26}+x^{25}+x^{24}+x^{18}+x^{15}+x^{14}+x^{12}+x^{11}+x^{10}+x^{8}+x^{6}+x^{5}+x^{4}+x^{3}+x+1$ .

Reversed representations and reciprocal polynomials

Polynomial representations

All the well-known CRC generator polynomials of degree $n$ have two common hexadecimal representations:

The big-endian representation is a hexadecimal number with $n$ bits, the least significant bit of which is always 1. The most significant bit represents the coefficient of $x^{n-1}$ and the least significant bit represents the coefficient of $x^{0}$ .

The little-endian representation is a hexadecimal number with $n$ bits, the most significant bit of which is always 1. The most significant bit represents the coefficient of $x^{0}$ and the least significant bit represents the coefficient of $x^{n-1}$ .

The big-endian form is referred to in the literature as the normal representation, while the little-endian is called the reversed representation, because it is the bitwise reverse of the normal representation. t is essential to use the correct form when implementing a CRC. If the coefficient of $x^{n-1}$ is zero, the forms can be distinguished at a glance by seeing which end has the bit set.

In both these representations the $x^{n}$ coefficient is omitted and understood to be 1.

To further confuse the matter, the excellent paper by P. Koopman and T. Chakravarty ^[2]^[3] converts CRC generator polynomials to hexadecimal numbers in yet another way: big-endian, but including the $x^{n}$ coefficient and omitting the $x^{0}$ coefficient. This "Koopman" representation has the advantage that the degree can be determined from the hexadecimal form and the coefficients are easy to read off in left-to-right order. However, it is not used anywhere else and is not recommended due to the risk of confusion.

Reciprocal polynomials

A reciprocal polynomial is created by assigning the $x^{n}$ through $x^{0}$ coefficients of one polynomial to the $x^{0}$ through $x^{n}$ coefficients of a new polynomial. That is, the reciprocal of the $n$ bit polynomial $G(x)$ is $x^{n}G(x^{-1})$ . Example: The reverse of $x^{16}+x^{15}+x^{2}+1$ is $x^{16}+x^{14}+x^{1}+1$ . The most interesting property of reciprocal polynomials, when used in CRCs, is that they have the exact same error-detecting strength as the polynomials they are reciprocals of. The reciprocal of a polynomial generates the same codewords — that is, the bit strings consisting of a valid CRC appended to a message — as the original polynomial, only bit reversed. But the reciprocal polynomial is not the same as the original polynomial, and the CRCs generated using it are not the same (even modulo bit reversal) as those generated by the original polynomial.

Another interesting property of reciprocal polynomials is related to their representation. Consider again the CRC-16-IBM polynomial $x^{16}+x^{15}+x^{2}+1$ and its reciprocal. These are its representations

Normal original: 0x8005
Reversed original: 0xA001
Koopman original: 0xC002
Normal reciprocal: 0x4003
Reversed reciprocal: 0xC002
Koopman reciprocal: 0xA001

Note that the Koopman representation of the reciprocal polynomial is the same as the reversed representation of the original polynomial. This means that if you mistakenly use the Koopman representation of a polynomial in a right-shift algorithm, you get a CRC that is just as strong as (but different from) the one you intended. This holds only for polynomials with a degree which is a multiple of 4.

Error detection strength

The error-detection ability of a CRC depends on the degree of its key polynomial and on the specific key polynomial used. The "error polynomial" $E(x)$ is the exclusive-or of the received message codeword and the correct message codeword. An error will go undetected by a CRC algorithm if and only if the error polynomial is divisible by the CRC polynomial.

Because a CRC is based on division, no polynomial can detect errors consisting of a string of zeros prepended to the data, or of missing leading zeros. However, see Variations.
All single bit errors will be detected by any polynomial with at least two terms with non-zero coefficients. The error polynomial is $x^{k}$ , and $x^{k}$ is divisible only by polynomials $x^{i}$ where $i\leq k$ .
All two bit errors separated by a distance less than the order of the polynomial will be detected. The error polynomial in the two bit case is $E(x)=x^{i}+x^{k}=x^{k}\cdot (x^{i-k}+1),\;i>k$ . As noted above, the $x^{k}$ term will not be divisible by the CRC polynomial, which leaves the $x^{i-k}+1$ term. By definition, the smallest value of ${i-k}$ such that a polynomial divides $x^{i-k}+1$ is the polynomial's order. The polynomials with the largest order are called primitive polynomials, and for polynomials of degree $n$ with binary coefficients, have order $2^{n}-1$ .
All errors in an odd number of bits will be detected by a polynomial which is a multiple of $x+1$ . This is equivalent to the polynomial having an even number of terms with non-zero coefficients.
All burst errors of length $n$ will be detected by any polynomial of degree $n$ or greater which has a non-zero $x^{0}$ term.

(as an aside, there is never reason to use a polynomial with a zero $x^{0}$ term. Recall that a CRC is the remainder of the message polynomial times x^n divided by the CRC polynomial. A polynomial with a zero $x^{0}$ term always has $x$ as a factor. So if $K(x)$ is the original CRC polynomial and $K(x)=x\cdot K'(x)$ , then

$M(x)\cdot x^{n-1}=Q(x)\cdot K'(x)+R(x)$

$M(x)\cdot x^{n}=Q(x)\cdot x\cdot K'(x^{n})+x\cdot R(x)$

$M(x)\cdot x^{n}=Q(x)\cdot K(x^{n})+x\cdot R(x)$

That is, the CRC of any message with the $K(x)$ polynomial is the same as that of the same message with the $K'(x)$ polynomial with a zero appended. It is just a waste of a bit.)

The combination of these factors means that good CRC polynomials are often primitive polynomials (which have the best 2-bit error detection) or polynomials whose factors are a primitive polynomial of degree $n-1$ and $x+1$ (which detects all odd numbers of bit errors, and has half the two-bit error detection ability of a primitive polynomial of degree $n$ )^[2].

Error detection strength (Bitfilters)

Analysis Technique using bitfilters (see weblink "CRC-Analysis with Bitfilters") allows to determine quite more efficient the properties of a given generator polynomial. The results are the following:

All burst errors (but one) with length no longer than the generator polynomial can be detected by any generator polynomial $1+...+X^{n}$ . This includes 1-bit errors (burst of length 1). The maximum length is $n+1$ , when $n$ is the degree of the generator polynomial (which itself has a length of $n+1$ ). The exception to this result is a bit pattern the same as that of the generator polynomial.
All uneven bit errors are detected by generator polynomials with even number of terms.
2-Bit errors in a (multiple) distance of the longest bitfilter of even parity to a generator polynomial are not detected; all others are detected. For degrees up to 32 there is an optimal generator polynomial with that degree and even number of terms; in this case the period mentioned above is $2^{n-1}-1$ . For $n=16$ this means that blocks of 32767 bits length do not contain undiscoverd 2-Bit errors. For uneven number of terms in the generator polynomial there can be a period of $2^{n}-1$ ; however, these generator polynomials (with odd number of terms) do not discover all odd number of errors, so they should be avoided. A list of the corresponding generators with even number of terms can be found in the link mentioned at the beginning of this section.
All single bit errors within the bitfilter period mentioned above (for even terms in the generator polynomial) can be identified uniquely by their residual. So CRC method can be used for error correction as well (within those limits, e.g. 32767 bits with optimal generator polynomials of degree 16). Since all odd errors leave an odd residual, all even an even residual, 1-bit errors and 2-bit errors can be distinguished, however not necessarily 1-bit errors and 3-bit errors. Thus bit error correction might be erroneous itself and produce more errors.

CRC polynomial specifications

The following table below omits "Initial value", "Reflected value" and "Final XOR value".

These hex values are important for some more complicated checksums (like most forms of CRC-32 and CRC-64). CRCs less than CRC-16 do not tend to use these values.
Very often custom versions of checksums are created by changing these values, as it does not alter the mechanics of the checksum algorithm.

CRC standardization issues

The definition of CRC-12 is disputed, as there are 3 forms of CRC-12 in common use.
Both forms of CRC-8 in use have notable weaknesses mathematically.
It is assumed that at least 10 forms of CRC-16 and CRC-32 exist, but no form of CRC-16 or CRC-32 is mathematically optimal.
CCITT CRCs differ from ITU CRCs (of the same size), as the same entity has standardized checksums more than once but in different eras.

CRC polynomials in common use (in ITU-IEEE syntax)

Name	Polynomial	Representations: Normal or Reverse (Normal of Reciprocal)
CRC-1	$x+1$ (use: hardware; also known as parity bit)	0x1 or 0x1 (0x1)
CRC-5-CCITT	$x^{5}+x^{3}+x+1$ (ITU G.704 standard)	0x0B or 0x15 (0x??)
CRC-5-USB	$x^{5}+x^{2}+1$ (use: USB token packets)	0x05 or 0x14 (0x9)
CRC-7	$x^{7}+x^{3}+1$ (use: telecom systems, MMC)	0x09 or 0x48 (0x11)
CRC-8-ATM	$x^{8}+x^{2}+x+1$ (use: ATM HEC)	0x07 or 0xE0 (0xC1)
CRC-8-CCITT	$x^{8}+x^{7}+x^{3}+x^{2}+1$ (use: 1-Wire bus)	0x87 or 0xE1
CRC-8-Dallas/Maxim	$x^{8}+x^{5}+x^{4}+1$ (use: 1-Wire bus)	0x31 or 0x8C
CRC-8	$x^{8}+x^{7}+x^{6}+x^{4}+x^{2}+1$	0xD3 or 0xCB (0x??)
CRC-10	$x^{10}+x^{9}+x^{5}+x^{4}+x+1\|\|0x233or0x331(0x????)\|-\|CRC-12\|\|<math>x^{12}+x^{11}+x^{3}+x^{2}+x+1$ (use: telecom systems)	0x80F or 0xF01 (0xE03)
CRC-15-CAN	$x^{15}+x^{14}+x^{10}+x^{8}+x^{7}+x^{4}+x^{3}+1$	0x4599 or 0x4CD1
CRC-16-Fletcher	Not a CRC; see Fletcher's checksum	Used in Adler-32 A & B CRCs
CRC-16-CCITT	$x^{16}+x^{12}+x^{5}+1$ (X.25, V.41, Bluetooth, PPP, IrDA; known as "CRC-CCITT")	0x1021 or 0x8408 (0x0811)
CRC-16-IBM	$x^{16}+x^{15}+x^{2}+1$ (XMODEM, USB, many others; also known as "CRC-16")	0x8005 or 0xA001 (0x4003)
CRC-32-Adler	Not a CRC; see Adler-32	See Adler-32
CRC-32-MPEG2	$x^{32}+x^{26}+x^{23}+x^{22}+x^{16}+x^{12}+x^{11}+x^{10}+x^{8}+x^{7}+x^{5}+x^{4}+x^{2}+x+1$	0x04C11DB7 or 0xEDB88320 Also used in IEEE 802.3
CRC-32-IEEE 802.3	$x^{32}+x^{26}+x^{23}+x^{22}+x^{16}+x^{12}+x^{11}+x^{10}+x^{8}+x^{7}+x^{5}+x^{4}+x^{2}+x+1$ (V.42)	0x04C11DB7 or 0xEDB88320 (0xDB710641)
CRC-32C (Castagnoli)^[1]	$x^{32}+x^{28}+x^{27}+x^{26}+x^{25}+x^{23}+x^{22}+x^{20}+x^{19}+x^{18}+x^{14}+x^{13}+x^{11}+x^{10}+x^{9}+x^{8}+x^{6}+1$	0x1EDC6F41 or 0x82F63B78 (0x05EC76F1)
CRC-64-ISO	$x^{64}+x^{4}+x^{3}+x+1$ (use: ISO 3309)	0x000000000000001B or 0xD800000000000000 (0xB000000000000001)
CRC-64-ECMA-182	$x^{64}+x^{62}+x^{57}+x^{55}+x^{54}+x^{53}+x^{52}+x^{47}+x^{46}+x^{45}+x^{40}+x^{39}+x^{38}+x^{37}+x^{35}+x^{33}+x^{32}+x^{31}+x^{29}+x^{27}+x^{24}+x^{23}+x^{22}+x^{21}+x^{19}+x^{17}+x^{13}+x^{12}+x^{10}+x^{9}+x^{7}+x^{4}+x+1$ (as described in ECMA-182 p.63)	0x42F0E1EBA9EA3693 or 0xC96C5795D7870F42 (0x92D8AF2BAF0E1E85)
CRC-128	IEEE-ITU Standard. Use superseded by hashes MD5 & SHA-1
CRC-160	IEEE-ITU Standard. Use superseded by hashes MD5 & SHA-1

CRCs and data integrity

While useful for error detection, CRCs cannot be safely relied upon to verify data integrity (that no changes whatsoever have occurred), since, because of the linear structure of CRC polynomials, it is extremely easy to intentionally change data without modifying its CRC, as demonstrated by CRC and how to Reverse it. Message authentication codes can be used to verify data integrity.

When CRCs collide

As with all hash functions, CRCs tend to have collision rates nearing 100% after a certain number of collision tests. Each additional bit added to a CRC (ie: CRC-20 vs CRC-21) cuts the number of collisions down by nearly 50%. The polynomial has an intrinsic repeat rate (i.e. when viewed as a random number generator, after a certain number of cycles it will repeat itself). The polynomial repeat rate of a CRC-8 is around 100 bits, but the CCITT-16 repeats at around 4k bytes.

The theoretical collision rate for CRC64 is around one collision every 18×10¹⁸ CRCs.
A properly designed CRC (using fewer bits) can be as effective as a poorly designed CRC using more bits due to the irreducible polynomial property of CRCs. In this case CRC-32 can perform nearly as well as CRC-40.

Designing CRC polynomials

The selection of generator polynomial is the most important part of implementing the CRC algorithm. The polynomial must be chosen to maximize the error detecting capabilities while minimizing overall collision probabilites. The most important attribute of the polynomial is its length (the number of the highest nonzero coefficient), because of its direct influence of the length of the computed checksum.

The most commonly used polynomial lengths are

9 bits (CRC-8)
17 bits (CRC-16)
33 bits (CRC-32)
65 bits (CRC-64)

When creating a new CRC polynomial or improving an existing CRC the general mathematical advice is to use an irreducible polynomial that satisfies all polynomical irreducibility constraints from modular arithmetics.

Irreducibility in this case means that the polynomial cannot be divided by any polynomial except itself and 1 with zero remainder.

The properties of the generator polynomial can be derived from the algorithm definition

CRCs with more than one nonzero coefficients are able to detect all single bit errors in the input message.
CRCs can be used to detect all double bit errors in the input message shorter than 2k, where k is the length of the longest irreducible part of the polynomial.
If the CRC polynomial is divided by x + 1 then no polynomial with odd number of nonzero coefficients can be divided by it. Hence, it can be used to detect odd number of errors in the input message (like single bit parity function).
CRC polynomials detect (single) burst errors shorter than the number of the position of the highest polynomial coefficient.

References

^ ^a ^b Castagnoli, G. and Braeuer, S. and Herrman, M. (1993). "Optimization of Cyclic Redundancy-Check Codes with 24 and 32 Parity Bits". IEEE Transactions on Communications. 41 (6). {{cite journal}}: Unknown parameter |month= ignored (help)CS1 maint: multiple names: authors list (link) - Castagnoli's et al. work on algorithmic selection of CRC polynomials
^ ^a ^b ^c Koopman, P. (2002). "32-Bit Cyclic Redundancy Codes for Internet Applications". The International Conference on Dependable Systems and Networks. {{cite journal}}: Unknown parameter |month= ignored (help) - verification of Castagnoli's results by exhaustive search and some new good polynomials
^ Koopman, P. and Chakravarty, T. (2004). "Cyclic Redundancy Code (CRC) Polynomial Selection For Embedded Networks". The International Conference on Dependable Systems and Networks.{{cite journal}}: CS1 maint: multiple names: authors list (link) - analysis of short CRC polynomials for embedded applications

External links

Tutorial and C++ implementation of CRC
Cyclic redundancy check - a simple guide to what it means for your data, CD and DVD discs. http://www.softwarepatch.com/tips/cyclic-redundancy.html
The CRC Pitstop
Williams, R. (1993-09) A Painless Guide to CRC Error Detection Algorithms
Understanding Cyclic Redundancy Check
Black, R. (1994-02) Fast CRC32 in Software — Algorithm 4 is used in Linux and info-zip's zip and unzip.
(1999-11, 1999-12, 2000-01) checksums, CRCs, and their source code. Embedded Systems Programming
CRC32: Generating a checksum for a file, C++ implementation by Brian Friesen
Online Tool to compute common CRCs (8/16/32/64) from strings
Online CRC calculator
Online CRC Tool: Generator of synthesizable CRC functions
Online Char (ASCII), HEX, Binary, Base64, etc... Encoder/Decoder with MD2, MD4, MD5, SHA1+2, CRC, etc. hashing algorithms
CRC16 to CRC64 collision research
Reversing CRC – Theory and Practice.
'CRC-Analysis with Bitfilters'.

[cast93-1] Castagnoli, G. and Braeuer, S. and Herrman, M. (1993). "Optimization of Cyclic Redundancy-Check Codes with 24 and 32 Parity Bits". IEEE Transactions on Communications. 41 (6). {{cite journal}}: Unknown parameter |month= ignored (help)CS1 maint: multiple names: authors list (link) - Castagnoli's et al. work on algorithmic selection of CRC polynomials

[koop02-2] Koopman, P. (2002). "32-Bit Cyclic Redundancy Codes for Internet Applications". The International Conference on Dependable Systems and Networks. {{cite journal}}: Unknown parameter |month= ignored (help) - verification of Castagnoli's results by exhaustive search and some new good polynomials

[koop04-3] Koopman, P. and Chakravarty, T. (2004). "Cyclic Redundancy Code (CRC) Polynomial Selection For Embedded Networks". The International Conference on Dependable Systems and Networks.{{cite journal}}: CS1 maint: multiple names: authors list (link) - analysis of short CRC polynomials for embedded applications

[1]

[2]

[3]