€•g      Œsphinx.addnodes”Œdocument”“”)”}”(Œ	rawsource”Œ ”Œchildren”]”(Œtranslations”ŒLanguagesNode”“”)”}”(hhh]”(h Œpending_xref”“”)”}”(hhh]”Œdocutils.nodes”ŒText”“”ŒChinese (Simplified)”…””}”Œparent”hsbaŒ
attributes”}”(Œids”]”Œclasses”]”Œnames”]”Œdupnames”]”Œbackrefs”]”Œ	refdomain”Œstd”Œreftype”Œdoc”Œ	reftarget”Œ!/translations/zh_CN/staging/crc32”Œmodname”NŒ	classname”NŒrefexplicit”ˆuŒtagname”hhhubh)”}”(hhh]”hŒChinese (Traditional)”…””}”hh2sbah}”(h]”h ]”h"]”h$]”h&]”Œ	refdomain”h)Œreftype”h+Œ	reftarget”Œ!/translations/zh_TW/staging/crc32”Œmodname”NŒ	classname”NŒrefexplicit”ˆuh1hhhubh)”}”(hhh]”hŒItalian”…””}”hhFsbah}”(h]”h ]”h"]”h$]”h&]”Œ	refdomain”h)Œreftype”h+Œ	reftarget”Œ!/translations/it_IT/staging/crc32”Œmodname”NŒ	classname”NŒrefexplicit”ˆuh1hhhubh)”}”(hhh]”hŒJapanese”…””}”hhZsbah}”(h]”h ]”h"]”h$]”h&]”Œ	refdomain”h)Œreftype”h+Œ	reftarget”Œ!/translations/ja_JP/staging/crc32”Œmodname”NŒ	classname”NŒrefexplicit”ˆuh1hhhubh)”}”(hhh]”hŒKorean”…””}”hhnsbah}”(h]”h ]”h"]”h$]”h&]”Œ	refdomain”h)Œreftype”h+Œ	reftarget”Œ!/translations/ko_KR/staging/crc32”Œmodname”NŒ	classname”NŒrefexplicit”ˆuh1hhhubh)”}”(hhh]”hŒSpanish”…””}”hh‚sbah}”(h]”h ]”h"]”h$]”h&]”Œ	refdomain”h)Œreftype”h+Œ	reftarget”Œ!/translations/sp_SP/staging/crc32”Œmodname”NŒ	classname”NŒrefexplicit”ˆuh1hhhubeh}”(h]”h ]”h"]”h$]”h&]”Œcurrent_language”ŒEnglish”uh1h
hhŒ	_document”hŒsource”NŒline”NubhŒsection”“”)”}”(hhh]”(hŒtitle”“”)”}”(hŒ!Brief tutorial on CRC computation”h]”hŒ!Brief tutorial on CRC computation”…””}”(hh¨hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h¦hh£hžhhŸŒ;/var/lib/git/docbuild/linux/Documentation/staging/crc32.rst”h KubhŒ	paragraph”“”)”}”(hX¶  A CRC is a long-division remainder.  You add the CRC to the message,
and the whole thing (message+CRC) is a multiple of the given
CRC polynomial.  To check the CRC, you can either check that the
CRC matches the recomputed value, *or* you can check that the
remainder computed on the message+CRC is 0.  This latter approach
is used by a lot of hardware implementations, and is why so many
protocols put the end-of-frame flag after the CRC.”h]”(hŒåA CRC is a long-division remainder.  You add the CRC to the message,
and the whole thing (message+CRC) is a multiple of the given
CRC polynomial.  To check the CRC, you can either check that the
CRC matches the recomputed value, ”…””}”(hh¹hžhhŸNh NubhŒemphasis”“”)”}”(hŒ*or*”h]”hŒor”…””}”(hhÃhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÁhh¹ubhŒÍ you can check that the
remainder computed on the message+CRC is 0.  This latter approach
is used by a lot of hardware implementations, and is why so many
protocols put the end-of-frame flag after the CRC.”…””}”(hh¹hžhhŸNh Nubeh}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khh£hžhubh¸)”}”(hŒHIt's actually the same long division you learned in school, except that:”h]”hŒJItâ€™s actually the same long division you learned in school, except that:”…””}”(hhÛhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khh£hžhubhŒbullet_list”“”)”}”(hhh]”(hŒ	list_item”“”)”}”(hŒ<We're working in binary, so the digits are only 0 and 1, and”h]”h¸)”}”(hhòh]”hŒ>Weâ€™re working in binary, so the digits are only 0 and 1, and”…””}”(hhôhžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khhðubah}”(h]”h ]”h"]”h$]”h&]”uh1hîhhëhžhhŸh¶h Nubhï)”}”(hŒµWhen dividing polynomials, there are no carries.  Rather than add and
subtract, we just xor.  Thus, we tend to get a bit sloppy about
the difference between adding and subtracting.
”h]”h¸)”}”(hŒ´When dividing polynomials, there are no carries.  Rather than add and
subtract, we just xor.  Thus, we tend to get a bit sloppy about
the difference between adding and subtracting.”h]”hŒ´When dividing polynomials, there are no carries.  Rather than add and
subtract, we just xor.  Thus, we tend to get a bit sloppy about
the difference between adding and subtracting.”…””}”(hj  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khj  ubah}”(h]”h ]”h"]”h$]”h&]”uh1hîhhëhžhhŸh¶h Nubeh}”(h]”h ]”h"]”h$]”h&]”Œbullet”Œ-”uh1héhŸh¶h Khh£hžhubh¸)”}”(hXh  Like all division, the remainder is always smaller than the divisor.
To produce a 32-bit CRC, the divisor is actually a 33-bit CRC polynomial.
Since it's 33 bits long, bit 32 is always going to be set, so usually the
CRC is written in hex with the most significant bit omitted.  (If you're
familiar with the IEEE 754 floating-point format, it's the same idea.)”h]”hXn  Like all division, the remainder is always smaller than the divisor.
To produce a 32-bit CRC, the divisor is actually a 33-bit CRC polynomial.
Since itâ€™s 33 bits long, bit 32 is always going to be set, so usually the
CRC is written in hex with the most significant bit omitted.  (If youâ€™re
familiar with the IEEE 754 floating-point format, itâ€™s the same idea.)”…””}”(hj'  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khh£hžhubh¸)”}”(hXÇ  Note that a CRC is computed over a string of *bits*, so you have
to decide on the endianness of the bits within each byte.  To get
the best error-detecting properties, this should correspond to the
order they're actually sent.  For example, standard RS-232 serial is
little-endian; the most significant bit (sometimes used for parity)
is sent last.  And when appending a CRC word to a message, you should
do it in the right order, matching the endianness.”h]”(hŒ-Note that a CRC is computed over a string of ”…””}”(hj5  hžhhŸNh NubhÂ)”}”(hŒ*bits*”h]”hŒbits”…””}”(hj=  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÁhj5  ubhX–  , so you have
to decide on the endianness of the bits within each byte.  To get
the best error-detecting properties, this should correspond to the
order theyâ€™re actually sent.  For example, standard RS-232 serial is
little-endian; the most significant bit (sometimes used for parity)
is sent last.  And when appending a CRC word to a message, you should
do it in the right order, matching the endianness.”…””}”(hj5  hžhhŸNh Nubeh}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khh£hžhubh¸)”}”(hX©  Just like with ordinary division, you proceed one digit (bit) at a time.
Each step of the division you take one more digit (bit) of the dividend
and append it to the current remainder.  Then you figure out the
appropriate multiple of the divisor to subtract to bring the remainder
back into range.  In binary, this is easy - it has to be either 0 or 1,
and to make the XOR cancel, it's just a copy of bit 32 of the remainder.”h]”hX«  Just like with ordinary division, you proceed one digit (bit) at a time.
Each step of the division you take one more digit (bit) of the dividend
and append it to the current remainder.  Then you figure out the
appropriate multiple of the divisor to subtract to bring the remainder
back into range.  In binary, this is easy - it has to be either 0 or 1,
and to make the XOR cancel, itâ€™s just a copy of bit 32 of the remainder.”…””}”(hjU  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K"hh£hžhubh¸)”}”(hŒìWhen computing a CRC, we don't care about the quotient, so we can
throw the quotient bit away, but subtract the appropriate multiple of
the polynomial from the remainder and we're back to where we started,
ready to process the next bit.”h]”hŒðWhen computing a CRC, we donâ€™t care about the quotient, so we can
throw the quotient bit away, but subtract the appropriate multiple of
the polynomial from the remainder and weâ€™re back to where we started,
ready to process the next bit.”…””}”(hjc  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K)hh£hžhubh¸)”}”(hŒ7A big-endian CRC written this way would be coded like::”h]”hŒ6A big-endian CRC written this way would be coded like:”…””}”(hjq  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K.hh£hžhubhŒliteral_block”“”)”}”(hŒ¡for (i = 0; i < input_bits; i++) {
        multiple = remainder & 0x80000000 ? CRCPOLY : 0;
        remainder = (remainder << 1 | next_input_bit()) ^ multiple;
}”h]”hŒ¡for (i = 0; i < input_bits; i++) {
        multiple = remainder & 0x80000000 ? CRCPOLY : 0;
        remainder = (remainder << 1 | next_input_bit()) ^ multiple;
}”…””}”hj  sbah}”(h]”h ]”h"]”h$]”h&]”Œ	xml:space”Œpreserve”uh1j  hŸh¶h K0hh£hžhubh¸)”}”(hŒoNotice how, to get at bit 32 of the shifted remainder, we look
at bit 31 of the remainder *before* shifting it.”h]”(hŒZNotice how, to get at bit 32 of the shifted remainder, we look
at bit 31 of the remainder ”…””}”(hj‘  hžhhŸNh NubhÂ)”}”(hŒ*before*”h]”hŒbefore”…””}”(hj™  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÁhj‘  ubhŒ shifting it.”…””}”(hj‘  hžhhŸNh Nubeh}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K5hh£hžhubh¸)”}”(hXf  But also notice how the next_input_bit() bits we're shifting into
the remainder don't actually affect any decision-making until
32 bits later.  Thus, the first 32 cycles of this are pretty boring.
Also, to add the CRC to a message, we need a 32-bit-long hole for it at
the end, so we have to add 32 extra cycles shifting in zeros at the
end of every message.”h]”hXj  But also notice how the next_input_bit() bits weâ€™re shifting into
the remainder donâ€™t actually affect any decision-making until
32 bits later.  Thus, the first 32 cycles of this are pretty boring.
Also, to add the CRC to a message, we need a 32-bit-long hole for it at
the end, so we have to add 32 extra cycles shifting in zeros at the
end of every message.”…””}”(hj±  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K8hh£hžhubh¸)”}”(hX  These details lead to a standard trick: rearrange merging in the
next_input_bit() until the moment it's needed.  Then the first 32 cycles
can be precomputed, and merging in the final 32 zero bits to make room
for the CRC can be skipped entirely.  This changes the code to::”h]”hX  These details lead to a standard trick: rearrange merging in the
next_input_bit() until the moment itâ€™s needed.  Then the first 32 cycles
can be precomputed, and merging in the final 32 zero bits to make room
for the CRC can be skipped entirely.  This changes the code to:”…””}”(hj¿  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K?hh£hžhubj€  )”}”(hŒ½for (i = 0; i < input_bits; i++) {
        remainder ^= next_input_bit() << 31;
        multiple = (remainder & 0x80000000) ? CRCPOLY : 0;
        remainder = (remainder << 1) ^ multiple;
}”h]”hŒ½for (i = 0; i < input_bits; i++) {
        remainder ^= next_input_bit() << 31;
        multiple = (remainder & 0x80000000) ? CRCPOLY : 0;
        remainder = (remainder << 1) ^ multiple;
}”…””}”hjÍ  sbah}”(h]”h ]”h"]”h$]”h&]”j  j  uh1j  hŸh¶h KDhh£hžhubh¸)”}”(hŒGWith this optimization, the little-endian code is particularly simple::”h]”hŒFWith this optimization, the little-endian code is particularly simple:”…””}”(hjÛ  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h KJhh£hžhubj€  )”}”(hŒ®for (i = 0; i < input_bits; i++) {
        remainder ^= next_input_bit();
        multiple = (remainder & 1) ? CRCPOLY : 0;
        remainder = (remainder >> 1) ^ multiple;
}”h]”hŒ®for (i = 0; i < input_bits; i++) {
        remainder ^= next_input_bit();
        multiple = (remainder & 1) ? CRCPOLY : 0;
        remainder = (remainder >> 1) ^ multiple;
}”…””}”hjé  sbah}”(h]”h ]”h"]”h$]”h&]”j  j  uh1j  hŸh¶h KLhh£hžhubh¸)”}”(hŒöThe most significant coefficient of the remainder polynomial is stored
in the least significant bit of the binary "remainder" variable.
The other details of endianness have been hidden in CRCPOLY (which must
be bit-reversed) and next_input_bit().”h]”hŒúThe most significant coefficient of the remainder polynomial is stored
in the least significant bit of the binary â€œremainderâ€ variable.
The other details of endianness have been hidden in CRCPOLY (which must
be bit-reversed) and next_input_bit().”…””}”(hj÷  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h KRhh£hžhubh¸)”}”(hŒÔAs long as next_input_bit is returning the bits in a sensible order, we don't
*have* to wait until the last possible moment to merge in additional bits.
We can do it 8 bits at a time rather than 1 bit at a time::”h]”(hŒPAs long as next_input_bit is returning the bits in a sensible order, we donâ€™t
”…””}”(hj  hžhhŸNh NubhÂ)”}”(hŒ*have*”h]”hŒhave”…””}”(hj  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1hÁhj  ubhŒ to wait until the last possible moment to merge in additional bits.
We can do it 8 bits at a time rather than 1 bit at a time:”…””}”(hj  hžhhŸNh Nubeh}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h KWhh£hžhubj€  )”}”(hŒûfor (i = 0; i < input_bytes; i++) {
        remainder ^= next_input_byte() << 24;
        for (j = 0; j < 8; j++) {
                multiple = (remainder & 0x80000000) ? CRCPOLY : 0;
                remainder = (remainder << 1) ^ multiple;
        }
}”h]”hŒûfor (i = 0; i < input_bytes; i++) {
        remainder ^= next_input_byte() << 24;
        for (j = 0; j < 8; j++) {
                multiple = (remainder & 0x80000000) ? CRCPOLY : 0;
                remainder = (remainder << 1) ^ multiple;
        }
}”…””}”hj%  sbah}”(h]”h ]”h"]”h$]”h&]”j  j  uh1j  hŸh¶h K[hh£hžhubh¸)”}”(hŒOr in little-endian::”h]”hŒOr in little-endian:”…””}”(hj3  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Kchh£hžhubj€  )”}”(hŒìfor (i = 0; i < input_bytes; i++) {
        remainder ^= next_input_byte();
        for (j = 0; j < 8; j++) {
                multiple = (remainder & 1) ? CRCPOLY : 0;
                remainder = (remainder >> 1) ^ multiple;
        }
}”h]”hŒìfor (i = 0; i < input_bytes; i++) {
        remainder ^= next_input_byte();
        for (j = 0; j < 8; j++) {
                multiple = (remainder & 1) ? CRCPOLY : 0;
                remainder = (remainder >> 1) ^ multiple;
        }
}”…””}”hjA  sbah}”(h]”h ]”h"]”h$]”h&]”j  j  uh1j  hŸh¶h Kehh£hžhubh¸)”}”(hŒ{If the input is a multiple of 32 bits, you can even XOR in a 32-bit
word at a time and increase the inner loop count to 32.”h]”hŒ{If the input is a multiple of 32 bits, you can even XOR in a 32-bit
word at a time and increase the inner loop count to 32.”…””}”(hjO  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Kmhh£hžhubh¸)”}”(hŒ¯You can also mix and match the two loop styles, for example doing the
bulk of a message byte-at-a-time and adding bit-at-a-time processing
for any fractional bytes at the end.”h]”hŒ¯You can also mix and match the two loop styles, for example doing the
bulk of a message byte-at-a-time and adding bit-at-a-time processing
for any fractional bytes at the end.”…””}”(hj]  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Kphh£hžhubh¸)”}”(hŒóTo reduce the number of conditional branches, software commonly uses
the byte-at-a-time table method, popularized by Dilip V. Sarwate,
"Computation of Cyclic Redundancy Checks via Table Look-Up", Comm. ACM
v.31 no.8 (August 1988) p. 1008-1013.”h]”hŒ÷To reduce the number of conditional branches, software commonly uses
the byte-at-a-time table method, popularized by Dilip V. Sarwate,
â€œComputation of Cyclic Redundancy Checks via Table Look-Upâ€, Comm. ACM
v.31 no.8 (August 1988) p. 1008-1013.”…””}”(hjk  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Kthh£hžhubh¸)”}”(hXG  Here, rather than just shifting one bit of the remainder to decide
in the correct multiple to subtract, we can shift a byte at a time.
This produces a 40-bit (rather than a 33-bit) intermediate remainder,
and the correct multiple of the polynomial to subtract is found using
a 256-entry lookup table indexed by the high 8 bits.”h]”hXG  Here, rather than just shifting one bit of the remainder to decide
in the correct multiple to subtract, we can shift a byte at a time.
This produces a 40-bit (rather than a 33-bit) intermediate remainder,
and the correct multiple of the polynomial to subtract is found using
a 256-entry lookup table indexed by the high 8 bits.”…””}”(hjy  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Kyhh£hžhubh¸)”}”(hŒI(The table entries are simply the CRC-32 of the given one-byte messages.)”h]”hŒI(The table entries are simply the CRC-32 of the given one-byte messages.)”…””}”(hj‡  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khh£hžhubh¸)”}”(hŒ{When space is more constrained, smaller tables can be used, e.g. two
4-bit shifts followed by a lookup in a 16-entry table.”h]”hŒ{When space is more constrained, smaller tables can be used, e.g. two
4-bit shifts followed by a lookup in a 16-entry table.”…””}”(hj•  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khh£hžhubh¸)”}”(hŒÀIt is not practical to process much more than 8 bits at a time using this
technique, because tables larger than 256 entries use too much memory and,
more importantly, too much of the L1 cache.”h]”hŒÀIt is not practical to process much more than 8 bits at a time using this
technique, because tables larger than 256 entries use too much memory and,
more importantly, too much of the L1 cache.”…””}”(hj£  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K„hh£hžhubh¸)”}”(hŒÚTo get higher software performance, a "slicing" technique can be used.
See "High Octane CRC Generation with the Intel Slicing-by-8 Algorithm",
ftp://download.intel.com/technology/comms/perfnet/download/slicing-by-8.pdf”h]”(hŒ—To get higher software performance, a â€œslicingâ€ technique can be used.
See â€œHigh Octane CRC Generation with the Intel Slicing-by-8 Algorithmâ€,
”…””}”(hj±  hžhhŸNh NubhŒ	reference”“”)”}”(hŒKftp://download.intel.com/technology/comms/perfnet/download/slicing-by-8.pdf”h]”hŒKftp://download.intel.com/technology/comms/perfnet/download/slicing-by-8.pdf”…””}”(hj»  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”Œrefuri”j½  uh1j¹  hj±  ubeh}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Kˆhh£hžhubh¸)”}”(hŒËThis does not change the number of table lookups, but does increase
the parallelism.  With the classic Sarwate algorithm, each table lookup
must be completed before the index of the next can be computed.”h]”hŒËThis does not change the number of table lookups, but does increase
the parallelism.  With the classic Sarwate algorithm, each table lookup
must be completed before the index of the next can be computed.”…””}”(hjÐ  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h KŒhh£hžhubh¸)”}”(hXâ  A "slicing by 2" technique would shift the remainder 16 bits at a time,
producing a 48-bit intermediate remainder.  Rather than doing a single
lookup in a 65536-entry table, the two high bytes are looked up in
two different 256-entry tables.  Each contains the remainder required
to cancel out the corresponding byte.  The tables are different because the
polynomials to cancel are different.  One has non-zero coefficients from
x^32 to x^39, while the other goes from x^40 to x^47.”h]”hXæ  A â€œslicing by 2â€ technique would shift the remainder 16 bits at a time,
producing a 48-bit intermediate remainder.  Rather than doing a single
lookup in a 65536-entry table, the two high bytes are looked up in
two different 256-entry tables.  Each contains the remainder required
to cancel out the corresponding byte.  The tables are different because the
polynomials to cancel are different.  One has non-zero coefficients from
x^32 to x^39, while the other goes from x^40 to x^47.”…””}”(hjÞ  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Khh£hžhubh¸)”}”(hŒ¿Since modern processors can handle many parallel memory operations, this
takes barely longer than a single table look-up and thus performs almost
twice as fast as the basic Sarwate algorithm.”h]”hŒ¿Since modern processors can handle many parallel memory operations, this
takes barely longer than a single table look-up and thus performs almost
twice as fast as the basic Sarwate algorithm.”…””}”(hjì  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K˜hh£hžhubh¸)”}”(hXJ  This can be extended to "slicing by 4" using 4 256-entry tables.
Each step, 32 bits of data is fetched, XORed with the CRC, and the result
broken into bytes and looked up in the tables.  Because the 32-bit shift
leaves the low-order bits of the intermediate remainder zero, the
final CRC is simply the XOR of the 4 table look-ups.”h]”hXN  This can be extended to â€œslicing by 4â€ using 4 256-entry tables.
Each step, 32 bits of data is fetched, XORed with the CRC, and the result
broken into bytes and looked up in the tables.  Because the 32-bit shift
leaves the low-order bits of the intermediate remainder zero, the
final CRC is simply the XOR of the 4 table look-ups.”…””}”(hjú  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Kœhh£hžhubh¸)”}”(hŒÙBut this still enforces sequential execution: a second group of table
look-ups cannot begin until the previous groups 4 table look-ups have all
been completed.  Thus, the processor's load/store unit is sometimes idle.”h]”hŒÛBut this still enforces sequential execution: a second group of table
look-ups cannot begin until the previous groups 4 table look-ups have all
been completed.  Thus, the processorâ€™s load/store unit is sometimes idle.”…””}”(hj  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K¢hh£hžhubh¸)”}”(hXœ  To make maximum use of the processor, "slicing by 8" performs 8 look-ups
in parallel.  Each step, the 32-bit CRC is shifted 64 bits and XORed
with 64 bits of input data.  What is important to note is that 4 of
those 8 bytes are simply copies of the input data; they do not depend
on the previous CRC at all.  Thus, those 4 table look-ups may commence
immediately, without waiting for the previous loop iteration.”h]”hX   To make maximum use of the processor, â€œslicing by 8â€ performs 8 look-ups
in parallel.  Each step, the 32-bit CRC is shifted 64 bits and XORed
with 64 bits of input data.  What is important to note is that 4 of
those 8 bytes are simply copies of the input data; they do not depend
on the previous CRC at all.  Thus, those 4 table look-ups may commence
immediately, without waiting for the previous loop iteration.”…””}”(hj  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K¦hh£hžhubh¸)”}”(hŒvBy always having 4 loads in flight, a modern superscalar processor can
be kept busy and make full use of its L1 cache.”h]”hŒvBy always having 4 loads in flight, a modern superscalar processor can
be kept busy and make full use of its L1 cache.”…””}”(hj$  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K­hh£hžhubh¸)”}”(hŒ<Two more details about CRC implementation in the real world:”h]”hŒ<Two more details about CRC implementation in the real world:”…””}”(hj2  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K°hh£hžhubh¸)”}”(hX¸  Normally, appending zero bits to a message which is already a multiple
of a polynomial produces a larger multiple of that polynomial.  Thus,
a basic CRC will not detect appended zero bits (or bytes).  To enable
a CRC to detect this condition, it's common to invert the CRC before
appending it.  This makes the remainder of the message+crc come out not
as zero, but some fixed non-zero value.  (The CRC of the inversion
pattern, 0xffffffff.)”h]”hXº  Normally, appending zero bits to a message which is already a multiple
of a polynomial produces a larger multiple of that polynomial.  Thus,
a basic CRC will not detect appended zero bits (or bytes).  To enable
a CRC to detect this condition, itâ€™s common to invert the CRC before
appending it.  This makes the remainder of the message+crc come out not
as zero, but some fixed non-zero value.  (The CRC of the inversion
pattern, 0xffffffff.)”…””}”(hj@  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h K²hh£hžhubh¸)”}”(hX  The same problem applies to zero bits prepended to the message, and a
similar solution is used.  Instead of starting the CRC computation with
a remainder of 0, an initial remainder of all ones is used.  As long as
you start the same way on decoding, it doesn't make a difference.”h]”hX  The same problem applies to zero bits prepended to the message, and a
similar solution is used.  Instead of starting the CRC computation with
a remainder of 0, an initial remainder of all ones is used.  As long as
you start the same way on decoding, it doesnâ€™t make a difference.”…””}”(hjN  hžhhŸNh Nubah}”(h]”h ]”h"]”h$]”h&]”uh1h·hŸh¶h Kºhh£hžhubeh}”(h]”Œ!brief-tutorial-on-crc-computation”ah ]”h"]”Œ!brief tutorial on crc computation”ah$]”h&]”uh1h¡hhhžhhŸh¶h Kubeh}”(h]”h ]”h"]”h$]”h&]”Œsource”h¶uh1hŒcurrent_source”NŒcurrent_line”NŒsettings”Œdocutils.frontend”ŒValues”“”)”}”(h¦NŒ	generator”NŒ	datestamp”NŒsource_link”NŒ
source_url”NŒtoc_backlinks”Œentry”Œfootnote_backlinks”KŒsectnum_xform”KŒstrip_comments”NŒstrip_elements_with_classes”NŒstrip_classes”NŒreport_level”KŒ
halt_level”KŒexit_status_level”KŒdebug”NŒwarning_stream”NŒ	traceback”ˆŒinput_encoding”Œ	utf-8-sig”Œinput_encoding_error_handler”Œstrict”Œoutput_encoding”Œutf-8”Œoutput_encoding_error_handler”j‡  Œerror_encoding”Œutf-8”Œerror_encoding_error_handler”Œbackslashreplace”Œlanguage_code”Œen”Œrecord_dependencies”NŒconfig”NŒ	id_prefix”hŒauto_id_prefix”Œid”Œdump_settings”NŒdump_internals”NŒdump_transforms”NŒdump_pseudo_xml”NŒexpose_internals”NŒstrict_visitor”NŒ_disable_config”NŒ_source”h¶Œ_destination”NŒ_config_files”]”Œ7/var/lib/git/docbuild/linux/Documentation/docutils.conf”aŒfile_insertion_enabled”ˆŒraw_enabled”KŒline_length_limit”M'Œpep_references”NŒpep_base_url”Œhttps://peps.python.org/”Œpep_file_url_template”Œpep-%04d”Œrfc_references”NŒrfc_base_url”Œ&https://datatracker.ietf.org/doc/html/”Œ	tab_width”KŒtrim_footnote_reference_space”‰Œsyntax_highlight”Œlong”Œsmart_quotes”ˆŒsmartquotes_locales”]”Œcharacter_level_inline_markup”‰Œdoctitle_xform”‰Œdocinfo_xform”KŒsectsubtitle_xform”‰Œimage_loading”Œlink”Œembed_stylesheet”‰Œcloak_email_addresses”ˆŒsection_self_link”‰Œenv”NubŒreporter”NŒindirect_targets”]”Œsubstitution_defs”}”Œsubstitution_names”}”Œrefnames”}”Œrefids”}”Œnameids”}”ja  j^  sŒ	nametypes”}”ja  ‰sh}”j^  h£sŒfootnote_refs”}”Œcitation_refs”}”Œautofootnotes”]”Œautofootnote_refs”]”Œsymbol_footnotes”]”Œsymbol_footnote_refs”]”Œ	footnotes”]”Œ	citations”]”Œautofootnote_start”KŒsymbol_footnote_start”K Œ
id_counter”Œcollections”ŒCounter”“”}”…”R”Œparse_messages”]”Œtransform_messages”]”Œtransformer”NŒinclude_log”]”Œ
decoration”Nhžhub.