Appendix 2
: How to detect the internal audio error correction ability of a CD ROM drive
1 Introduction
The performance of CD ROM
drives and CD players reading some scratched or weared CDs depends on the quality
of their components, but also on the error correction performed by their
chipset. The audio data coming from the surface of the CD is converted into
binary information. It is divided in frames of 24 bytes each, that is 6 audio
samples. These frames pass through two layers of error correction. The first
layer is called C1. Thanks to the presence of 4 bytes of error correction
information, the chipset can correct up to two wrong bytes there. If there
are more than two wrong bytes at the C1 stage, the data is interleaved so
that the wrong bytes are scattered over different frames, and passed to the
C2 stage. There, there are three kind of strategies to correct errors. It is
possible to correct up to two wrong bytes in the C2 stage, up to three, and
up to four wrong bytes. When there are more wrong bytes, the chipset can't
correct the errors, so the wrong samples are replaced by a guessing of their
value, according to the neighborous samples, if they are themselves correct.
There is another possible difference between the error correction strategies
used. When the errors present in the C1 stage can't be corrected, it is
possible to flag all 24 audio bytes in the frame as wrong, because the C1
error correction is unable to sort the right and the wrong ones if too much
info is lost. But it is possible to keep track of the EFM decoding that
occured just before, so that only the ones that were already unreadable at
the EFM stage are flagged wrong. The other bytes that were properly read by
the EFM decoder are supposed to be correct. There is one chance out of 64
that they are wrong all the same, but the C2 stage will detect and correct
them anyway.
Last, there it is possible to come back to C1 again once the C2 error
correction is done, in order to take advantage of the info restored in the C2
stage for further error correction [5]
According to [7], the more wrong bytes we try to
detect, the less secure the error correction becomes. So when there are few
errors, it is better to use a weak error correction strategy, because it
ensures that errors are detected with more confidence. But when there are too
much errors, it’s better switching to a stronger strategy, because it may be
still possible to correct quite all of them. Better being a bit less sure
than everything is correct, but still get a perfect result most of the time,
than to be always sure to interpolate all errors, and get an average result
everytime.
We are going to detect
the kind of strategy used in a CD player to correct a big burst error.
2 The experiment
Analysing the C2 results
of the Memorex DVDMaxx 1648 on the DAEquality test CD, I came across this
graph (see the C2 results with the
DAEquality test CD) :

1 - Errors and undetected errors for each second in
the black mark range on the CD, without the peaks, sorted by number of errors
Errors per second, for every second.
Undetected errors per second x100 (here : 0, 1, 2, or 3)
The first thing that
seems strange is the profile of the pink curve. It is expected to rise
evenly, but instead, it rises step by step. Here's a close-up :

2 - Close-up on the three first steps of figure 1
The errors recorded each
second are mostly multiple of 60. Sometimes a little bit less. Why ?
I substracted the
extracted wave from the reference one, in order to look at the errors in a
wave editor. Looking at the resulting waveform, it was clear that errors came
always by isolated bursts of 60, and each burst had the errors at exactly the
same place. Sometimes one or two are missing, this is because the differences
being interpolated, their amplitude is very small, and some of them, by chance,
are equal to zero. Thus they are not visible.
Click here to
view a picture of the complete pattern for a 60 samples burst error. It is
543 samples wide from the first to the last error. Blue bars show the
differences between the extracted wave and the reference one. Each one shows
a wrong value in the copy. Right values are all the null samples in between,
that are the substraction of the copy from the original, thus zero if the
copy is the same as the original.
The horizontal axis marks are samples, and the vertical ones elementary 16
bits steps (maximum zoom).
The steps in the error
graph don't show up reading a bad CDR :

3 - Errors in a bad CDR, same settings as the pink curve of figure
2
The difference between
the test CD and a bad CDR is the distribution of the errors on the CD. On a
bad CDR, the errors occurs when the drive is unable to properly read the
data. Since the CDR is evenly weared on all its surface, the errors occurs
randomly. With the test CD, on the other hand, errors come from a little
black mark drawn on the surface, that makes about one milimeter of groove
completely unreadable, while the rest is perfect.
The CD is read at the linear speed of 1.2 meters per second, and the black
mark begins 31 millimeters from the center of the CD.
One rotation is
31*2*Pi=195 mm
Its duration is 195/1200=0.16 seconds, that is 7166 samples. So, at the
beginning of the errors, the elementary patterns should be spaced by at least
7000 samples, more if the errors begins further away in the black mark,
farther from the center of the CD.
In the wave, the first errors are at 301000 samples from each other. Then
164000, 284000... no need to try to divide these numbers by 7000, since more
than 20 rotations separate two of them. As the theoretical delay increases,
the inaccuray is too big.
Further in the wave, the errors are closer. An elementary spacing appears
between the bursts, that is about 9243 samples. That is 0.210 seconds, that
is about 250 millimeters, that is one rotation at 40 mm from the center of
the CD. This is still in the black mark. It's 2 mm wide at this place. So it
lasts 0.0017 seconds, that is 73.5 samples, that is 12 audio blocks.
So each time the black
mark is encountered, an error can occur, that is always exactly the same. It
must match the pattern of a given burst error on the CD, after CIRC decoding.
CIRC is the name of the
method used to encode PCM data, as it appears in wav files, that is 44100
stereo samples of 16 bits per second, onto audio CDs. I found two ressources
on the web explaining CIRC in detail, so that I could reconstruct the
observed pattern : Kelin Kuhn’s paper [1] and ECMA-130 specifications from ecma.ch [2]. It is advised to read one of them in order
to understand the following.
Kuhn's paper is better explained, but beware that is his text, C1 and C2 are
inverted !
In order to go on, you
will need the CIRC decoder diagram at hand, Fig C2 in ECMA 130. Beware that
the "delay of one byte" and "delay of two bytes" are in
fact delays of one and two frames respectively. What ECMA 130 calls
"F1 frames times" are in fact full frames, be them 24, 28, or 32
bytes. Top left; 328-bit bytes means "thirty two 8-bit bytes".
3 Image of a 16 frames EFM burst error with no EFM error detection and
4 bytes C2 correction
3.1
Error pattern in the C1 decoder
A CD player or a CD ROM
drive can correct up to two, three or four wrong symbols in a C2 frame,
according to the type of strategy used. It means, in the four wrong symbols
case, that if four bytes are changed in the C2 frames, it can always be
detected thanks to the additional information provided by the four parity
bytes, and if there are no more than four bytes affected in this frame, the
chipset can always calulate their original value and correct the error, be
them completely damaged.
As the errors we get come in clusters of 60 and not 5, is is possible that
the chipset doesn't use EFM information for error detection, because as soon
as a little part of a C1 frame becomes affected (3 or 4 bytes), 24 audio
bytes will be lost at once in the C1 stage. Maybe this will account for the
60 errors. Let's see what can happen.
The black mark masks
several consecutive frames of audio, on the CD, at the extreme left of the
CIRC decoder diagram (that stands for one frame only). I will call this
"CD" data "EFM", because the previous step that could be
shown on the left of the CIRC diagram is the EFM decoder.
These lost data, coming from the left of the diagram, pass through the first
delay line before reaching the C1 decoder. There, if more than 2 bytes are
wrong, generally, the whole frame is marked as wrong [4].
For an error to occur at the output, at least five bytes must be affected in
a C2 frame. But if only one C1 frame is destroyed, we can see that it will
affect only one byte in 28 different C2 frames. If four consecutive C1 frames
are destroyed, it will affect 1 byte in 112 consecutive C2 frames, and still
generate no error, since no C2 frame have more than one wrong byte. Therefore
our original EFM error, that is supposed to generate errors at the output,
must run longer than at least 2 frames, thus it includes at least one full
frame (and certainly more, but let’s see it step by step).
If it begins sooner than the byte number 27 in a given frame, the
corresponding C1 frame will have at least bytes 26, 28 and 30 affected, and
will be flagged as being all wrong. If it begins on byte 27 or later, the
corresponding C1 frame will have all its even audio bytes, from 0 to 26,
correct. It can be left all correct, or being flagged all wrong, because
bytes 28 and 30 are wrong [4]. The burst error being bigger than two
frames, the next C1 frame will have at least all its even bytes wrong (from
the next EFM frame), and be flagged all wrong.
Then, as the EFM error runs, all the subsequent C1 frames will be flagged
wrong.
The C1 frame corresponding to the EFM frame where the burst error stops will
have all its odd bytes wrong and will be flagged wrong. The next one can have
byte number 1 wrong, or unlikely 1 and 3, if the EFM burst error stops after
byte 0 and before byte 5.
Thus 95 % of the burst errors will affect an integer number of consecutive C1
frames (unlikely 90%) depending on where exactly it stops.
3.2
Minimal size of the error
We just saw that a burst
error such as the one on the DAEquality test CD, if EFM is not used for error
detection, will most of the time turn wrong all bytes in a given number of
consecutive C1 frames. Exept little glitches here and there in the data
returned by the Memorex drive, the smallest common error is a burst of 60
samples. Let's find the smallest number of wrong C1 frames capable of
creating uncorrectable errors.
It is easier to start from the CIRC encoder diagram (ECMA 130 fig C1), this
time. The C1 and C2 stages are the column "Generation of four parity
bytes". There is just one delay line between them, that drops one C2
byte every fourth C1 frame in the encoding process.
If we consider a situation in which we can correct up to four wrong bytes per
C2 frame, for an uncorrectable error to occur, at least one C2 frame must
have five wrong bytes. These bytes must all come from inside the affected C1
frames. For the C1 error to be the shortest, the 5 C2 bytes must be
consecutive ones. This way, they will be scattered across 17 C1 frames only.
If the first byte was in C1 frame 1, the second will be in C1 frame 5, the
next in frame 9, then 13, and 17.
Thus at least 17 C1 frames are wrong. This is the smallest common error.

4 Gathering of five bytes from the C1 burst error in a C2 frame
3.3
Image of the error after decoding
A 17 frames C1 burst
error will not only generate 5 wrong byte in one C2 frame, but any serial of
consecutive bytes from frames 1, 5, 9, 13 and 17 will gather in a C2 frame,
and generate an uncorrectable error.
In order to compute the pattern of the resulting wrong samples, let's use
again a Microsoft Excel document. In the first column, write all the byte
sequences that will find themselves in the same C2 frame. The first one is 0,
1, 2, 3, 4. The next sequence of wrong bytes is 1, 2, 3, 4, 5, then 2, 3 ,4
,5 ,6 etc, until 23 , 24 , 25 , 26 , 27.
Then, in the Excel file, in the five next columns, write the delay affecting
each of the listed bytes in the first column, in mono samples, according to
the interleaving pattern between the C2 decoder and the final 24 bytes frame,
in the CIRC decoder diagram. One mono sample is 2 bytes. Two frames are 24
mono samples. For example, for bytes 0, 1, 2, 3, and 4, counting the first
sample as zero, it will be 0, 0, 4, 4 and 8.
In any sequence, the
first and last byte are in the first and last frame of the burst error.
Therefore the C2 frame in which they gather can be positionned relatively to
the first frame of the C1 burst error looking at the delay affecting the
first byte of each sequence. For example in the sequence 0, 1, 2, 3, 4, the
byte 0, according to the CIRC decoder figure, is delayed by 27D=27x4=108
frames between the C1 and the C2 stage. Write this delay in mono samples in
the next column.
Then, in the five next
column, add the sample positions with the frame delays. The 24x5=120
resulting numbers give the relative position of each wrong sample in the wave
file.
Paste all of them into another Excel file, sort them ascendingly and delete
all duplicates. Then in the next column, add an even constant so that the
first sample is 0 or 1, and divide by two in order to get the delays in
stereo samples. Since left and right samples are stored alternately, integer
numbers stand for the left channel, and decimal numbers for the right channel.
Here are the resulting
Excel files for two, three, and four C2 bytes correction
Delays
for a 17 frames C1 burst error (with five wrong C2 bytes)
Error pattern for a 17 frames C1 burst error (with five wrong C2 bytes)
Delays
for a 13 frames C1 burst error (with four wrong C2 bytes)
Error pattern for a 13 frames C1 burst error (with four wrong C2 bytes)
Delays
for a 9 frames C1 burst error (with three wrong C2 bytes)
Error pattern for a 9 frames C1 burst error (with three wrong C2 bytes)
The Memorex burst error
matches exactly the one predicted for the four wrong C2 bytes strategy. Since
the pattern features 60 samples spread over a range of about 1000 mono
samples, this result can't be the effect of chance. This drive corrected up
to four wrong bytes at the C2 stage of error correction, and most C1 frames
turned out flagged either all right, either all wrong.
We saw that the black mark covered about 12 frames. It means that the drive
didn't catch back the groove immediately after the black mark, but a little
further. Besides, we also saw that the first errors are spaced by about 20
rotations, therefore most of the time (19 times out of 20), the groove is
caught back before 17 C1 frames are lost and all errors are corrected.
The error can also be more than 17 frames, but this occurs further away in
the wave file, where the black mark is larger on the CD. Then the error
clusters get more and more complex, as the elementary patterns get mixed
together.
In conclusion, without
using EFM for error detection and correcting up to four wrong bytes at the C2
stage, the elementary burst error is 60 samples distributed like this :
0,5 2,5 4,5 24,5 26,5 28,5 48,5 50,5 52 72,5 74,5 76 96,5 98 100 120,5 122
124 144 146 148 168 170 172 192 194 216 218 240 264 279,5 303,5 325,5 327,5
349,5 351,5 371,5 373,5 375,5 395,5 397,5 399,5 419,5 421,5 423 443,5 445,5
447 467,5 469 471 491,5 493 495 515 517 519 539 541 543
Picture
Wave
file
Without EFM error detection
and correcting up to three wrong bytes at the C2 stage, the elementary burst
error is 52 samples distributed like this :
0,5 2,5 22,5 24,5 26,5 46,5 48,5 70,5 72,5 74 94,5 98 118,5 120 122 144 146
166 168 170 190 192 214 216 238 262 301,5 325,5 347,5 349,5 371,5 373,5 393,5
395,5 397,5 417,5 419,5 441,5 443,5 445 465,5 469 489,5 491 493 515 517 537
539 541 561 563
Picture
Wave
file
Without EFM error
detection and correcting up to two wrong bytes at the C2 stage, the
elementary burst error is 44 samples distributed like this :
0,5 2,5 24,5 26,5 46,5 48,5 70,5 72,5 94,5 98 118,5 122 144 146 168 170 190
192 214 216 238 262 325,5 349,5 371,5 373,5 395,5 397,5 417,5 419,5 441,5
443,5 465,5 469 489,5 493 515 517 539 541 561 563 585 587
Picture
Wave
file
4 Using EFM information
Before
the CIRC decoder, the data that comes from the CD pass through the EFM
decoder. There, each valid 14 bits symbols is converted into the matching 8
bits byte, according to the EFM table [2]. Since it is
possible to write 28=256 8 bits symbols, and 214=16384
14 bits symbols, only one 14 bits symbol out of 64 is used.
Thus when an error occurs, most of the time, the EFM decoder can’t find a 8
bits byte matching the erroneous 14 bits symbol. It is possible to take
advantage of this, and when three or four bytes are wrong in a given frame,
it is possible for the C1 decoder to correct them thanks to their position,
that is given by the EFM failures, while using only the P parity bytes, no
more than two wrong bytes can be corrected in the C1 stage. Furthermore, when
there are more than four errors in a frame, instead of marking all 28 bytes wrong,
it is possible to mark only the ones that the EFM decoder couldn’t translate.
As there is one chance out of 64 of generating a valid EFM symbol by
accident, we are not sure that all wrong bytes are marked wrong, but the C2
layer detects the missed errors. The advantage is not to overload the C2
decoder with suspicious bytes. When all bytes of a C1 frame are marked wrong,
some valid bytes that could be used for C2 error correction, if one frame has
only four wrong bytes, for example, are considered wrong all the same, and
the C2 frame having five bytes marked wrong instead of four, the error
correction is given up.
If a burst error
affecting N frames occurs on the CD, the C1 frames will keep track of most of
the errors detected by the EFM decoder. Because of the delay line between EFM
and C1, Each C1 frame keeps its even bytes, but receives the delayed odd
bytes from the previous one. Thus the first C1 frame got from the burst error
will have wrong odd bytes and right even bytes, all following C1 frames until
the Nth will have all bytes wrong, and the frame number N+1 will only have
the odd bytes wrong. Such C1 frames with every other byte wrong will be
refered to as “combed frames”.
But the error does not need anymore to affect an integer number of frames.
For example if it stops at the first quarter of a frame, the generated C1
frame will have all odd bytes wrong, because the previous EFM frame from
which they come was all wrong, plus the first quarter of the even bytes also
wrong. The next frame will just have the first quarter of the odd bytes
affected. Thus the burst error, once passed to the C1 level, will have a
combed pattern that is one frame long at both its ends, made of 16 even bytes
at the beginning, and 16 odd bytes at the end.
It changes the patterns computed above for strategies not using EFM info.
Without EFM info, once two or three bytes were wrong in a C1 frame, all the
frame was marked wrong, that’s why, as the burst error widens, the resulting
number of wrong samples at the output grew step by step : 60 when 17 C1
frames were affected, 120 when 18 were affected, 180 for 19, etc. Now, the
number of errors at the output is going to rise evenly as the burst error
widens, because the number of wrong C1 bytes will grow byte after byte instead
of frame after frame. This allows to detect easily if EFM information is used
for error detection.
In fig 4 above, we saw
that a C2 uncorrectable error comes from C1 frames spaced every 4 frames (for
our little burst error). The byte sequences taking part into the C2 error are
3, 4, or 5 bytes long according to the C2 error correction strategy, with
byte orders following each other. Thus the first and last byte orders, will
be of the same parity for 3 and 5 bytes sequences, and of an opposite parity
for 4 bytes sequences. Those bytes are at the beginning and at the end of the
error. Thus if we consider an EFM error with the smallest necessary number of
frames in order to generate a C2 error, those bytes will be into the combed
parts in the C1 stage. EFM errors of 8, 12, and 16 frames will generate C1
errors of 9, 13, and 17 frames, but with the first and the last ones combed.
Since the last C1 erroneous frame will have the odd bytes affected while the
first will have the even ones, the result will depend on the parity of the
first and last byte in the sequence generating the C2 error.
If the parity is the same (sequences of 3 or 5 bytes), when the first byte of
the sequence is wrong (even position), the last is right (still even
position), and conversely. Thus no C2 error is generated. Let’s see what
happens if one more frame is wrong. Every sequence beginning with an even
byte is wrong, because now, the last frame of the last byte of the sequence
is completely wrong. But it is also true for sequences beginning in the
second frame, and finishing with an odd byte.
Let’s add another column in the Excel file, with an additional delay of 12
mono samples for sequences beginning with an odd byte. Here are the results :
Delays
for a 9 frames EFM burst error (with three wrong C2 bytes and using EFM info)
Error pattern for a 9 frames EFM burst error (with three wrong C2
bytes and using EFM info)
Delays
for a 17 frames EFM burst error (with five wrong C2 bytes and using EFM info)
Error pattern for a 17 frames EFM burst error (with five wrong C2
bytes and using EFM info)
We can see that they look
like the ones without the use of EFM info, but with every other group of
three wrong samples shifted one frame (6 stereo samples) to the left.
The case with four wrong bytes is different, because, as the first and last
byte of each sequence have opposite parities, when only 12 EFM frames are
affected, the wrong byte sequences beginning in the first combed C1 frame
will also finish on wrong bytes in the 13th frame, that is also
combed. Thus in this case, we will get half of the sequences that are wrong
without using EFM info. Here is the resulting pattern :
Delays
for a 12 frames EFM burst error (with four wrong C2 bytes and using EFM info)
Error pattern for a 12 frames EFM burst error (with four wrong C2
bytes and using EFM info)
Unlike the previous
patterns got when EFM info is not used, these patterns won’t happen often as
they are given. The burst errors actually occuring in the output file will be
shorter at the beginning, showing only little parts of the pattern linked
above, because the burst error won’t reach the total number of needed frames
at once. The first frame that generates errors (the 9th, 12th,
or 17th) will be progressively filled, byte after byte, generating
error patterns bigger and bigger, until it reaches the size matching the
given patterns, and the burst will go on after that. It is even quite
possible for these patterns never to appear exactly, since the burst error
can rise from just less than the needed number to just more.
This is valid if the
burst error starts and stops at frame boundaries. What will happen if it is
offsetted by half a frame, for example ? The first C1 frame will have its
second half combed, that will match the second half of the previous last C1
frame. About half of the pattern above will appear (the first half, since the
delays diminish as the byte order increases). The next C1 frame, that was
fully affected, but did not take part in any C2 error, will have its first
half combed, and will have a new C1 frame after the previous last frame that
will also have its first half combed, so about the second half of the pattern
will appear too, but offsetted 6 stereo samples to the right.
5 Peforming double pass
There
may another strategy to perform CIRC decoding : double pass [5].
Once the C2 error correction is done, the data goes back into the delay line
and the C1 and C2 error corrections take place again. This allows to correct
errors that were uncorrectable in the first pass.
For example, imagine a random error consisting in five C1 frames, spaced 4 by
4, each of them having just the lowest number of uncorrectable bytes, e.g.
three, that is a total of 5x3 = 15 wrong bytes, instead of being all wrong.
Now imagine that 5 of them, one from each frame, gather all in the same C2
frame, while the others are scattered into other frames. In the single pass
strategy, this C2 frame have an uncorrectable error, with at least 3 samples
affected. But if the data is sent back to C1, all other bytes having been
sent into other C2 frames have been corrected in the meantime, now the only
wrong bytes left in the second C1 pass are the five ones that were in the
same C2 frame, and each of them is now back into its own C1 frame, where it
is the only bad byte left. Each of these C1 frame having just one wrong byte,
they can now be corrected, and there is nothing left wrong at all. The C2
error doesn’t occur anymore in the second C2 pass, since the C1 level doesn’t
send anymore wrong bytes. In this case, performing a double pass allowed to
correct an uncorrectable C2 error.
In the case of a burst
error, it will be difficult to see the difference with single pass. If the
second C1 pass is the same as the first, nothing will change, so the system
only works when the second C1 pass allows to correct C1 frames that were
previously uncorrectable, and only the first C2 pass can make the difference.
Therefore, the C1 frame must generate at most 1, 2 or 4 uncorrectable C2
errors, according to the error correction strategy. Otherwise, more wrong
bytes will be left after the first C2 pass, and the C1 frame won’t be
corrected in the second pass, leading to the same results as the first time.
The bytes corrected in the first C2 pass will just be replaced at their
position in the second C2 pass, not changing anything from the first C2 pass.
Thus the difference can only exist for C1 frames implied into less than five
uncorrectable C2 errors.
5.1
Effect if EFM info is not used
Not using EFM info, with
the minimal burst error studied above, each C1 frame involved into C2 errors
generates 24, 25, or 26 of them, according to the C2 error correction
ability. If the burst error is bigger, this number can rise to 28, if every
byte is involved into a C2 error. Thus, this number being far above the
required 1 or 2, the double pass strategy doesn’t change anything to burst
error handling in this case.
5.2
Effect using EFM info
Using EFM information,
there can be some C1 frames involved into less than five C2 frames. In these
C1 frames, after the first C2 pass, all bytes will be corrected exept the
ones that were in the C2 errors. Since every C1 byte goes into a different C2
frame, there can be no more than four bytes affected in this C1 frame. Thus,
granted, as [6] suggests, that 4 wrong bytes can be
corrected in a C1 frame knowing their positions, the frame can be corrected,
and generates no more C2 errors.
In the case of minimal C2 errors (no more than 3, 4, or 5 wrong bytes in a C2
frame), no C1 frame can generate more than 0 and less than 5 C2 errors after
the two passes. As a C1 frame can usually generate 1 to 28 C2 errors, with
the burst error boundary being anywhere inside a frame, the effect on the error
patterns will be hardly visible, lost into the patterns of various sizes that
the regular one pass mode already generates. In order to see it, I think that
it would be necessary to compute all possible error patterns for little burst
errors (that is for a burst error size increasing byte by byte), list their
sizes, and point out the sizes that should be missing in the double pass
strategy. Then the effect might be visible in a graph such as the one figure
2 above, that stands for no use of EFM, and correcting 4 C2 bytes.
6 References
[1] Audio Compact Disk, an
introduction,
Kelin Kuhn
[2] ECMA-130 Compact Disc Standard
[3] Info about chipsets
provided by BobHere
[4] Info about C1 error detection provided by BobHere and Spath
[5] Existence of double
pass CIRC strategy mentioned by BobHere
[6] C1 error correction, Writing
quality article, page 9, CDRinfo.com
[7] Why
switching between different CIRC strategies ?, by BobHere, according to
Pohlmann.
Home
Version
4 By Pio2001, updated february the 6th, 2003
Version 1 created december 2002
|