Etienne Vogt
2010-04-11 21:07:00 UTC
Hi there !
I think I have just found a well hidden bug in the OS 3.9 FastFileSystem
(possibly also present in earlier versions).
This week-end, I took my A3000 out of storage to check and recharge
its RTC battery and found that one of the internal hardrives was nearly
dead. When it finally started up, I quickly copied its content to another
old SCSI spare drive. Then I noticed something strange : on every boot,
the new Work partition went into validation. The validation completed
succesfully, leaving the partition in read/write state but started again
each time the volume was remounted (even the diskchange command triggered
a validation). When running a disk checking utility, it complained that
the volume bitmap was invalid : the last bitmap block was apparently
missing.
So I took a look at the filesystem code and it appears that the calculation
of the number of blocks required to hold the volume bitmap is faulty.
The assembler code looks like this :
MOVE.L MyBlockSize(A5),D0 ; blocksize in bytes
SUBQ.L #4,D0 ; for the checksum
LSL.L #3,D0 ; 8 bits per byte
MOVE.L D0,BitsPerBMBlock(A5)
SUBQ.L #1,D0 ; Bug here !
ADD.L MyHighKey(A5),D0
SUB.L MyReserved(A5),D0
MOVE.L BitsPerBMBlock(A5),D1
BSR UDivMod32
MOVE.L D0,NumBMBlocks(A5)
HighKey is number of blocks - 1, so the SUBQ.L #1,D0 shouldn't be there !
(it is cancelled by an ADDQ.L #1,D0 which would go after adding HighKey).
The bug goes unnoticed most of the time, unless the number of useful blocks
(minus reserved) modulo BitsPerBMBlock (giving the number of useful bits
in the last bitmap block) happens to be 1 : in that case the incorrect
substraction changes it to 0, and the last bitmap block is not allocated.
With a 512 bytes blocksize, BitsPerBMBlock is 4064 so the odds for being
hit by the bug are quite weak (it's even less for bigger block sizes).
Anyways, this seems quite easy to fix and I should be able to release
a patch soon (if anybody still cares).
I think I have just found a well hidden bug in the OS 3.9 FastFileSystem
(possibly also present in earlier versions).
This week-end, I took my A3000 out of storage to check and recharge
its RTC battery and found that one of the internal hardrives was nearly
dead. When it finally started up, I quickly copied its content to another
old SCSI spare drive. Then I noticed something strange : on every boot,
the new Work partition went into validation. The validation completed
succesfully, leaving the partition in read/write state but started again
each time the volume was remounted (even the diskchange command triggered
a validation). When running a disk checking utility, it complained that
the volume bitmap was invalid : the last bitmap block was apparently
missing.
So I took a look at the filesystem code and it appears that the calculation
of the number of blocks required to hold the volume bitmap is faulty.
The assembler code looks like this :
MOVE.L MyBlockSize(A5),D0 ; blocksize in bytes
SUBQ.L #4,D0 ; for the checksum
LSL.L #3,D0 ; 8 bits per byte
MOVE.L D0,BitsPerBMBlock(A5)
SUBQ.L #1,D0 ; Bug here !
ADD.L MyHighKey(A5),D0
SUB.L MyReserved(A5),D0
MOVE.L BitsPerBMBlock(A5),D1
BSR UDivMod32
MOVE.L D0,NumBMBlocks(A5)
HighKey is number of blocks - 1, so the SUBQ.L #1,D0 shouldn't be there !
(it is cancelled by an ADDQ.L #1,D0 which would go after adding HighKey).
The bug goes unnoticed most of the time, unless the number of useful blocks
(minus reserved) modulo BitsPerBMBlock (giving the number of useful bits
in the last bitmap block) happens to be 1 : in that case the incorrect
substraction changes it to 0, and the last bitmap block is not allocated.
With a 512 bytes blocksize, BitsPerBMBlock is 4064 so the odds for being
hit by the bug are quite weak (it's even less for bigger block sizes).
Anyways, this seems quite easy to fix and I should be able to release
a patch soon (if anybody still cares).