block header
My objective now is to pull out some of the concepts from AgileWiki to create a very fast and reliable database.
I find myself inspired by the work done at Sun Microsystems on a reliable file system, in particular the idea of doing copy-on-write and having a checksum in each block. I also like the ability to update a backup file quickly.
The database would use fixed-size blocks which are moderately large. Now when I tuned AgileWike, I found that a block size of 32K seemed to work best with the algorithms I was using, so lets use that number as the default block size, at least for now.
Here then is a proposal for the block header:
- Checksum of the Block
- Timestamp
- Data Length
- Data
The checksum would be on the timestamp, data length and the (variable length) data. And let's use something like java.util.zip.Adler32, which is a reasonably fast checksum. And timestamp would be of the last time the block was updated. This layout should make it reasonably easy to update a backup copy of the database, copying only the blocks with a timestamp greater than the previous update. We can also detect corrupted data when we read a block, as well as making it easy to check to see if a database contains any bad blocks.
More about copy-on-write next time.
0 Comments:
Post a Comment
<< Home