Friday, November 30, 2007

API changes!

The 0.7 release of cowdb went out on 28 November: 4 downloads to date. I was calling it a beta release, but it looks now like it was more of an alpha release. I'm working on the threading model and find I'm making a lot of changes to the user-level API.

When working on the API previously, I gave very little thought to multi-threading. I'm now moving to a thread-safe DbProcessor class and a thread-safe BranchProcessor class. Each DbProcessor will be unique to a db file and similarly, BranchProcessor objects will be unique to a branch. Selecting a branch will now be a matter of calling getBranchProcessor on the DbProcessor object. Overall, I think the new API will be a lot more developer-friendly. And release 0.8 will definitely be a beta release.

Now what is going to take time, I suspect, will be developing the tests for multi-threading. That, and perhaps a bit of debugging.

Tuesday, November 27, 2007

new home page for AgileWiki

http://agilewiki.wiki.sourceforge.net/

Enjoy!

Monday, November 26, 2007

threading model for the COW DB

With branches done and working, the threading model has resurfaced as an issue. At the moment, except for the cache shared by all the databases, nothing is thread safe. But you can have a different thread for each database. That's a bit primitive and needs changing.

Here's what I'm planning:
For each database, there can be no more than a single writer. And for each branch, either a single writer or multiple readers.

Now you can support a long-running query--just create a branch for it to run against and then delete the branch when it is done. A nice enhancement then would be to have temporary branches which are deleted when the program restarts--this would handle the problem of a crash in the middle of a long-running query.

No, I'm not rushing to do this. Rather, I want to spend the time working on web pages for now.

Sunday, November 25, 2007

The COW database now has branches

A busy weekend--I implemented branches. And at this point, everything works. Branches are like virtual copies. At any time, you can create a branch of either the trunk (the default branch) or any other branch. You can then apply transactions to either the original branch or the new branch. It is as if you made a copy of the whole database.



Creating branches is very quick--it does not depend on the size of the branch that you are creating. Deleting a branch is slower--it depends on the number of changes made to that branch since it was created or since the last branch (still in existance) was created from it.



Remember that this is also a crash-proof transactional database. It is also quite fast for an object oriented database, but then objects require specific serialization methods to achieve that speed. (Object serialization and XML data binding are avoided--it uses plain old DataInput and DataOutput.)



I would call this beta version 0.7, and it is the first version I would call beta. It consists of about 70 *.java files. The next step is to update the agile wiki web site. Yes, I've got reasonable javadocs, but I want something more. :-)



And no, I didn't get to implementing hashes yet.

Tuesday, November 20, 2007

an mru cache

I've now implemented a most receintly updated object cache. Remember, the database already has a most receintly referenced cache for retaining the N most receintly referenced objects. The new cache writes dirty objects to disk when there are too many dirty objects--and it always chooses the dirty object which has not been updated for the longest time. This then solves the problem of long-running transactions that update more objects than can be held in memory.

I've also been thinking that the database can have some of the functionality of a version control system--it can support both tags (snapshots) and branches. But I want a bit more flexibility in the control structures than can be directly implemented in a copy-on-write (COW) database.

Seems to me like a good time to implement a hash table, though I don't think having one big area of disk space dedicated for this purpose would conform very well with a COW database. But we could have a heirarchy of hash tables--that would be a good fit.

Once we have a hash table, we can create identifiers which reference an object. And then we can have multiple references to the same object. This will add a lot of flexibility, as the handles we use to reference an object must be the sole reference to that object. The hash table then would be used to map an identifier (a name) to a handle or object.

Note that we can use the table object that has already been implemented as the leaf nodes of the tree. So we only need to define the interior nodes (hash nodes) to implement our heirarical hash.

Sunday, November 18, 2007

zipping long blocks, queries, boolean

I've been giving some thought to snapshots and a threading model which supports queries while the database is being updated, but it will take some time to complete. A first-step is the support for loading objects as read-only and support for queries (as distinct from transactions which update the database).

I've also reinstated compression, but now only for serialized objects over a given limit--so you can tune for speed and still use compression. Boolean persistent objects are now also supported and a table can now have boolean values.

One bit of cleanup that I want to address before diving into snapshots is the problem with long running transactions. Dirty (updated) objects are currently kept in memory until the transaction completes and that is not a very nice restriction. Gotta fix it.

Thursday, November 15, 2007

faster and faster

Well, I've done some tuning now.
  • Null transactions (e.g. queries): 10,000,000/sec
  • Aborts: 4,000+/sec
  • Small transactions: 100+/sec

To get this, I've stopped zipping the data and only calculate checksums on the root. I'm also using rwd as the mode on the random access file instead of doing a sync when writting the root.

I'm also looking at fully implementing the copy-on-write technology, which allows you to create snapshots of the current state of the database with virtually no overhead. I see two major advantages here:

  • Backups can be done while updating the database and
  • Queries can run while updating the database.

Both of these are significant. The first allows a database to be up 24x7. The second means that the response time for queries will always be low.

Sunday, November 11, 2007

timing tests

I've added an object cache now. Unlike everything else, the cache is thread safe. The idea is that you can have multiple databases, each one single threaded, and have a shared cache.

I've also added some very basic tests and find that I'm getting 33 transactions/second on my laptop. (I suspect that it is disk bound, so it should be much faster on a desktop.)

Saturday, November 10, 2007

COWDB: A functional mini-database

We now have a functional mini-database built on Copy-On-Write (COW) technology, beta version 0.3.

It is very very light-weight, and crash-proof. Indeed, there is no way to take it down gracefully--just pull the plug when you're finished.

It is built using constructor-based Insertion-Of-Context (IOC) and the factory pattern, so it is easy to modify or extend.

There are a number of primitive types defined, and you can just use them or add your own:
  • PersistentInteger
  • PersistentLong
  • PersistentString
  • PersistentTable (Keys are Strings, values are other PersistentObjects. Objects are nested within a table.)
  • PersistentHandle (used to access persistent objects without nesting.)
The thing to note is that only tree-like structures are currently supported. So it is good for holding a directory structure of data that you want to access/update from a crash-proof program.

This is a Java library, not a standalone program. And it should be blazing fast--both XML and object serialization are studiously avoided.

What remains now is a bit of polishing and a lot of testing.

Friday, November 09, 2007

the database is functional, but...

We are now at alpha version 0.2. You can create tables with tables with tables. You can create handles to reference other objects, so we are no longer constrained to having everything in the root object. Everything is just fine, but...

The allocation of space in the database is rather disfunctional. Disk space is never reused--all space is allocated from the end of the file. Gotta fix that!

Thursday, November 08, 2007

The COW DB is now working

The Copy-On-Write DataBase now works with minimal functionality. Let's call it version 0.1.

We have 3 kinds of persistent objects:
  1. String,
  2. Table and
  3. Root.

Persistent objects can be nested, and Table is indeed a container for persistent objects with (non-persistent/regular) strings used as keys.

Root subclasses Table, so the root can hold multiple persistent objects--so long as the fixed size of the root is not exceeded. (Default size is 8K.)

The database is crash-proof, as the root is never written from where it is read. There are also a number of integrety checks in root.

The root is serialized and compressed when written to disk; it is decompressed and deserialized when read from disk. However, Java serialization and XML are avoided for performance reasons.

That was the up side. The down side is that there is no space management for the disk file. And there is no way for a persistent object to have a reference to another. These two items comprise the current todo list.

Now, why am I doing all this? My experience has been that databases are just too slow for many applications and it is difficult write crash-proof code otherwise.

Tuesday, November 06, 2007

variable length blocks, typing, TreeMap

Well, I've now defined a simple root block (with no content) and the logic for determining its location and where to write it to next. I've also finished the physical layer of the database.

I'll note that the root block is the only block with a maximum size (default is 8K). Everything else is variable in size, and that is a big advantage to a copy-on-write database--handling variable length blocks is easy! Of course, this means we can easily use compression on the data. :-)

It is time to add support for multiple logical block types by having a type string as the first data element for all logical blocks.

It is also time to define a logical block type which holds data. TreeMap is my prefered choice. Root can then subclass the TreeMap logical block.

Monday, November 05, 2007

Copy-On-Write: the root logical block

Well, I finished drafting the PhysicalBlock class, as well as a bunch of supporting classes. Seems like a lot of code for a simple idea, but it usually works out that way.

Now the idea behind copy-on-write is that you never overwrite a block while processing a transaction. This means that each time you update a logical block of data, it is moved to a new physical block number. Ouch! How do you keep track of where things are?

Lets start with the root logical block. The location of all other logical blocks can be determined starting with the root logical block. This block is updated with every transaction and is updated last. Writing the root logical block to disk is the transaction's commit operation.

We will reserve physical block 0 and physical block 1 for holding the root logical block. And as with other physical blocks, we never update both block 0 and block 1 in the same transaction.

When starting COWDB, we will read and validate both block 0 and block 1. If only one block is valid, we will take the root logical block from that. Otherwise we will use the more reciently updated block.

Note that if we follow these few simple rules, we end up with a crash-proof database. And it should be very fast as well--there's no need for any additional supporting files, and never any need to recover anything.

Well, that was brief. Now it is back to writing code to implement it.

org.agilewiki.cowdb.physical.PhysicalBlock

I've just started hacking out the code for the PhysicalBlock class.

Saturday, November 03, 2007

block header

My objective now is to pull out some of the concepts from AgileWiki to create a very fast and reliable database.

I find myself inspired by the work done at Sun Microsystems on a reliable file system, in particular the idea of doing copy-on-write and having a checksum in each block. I also like the ability to update a backup file quickly.

The database would use fixed-size blocks which are moderately large. Now when I tuned AgileWike, I found that a block size of 32K seemed to work best with the algorithms I was using, so lets use that number as the default block size, at least for now.

Here then is a proposal for the block header:
  • Checksum of the Block
  • Timestamp
  • Data Length
  • Data

The checksum would be on the timestamp, data length and the (variable length) data. And let's use something like java.util.zip.Adler32, which is a reasonably fast checksum. And timestamp would be of the last time the block was updated. This layout should make it reasonably easy to update a backup copy of the database, copying only the blocks with a timestamp greater than the previous update. We can also detect corrupted data when we read a block, as well as making it easy to check to see if a database contains any bad blocks.

More about copy-on-write next time.

Friday, November 02, 2007

A Year at Fiorano

I've been with Fiorano now just a few days short of a year. It sure keeps me busy--I feel like my life has been put on hold. On the flip side, I've learned a great deal about managing multiple development teams and real product development. So the balance is still in favor of Fiorano.

Two months ago Rupali and I moved to a new appartment building, very close to ITPL in Bangalore--where one of my offices is located. (The other being in Hyderabad.) No phone service! There have been a lot of new appartment buildings going up in this area and none of the phone companies have layed any cables lately and they are all maxed out. So I've not had any internet access until lately--I've finally gone wireless. I miss broadband, but this service is managable and reasonably cheap at about $20 for 1GB of downloads each month.

I am still very interested in the ideas behind AgileWiki, but my time is still constrained. I'd like to do some very basic things--I've been quite disapointed with the fact that AW would crash occasionally and very much want to play in that area to begin with. Also, the high-speed b-tree implementation in the lower levels really gives competing databases a good run for their money and it is well worth extracting--providing we could come up with a crash-proof implementation!