HAMMER Update 13-June-2008 - HEADS UP: MEDIA CHANGED AGAIN
Matthew Dillon
dillon at apollo.backplane.com
Mon Jun 16 07:58:33 PDT 2008
:Maybe the question is why B+ trees are so popular in current file
:systems. I am certainly not expert enough to see all the implications
:of the selection of the tree structure on the implementation and
:performance of a file system, but what is the main reason why you
:chose B- trees?
:
:Riggs
B-Trees (generically, + or -) have very good lookup characteristics
for random-access devices. They first found wide-spread use in
relational databases something like 20 years ago.
Basically the idea is to use a B-Tree with a very large radix. In
the case of HAMMER, I am using a radix of 64 and could go all the
way to 256 (the maximum that would fit in a 16K HAMMER buffer) if
I wanted to. Use of a large radix greatly reduces the number of
discrete I/O's (and related seeks) needed to locate an object or
file record.
For example, a lookup of a filename in a directory containing
one hundred million files would only require 6 discrete I/O's.
The caching characteristics of B-Trees are also really, really good.
Not only can one cache pointers into the B-Tree and avoid starting
searches from the root of the tree, but the higher layers of the B-Tree
tend to be very well represented in system (memory) caches and,
in particular, the node paths going the root to the 'vicinity' of
the object or record you want to locate are very easy to cache.
-Matt
Matthew Dillon
<dillon at backplane.com>
More information about the Kernel
mailing list