Initial filesystem design synopsis.
Thomas E. Spanjaard
tgen at netphreax.net
Wed Feb 21 15:55:20 PST 2007
Matthew Dillon wrote:
Segmentation:
The physical storage backing a filesystem is broken up into large
1MB-4GB segments (64MB is a typical value). Each segment is
self-identifying and contains its own header, data table, and record
table. The operating system glues together filesystems and determines
availability based on the segments it finds.
I think the more common term for this kind of thing is 'allocation group'.
- The data table consists of pure data, laid out linearly in the forward
direction within the segment. Data blocks are variable-sized entities
containing pure data, with no other identifying information, suitable
for direct DMA. The segment header has a simple append index for
the data table.
And 'extent' for the variable-sized entities :).
- The record table consists of fixed-sized records and a reference to
data in the data table. The record table is built backwards from
the end of the segment.
Doesn't this prepending stuff incur a significant performance penalty
for operations that walk the record table in a chronological/otherwise
'fifo' ordered fashion?
Record destruction creates holes in both the data table and the record
table. Any holes adjacent to the data table append point or the record
table prepend point are immediately recovered by adjusting the
appropriate indices in the segment header. The operating system may
cache a record of non-adjacent holes (in memory) and reuse the space,
and can also generate an in-memory index of available holes on the
fly when space is very tight (which requires scanning the record table),
but otherwise the recovery of any space not adjacent to the data table
append point requires a performance reorganization of the segment.
I think these lists/trees should be kept sorted, at least on-disk for
performance reasons (random reads/writes on rotational media is a bummer
given current seek times).
Generally, I can't help but feel that the clustering/replication stuff
needs to be separate from the 'actual on-disk' filesystem.
Cheers,
--
Thomas E. Spanjaard
tgen at netphreax.net
Attachment:
signature.asc
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pgp00015.pgp
Type: application/octet-stream
Size: 186 bytes
Desc: "Description: OpenPGP digital signature"
URL: <http://lists.dragonflybsd.org/pipermail/kernel/attachments/20070221/06901b80/attachment-0020.obj>
More information about the Kernel
mailing list