Upcoming BUF/BIO work
Matthew Dillon
dillon at apollo.backplane.com
Tue Jul 13 19:36:28 PDT 2004
Hiten, with my help, is starting to work on the BUF/BIO infrastructure
changes needed to get rid of the required KVA mappings for I/O requests.
Things are going to look ugly before they look better but we've broken
them up into (hopefully) manageable and testable stages. I'm guessing
two-three months to completion.
Basically what this work is doing is
* Moving the I/O subsystem from KVA mapped buffers to page lists,
using XIO to manage the page lists.
* Making KVA mappings optional in the buffer cache and I/O subsystem
* The Target device becomes responsible for mapping (but can use the
MSFBUF code to do it in one simple call), with the idea that most
target devices will be using BUSDMA and not require an actual KVA
mapping. (Being able to avoid KVA mappings is a BIG deal for
performance and overhead).
* Using a chain of BIO's under a BUF to cache block translations
(e.g. file offset -> filesystem block number -> partition block number),
which will allow us to layer block devices completely arbitrarily.
* Converting I/O block numbers to 64 bit offsets across the board.
1. replace primary page array and page count with an XIO structure.
(done)
2. remove b_driver2 and b_caller2. We are either very close
to or have achieved the removal of these fields and they should
be easy to remove from the source.
3. create a driver encapsulation for the actual I/O, one instance of which
will be embedded in the struct buf and further instances of which will
be chained.
The struct bio starts out like this:
struct bio {
struct bio *bio_next;
struct bio *bio_prev;
struct buf *bio_buf;
int bio_resid;
int bio_error;
off_t bio_blkno; /* block number (offset in later stage) */
dev_t bio_dev;
void (*bio_done)(struct bio *);
void *bio_driver; /* private use by driver */
void *bio_caller;
}
And will replace the following fields in struct buf:
b_resid, b_eror, b_blkno, b_lblkno, b_pblkno, b_dev, b_done,
b_driver, b_caller.
FOR THE FIRST STAGE OF THE BIO INTEGRATION: Each buf will require two
additional BIO structures to be allocated and chained in, one for the
filesystem level blkno, and one for the device level pblkno.
b_blkno and b_pblkno will be #define's for the first stage into these
chained bio's.
4. Change the entire I/O system to pass a BIO to represent an I/O instead
of a BUF.
This also involves removing the macros for blkno and pblkno and passing the
chained bio to the lower layer.
You are going to HATE this step.
5. The reference to bio_buf will be removed. The buf's embedded bio will
be initialized so bio_done points to biodone() and bio_driver (e.g.) points
to the struct buf.
A struct msfbuf * pointer will be associated with the bio, allowing
drivers (or the buffer cache) to map the data if they need a mapping.
Note that XIO has not mapping facility (on purpose), only a page list.
6. Change the bio_blkno field to bio_offset (64 bit offset). The entire
I/O subsystem must be changed to operate using offsets instead of block
numbers.
-Matt
Matthew Dillon
<dillon at xxxxxxxxxxxxx>
More information about the Kernel
mailing list