device cloning on open
Toby karyadi
toby.karyadi at gmail.com
Thu Apr 5 14:07:02 PDT 2007
Matthew Dillon wrote:
>
> :Has there been any changes to device cloning support in DFBSD,
> :specifically related to this email:
> :http://leaf.dragonflybsd.org/mailarchive/kernel/2006-01/msg00058.html
> :
> :I'm porting the kqemu module based on the FreeBSD version of it and it
> :works now but there can only be one qemu process using the /dev/kqemu
> :device. I guess I can just have a list of association between a process
> :and the module/device private data for each open() that's managed in the
> :module. It's not optimal but it will work until I figure out the best way
> :to keep per open data.
> :
> :In FreeBSD >= 5 the per open data is associated to a clone of the device
> :via the si_drv1 field. A clone event handler also needs to be registered
> :to do the cloning - it's kind of convoluted.
> :
> :In Linux and NetBSD, if I read the code right, the file pointer is passed
> :in to the various ops, read, ioctl, write, etc of the driver. I think
> :this makes more sense semantically. In Linux the file* is passed in,
> :while in NetBSD the module can decide call falloc and fdclone.
> :
> :I'm not sure if I want to / can take on the task of adding the per open
> :vnode as suggested by Matt, but I'd be interested in hearing what people
> :have to say about this.
> :
> :Thanks for all of the great work so far.
> :
> :Cheers,
> :Toby
>
> At the moment DragonFly passes the file pointer to the VOP open
> function, allowing the VOP open function to change elements of
> the file pointer (for example, to install a different vnode). The
> file functions typically are left pointing at the specfs VFS which
> does the translation between a vnode operation and a device operation.
> The file pointer's fp->f_type field is left indicating a 'vnode' type.
>
> This means that I/O operations running through specfs run through
> the uncloned vnode at the moment and the actual device is picked
> out of the vnode structure. The vnode represents the filesystem
> rendezvous (typically /dev/<blahblah>). Hence why I originally
> suggested cloning the vnode in order to support a cloned device.
>
> There is another way we could clone the device, and that would be
> to NOT retain the vnode type in the file pointer but instead to
> create a wholely new file type and wholely new f_ops operations
> set for the file pointer, then point f_data at a cloning structure
> of some sort:
>
> struct dev_data {
> cdev_t *dd_dev; /* possibly cloned device */
> struct vnode *dd_orig_vp; /* original uncloned vnode */
> };
>
> fp->f_ops = &dev_ops;
> fp->f_type = DTYPE_DEV;
> fp->f_data = (pointer to allocated dev_data structure)
>
> In otherwords, to create a completely new device abstraction that
> completely bypasses the original vnode and provides storage
> (dd_dev) for us to clone the device. No more specfs, no more
> indirection through the vnode operations vector. A far more
> direct device access mechanism that happens to also making cloning
> trivial.
>
> The dd_orig_vp field would remain only to give the new device ops
> the ability to update the vnode's access and modified timestamps
> (if we even care about doing that for cloned entities).
>
> --
>
> If you or someone would like to take on this task, I think it would
> be an excellent (and clean) solution to a long standing problem.
>
> -Matt
> Matthew Dillon
> <dillon at backplane.com>
Do you suppose adding a new D_MAKECLONE bit in the si_flags makes sense?
That way devices can be explicit about whether they can/want to be cloned.
Unless VOP_OPEN is changed we'll still hit specfs, but only for spec_open().
spec_open() should be modified when an fp is passed in so that fp->f_data
is setup with the dev_data struct. If this is an initial open (hmm, how do
I figure it out?), then dd_orig_vp->v_rdev is copied to dd_dev. Otherwise
create a clone of dd_orig_vp->v_rdev (using make_sub_dev() maybe?).
spec_open() should also setup the fp->f_ops to a new set of fileops that
uses the dev_d*() functions directly. Am I correct so far?
Now, this might be a stupid question, does the higher level VFS system like
the filesystem call the VOP_OPEN of the lower layer, like of the disk
partitions? I'm just trying to figure out potential problems down stream.
Well, let me take a stab at it and see how that goes. Hmm, maybe I can use
the vkernel...
Cheers,
Toby
More information about the Kernel
mailing list