cache_lookup() work this week.
Matthew Dillon
dillon at apollo.backplane.com
Thu Sep 4 11:28:30 PDT 2003
:...
:
:Consider also:
: cd /a/b/c/d ; ln /a/outside
:Time passes, and someone else types:
: cd /a/b/c ; rm -Rf *
:
:The person will think that they are just safely removing the
:directory and everything below it, but now they could be
:removing much more than that. We'd have to do something to
:guard against that problem too. (and these hard links could
:be created by users with nefarious purposes in mind, so the
:person doing the 'rm' would have no reason to suspect that
:this would be an issue).
This is a good one, and easy to solve... rm would just unlink() or
rmdir() the directory first, whether it is empty or not. If the unlink
succeeds then rm considers its work done.
The last instance of the directory would not be unlinkable... rm -rf
would have to recurse through and delete the underlying files first.
:> Only the (rdev,inode#) for the elements representing
:> the current path need to be remembered so the memory
:> use is small.
:
:If we're going to have real hard-links, then it would probably be
:important to add a a "number-of-hard-dirlinks" field to stat().
:This would be a separate value from the st_nlink field, in that
:it would only count the number of directory hard-links. Maybe
:call it st_ndirlink. Then any program which wants to do this
:will only have to remember (rdev, inode) for those directories
:where this value is > 1. That makes the overhead even less...
Right.
:
:This field might also provide a way to address the 'rm' problem
:mentioned above. If st_ndirlink > 1, then just destroy the hard
:link and do *not* remove the files underneath the hard link.
:
:But I'm still very uneasy with the idea of real hard links on
:directories. I think it's too much potential for trouble
:without enough of a benefit.
Right.
Oh, don't get me wrong, I am both uneasy and thrilled about the prospect.
I think it is worth having for precisely that reason :-)
:>:> and (B) it will be possible to implement semi-hard
:>:> links, basically softlinks that *look* like hardlinks
:>:
:>:What will this be like? Ie, what will be the difference
:>:between hard-links and semi-hard links? Will this be
:>:something like the way OpenAFS handles volume-mounting?
:
:On this question, I just curious in the lower-level details,
:so it's different than my hard-links question. I definitely
:like this idea, I was just wondering how it'd be implemented.
: From other messages it does sounds like you intend to implement
:this in about the same way that OpenAFS does volume-mounting,
:which is what I was wondering. Thanks.
:
:--
:Garance Alistair Drosehn = gad at xxxxxxxxxxxxxxxxxxxx
It would be implemented as a softlink from the point of view of the
filesystem, but namei() would interpret a flag on it to mean that the
namecache should keep a separate chain through the link rather then
'jump' through the link.
Maybe this will help. This is the new namecache structure I am
using:
struct namecache {
LIST_ENTRY(namecache) nc_hash; /* hash chain (parentvp,name) */
TAILQ_ENTRY(namecache) nc_entry; /* scan via nc_parent->nc_list */
TAILQ_ENTRY(namecache) nc_vnode; /* scan via vnode->v_namecache */
TAILQ_HEAD(, namecache) nc_list; /* list of children */
struct namecache *nc_parent; /* namecache entry for parent */
struct vnode *nc_vp; /* vnode representing name or NULL */
int nc_refs; /* ref count prevents deletion */
u_char nc_flag;
u_char nc_nlen; /* The length of the name, 255 max */
char nc_name[0]; /* The segment name (embedded) */
};
And in the vnode:
TAILQ_HEAD(namecache_list, namecache) v_namecache;
What we had before was that the vnode served as the central coordinating
point for the namecache entries, both namecache entries (parent
directories) feeding into the vnode and namecache entries (children in
the directory) going out of the vnode.
What we have above is that the namecache entries now serves as the central
coordinating point and the vnode meerly heads a list of namecache entries
associated with it in particular.
With the new scheme it is possible to maintain completely independant
naming topologies that contain some 'shared' vnodes. In the old scheme
you could do that but you would loose track of which naming topology
was used to locate the vnode. In the new scheme the handle *IS* the
namecache structure and thus the topology used to locate the vnode is
known, even if the vnode is shared amoungst several topologies.
All I have to do, which is what I am working on right now, is change all
the directory references in the codebase from vnodes to namecache pointers.
For example, fd_cdir, fd_rdir, and fd_jdir in sys/filedesc.h need to
be changed from vnode pointers to namecache pointers, and all the VOP_*
functions which take 'directory vnodes' as arguments would now instead
take namecache pointers as arguments. namei related functions which take
and return directory vnodes would now have to take and return namecache
pointers. For that matter, these functions would have to take and
return namecache pointers for everything, including file vnodes.
This in turn will allow the lookup functions to gain holds on directories,
files, and non-existant files (placeholders for create, rename) without
having to obtain any vnode locks, which in turn allows us to completely
get rid of the race to root problem as well as other common stalls
associated with blocking I/O during directory lookup operations.
-Matt
Matthew Dillon
<dillon at xxxxxxxxxxxxx>
More information about the Kernel
mailing list