Patch for inode SLIST conversion
Matthew Dillon
dillon at apollo.backplane.com
Sat Jan 19 11:08:12 PST 2008
:If I understand serialization tokens correctly, they lead in case of the
:two functions above to mutual exclusive execution, because none of them
:voluntarily blocks or calls a blocking method. Is this assumption correct?
Correct.
: /*
: * We must check to see if the inode has been ripped
: * out from under us after blocking.
: */
: INOFIND(ip, dev, inum);
: if (ip == NULL || ITOV(ip) != vp) {
: vput(vp);
:..
:
:See the comment on "ripped out from under us". This can only happen
:because vget might block, right?
Correct, though once you start calling random procedures which are
not specifically documented as being non-blocking, it can get dangerous
because the assumption might not hold if the code is changed in the
future.
But here... yes, correct.
:Now, ufs_ihashget doesn't modify the hashtable at all. So why not store
:a "transaction id" in each hash table bucket, which gets increased by
:ufs_ihashins and ufs_ihashrem. Then in ufs_ihashget I read in this id
:*before* calling vget and check afterwards if it has changed. If it is
:unchanged, I can omit a second pass over the linked list.
:
:It's actually quite easy to implement and if all my assumptions apply,
:the correct way of doing it. Please correct me if I'm wrong.
:
:
:Regards,
:
: Michael
Ok, three things. First, the transaction id will be fairly granular,
meaning that if it does not match you have to loop. So the code
is going to wind up being very slightly more complex then it is now.
Second, the transaction id idea has been used in BSD code before,
particular with regards to dealing with vm_map's. It will be
higher-performing, but not by a lot, and it has the problem of
obfuscating the code a bit (meaning you have to carefully document
the algorithm in comments) and it is also somewhat algorithmically
fragile because there is no direct check against duplicate nodes.
Third, the hash table is sized such that it has or should have very
short chain lengths. The relookup will all be in the cpu's L1 cache
and I doubt you'd notice any improvement in performance.
I would personally like to keep the code as is and not make any major
changes to it, simply because we know it works and it is extremely
fragile. If you really want to redo the code, you can, but I would
like to wait until after the 2.0 release to commit it.
-Matt
Matthew Dillon
<dillon at backplane.com>
More information about the Kernel
mailing list