HEADS UP - major structure size changes in HEAD

Wed Jun 9 04:53:30 PDT 2010

On 09/06/2010 08:00 Ï€Î¼, Matthew Dillon wrote:
[...]
     I am going to start tying in the new global tokens this week.  I may
     be able to do the whole thing but if not the rules for tying in the
     tokens are pretty easy... every global procedure that needs a
     particular token simply acquires it unconditionally for the duration
     of the procedure.  I have optimized the code paths dealing with
     recursive tokens to reduce overhead significantly so extra recursive
     acquisitions and releases do not cost us much in terms of efficiency.
     I'll solicit additional help once I've determined that my methodology
     is sound.  I think I can get the VM path completely locked up this
     weekend.  These are still global tokens but we will at least be making
     progress.
I fully agree that tokens are a more convenient API than, say, lockmgr 
locks, but AFAIK nobody has investigated their behavior under heavy 
contention (I suspect you have Matt, so please share any information you 
might have collected).

Right now, the lwkt_switch() routine is responsible for getting all 
tokens that a thread needs to run. If it can't get them (because some 
token is owned by a thread running on a different cpu), it moves on to 
the next runnable thread.

However, lwkt_getalltokens() tries to get the tokens in the same order 
that they were acquired by the thread, which means that if you lose a 
race to another thread that needs your Nth token, you just wasted time 
on taking (N - 1) tokens, possibly obstructing other threads that could 
use those tokens to move forward. I admit that in practice most threads 
will take the same tokens in the same order, so I don't expect that to 
be a big issue. Some concrete data would be nice though :)

Another thing that makes me sceptical about token scalability is that, 
AFAICT, the scheduler needs to try to acquire them *every time*. So if 
you have M threads on a cpu all waiting for, say, the VM token, then 
while it is held on another cpu you'll try (and fail) to get that token 
M times every time you lwkt_switch(). This is time you could have spent 
running a task that potentially *could* get some work done (nevermind 
the cmp. That is in contrast to sleep locks where you only become 
runnable when a lock you wanted is released.

I'm sure you've thought about all this; I suppose you're seeing tokens 
as a good trade off between simplicity and scalability?

Thanks,
Aggelos