git: Bring in a simple event tracing library and POC utility

Aggelos Economopoulos aoiko at
Wed May 26 07:14:28 PDT 2010

I'm not trimming any of the context because it's been a long time since 
the original mail. I'll just add new information inline.

On 16/02/2010 07:24 μμ, Aggelos Economopoulos wrote:
Aggelos Economopoulos wrote:
Aggelos Economopoulos wrote:
commit e7c0dbbaa9d5a2de52e8882628668615903e9132
Author: Aggelos Economopoulos<aoiko at>
Date:   Mon Feb 8 19:43:33 2010 +0200
     Bring in a simple event tracing library and POC utility
There's many things we could do with this infrastructure in place, maybe
I should put together a mail outlining possible future directions.
One thing I'd like to add is a 'stats' (and maybe a 'plot') command. In
the past I've had to write a set of scripts to parse ktrdump output, to
do things such as calculate the time spent in an interrupt handler and
the used decriptors in a network interface tx ring. Now, adding
something like
evtranalyze -f /path stats evpair -b "begin intr" -e "end intr"

(where "begin intr" and "end intr" are the respective ktr format
strings; they could also be regular expressions) that produces a
mean/stddev is easy, but I'd like to go a bit further. Paired events are
a pattern that appears throughout the kernel and we could add
first-class support for it.
So what I want to do is use a standardized and trivial to parse format
string. For example,
"#compl:INT BEG DEV=%p"

where '#' is the special character that instructs libevtr to parse the
format string. In this case, libevtr would generate a special
EVTR_TYPE_COMPLETION event, with special fields:
struct evtr_event {
	union {
		struct completion {
			const char *name;	/* "INT" */
			void *obj;		/* kernel pointer */
			/* or maybe const char *obj_name; see below */
		} compl;
With a bit of behind-the-scene magic we could provide the device name as
well (or maybe recording just the device name from the start would be
simpler, depends on whether we'd like to expose other device properties).
Then, one could use something like

evtranalyze stats duration "INT DEV=devname"
evtranalyze stats qlen "TXPKT NETIF=ifname"
evtranalyze stats qlen "DISKIO DEV=devname"
and the respective plot commands, e.g.

evtranalyze plothist duration "INT DEV=devname"

to create a histogram[0] of the distribution of interrupt times


evtranalyze plottime qlen "DISKIO DEV=devname"

to get a qlen-time graph (obviously one would need to make sure they
start with no pending completion events for the absolute numbers to make
sense, so I'm not sure if this approach makes a lot of sense. One could
just as well record "var:TXQLEN DEV=%p VAL=%d" events to trivially get a
reliable plot).
This is all a bit blue-sky, but I'm mainly throwing the idea out there
to see what people think of the approach. I think standardizing format
strings a bit is a good idea in any case.
In fact this syntax was a bit ugly and suboptimal from an implementation 
POV. I've added support for hashes, so that now (for instance) at thread 
switch time we just do

cpu[%d].td = %p

and at thread creation time we do

threads[%p].name = %s

Another thing we can do (I haven't added the kernel instrumentation yet) is:

devnames["ad5"] = 0xc2b06678

This will be done at attach time and will map the device name ("ad5") to 
the 'struct device' pointer. Then you can use the pointer to index in 
the devices hash and assign stuff to it.

devices[0xc2b06678].queue = 0
devices[0xc2b06678].queue = 10
Notice that foo.field gets rewritten into foo["field"] so this is 
actually a hash, not a record. It gets created automatically if it 
doesn't exist.

Our evtranalyze now can do statistical analysis on integer variables; 
for instance, this works:

evtranalyze -f blah.evtr stats 'devices[devnames.ad5].queue'
median for variable devices[devnames.ad5].queue is 5.000000
Adding more statistics and plotting is very straightforward.

I'm currently assuming a static device hierarchy which ktrdump retrieves 
via libdevinfo and dumps it at the start of the event stream as pseudo 

I'm not sure how to elegantly handle completion events yet. I might have 
to add support for a simple object type system. We might want to do 
pattern matching, so I may consider a Haskell-like approach. Haven't 
really thought about it yet. Is it too late to import lua? ;-)

Keep in mind that all this needs absolutely no kernel support (this has 
been an objective from the start) other than modifying the ktr format 
strings. If a format string starts with '#' then it gets interpreted by 
libevtr, in userspace. And I dare say the changes make the format 
strings more readable even for ktrdump users.

All feedback is welcome,

More information about the Kernel mailing list