FreeBSD 7, DragonFly's status
Kris Kennaway
kris at obsecurity.org
Mon Mar 10 17:03:25 PDT 2008
Dave Hayes wrote:
Does an objective metric of stability actually exist? ( If you say
"uptime" I'll take that as a "no" ;) ) If it does, I would really like
to learn what that metric is. Do you know of any current
low-project-bias work that has been done in this area?
Thanks in advance. :)
It's easiest to define "stability" by "lack of instability", i.e.
"system does not crash no matter what you do to it".
The best method I know to evaluate this is by brute force (other
techniques like static code analysis and formal model checking can
help). You have to try really hard to put the system through all kinds
of bizarre contortions in the workloads you care about (which is
"everything" for a general OS developer) until you find something that
breaks. Then fix it and try again.
You have to put serious effort into it though, because after you fix all
the "obvious" panics that can be reproduced in a few minutes of testing,
you end up trying to cause extremely low probability events that can
(and do) nevertheless pop up on real systems given the right combination
of circumstances. If you don't put in the work, the bugs will usually
not get fixed until they crash a user's system and they bother to report
the bug. This doesn't always happen, often they just curse you out.
Once you can no longer trigger bugs no matter how hard you try (assuming
you are achieving good coverage of the system), I think it's reasonable
to provisionally award the label of "pretty stable" to the aspects of
the system you have been testing. There will always be more bugs than
those you found (especially with particular hardware configurations),
but at least you've made a concerted effort to find them.
This is basically what stress2 and other tools try to help automate,
although it can never replace human-driven QA. It's literally a full
time job to do properly.
Kris
More information about the Users
mailing list