HAMMER update 11-June-2008

Matthew Dillon dillon at apollo.backplane.com
Wed Jun 11 17:30:11 PDT 2008


    After another round of performance tuning HAMMER all my benchmarks
    show HAMMER within 10% of UFS's performance, and it beats the shit
    out of UFS in certain tests such as file creation and random write
    performance.  Read performance is good but drops more then UFS under
    heavy write loads (but write performance is much better at the same
    time).

    I am making progress with blogbench.  It turns out that HAMMER isn't
    quite as horrible as it first appeared.  What was happening was simply
    that the test under UFS never got past blog #200, and thus never wrote
    enough data to blow out the system caches.  The blogbench test builds
    up an ever-increasing sized dataset as it progresses.

    HAMMER has superior random write performance and because of that it
    easily got into the blog #350-400 range in the default test, or about
    double the size of the data-set.  Plus it was writing more
    during the first half of the test so that of course depressed the
    read performance a bit relative to UFS.

    When I increase the number of iterations sufficiently for both UFS and
    HAMMER to blow out the system caches, the results wind up being very
    different.

	blogbench --iterations=100 -d /mnt/bench2

    Now when UFS gets past blog #300 and blows out the system caches, UFS's
    write performance goes completely to hell but it is able to maintain
    good read performance:

Nb blogs   R articles W articles R pictures W pictures R comments W comments
   322     72232      67         55654      88         45740      194
   323     83711      81         64882      81         53844      204
   325     57380      62         43314      62         36603      196
  ...
   345     17494      40         12866      50         12226      137
   347     21895      42         16655      41         13002      128
   347     22803      68         17517      12         14247      122
   348     16976      52         13397      29         12113      119
   348     20068      34         15668      52         13569      135

    HAMMER is the opposite.  It can maintain fairly good write performance
    long after the system caches have been blown out, but read performance
    drops to about the same as its write performance (remember, this is
    blogbench doing reads from random files).  Here HAMMER's read performance
    drops significantly but it is able to maintain write performance.
    UFS's write performance basically comes to a dead halt.  However, HAMMER's
    performance numbers become 'unstable' once the system caches are blown
    out.

    Here is HAMMER:

Nb blogs   R articles W articles R pictures W pictures R comments W comments
   297     3904        972       3111        720       3228          3966
   310     2653       1104       1605        751       1740          3936
   325     2703        962       2082        708       1894          2914
   346     3637       1123       2537       1138       2204          4761
   ...
   477     1375        597       1005        700        572          2548
   496     1507       1307        995        900        825          3735
   515     1423       1068        907       1008        569          3877
   ...
   751     1221       1445        817       1086        557          3296
   761     1204        508        719        664        719          1398
   771     1352        438        824        685        525          1856


				Performance TODO

    I am going to continue to work on random read and write performance,
    particularly inconsistencies in HAMMER's performance numbers.

    There are some performance issues when running blogbench on a directory
    that it has already been run on, with a normal HAMMER mount which is
    retaining a full history of all changes.  I believe the problem is
    related to fragmentation of the directory entries.

    I may make two additional media changes:

    * I may give directory entries their own blockmap zone or their own
      localization parameter (so they can be reblocked separately).  I haven't
      decided for sure yet.

    * I will probably increase the data block size from 16K to 64K for
      files larger then 1MB.  This will cut the number of B-Tree elements
      needed to index large files by a factor of 4.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>





More information about the Kernel mailing list