[issue952] system hang with rsync

Matthew Dillon dillon at apollo.backplane.com
Sun Mar 16 12:57:24 PDT 2008


:The sync did not prevent it.  It is not actually crashing, but just
:hanging.  After I suspend or kill the rsync process on the client
:machine (running NetBSD), the server stays hung for several minutes and
:then eventually comes back alive.  Top never shows any process load.
:
:I also tried syncing every .1 seconds.  Same result but it gave me
:a more fine grained resolution of the freeze pattern.  I printed a '.'
:for every sync.  The moment I run the rsync command on the client it
:freezes (no more dots).  Some of the time it will freeze for a few
:seconds, print a few more dots, freeze again for much longer, sometimes
:print another dot or two and then stay frozen.  I also tried nicing the
:sync to -10.  Same result.
:
:As I mentioned, it is only with the '--delete' rsync option.  The freeze
:takes place while deleting large files and seems to only be when there
:are more than a certain number of them to delete.  When I tested a while
:back with fewer files to delete, it did not do it.  I don't know where
:the threshhold is of how many files triggers it.  However, the freeze
:seems to always be while deleting the first file.  After killing it and
:waiting several minutes for it to come back alive, the first file in the
:list does seem to be deleted.
:
:The directory size on the DragonFly server is 45GB with most of the file
:sizes ranging between 2GB and 4GB.  I can provide you with the full
:directory listing of both the server and the client if you like.  Either
:email it directly to you or post it to the list.  Let me know.
:
:There are 36 files on the server.
:Here is a sample listing of a few of the files.

    Do you have a console on the server?  Can you break into DDB?

    When it hangs you can break into DDB with control-alt-escape and do a
    'ps' to see what (probably many) processes are stuck on.  That will
    give us a starting point.  You can then 'cont' to continue operation.

    I'm sure you probably do not want to crash the box but if you don't
    mind and have core dumps enabled and can get a core while it is 'stuck',
    that will give us the most information.  If you want to go that route
    I can give you a leaf.dragonflybsd.org account to upload the core into
    so the developers can have a look at it (just supply me with your public
    dsa key for ssh and your desired username and I will create the leaf
    account).

						-Matt





More information about the Bugs mailing list