critical kernel panic on 4.2 (or any other version) on hammer volume-del

Tomohiro Kusumi kusumi.tomohiro at gmail.com
Thu Mar 24 11:51:49 PDT 2016


This has been fixed in master and 4.4.

https://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/2f1df9cece2c18f660b07748af682ac3c47cc50f
https://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/622d6d5bb6c78995fbb7d00c0f92e9406d8ef802


2015-09-01 4:51 GMT+09:00 Matthew Dillon <dillon at backplane.com>:

> Well, I'm amazed the volume-add/volume-del work at all.  It was coded way
> back in 2009 by Michael Neumann (I think), but only a few people ever used
> it so the feature has never seen the same sort of use or load that other
> features have seen.
>
> It would be pretty cool if you could get the volume add/del stuff working
> reliably.
>
> If you post the backtrace I might have some ideas on what the timing issue
> between the flusher and the volume-del code could be.
>
> -Matt
>
> On Mon, Aug 31, 2015 at 12:10 PM, Tomohiro Kusumi <
> kusumi.tomohiro at gmail.com> wrote:
>
>> There is a critical kernel panic issue while running hammer volume-del.
>> I've seen this happens not only on master, but also 4.2.4 release kernel
>> (which was before I started to touch hammer volume-add|del ioctl code) and
>> probably older kernels too from the way it works.
>>
>> This seems to be a timing issue, race between ioctl called by a
>> volume-del process and the flusher kernel thread. It's relatively easy to
>> reproduce in my environment with >=4 volumes but never with 2 or 3 volumes.
>> The first one or two volumes need to be filled up and make reblock run
>> before unloading the volume in order to reproduce this.
>>
>> In the worst case, a volume gets removed and volume header gets erased as
>> usual, but kernel could panic before volume count of remaining volumes get
>> updated (decremented). This results in inconsistency between volume count
>> and # of valid volumes, and the filesystem will no longer be mountable
>> unless directly edit the volume header of block devices.
>>
>> I haven't fixed this yet, nor is it obvious, but it seems the flusher
>> kernel thread needs to be aware of ongoing volume-del by checking if
>> hmp->volume_to_remove != -1
>> similar to the way blockmap allocator checks it while looking for free
>> space within a bigblock.
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/kernel/attachments/20160325/767fcbf3/attachment-0006.html>


More information about the Kernel mailing list