[DragonFlyBSD - Bug #2756] Hit kernel panic while running hammer show cmd

bugtracker-admin at leaf.dragonflybsd.org bugtracker-admin at leaf.dragonflybsd.org
Sat Jan 3 16:07:41 PST 2015


Issue #2756 has been updated by tkusumi.


Tried again and was able to reproduce it. For some reason it hits panic when show.out has reached around 240-250MB.

dmesg from console right before it enters db>.

> ad0 FAILURE - device detached
> subdisk0: detached
> ad0: detached

It was at a different function this time though. x/i says it died at
getmicrouptime+0x15

where asm of +0x15=ffffffff805c3c86 is like this.
getmicrouptime+0x15 ()
ffffffff805c3c71 <getmicrouptime>:
ffffffff805c3c71:       55                      push   %rbp
ffffffff805c3c72:       48 89 e5                mov    %rsp,%rbp
ffffffff805c3c75:       65 48 8b 0c 25 00 00    mov    %gs:0x0,%rcx
ffffffff805c3c7c:       00 00
ffffffff805c3c7e:       8b b1 28 24 00 00       mov    0x2428(%rcx),%esi
ffffffff805c3c84:       89 f6                   mov    %esi,%esi
ffffffff805c3c86:       48 89 37                mov    %rsi,(%rdi)

with following bt
getmicrouptime()
  devstat_end_transaction()
    devstat_end_transaction_buf()
      ad_done()
        ata_completed()
          taskqueue_run()
            taskqueue_swi_mp_run()
              ithread_handler()


timeval *tvp (%rdi) was null and it died at
mov    %rsi,(%rdi)
in
tvp->tv_sec = gd->gd_time_seconds;
of the following.

> void
> getmicrouptime(struct timeval *tvp)
> {
>         struct globaldata *gd = mycpu;
>         sysclock_t delta;
> 
>         do {
>                 tvp->tv_sec = gd->gd_time_seconds;
>                 delta = gd->gd_hardclock.time - gd->gd_cpuclock_base;
>         } while (tvp->tv_sec != gd->gd_time_seconds);


not sure if this is really a hammer issue. worth trying with ufs ?


----------------------------------------
Bug #2756: Hit kernel panic while running hammer show cmd
http://bugs.dragonflybsd.org/issues/2756#change-12380

* Author: tkusumi
* Status: New
* Priority: High
* Assignee: 
* Category: Kernel
* Target version: 4.0.x
----------------------------------------
Hit kernel panic while running hammer show cmd. All I did was

# uname -r
4.0-RELEASE
# hammer -f /dev/serno/xxxxxxxx.s1d show > show.out

where /dev/serno/xxxxxxxx.s1d is a volume for / hammerfs with enough space left. It's running as a virtualbox guest on x86_64. It happens whenever the size of show.out gets around 250MB.

# df -h
Filesystem                           Size   Used  Avail Capacity  Mounted on
ROOT                                  74G   8.2G    66G    11%    /
...

x/i says it died at movl at dscheck+0x8b (ffffffff80618025)
ffffffff80618025:       44 8b 7b 0c             mov    0xc(%rbx),%r15d
ffffffff80618029:       44 3b 7d b8             cmp    -0x48(%rbp),%r15d
ffffffff8061802d:       77 28                   ja     ffffffff80618057 <dscheck+0xbd>

dscheck() was called as a sequence of btree lookup by hammer show. hammer_vop_strategy_read() -> hammer_ip_first() -> hammer_btree_lookup() -> btree_search() -> hammer_cursor_down() -> hammer_get_node() -> hammer_load_node() -> hammer_get_buffer() -> hammer_load_buffer() -> hammer_io_read() -> hammer_cluster_read() -> ... (failed to catch any further)

I saw disas of /boot/kernel/kernel and this movl seems to be null pointer dereference of *ssp at
if (slice >= ssp->dss_nslices)
of the following.

> struct bio *
> dscheck(cdev_t dev, struct bio *bio, struct diskslices *ssp)
> {
>         struct buf *bp = bio->bio_buf;
>         struct bio *nbio;
>         disklabel_t lp;
>         disklabel_ops_t ops;
>         long nsec;
>         u_int64_t secno;
>         u_int64_t endsecno;
>         u_int64_t slicerel_secno;
>         struct diskslice *sp;
>         u_int32_t part;
>         u_int32_t slice;
>         int shift;
>         int mask;
> 
>         slice = dkslice(dev);
>         part  = dkpart(dev);
> 
>         if (bio->bio_offset < 0) {
>                 kprintf("dscheck(%s): negative bio_offset %lld\n",
>                         devtoname(dev), (long long)bio->bio_offset);
>                 goto bad;
>         }
>         if (slice >= ssp->dss_nslices) {
>                 kprintf("dscheck(%s): slice too large %d/%d\n",
>                         devtoname(dev), slice, ssp->dss_nslices);
>                 goto bad;
>         }




-- 
You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here: http://bugs.dragonflybsd.org/my/account



More information about the Bugs mailing list