Postfix suddenly stopped working
Matthew Dillon
dillon at apollo.backplane.com
Sat Aug 12 17:07:28 PDT 2006
:I ran with -HEAD built from a couple of weeks ago and did not see a
:reoccurrance of the postfix queue "sticking". Last night, I went back
:to
:
:DragonFly woodstock.nethamilton.net 1.7.0-PREVIEW DragonFly 1.7.0-PREVIEW #6: Sat Aug 12 12:07:04 CDT 2006 hamilton at xxxxxxxxxxxxxxxxxxxxxxxxx:/usr/obj/usr/src/sys/WOODSTOCK i386
:
:with a build/installkernel and installworld.
:
:To my surprise, the symptom popped back up this morning. I checked the source,
:and found that the patch above hadn't been applied. I applied the patch and
:rebuilt and installed the kernel, and the queue got stuck again this
:afternoon.
There are two commits and I'm not sure whether you applied both of them.
kern/kern_lockf.c 1.32 and 1.33 both need to be applied.
PREVIEW isn't HEAD. I will slip the PREVIEW tag for those two commits
right now. If you scrap your manual patch and resync with preview you
should get both patches.
:I ran vnodeinfo as above, and after ripping out the non-locked stuff from
:the output the results are at http://www.nethamilton.net/lock_debug/stuck1.txt
:(which is pre-patch) and http://www.nethamilton.net/lock_debug/stuck2.txt
:(post-patch). I'm not sure what this is trying to tell me aside from
:confirming that postfix is holding a lock on unix.local.
:
:A couple of questions:
:1) is this a different problem, since it's occurring even after I applied
: the patch?
:2) what can I do to diagnose further?
:
:I'm happy to fiddle around to gather info on this, but need a little
:hand holding in terms of exactly what to do.
:
:--
:
: Jon Hamilton
: hamilton at xxxxxxxxx
From the information you posted I'm guessing that a lock did not get
released, which is symptom of the 1.32 commit (the patch I emailed you
was the 1.33 commit, but PREVIEW did not have 1.32 OR 1.33).
I have included the diff between 1.31 and 1.33 of kern_lockf.c below
for reference but if you update to the latest preview you should
get the patches automatically.
-Matt
Matthew Dillon
<dillon at xxxxxxxxxxxxx>
Index: kern_lockf.c
===================================================================
RCS file: /cvs/src/sys/kern/kern_lockf.c,v
retrieving revision 1.31
retrieving revision 1.33
diff -u -r1.31 -r1.33
--- kern_lockf.c 27 May 2006 02:03:17 -0000 1.31
+++ kern_lockf.c 3 Aug 2006 16:06:15 -0000 1.33
@@ -38,7 +38,7 @@
*
* @(#)ufs_lockf.c 8.3 (Berkeley) 1/6/94
* $FreeBSD: src/sys/kern/kern_lockf.c,v 1.25 1999/11/16 16:28:56 phk Exp $
- * $DragonFly: src/sys/kern/kern_lockf.c,v 1.31 2006/05/27 02:03:17 dillon Exp $
+ * $DragonFly: src/sys/kern/kern_lockf.c,v 1.33 2006/08/03 16:06:15 dillon Exp $
*/
#include <sys/param.h>
@@ -239,8 +239,15 @@
switch(ap->a_op) {
case F_SETLK:
- ap->a_vp->v_flag |= VMAYHAVELOCKS;
+ /*
+ * NOTE: It is possible for both lf_range and lf_blocked to
+ * be empty if we block and get woken up, but another process
+ * then gets in and issues an unlock. So VMAYHAVELOCKS must
+ * be set after the lf_setlock() operation completes rather
+ * then before.
+ */
error = lf_setlock(lock, owner, type, flags, start, end);
+ ap->a_vp->v_flag |= VMAYHAVELOCKS;
break;
case F_UNLCK:
@@ -683,7 +690,7 @@
* Extend brange to cover range and scrap range.
*/
brange->lf_end = range->lf_end;
- brange->lf_flags |= brange->lf_flags & F_NOEND;
+ brange->lf_flags |= range->lf_flags & F_NOEND;
TAILQ_REMOVE(&lock->lf_range, range, lf_link);
if (range->lf_flags & F_POSIX)
--count;
@@ -753,20 +760,23 @@
}
/*
- * Wakeup pending lock attempts.
+ * Wakeup pending lock attempts. Theoretically we can stop as soon as
+ * we encounter an exclusive request that covers the whole range (at least
+ * insofar as the sleep code above calls lf_wakeup() if it would otherwise
+ * exit instead of loop), but for now just wakeup all overlapping
+ * requests. XXX
*/
static void
lf_wakeup(struct lockf *lock, off_t start, off_t end)
{
struct lockf_range *range, *nrange;
+
TAILQ_FOREACH_MUTABLE(range, &lock->lf_blocked, lf_link, nrange) {
if (lf_overlap(range, start, end) == 0)
continue;
TAILQ_REMOVE(&lock->lf_blocked, range, lf_link);
range->lf_flags = 1;
wakeup(range);
- if (range->lf_start >= start && range->lf_end <= end)
- break;
}
}
More information about the Users
mailing list