fix for IPSEC-IPV4 breakage

Andrew Atrens atrens at nortelnetworks.com
Thu Oct 14 06:19:21 PDT 2004


>
>     This looks better, but I don't think your assumption that save_len
will
>     cover the TCP or UDP header is correct and I don't think you can
safely

Yeah, I didn't want to know about protocol header sizes so the heuristic I 
used was -

1. Cache the size of the first mbuf.
2. Call decryptor
3. if m->m_len < save_len  we know that the decryptor has modified
    the mbuf chain - effectively has truncated the first mbuf.

In this case my first plan was to m_pullup(m, save_len).  This worked 
exactly the same as my second plan which was to  m_pullup(m, m->m_len + 
m->m_next->m_len) bytes. The printf's generated were the same, and the
behaviour was the same - the printf's were happening, but the packets were
getting dropped.

>     m_pullup() the entire mbuf length plus the following mbuf because that
>     might exceed the mbuf cluster size.  Putting the m_pullup there is

Yeah I was tightly coupling the code to the assumption that *_decrypt was
truncating the first mbuf and decrypting the payload into a new mbuf, 
effectively, always splitting the first mbuf into two mbufs. Admittedly this
is a very brittle assumption.

>     not going to work, so rip out your m_pullup() code :-)
>
>     If you hop down to approximately line 357 in esp_input.c, around this
>     line:
>
>       /* was it transmitted over the IPsec tunnel SA? */
>
>     We have an interesting case here.  The first part of the conditional
>     looks correct... it pulls up the IP header and then queues the packet
>     with netisr_queue().  The else section, start at:
>
>       * strip off ESP header and IV.

I'm using 'transport' mode which definitely uses this path.

>     Does not look correct.  The question is, which code path are your
tests
>     using?  We would need a DDB backtrace of the panic to see.

Here's one backtrace -

(kgdb) bt
#0  0xc01e021f in dumpsys ()
#1  0xc01dffba in boot ()
#2  0xc01e0495 in poweroff_wait ()
#3  0xc027a769 in udp_input ()
#4  0xc0281e69 in esp4_input ()
#5  0xc026aea4 in transport_processing_oncpu ()
#6  0xc026ba94 in ip_input ()
#7  0xc026654e in div_output ()
#8  0xc026674e in div_send ()
#9  0xc020bfe1 in netmsg_pru_send ()
#10 0xc0248061 in netmsg_service_loop ()

the tcp_input panics look similarly.

>     If it is the second code path then we probably have some work to do.
>     We obviously cannot just call the protocol routine directly.  We might

This is definitely the case.

>     be able to get away with requeueing the packet the same wawy the first
>     section does, via netisr_queue(NETISR_IP, m).

Hmm. Worth a try! :)

>     Jeffrey Hsu would know for sure.

I'll copy him (separate email).

>     You could try a quick hack... replace this code:
>
>                 if (nxt != IPPROTO_DONE) {
>                         if ((inetsw[ip_protox[nxt]].pr_flags & PR_LASTHDR)
> != 0 &&
>                             ipsec4_in_reject(m, NULL)) {
>                                 ipsecstat.in_polvio++;
>                                 goto bad;
>                         }
>                         (*inetsw[ip_protox[nxt]].pr_input)(m, off, nxt);
>                 } else ...
>
>     With:
>
>               if (nxt != IPPROTO_DONE) {
>                         if ((inetsw[ip_protox[nxt]].pr_flags & PR_LASTHDR)
> != 0 &&
>                             ipsec4_in_reject(m, NULL)) {
>                                 ipsecstat.in_polvio++;
>                                 goto bad;
>                         }
>                       if (netisr_queue(NETISR_IP, m)) {
>                               ipsecstat.in_inval++;
>                               m = NULL;
>                               goto bad;
>                       }
>               } else ...
>
>     I'm not sure if this is correct, but it's worth a shot.  I'm almost
>     positive that the direct protocol dispatch is illegal.

It's interesting that the second code path is different. Looks like it's
been that way since FBSD version 1.8 (taken directly from a KAME
patchset sometime in 2001).  Incidentally version 1.7 looks very
different.

Okay, I'll give the netisr_queue thing a try.

Andrew.






More information about the Submit mailing list