1. 08 Sep, 2008 6 commits
    • Daniel Lezcano's avatar
      netns : fix kernel panic in timewait socket destruction · d315492b
      Daniel Lezcano authored
      How to reproduce ?
       - create a network namespace
       - use tcp protocol and get timewait socket
       - exit the network namespace
       - after a moment (when the timewait socket is destroyed), the kernel
         panics.
      
      # BUG: unable to handle kernel NULL pointer dereference at
      0000000000000007
      IP: [<ffffffff821e394d>] inet_twdr_do_twkill_work+0x6e/0xb8
      PGD 119985067 PUD 11c5c0067 PMD 0
      Oops: 0000 [1] SMP
      CPU 1
      Modules linked in: ipv6 button battery ac loop dm_mod tg3 libphy ext3 jbd
      edd fan thermal processor thermal_sys sg sata_svw libata dock serverworks
      sd_mod scsi_mod ide_disk ide_core [last unloaded: freq_table]
      Pid: 0, comm: swapper Not tainted 2.6.27-rc2 #3
      RIP: 0010:[<ffffffff821e394d>] [<ffffffff821e394d>]
      inet_twdr_do_twkill_work+0x6e/0xb8
      RSP: 0018:ffff88011ff7fed0 EFLAGS: 00010246
      RAX: ffffffffffffffff RBX: ffffffff82339420 RCX: ffff88011ff7ff30
      RDX: 0000000000000001 RSI: ffff88011a4d03c0 RDI: ffff88011ac2fc00
      RBP: ffffffff823392e0 R08: 0000000000000000 R09: ffff88002802a200
      R10: ffff8800a5c4b000 R11: ffffffff823e4080 R12: ffff88011ac2fc00
      R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000000
      FS: 0000000041cbd940(0000) GS:ffff8800bff839c0(0000)
      knlGS:0000000000000000
      CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      CR2: 0000000000000007 CR3: 00000000bd87c000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process swapper (pid: 0, threadinfo ffff8800bff9e000, task
      ffff88011ff76690)
      Stack: ffffffff823392e0 0000000000000100 ffffffff821e3a3a
      0000000000000008
      0000000000000000 ffffffff821e3a61 ffff8800bff7c000 ffffffff8203c7e7
      ffff88011ff7ff10 ffff88011ff7ff10 0000000000000021 ffffffff82351108
      Call Trace:
      <IRQ> [<ffffffff821e3a3a>] ? inet_twdr_hangman+0x0/0x9e
      [<ffffffff821e3a61>] ? inet_twdr_hangman+0x27/0x9e
      [<ffffffff8203c7e7>] ? run_timer_softirq+0x12c/0x193
      [<ffffffff820390d1>] ? __do_softirq+0x5e/0xcd
      [<ffffffff8200d08c>] ? call_softirq+0x1c/0x28
      [<ffffffff8200e611>] ? do_softirq+0x2c/0x68
      [<ffffffff8201a055>] ? smp_apic_timer_interrupt+0x8e/0xa9
      [<ffffffff8200cad6>] ? apic_timer_interrupt+0x66/0x70
      <EOI> [<ffffffff82011f4c>] ? default_idle+0x27/0x3b
      [<ffffffff8200abbd>] ? cpu_idle+0x5f/0x7d
      
      
      Code: e8 01 00 00 4c 89 e7 41 ff c5 e8 8d fd ff ff 49 8b 44 24 38 4c 89 e7
      65 8b 14 25 24 00 00 00 89 d2 48 8b 80 e8 00 00 00 48 f7 d0 <48> 8b 04 d0
      48 ff 40 58 e8 fc fc ff ff 48 89 df e8 c0 5f 04 00
      RIP [<ffffffff821e394d>] inet_twdr_do_twkill_work+0x6e/0xb8
      RSP <ffff88011ff7fed0>
      CR2: 0000000000000007
      
      This patch provides a function to purge all timewait sockets related
      to a network namespace. The timewait sockets life cycle is not tied with
      the network namespace, that means the timewait sockets stay alive while
      the network namespace dies. The timewait sockets are for avoiding to
      receive a duplicate packet from the network, if the network namespace is
      freed, the network stack is removed, so no chance to receive any packets
      from the outside world. Furthermore, having a pending destruction timer
      on these sockets with a network namespace freed is not safe and will lead
      to an oops if the timer callback which try to access data belonging to 
      the namespace like for example in:
      	inet_twdr_do_twkill_work
      		-> NET_INC_STATS_BH(twsk_net(tw), LINUX_MIB_TIMEWAITED);
      
      Purging the timewait sockets at the network namespace destruction will:
       1) speed up memory freeing for the namespace
       2) fix kernel panic on asynchronous timewait destruction
      Signed-off-by: default avatarDaniel Lezcano <dlezcano@fr.ibm.com>
      Acked-by: default avatarDenis V. Lunev <den@openvz.org>
      Acked-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d315492b
    • Jarek Poplawski's avatar
      pkt_sched: Fix qdisc state in net_tx_action() · e8a83e10
      Jarek Poplawski authored
      net_tx_action() can skip __QDISC_STATE_SCHED bit clearing while qdisc
      is neither ran nor rescheduled, which may cause endless loop in
      dev_deactivate().
      Reported-by: default avatarDenys Fedoryshchenko <denys@visp.net.lb>
      Tested-by: default avatarDenys Fedoryshchenko <denys@visp.net.lb>
      Signed-off-by: default avatarJarek Poplawski <jarkao2@gmail.com>
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e8a83e10
    • Patrick McHardy's avatar
      netfilter: nf_conntrack_irc: make sure string is terminated before calling simple_strtoul · e3b802ba
      Patrick McHardy authored
      Alexey Dobriyan points out:
      
      1. simple_strtoul() silently accepts all characters for given base even
         if result won't fit into unsigned long. This is amazing stupidity in
         itself, but
      
      2. nf_conntrack_irc helper use simple_strtoul() for DCC request parsing.
         Data first copied into 64KB buffer, so theoretically nothing prevents
         reading past the end of it, since data comes from network given 1).
      
      This is not actually a problem currently since we're guaranteed to have
      a 0 byte in skb_shared_info or in the buffer the data is copied to, but
      to make this more robust, make sure the string is actually terminated.
      Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e3b802ba
    • Alexey Dobriyan's avatar
      netfilter: nf_conntrack_gre: nf_ct_gre_keymap_flush() fixlet · 51807e91
      Alexey Dobriyan authored
      It does "kfree(list_head)" which looks wrong because entity that was
      allocated is definitely not list_head.
      
      However, this all works because list_head is first item in
      struct nf_ct_gre_keymap.
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      51807e91
    • Alexey Dobriyan's avatar
      netfilter: nf_conntrack_gre: more locking around keymap list · 887464a4
      Alexey Dobriyan authored
      gre_keymap_list should be protected in all places.
      (unless I'm misreading something)
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      887464a4
    • Alexey Dobriyan's avatar
      netfilter: nf_conntrack_sip: de-static helper pointers · 66bf7918
      Alexey Dobriyan authored
      Helper's ->help hook can run concurrently with itself, so iterating over
      SIP helpers with static pointer won't work reliably.
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      66bf7918
  2. 03 Sep, 2008 21 commits
  3. 02 Sep, 2008 11 commits
  4. 29 Aug, 2008 2 commits
    • David S. Miller's avatar
      net: Unbreak userspace usage of linux/mroute.h · 7c19a3d2
      David S. Miller authored
      Nothing in linux/pim.h should be exported to userspace.
      
      This should fix the XORP build failure reported by
      Jose Calhariz, the debain package maintainer.
      
      Nothing originally in linux/mroute.h was exported to userspace
      ever, but some of this stuff started to be when it was moved into
      this new linux/pim.h, and that was wrong.  If we didn't provide these
      definitions for 10 years we can reasonably expect that applications
      defined this stuff locally or used GLIBC headers providing the
      protocol definitions.  And as such the only result of this can
      be conflict and userland build breakage.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7c19a3d2
    • Jarek Poplawski's avatar
      pkt_sched: Fix locking of qdisc_root with qdisc_root_sleeping_lock() · 102396ae
      Jarek Poplawski authored
      Use qdisc_root_sleeping_lock() instead of qdisc_root_lock() where
      appropriate. The only difference is while dev is deactivated, when
      currently we can use a sleeping qdisc with the lock of noop_qdisc.
      This shouldn't be dangerous since after deactivation root lock could
      be used only by gen_estimator code, but looks wrong anyway.
      Signed-off-by: default avatarJarek Poplawski <jarkao2@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      102396ae