• Hugh Dickins's avatar
    scsi: fix sense_slab/bio swapping livelock · 164fc5dc
    Hugh Dickins authored
    Since 2.6.25-rc7, I've been seeing an occasional livelock on one x86_64
    machine, copying kernel trees to tmpfs, paging out to swap.
    
    Signature: 6000 pages under writeback but never getting written; most
    tasks of interest trying to reclaim, but each get_swap_bio waiting for a
    bio in mempool_alloc's io_schedule_timeout(5*HZ); every five seconds an
    atomic page allocation failure report from kblockd failing to allocate a
    sense_buffer in __scsi_get_command.
    
    __scsi_get_command has a (one item) free_list to protect against this,
    but rc1's [SCSI] use dynamically allocated sense buffer
    de25deb1 upset that slightly.  When it
    fails to allocate from the separate sense_slab, instead of giving up, it
    must fall back to the command free_list, which is sure to have a
    sense_buffer attached.
    
    Either my earlier -rc testing missed this, or there's some recent
    contributory factor.  One very significant factor is SLUB, which merges
    slab caches when it can, and on 64-bit happens to merge both bio cache
    and sense_slab cache into kmalloc's 128-byte cache: so that under this
    swapping load, bios above are liable to gobble up all the slots needed
    for scsi_cmnd sense_buffers below.
    
    That's disturbing behaviour, and I tried a few things to fix it.  Adding
    a no-op constructor to the sense_slab inhibits SLUB from merging it, and
    stops all the allocation failures I was seeing; but it's rather a hack,
    and perhaps in different configurations we have other caches on the
    swapout path which are ill-merged.
    
    Another alternative is to revert the separate sense_slab, using
    cache-line-aligned sense_buffer allocated beyond scsi_cmnd from the one
    kmem_cache; but that might waste more memory, and is only a way of
    diverting around the known problem.
    
    While I don't like seeing the allocation failures, and hate the idea of
    all those bios piled up above a scsi host working one by one, it does
    seem to emerge fairly soon with the livelock fix.  So lacking better
    ideas, stick with that one clear fix for now.
    Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
    Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
    Cc: Jens Axboe <jens.axboe@oracle.com>
    Cc: Christoph Lameter <clameter@sgi.com>
    Cc: Pekka Enberg <penberg@cs.helsinki.fi>
    Cc: Peter Zijlstra <a.p.ziljstra@chello.nl>
    Cc: Rafael J. Wysocki <rjw@sisk.pl>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    164fc5dc
scsi.c 32.4 KB