1. 24 Feb, 2010 2 commits
    • Steve Wise's avatar
      9918b28d
    • Steve Wise's avatar
      RDMA/cxgb3: Doorbell overflow avoidance and recovery · e998f245
      Steve Wise authored
      T3 hardware doorbell FIFO overflows can cause application stalls due
      to lost doorbell ring events.  This has been seen when running large
      NP IMB alltoall MPI jobs.  The T3 hardware supports an xon/xoff-type
      flow control mechanism to help avoid overflowing the HW doorbell FIFO.
      
      This patch uses these interrupts to disable RDMA QP doorbell rings
      when we near an overflow condition, and then turn them back on (and
      ring all the active QP doorbells) when when the doorbell FIFO empties
      out.  In addition if an doorbell ring is dropped by the hardware, the
      code will now recover.
      
      Design:
      
      cxgb3:
      - enable these DB interrupts
      - in the interrupt handler, schedule work tasks to call the ULPs event
        handlers with the new events.
      - ring all the qset txqs when an overflow is detected.
      
      iw_cxgb3:
      - disable db ringing on all active qps when we get the DB_FULL event
      - enable db ringing on all active qps and ring all active dbs when we get
        the DB_EMPTY event
      - On DB_DROP event:
             - disable db rings in the event handler
             - delay-schedule a work task which rings and enables the dbs on
               all active qps.
      - in post_send and post_recv logic, don't ring the db if it's disabled.
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarRoland Dreier <rolandd@cisco.com>
      e998f245
  2. 11 Feb, 2010 32 commits
  3. 10 Feb, 2010 6 commits