• Eric Dumazet's avatar
    [TCP]: Keep copied_seq, rcv_wup and rcv_next together. · 54287cc1
    Eric Dumazet authored
    I noticed in oprofile study a cache miss in tcp_rcv_established() to read
    copied_seq.
    
    ffffffff80400a80 <tcp_rcv_established>: /* tcp_rcv_established total: 4034293  
    2.0400 */
    
     55493  0.0281 :ffffffff80400bc9:   mov    0x4c8(%r12),%eax copied_seq
    543103  0.2746 :ffffffff80400bd1:   cmp    0x3e0(%r12),%eax   rcv_nxt    
    
    if (tp->copied_seq == tp->rcv_nxt &&
            len - tcp_header_len <= tp->ucopy.len) {
    
    In this function, the cache line 0x4c0 -> 0x500 is used only for this
    reading 'copied_seq' field.
    
    rcv_wup and copied_seq should be next to rcv_nxt field, to lower number of
    active cache lines in hot paths. (tcp_rcv_established(), tcp_poll(), ...)
    
    As you suggested, I changed tcp_create_openreq_child() so that these fields
    are changed together, to avoid adding a new store buffer stall.
    
    Patch is 64bit friendly (no new hole because of alignment constraints)
    Signed-off-by: default avatarEric Dumazet <dada1@cosmosbay.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    54287cc1
tcp_minisocks.c 22.6 KB