1. 10 Apr, 2009 4 commits
    • Lai Jiangshan's avatar
      tracing: fix splice return too large · 93cfb3c9
      Lai Jiangshan authored
      I got these from strace:
      
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 16384
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
      
      I wanted to splice_read 4096 bytes, but it returns 8192 or larger.
      
      It is because the return value of tracing_buffers_splice_read()
      does not include "zero out any left over data" bytes.
      
      But tracing_buffers_read() includes these bytes, we make them
      consistent.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      LKML-Reference: <49D46674.9030804@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      93cfb3c9
    • Lai Jiangshan's avatar
      tracing: update file->f_pos when splice(2) it · c7625a55
      Lai Jiangshan authored
      Impact: Cleanup
      
      These two lines:
      
      	if (unlikely(*ppos))
      		return -ESPIPE;
      
      in tracing_buffers_splice_read() are not needed, VFS layer
      has disabled seek(2).
      
      We remove these two lines, and then we can update file->f_pos.
      
      And tracing_buffers_read() updates file->f_pos, this fix
      make tracing_buffers_splice_read() updates file->f_pos too.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      LKML-Reference: <49D46670.4010503@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      c7625a55
    • Lai Jiangshan's avatar
      tracing: allocate page when needed · ddd538f3
      Lai Jiangshan authored
      Impact: Cleanup
      
      Sometimes, we open trace_pipe_raw, but we don't read(2) it,
      we just splice(2) it, thus, the page is not used.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      LKML-Reference: <49D4666B.4010608@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ddd538f3
    • Lai Jiangshan's avatar
      tracing: disable seeking for trace_pipe_raw · d1e7e02f
      Lai Jiangshan authored
      Impact: disable pread()
      
      We set tracing_buffers_fops.llseek to no_llseek,
      but we can still perform pread() to read this file.
      
      That is not expected.
      
      This fix uses nonseekable_open() to disable it.
      
      tracing_buffers_fops.llseek is still set to no_llseek,
      it mark this file is a "non-seekable device" and is used by
      sys_splice(). See also do_splice() or manual of splice(2):
      
      ERRORS
             EINVAL Target file system doesn't support  splicing;
                    neither  of the descriptors refers to a pipe;
                    or offset given for non-seekable device.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      LKML-Reference: <49D46668.8030806@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      d1e7e02f
  2. 09 Apr, 2009 25 commits
  3. 08 Apr, 2009 11 commits
    • Mikulas Patocka's avatar
      dm kcopyd: fix callback race · 340cd444
      Mikulas Patocka authored
      If the thread calling dm_kcopyd_copy is delayed due to scheduling inside
      split_job/segment_complete and the subjobs complete before the loop in
      split_job completes, the kcopyd callback could be invoked from the
      thread that called dm_kcopyd_copy instead of the kcopyd workqueue.
      
      dm_kcopyd_copy -> split_job -> segment_complete -> job->fn()
      
      Snapshots depend on the fact that callbacks are called from the singlethreaded
      kcopyd workqueue and expect that there is no racing between individual
      callbacks. The racing between callbacks can lead to corruption of exception
      store and it can also mean that exception store callbacks are called twice
      for the same exception - a likely reason for crashes reported inside
      pending_complete() / remove_exception().
      
      This patch fixes two problems:
      
      1. job->fn being called from the thread that submitted the job (see above).
      
      - Fix: hand over the completion callback to the kcopyd thread.
      
      2. job->fn(read_err, write_err, job->context); in segment_complete
      reports the error of the last subjob, not the union of all errors.
      
      - Fix: pass job->write_err to the callback to report all error bits
        (it is done already in run_complete_job)
      
      Cc: stable@kernel.org
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      340cd444
    • Mikulas Patocka's avatar
      dm kcopyd: prepare for callback race fix · 73830857
      Mikulas Patocka authored
      Use a variable in segment_complete() to point to the dm_kcopyd_client
      struct and only release job->pages in run_complete_job() if any are
      defined.  These changes are needed by the next patch.
      
      Cc: stable@kernel.org
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      73830857
    • Mikulas Patocka's avatar
      dm: implement basic barrier support · af7e466a
      Mikulas Patocka authored
      Barriers are submitted to a worker thread that issues them in-order.
      
      The thread is modified so that when it sees a barrier request it waits
      for all pending IO before the request then submits the barrier and
      waits for it.  (We must wait, otherwise it could be intermixed with
      following requests.)
      
      Errors from the barrier request are recorded in a per-device barrier_error
      variable. There may be only one barrier request in progress at once.
      
      For now, the barrier request is converted to a non-barrier request when
      sending it to the underlying device.
      
      This patch guarantees correct barrier behavior if the underlying device
      doesn't perform write-back caching. The same requirement existed before
      barriers were supported in dm.
      
      Bottom layer barrier support (sending barriers by target drivers) and
      handling devices with write-back caches will be done in further patches.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      af7e466a
    • Mikulas Patocka's avatar
      dm: remove dm_request loop · 92c63902
      Mikulas Patocka authored
      Remove queue_io return value and a loop in dm_request.
      
      IO may be submitted to a worker thread with queue_io().  queue_io() sets
      DMF_QUEUE_IO_TO_THREAD so that all further IO is queued for the thread. When
      the thread finishes its work, it clears DMF_QUEUE_IO_TO_THREAD and from this
      point on, requests are submitted from dm_request again. This will be used
      for processing barriers.
      
      Remove the loop in dm_request. queue_io() can submit I/Os to the worker thread
      even if DMF_QUEUE_IO_TO_THREAD was not set.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      92c63902
    • Mikulas Patocka's avatar
      dm: rework queueing and suspension · 3b00b203
      Mikulas Patocka authored
      Rework shutting down on suspend and document the associated rules.
      
      Drop write lock in __split_and_process_bio to allow more processing
      concurrency.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      3b00b203
    • Alasdair G Kergon's avatar
      dm: simplify dm_request loop · 54d9a1b4
      Alasdair G Kergon authored
      Refactor the code in dm_request().
      
      Require the new DMF_BLOCK_FOR_SUSPEND flag on readahead bios we will
      discard so we don't drop such bios while processing a barrier.
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      54d9a1b4
    • Alasdair G Kergon's avatar
      dm: split DMF_BLOCK_IO flag into two · 1eb787ec
      Alasdair G Kergon authored
      Split the DMF_BLOCK_IO flag into two.
      
      DMF_BLOCK_IO_FOR_SUSPEND is set when I/O must be blocked while suspending a
      device.  DMF_QUEUE_IO_TO_THREAD is set when I/O must be queued to a
      worker thread for later processing.
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      1eb787ec
    • Alasdair G Kergon's avatar
      dm: rearrange dm_wq_work · df12ee99
      Alasdair G Kergon authored
      Refactor dm_wq_work() to make later patch more readable.
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      df12ee99
    • Mikulas Patocka's avatar
      dm: remove limited barrier support · 692d0eb9
      Mikulas Patocka authored
      Prepare for full barrier implementation: first remove the restricted support.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      692d0eb9
    • Martin K. Petersen's avatar
      dm: add integrity support · 9c47008d
      Martin K. Petersen authored
      This patch provides support for data integrity passthrough in the device
      mapper.
      
       - If one or more component devices support integrity an integrity
         profile is preallocated for the DM device.
      
       - If all component devices have compatible profiles the DM device is
         flagged as capable.
      
       - Handle integrity metadata when splitting and cloning bios.
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      9c47008d
    • Serge E. Hallyn's avatar
      cap_prctl: don't set error to 0 at 'no_change' · 5bf37ec3
      Serge E. Hallyn authored
      One-liner: capsh --print is broken without this patch.
      
      In certain cases, cap_prctl returns error > 0 for success.  However,
      the 'no_change' label was always setting error to 0.  As a result,
      for example, 'prctl(CAP_BSET_READ, N)' would always return 0.
      It should return 1 if a process has N in its bounding set (as
      by default it does).
      
      I'm keeping the no_change label even though it's now functionally
      the same as 'error'.
      Signed-off-by: default avatarSerge Hallyn <serue@us.ibm.com>
      Acked-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarJames Morris <jmorris@namei.org>
      5bf37ec3