- 10 Dec, 2009 40 commits
-
-
Mikulas Patocka authored
dm-kcopyd: accept zero-size jobs This patch changes dm-kcopyd so that it accepts zero-size jobs and completes them immediatelly via its completion thread. It is needed for multisnapshots snapshot resizing. When we are writing to a chunk beyond origin end, no copying is done. To simplify the code, we submit an empty request to kcopyd and let kcopyd complete it. If we didn't submit a request to kcopyd and called the completion routine immediatelly, it would violate the principle that completion is called only from one thread and it would need additional locking. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mike Snitzer authored
Keep track of whether or not the device is suspended within the snapshot target module, the same as we do in dm-raid1. We will use this later to enforce the correct sequence of ioctls to transfer the in-core exceptions from a snapshot target instance in one table to a replacement one capable of merging them back into the origin. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mike Snitzer authored
Store the reference to the snapshot cow device in the core snapshot code instead of each exception store. It can be accessed through the new function dm_snap_cow(). Exception stores should each now maintain a reference to their parent snapshot struct. This is cleaner and makes part of the forthcoming snapshot merge code simpler. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Reviewed-by: Jonathan Brassow <jbrassow@redhat.com> Cc: Mikulas Patocka <mpatocka@redhat.com>
-
Mike Snitzer authored
Add number of sectors used by metadata to the end of the snapshot's status line. Renamed dm_exception_store_type's 'fraction_full' to 'usage'. Renamed arguments to be clearer about what is being returned. Also added 'metadata_sectors'. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Jon Brassow authored
Rename exception functions. Preparing to pull them out of dm-snap.c for broader use. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Jon Brassow authored
Rename exception_table for broader use outside dm-snap.c Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Jon Brassow authored
The exception structure is not necessarily just a snapshot element (especially after we pull it out of dm-snap.c). Renaming appropriately. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Jon Brassow authored
Consolidate the insert_*exception functions. 'insert_completed_exception' already contains all the logic to handle 'insert_exception' (via check for a hash_shift of 0), so remove redundant function. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
The origin needs to find minimum chunksize of all snapshots. This logic is moved to a separate function because it will be used at another place in the snapshot merge patches. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Removed unnecessary 'and' masking: The right shift discards the lower bits so there is no need to clear them. (A later patch needs this change to support a 32-bit chunk_mask.) Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Jon Brassow authored
Minor code touch-up. We don't need the 'else'. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Roel Kluin authored
strlcpy() will always null terminate the string. The code should already guarantee this as the last bytes are already NULs and the string lengths were restricted before being stored in hc. Removing the '-1' becomes necessary so strlcpy() doesn't lose the last character of a maximum-length string. - agk Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Explicitly initialize bio lists instead of relying on kzalloc. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Takahiro Yasui <tyasui@redhat.com> Tested-by: Takahiro Yasui <tyasui@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Hold all write bios when leg fails and errors are handled When using a userspace daemon such as dmeventd to handle errors, we must delay completing bios until it has done its job. This patch prevents the following race: - primary leg fails - write "1" fail, the write is held, secondary leg is set default - write "2" goes straight to the secondary leg Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Takahiro Yasui <tyasui@redhat.com> Tested-by: Takahiro Yasui <tyasui@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Hold all write bios when errors are handled. Previously the failures list was used only when handling errors with a userspace daemon such as dmeventd. Now, it is always used for all bios. The regions where some writes failed must be marked as nosync. This can only be done in process context (i.e. in raid1 workqueue), not in the write_callback function. Previously the write would succeed if writing to at least one leg succeeded. This is wrong because data from the failed leg may be replicated to the correct leg. Now, if using a userspace daemon, the write with some failures will be held until the daemon has done its job and reconfigured the array. If not using a daemon, the write still succeeds if at least one leg succeeds. This is bad, but it is consistent with current behavior. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Takahiro Yasui <tyasui@redhat.com> Tested-by: Takahiro Yasui <tyasui@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Move bio completion out of dm_rh_mark_nosync in preparation for the next patch. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Takahiro Yasui <tyasui@redhat.com> Tested-by: Takahiro Yasui <tyasui@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Move the logic to get a valid mirror leg into a function for re-use in a later patch. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Takahiro Yasui <tyasui@redhat.com> Tested-by: Takahiro Yasui <tyasui@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Use the hold framework in do_failures. This patch doesn't change the bio processing logic, it just simplifies failure handling and avoids periodically polling the failures list. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Takahiro Yasui <tyasui@redhat.com> Tested-by: Takahiro Yasui <tyasui@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Add framework to delay bios until a suspend and then resubmit them with either DM_ENDIO_REQUEUE (if the suspend was noflush) or complete them with -EIO. I/O barrier support will use this. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Takahiro Yasui <tyasui@redhat.com> Tested-by: Takahiro Yasui <tyasui@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Report flush errors as 'F' instead of 'D' for log and mirror devices. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Implement flush callee. It uses dm_io to send zero-size barrier synchronously and concurrently to all the mirror legs. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Call the flush callback from the log. If flush failed, we have no alternative but to mark the whole log as dirty. Also we set the variable flush_failed to prevent any bits ever being marked as clean again. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Introduce a callback pointer from the log to dm-raid1 layer. Before some region is set as "in-sync", we need to flush hardware cache on all the disks. But the log module doesn't have access to the mirror_set structure. So it will use this callback. So far the callback is unused, it will be used in further patches. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Introduce "flush failed" variable. When a flush before clearing a bit in the log fails, we don't know anything about which which regions are in-sync and which not. So we need to set all regions as not-in-sync and set the variable "flush_failed" to prevent setting the in-sync bit in the future. A target reload is the only way to get out of this situation. The variable will be set in following patches. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Introduce flush_header and use it to flush the log device. Note that we don't have to flush if all the regions transition from "dirty" to "clean" state. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Split the variable "touched" into two, "touched_dirtied" and "touched_cleaned", set when some region was dirtied or cleaned. This will be used to optimize flushes. After a transition from "dirty" to "clean" state we don't have flush hardware cache on the log device. After a transition from "clean" to "dirty" the cache must be flushed. Before a transition from "clean" to "dirty" state we don't have to flush all the raid legs. Before a transition from "dirty" to "clean" we must flush all the legs to make sure that they are really in sync. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Flush support for dm-raid1. When it receives an empty barrier, submit it to all the devices via dm-io. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Remove the hack where we allocate an extra bi_io_vec to store additional private data. This hack prevents us from supporting barriers in dm-raid1 without first making another little block layer change. Instead of doing that, this patch eliminates the bi_io_vec abuse by storing the region number directly in the low bits of bi_private. We need to store two things for each bio, the pointer to the main io structure and, if parallel writes were requested, an index indicating which of these writes this bio belongs to. There can be at most BITS_PER_LONG regions - 32 or 64. The index (region number) was stored in the last (hidden) bio vector and the pointer to struct io was stored in bi_private. This patch now aligns "struct io" on BITS_PER_LONG bytes and stores the region number in the low BITS_PER_LONG bits of bi_private. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Allocate "struct io" from a slab. This patch changes dm-io, so that "struct io" is allocated from a slab cache. It used to be allocated with kmalloc. Allocating from a slab will be needed for the next patch, because it requires a special alignment of "struct io" and kmalloc cannot meet this alignment. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Milan Broz authored
The "wipe key" message is used to wipe the volume key from memory temporarily, for example when suspending to RAM. But the initialisation vector in ESSIV mode is calculated from the hashed volume key, so the wipe message should wipe this IV key too and reinitialise it when the volume key is reinstated. This patch adds an IV wipe method called from a wipe message callback. ESSIV is then reinitialised using the init function added by the last patch. Cc: stable@kernel.org Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Milan Broz authored
This patch separates the construction of IV from its initialisation. (For ESSIV it is a hash calculation based on volume key.) Constructor code now preallocates hash tfm and salt array and saves it in a private IV structure. The next patch requires this to reinitialise the wiped IV without reallocating memory when resuming a suspended device. Cc: stable@kernel.org Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Milan Broz authored
Use kzfree for salt deallocation because it is derived from the volume key. Use a common error path in ESSIV constructor. Required by a later patch which fixes the way key material is wiped from memory. Cc: stable@kernel.org Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Milan Broz authored
Define private structures for IV so it's easy to add further attributes in a following patch which fixes the way key material is wiped from memory. Also move ESSIV destructor and remove unnecessary 'status' operation. There are no functional changes in this patch. Cc: stable@kernel.org Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Milan Broz authored
The "wipe key" message is used to wipe a volume key from memory temporarily, for example when suspending to RAM. There are two instances of the key in memory (inside crypto tfm) but only one got wiped. This patch wipes them both. Cc: stable@kernel.org Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Under some special conditions the snapshot hash_size is calculated as zero. This patch instead sets a minimum value of 64, the same as for the pending exception table. rounddown_pow_of_two(0) is an undefined operation (it expands to shift by -1). init_exception_table with an argument of 0 would fail with -ENOMEM. The way to trigger the problem is to create a snapshot with a chunk size that is larger than the origin device. Cc: stable@kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Take snapshot lock only for STATUSTYPE_INFO, not STATUSTYPE_TABLE. Commit 4c6fff44 (dm-snapshot-lock-snapshot-while-supplying-status.patch) introduced this use of the lock, but userspace applications using libdevmapper have been found to request STATUSTYPE_TABLE while the device is suspended and the lock is already held, leading to deadlock. Since the lock is not necessary in this case, don't try to take it. Cc: stable@kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Milan Broz authored
This patch just removes an unnecessary warning: kobject: 'dm': does not have a release() function, it is broken and must be fixed. The kobject is embedded in mapped device struct, so code does not need to release memory explicitly here. Cc: stable@kernel.org Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Julia Lawall authored
Error handling code following a kmalloc should free the allocated data. Cc: stable@kernel.org Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Fix a reported deadlock if there are still unprocessed multipath events on a device that is being removed. _hash_lock is held during dev_remove while trying to send the outstanding events. Sending the events requests the _hash_lock again in dm_copy_name_and_uuid. This patch introduces a separate lock around regions that modify the link to the hash table (dm_set_mdptr) or the name or uuid so that dm_copy_name_and_uuid no longer needs _hash_lock. Additionally, dm_copy_name_and_uuid can only be called if md exists so we can drop the dm_get() and dm_put() which can lead to a BUG() while md is being freed. The deadlock: #0 [ffff8106298dfb48] schedule at ffffffff80063035 #1 [ffff8106298dfc20] __down_read at ffffffff8006475d #2 [ffff8106298dfc60] dm_copy_name_and_uuid at ffffffff8824f740 #3 [ffff8106298dfc90] dm_send_uevents at ffffffff88252685 #4 [ffff8106298dfcd0] event_callback at ffffffff8824c678 #5 [ffff8106298dfd00] dm_table_event at ffffffff8824dd01 #6 [ffff8106298dfd10] __hash_remove at ffffffff882507ad #7 [ffff8106298dfd30] dev_remove at ffffffff88250865 #8 [ffff8106298dfd60] ctl_ioctl at ffffffff88250d80 #9 [ffff8106298dfee0] do_ioctl at ffffffff800418c4 #10 [ffff8106298dff00] vfs_ioctl at ffffffff8002fab9 #11 [ffff8106298dff40] sys_ioctl at ffffffff8004bdaf #12 [ffff8106298dff80] tracesys at ffffffff8005d28d (via system_call) Cc: stable@kernel.org Reported-by: guy keren <choo@actcom.co.il> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6Linus Torvalds authored
* 'acpica' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: ACPICA: Update version to 20091112. ACPICA: Add additional module-level code support ACPICA: Deploy new create integer interface where appropriate ACPICA: New internal utility function to create Integer objects ACPICA: Add repair for predefined methods that must return sorted lists ACPICA: Fix possible fault if return Package objects contain NULL elements ACPICA: Add post-order callback to acpi_walk_namespace ACPICA: Change package length error message to an info message ACPICA: Reduce severity of predefined repair messages, Warning to Info ACPICA: Update version to 20091013 ACPICA: Fix possible memory leak for Scope ASL operator ACPICA: Remove possibility of executing _REG methods twice ACPICA: Add repair for bad _MAT buffers ACPICA: Add repair for bad _BIF/_BIX packages
-