- 06 Jan, 2009 40 commits
-
-
Matthew Wilcox authored
Removing the completion from klist_node reduces its size from 64 bytes to 28 on x86-64. To maintain the semantics of klist_remove(), we add a single list of klist nodes which are pending deletion and scan them. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Matthew Wilcox authored
This minor rearrangement saves 16 bytes from sizeof(struct device) according to pahole. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
Alan Stern authored
This patch (as1167) fixes some misspellings in various recently-added macros in pm.h. Fortunately these macros are not yet used anywhere. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
-
Rafael J. Wysocki authored
PM: Simplify the new suspend/hibernation framework for devices Following the discussion at the Kernel Summit, simplify the new device PM framework by merging 'struct pm_ops' and 'struct pm_ext_ops' and removing pointers to 'struct pm_ext_ops' from 'struct platform_driver' and 'struct pci_driver'. After this change, the suspend/hibernation callbacks will only reside in 'struct device_driver' as well as at the bus type/ device class/device type level. Accordingly, PCI and platform device drivers are now expected to put their suspend/hibernation callbacks into the 'struct device_driver' embedded in 'struct pci_driver' or 'struct platform_driver', respectively. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@suse.cz> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-
git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dmLinus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: dm snapshot: extend exception store functions dm snapshot: split out exception store implementations dm snapshot: rename struct exception_store dm snapshot: separate out exception store interface dm mpath: move trigger_event to system workqueue dm: add name and uuid to sysfs dm table: rework reference counting dm: support barriers on simple devices dm request: extend target interface dm request: add caches dm ioctl: allow dm_copy_name_and_uuid to return only one field dm log: ensure log bitmap fits on log device dm log: move region_size validation dm log: avoid reinitialising io_req on every operation dm: consolidate target deregistration error handling dm raid1: fix error count dm log: fix dm_io_client leak on error paths dm snapshot: change yield to msleep dm table: drop reference at unbind
-
Jonathan Brassow authored
Supply dm_add_exception as a callback to the read_metadata function. Add a status function ready for a later patch and name the functions consistently. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Alasdair G Kergon authored
Move the existing snapshot exception store implementations out into separate files. Later patches will place these behind a new interface in preparation for alternative implementations. Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Jonathan Brassow authored
Rename struct exception_store to dm_exception_store. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Jonathan Brassow authored
Pull structures that bridge the gap between snapshot and exception store out of dm-snap.h and put them in a new .h file - dm-exception-store.h. This file will define the API for new exception stores. Ultimately, dm-snap.h is unnecessary, since only dm-snap.c should be using it. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Alasdair G Kergon authored
The same workqueue is used both for sending uevents and processing queued I/O. Deadlock has been reported in RHEL5 when sending a uevent was blocked waiting for the queued I/O to be processed. Use scheduled_work() for the asynchronous uevents instead. Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Milan Broz authored
Implement simple read-only sysfs entry for device-mapper block device. This patch adds a simple sysfs directory named "dm" under block device properties and implements - name attribute (string containing mapped device name) - uuid attribute (string containing UUID, or empty string if not set) The kobject is embedded in mapped_device struct, so no additional memory allocation is needed for initializing sysfs entry. During the processing of sysfs attribute we need to lock mapped device which is done by a new function dm_get_from_kobj, which returns the md associated with kobject and increases the usage count. Each 'show attribute' function is responsible for its own locking. Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Rework table reference counting. The existing code uses a reference counter. When the last reference is dropped and the counter reaches zero, the table destructor is called. Table reference counters are acquired/released from upcalls from other kernel code (dm_any_congested, dm_merge_bvec, dm_unplug_all). If the reference counter reaches zero in one of the upcalls, the table destructor is called from almost random kernel code. This leads to various problems: * dm_any_congested being called under a spinlock, which calls the destructor, which calls some sleeping function. * the destructor attempting to take a lock that is already taken by the same process. * stale reference from some other kernel code keeps the table constructed, which keeps some devices open, even after successful return from "dmsetup remove". This can confuse lvm and prevent closing of underlying devices or reusing device minor numbers. The patch changes reference counting so that the table destructor can be called only at predetermined places. The table has always exactly one reference from either mapped_device->map or hash_cell->new_map. After this patch, this reference is not counted in table->holders. A pair of dm_create_table/dm_destroy_table functions is used for table creation/destruction. Temporary references from the other code increase table->holders. A pair of dm_table_get/dm_table_put functions is used to manipulate it. When the table is about to be destroyed, we wait for table->holders to reach 0. Then, we call the table destructor. We use active waiting with msleep(1), because the situation happens rarely (to one user in 5 years) and removing the device isn't performance-critical task: the user doesn't care if it takes one tick more or not. This way, the destructor is called only at specific points (dm_table_destroy function) and the above problems associated with lazy destruction can't happen. Finally remove the temporary protection added to dm_any_congested(). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Andi Kleen authored
Implement barrier support for single device DM devices This patch implements barrier support in DM for the common case of dm linear just remapping a single underlying device. In this case we can safely pass the barrier through because there can be no reordering between devices. NB. Any DM device might cease to support barriers if it gets reconfigured so code must continue to allow for a possible -EOPNOTSUPP on every barrier bio submitted. - agk Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Kiyoshi Ueda authored
This patch adds the following target interfaces for request-based dm. map_rq : for mapping a request rq_end_io : for finishing a request busy : for avoiding performance regression from bio-based dm. Target can tell dm core not to map requests now, and that may help requests in the block layer queue to be bigger by I/O merging. In bio-based dm, this behavior is done by device drivers managing the block layer queue. But in request-based dm, dm core has to do that since dm core manages the block layer queue. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Kiyoshi Ueda authored
This patch prepares some kmem_caches for request-based dm. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Milan Broz authored
Allow NULL buffer in dm_copy_name_and_uuid if you only want to return one of the fields. (Required by a following patch that adds these fields to sysfs.) Signed-off-by: Milan Broz <mbroz@redhat.com> Reviewed-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Milan Broz authored
Check that the log bitmap will fit within the log device. Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Milan Broz authored
Move log size validation from mirror target to log constructor. Removed PAGE_SIZE restriction we no longer think necessary. Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Takahiro Yasui authored
rw_header function updates three members of io_req data every time when I/O is processed. bi_rw and notify.fn are never modified once they get initialized, and so they can be set in advance. header_to_disk() can also be pulled out of write_header() since only one caller needs it and write_header() can be replaced by rw_header() directly. Signed-off-by: Takahiro Yasui <tyasui@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Change dm_unregister_target to return void and use BUG() for error reporting. dm_unregister_target can only fail because of programming bug in the target driver. It can't fail because of user's behavior or disk errors. This patch changes unregister_target to return void and use BUG if someone tries to unregister non-registered target or unregister target that is in use. This patch removes code duplication (testing of error codes in all dm targets) and reports bugs in just one place, in dm_unregister_target. In some target drivers, these return codes were ignored, which could lead to a situation where bugs could be missed. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Jonathan Brassow authored
Always increase the error count when I/O on a leg of a mirror fails. The error count is used to decide whether to select an alternative mirror leg. If the target doesn't use the "handle_errors" feature, the error count is not updated and the bio can get requeued forever by the read callback. Fix it by increasing error_count before the handle_errors feature checking. Cc: stable@kernel.org Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Takahiro Yasui authored
In create_log_context function, dm_io_client_destroy function needs to be called, when memory allocation of disk_header, sync_bits and recovering_bits failed, but dm_io_client_destroy is not called. Cc: stable@kernel.org Signed-off-by: Takahiro Yasui <tyasui@redhat.com> Acked-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Change yield() to msleep(1). If the thread had realtime priority, yield() doesn't really yield, so the yielding process would loop indefinitely and cause machine lockup. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Move one dm_table_put() so that the last reference in the thread gets dropped in __unbind(). This is required for a following patch, dm-table-rework-reference-counting.patch, which will change the logic in such a way that table destructor is called only at specific points in the code. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
git://git.o-hand.com/linux-mfdLinus Torvalds authored
* 'for-next' of git://git.o-hand.com/linux-mfd: (30 commits) mfd: Fix section mismatch in da903x mfd: move drivers/i2c/chips/menelaus.c to drivers/mfd mfd: move drivers/i2c/chips/tps65010.c to drivers/mfd mfd: dm355evm msp430 driver mfd: Add missing break from wm3850-core mfd: Add WM8351 support mfd: Support configurable numbers of DCDCs and ISINKs on WM8350 mfd: Handle missing WM8350 platform data mfd: Add WM8352 support mfd: Use irq_to_desc in twl4030 code power_supply: Add Dialog DA9030 battery charger driver mfd: Dialog DA9030 battery charger MFD driver mfd: Register WM8400 codec device mfd: Pass driver_data onto child devices mfd: Fix twl4030-core.c build error mfd: twl4030 regulator bug fixes mfd: twl4030: create some regulator devices mfd: twl4030: cleanup symbols and OMAP dependency mfd: twl4030: simplified child creation code power_supply: Add battery health reporting for WM8350 ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linusLinus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: module: convert to stop_machine_create/destroy. stop_machine: introduce stop_machine_create/destroy. parisc: fix module loading failure of large kernel modules module: fix module loading failure of large kernel modules for parisc module: fix warning of unused function when !CONFIG_PROC_FS kernel/module.c: compare symbol values when marking symbols as exported in /proc/kallsyms. remove CONFIG_KMOD
-
Linus Torvalds authored
Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: swiotlb: Don't include linux/swiotlb.h twice in lib/swiotlb.c intel-iommu: fix build error with INTR_REMAP=y and DMAR=n swiotlb: add missing __init annotations
-
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlmLinus Torvalds authored
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm: dlm: fs/dlm/ast.c: fix warning dlm: add new debugfs entry dlm: add time stamp of blocking callback dlm: change lock time stamping dlm: improve how bast mode handling dlm: remove extra blocking callback check dlm: replace schedule with cond_resched dlm: remove kmap/kunmap dlm: trivial annotation of be16 value dlm: fix up memory allocation flags
-
git://aeryn.fluff.org.uk/bjdooks/linuxLinus Torvalds authored
* 'i2c-next' of git://aeryn.fluff.org.uk/bjdooks/linux: i2c-omap: fix type of irq handler function i2c-s3c2410: Change IRQ to be plain integer. i2c-s3c2410: Allow more than one i2c-s3c2410 adapter i2c-s3c2410: Remove default platform data. i2c-s3c2410: Use platform data for gpio configuration i2c-s3c2410: Fixup style problems from checkpatch.pl i2c-omap: Enable I2C wakeups for 34xx i2c-omap: reprogram OCP_SYSCONFIG register after reset i2c-omap: convert 'rev1' flag to generic 'rev' u8 i2c-omap: fix I2C timeouts due to recursive omap_i2c_{un,}idle() i2c-omap: Clean-up i2c-omap i2c-omap: Don't compile in OMAP15xx I2C ISR for non-OMAP15xx builds i2c-omap: Mark init-only functions as __init i2c-omap: Add support for omap34xx i2c-omap: FIFO handling support and broken hw workaround for i2c-omap i2c-omap: Add high-speed support to omap-i2c i2c-omap: Close suspected race between omap_i2c_idle() and omap_i2c_isr() i2c-omap: Do not use interruptible wait call in omap_i2c_xfer_msg Fix up apparently-trivial conflict in drivers/i2c/busses/i2c-s3c2410.c
-
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hidLinus Torvalds authored
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (22 commits) HID: fix error condition propagation in hid-sony driver HID: fix reference count leak hidraw HID: add proper support for pensketch 12x9 tablet HID: don't allow DealExtreme usb-radio be handled by usb hid driver HID: fix default Kconfig setting for TopSpeed driver HID: driver for TopSeed Cyberlink quirky remote HID: make boot protocol drivers depend on EMBEDDED HID: avoid sparse warning in HID_COMPAT_LOAD_DRIVER HID: hiddev cleanup -- handle all error conditions properly HID: force feedback driver for GreenAsia 0x12 PID HID: switch specialized drivers from "default y" to !EMBEDDED HID: set proper dev.parent in hidraw HID: add dynids facility HID: use GFP_KERNEL in hid_alloc_buffers HID: usbhid, use usb_endpoint_xfer_int HID: move usbhid flags to usbhid.h HID: add n-trig digitizer support HID: add phys and name ioctls to hidraw HID: struct device - replace bus_id with dev_name(), dev_set_name() HID: automatically call usbhid_set_leds in usbhid driver ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmwLinus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (27 commits) GFS2: Use DEFINE_SPINLOCK GFS2: Fix use-after-free bug on umount (try #2) Revert "GFS2: Fix use-after-free bug on umount" GFS2: Streamline alloc calculations for writes GFS2: Send useful information with uevent messages GFS2: Fix use-after-free bug on umount GFS2: Remove ancient, unused code GFS2: Move four functions from super.c GFS2: Fix bug in gfs2_lock_fs_check_clean() GFS2: Send some sensible sysfs stuff GFS2: Kill two daemons with one patch GFS2: Move gfs2_recoverd into recovery.c GFS2: Fix "truncate in progress" hang GFS2: Clean up & move gfs2_quotad GFS2: Add more detail to debugfs glock dumps GFS2: Banish struct gfs2_rgrpd_host GFS2: Move rg_free from gfs2_rgrpd_host to gfs2_rgrpd GFS2: Move rg_igeneration into struct gfs2_rgrpd GFS2: Banish struct gfs2_dinode_host GFS2: Move i_size from gfs2_dinode_host and rename it to i_disksize ...
-
Linus Torvalds authored
When using "min()", the types of both sides should match. With the cpu mask changes, the type of num_online_cpus() will now depend on config options. Use "min_t()" with an explicit type instead. And make the rx/tx case look the same too, just for sanity. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: (30 commits) sparc: Fix minor SPARC32 compile error sparc: Remove reg*.h from Kbuild sparc: Clean arch-specific code in prom_common.c sparc: Kill asm/reg*.h sparc: Use 64BIT config entry MAINTAINERS: update sparc maintainer sparc: unify ipcbuf.h sparc: Update 64-bit defconfig. sparc: remove NO_PROC_ID - it is no longer used sparc: drop get_tbr() in traps.h sparc: fix warning in userspace header traps.h sparc: fix warnings in userspace header byteorder.h sparc: fix warning in userspace header jsflash.h sparc: unify openprom.h sparc64: delete unused linux_prom64_ranges from openprom_64.h sparc: prepare openprom for unification sparc: remove linux_prom_pci_assigned_addresses from openprom_32.h sparc: remove ebus definitions from openprom*.h sparc: unify siginfo.h sparc: unify ptrace.h ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (44 commits) qlge: Fix sparse warnings for tx ring indexes. qlge: Fix sparse warning regarding rx buffer queues. qlge: Fix sparse endian warning in ql_hw_csum_setup(). qlge: Fix sparse endian warning for inbound packet control block flags. qlge: Fix sparse warnings for byte swapping in qlge_ethool.c myri10ge: print MAC and serial number on probe failure pkt_sched: cls_u32: Fix locking in u32_change() iucv: fix cpu hotplug af_iucv: Free iucv path/socket in path_pending callback af_iucv: avoid left over IUCV connections from failing connects af_iucv: New error return codes for connect() net/ehea: bitops work on unsigned longs Revert "net: Fix for initial link state in 2.6.28" tcp: Kill extraneous SPLICE_F_NONBLOCK checks. tcp: don't mask EOF and socket errors on nonblocking splice receive dccp: Integrate the TFRC library with DCCP dccp: Clean up ccid.c after integration of CCID plugins dccp: Lockless integration of CCID congestion-control plugins qeth: get rid of extra argument after printk to dev_* conversion qeth: No large send using EDDP for HiperSockets. ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6Linus Torvalds authored
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: ALSA: ice1724 - Fix a typo in IEC958 PCM name ASoC: fix davinci-sffsdr buglet ALSA: sound/usb: Use negated usb_endpoint_xfer_control, etc ALSA: hda - cxt5051 report jack state ALSA: hda - add basic jack reporting functions to patch_conexant.c ALSA: Use usb_set/get_intfdata ASoC: Clean up kerneldoc warnings ASoC: Fix pxa2xx-pcm checks for invalid DMA channels LSA: hda - Add HP Acacia detection ALSA: hda - fix name for ALC1200 ALSA: sound/usb: use USB API functions rather than constants ASoC: TWL4030: DAPM based capture implementation ASoC: TWL4030: Make the enum filter generic for twl4030
-
git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreqLinus Torvalds authored
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] Fix on resume, now preserves user policy min/max. [CPUFREQ] Add Celeron Core support to p4-clockmod. [CPUFREQ] add to speedstep-lib additional fsb values for core processors [CPUFREQ] Disable sysfs ui for p4-clockmod. [CPUFREQ] p4-clockmod: reduce noise [CPUFREQ] clean up speedstep-centrino and reduce cpumask_t usage
-
git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2Linus Torvalds authored
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (138 commits) ocfs2: Access the right buffer_head in ocfs2_merge_rec_left. ocfs2: use min_t in ocfs2_quota_read() ocfs2: remove unneeded lvb casts ocfs2: Add xattr support checking in init_security ocfs2: alloc xattr bucket in ocfs2_xattr_set_handle ocfs2: calculate and reserve credits for xattr value in mknod ocfs2/xattr: fix credits calculation during index create ocfs2/xattr: Always updating ctime during xattr set. ocfs2/xattr: Remove extend_trans call and add its credits from the beginning ocfs2/dlm: Fix race during lockres mastery ocfs2/dlm: Fix race in adding/removing lockres' to/from the tracking list ocfs2/dlm: Hold off sending lockres drop ref message while lockres is migrating ocfs2/dlm: Clean up errors in dlm_proxy_ast_handler() ocfs2/dlm: Fix a race between migrate request and exit domain ocfs2: One more hamming code optimization. ocfs2: Another hamming code optimization. ocfs2: Don't hand-code xor in ocfs2_hamming_encode(). ocfs2: Enable metadata checksums. ocfs2: Validate superblock with checksum and ecc. ocfs2: Checksum and ECC for directory blocks. ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6Linus Torvalds authored
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: inotify: fix type errors in interfaces fix breakage in reiserfs_new_inode() fix the treatment of jfs special inodes vfs: remove duplicate code in get_fs_type() add a vfs_fsync helper sys_execve and sys_uselib do not call into fsnotify zero i_uid/i_gid on inode allocation inode->i_op is never NULL ntfs: don't NULL i_op isofs check for NULL ->i_op in root directory is dead code affs: do not zero ->i_op kill suid bit only for regular files vfs: lseek(fd, 0, SEEK_CUR) race condition
-
Nick Piggin authored
An XFS workload showed up a bug in the lockless pagecache patch. Basically it would go into an "infinite" loop, although it would sometimes be able to break out of the loop! The reason is a missing compiler barrier in the "increment reference count unless it was zero" case of the lockless pagecache protocol in the gang lookup functions. This would cause the compiler to use a cached value of struct page pointer to retry the operation with, rather than reload it. So the page might have been removed from pagecache and freed (refcount==0) but the lookup would not correctly notice the page is no longer in pagecache, and keep attempting to increment the refcount and failing, until the page gets reallocated for something else. This isn't a data corruption because the condition will be detected if the page has been reallocated. However it can result in a lockup. Linus points out that ACCESS_ONCE is also required in that pointer load, even if it's absence is not causing a bug on our particular build. The most general way to solve this is just to put an rcu_dereference in radix_tree_deref_slot. Assembly of find_get_pages, before: .L220: movq (%rbx), %rax #* ivtmp.1162, tmp82 movq (%rax), %rdi #, prephitmp.1149 .L218: testb $1, %dil #, prephitmp.1149 jne .L217 #, testq %rdi, %rdi # prephitmp.1149 je .L203 #, cmpq $-1, %rdi #, prephitmp.1149 je .L217 #, movl 8(%rdi), %esi # <variable>._count.counter, c testl %esi, %esi # c je .L218 #, after: .L212: movq (%rbx), %rax #* ivtmp.1109, tmp81 movq (%rax), %rdi #, ret testb $1, %dil #, ret jne .L211 #, testq %rdi, %rdi # ret je .L197 #, cmpq $-1, %rdi #, ret je .L211 #, movl 8(%rdi), %esi # <variable>._count.counter, c testl %esi, %esi # c je .L212 #, (notice the obvious infinite loop in the first example, if page->count remains 0) Signed-off-by: Nick Piggin <npiggin@suse.de> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Alan Cox authored
Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-