1. 01 Apr, 2009 40 commits
    • dann frazier's avatar
      rtc-parisc: remove redundant locking · 05439f1f
      dann frazier authored
      The RTC subsystem proides ops locking, no need to implement our own
      Signed-off-by: default avatardann frazier <dannf@hp.com>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Grant Grundler <grundler@parisc-linux.org>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      05439f1f
    • dann frazier's avatar
      rtc-parisc: add a missing include for linux/rtc.h · 93d456d9
      dann frazier authored
      Signed-off-by: default avatardann frazier <dannf@hp.com>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Grant Grundler <grundler@parisc-linux.org>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      93d456d9
    • dann frazier's avatar
      rtc: add platform driver for EFI · 5e3fd9e5
      dann frazier authored
      Munge Stephane Eranian's efirtc.c code into an rtc platform driver
      
      [akpm@linux-foundation.org: use is_leap_year()]
      Signed-off-by: default avatardann frazier <dannf@hp.com>
      Cc: Alessandro Zummo <alessandro.zummo@towertech.it>
      Cc: stephane eranian <eranian@googlemail.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: David Brownell <david-b@pacbell.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5e3fd9e5
    • Andrew Morton's avatar
      rtc: convert LEAP_YEAR into an inline · 78d89ef4
      Andrew Morton authored
      - the LEAP_YEAR macro is buggy - it references its arg multiple times.
        Fix this by turning it into a C function.
      
      - give it a more approriate name
      
      - Move it to rtc.h so that other .c files can use it, instead of copying it.
      
      Cc: dann frazier <dannf@hp.com>
      Acked-by: default avatarAlessandro Zummo <alessandro.zummo@towertech.it>
      Cc: stephane eranian <eranian@googlemail.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: David Brownell <david-b@pacbell.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      78d89ef4
    • Mark Brown's avatar
      rtc: convert wm8350 use new alarm and update operations · 47367a3b
      Mark Brown authored
      These are the only two ioctls so the ioctl() function is also removed.
      Signed-off-by: default avatarMark Brown <broonie@opensource.wolfsonmicro.com>
      Cc: Acked-by: Alessandro Zummo <a.zummo@towertech.it>
      Cc: David Brownell <david-b@pacbell.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      47367a3b
    • Ian Kent's avatar
      autofs4: fix kernel includes · 79955898
      Ian Kent authored
      autofs_dev-ioctl.h is included by both the kernel module and user space tools
      and it includes two kernel header files.  Compiles work if the kernel headers
      are installed but fail otherwise.
      Signed-off-by: default avatarIan Kent <raven@themaw.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      79955898
    • Ian Kent's avatar
      autofs4: fix lookup deadlock · 8f63aaa8
      Ian Kent authored
      A deadlock can occur when user space uses a signal (autofs version 4 uses
      SIGCHLD for this) to effect expire completion.
      
      The order of events is:
      
      Expire process completes, but before being able to send SIGCHLD to it's parent
      ...
      
      Another process walks onto a different mount point and drops the directory
      inode semaphore prior to sending the request to the daemon as it must ...
      
      A third process does an lstat on on the expired mount point causing it to wait
      on expire completion (unfortunately) holding the directory semaphore.
      
      The mount request then arrives at the daemon which does an lstat and,
      deadlock.
      
      For some time I was concerned about releasing the directory semaphore around
      the expire wait in autofs4_lookup as well as for the mount call back.  I
      finally realized that the last round of changes in this function made the
      expiring dentry and the lookup dentry separate and distinct so the check and
      possible wait can be done anywhere prior to the mount call back.  This patch
      moves the check to just before the mount call back and inside the directory
      inode mutex release.
      Signed-off-by: default avatarIan Kent <raven@themaw.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8f63aaa8
    • Ian Kent's avatar
      autofs4: cleanup expire code duplication · 56fcef75
      Ian Kent authored
      A significant portion of the autofs_dev_ioctl_expire() and
      autofs4_expire_multi() functions is duplicated code.  This patch cleans that
      up.
      Signed-off-by: default avatarIan Kent <raven@themaw.net>
      Signed-off-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      56fcef75
    • Johannes Weiner's avatar
      ecryptfs: use kzfree() · 00fcf2cb
      Johannes Weiner authored
      Use kzfree() instead of memset() + kfree().
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      Acked-by: default avatarTyler Hicks <tyhicks@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      00fcf2cb
    • Anton Vorontsov's avatar
      powerpc/fsl_soc: isolate legacy fsl_spi support to mpc832x_rdb boards · e2801806
      Anton Vorontsov authored
      The advantages of this:
      - Don't encourage legacy support;
      - Less external symbols, less code to compile-in for !MPC832x_RDB
        platforms.
      Signed-off-by: default avatarAnton Vorontsov <avorontsov@ru.mvista.com>
      Cc: David Brownell <david-b@pacbell.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Kumar Gala <galak@gate.crashing.org>
      Cc: Grant Likely <grant.likely@secretlab.ca>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e2801806
    • Anton Vorontsov's avatar
      powerpc/83xx: add mmc-spi support via the device tree for MPC8323E-RDB · 75458285
      Anton Vorontsov authored
      - Add gpio-controller node to manage QE GPIO Bank D;
      - Add mmc-spi node;
      - Modify board file so that it won't use legacy SPI support with the new
        device trees.
      Signed-off-by: default avatarAnton Vorontsov <avorontsov@ru.mvista.com>
      Cc: David Brownell <david-b@pacbell.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Kumar Gala <galak@gate.crashing.org>
      Cc: Grant Likely <grant.likely@secretlab.ca>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      75458285
    • Anton Vorontsov's avatar
      powerpc: add mmc-spi-slot bindings · 3f1c6ebf
      Anton Vorontsov authored
      The bindings describes a case where MMC/SD/SDIO slot directly connected to
      a SPI bus.  Such setups are widely used on embedded PowerPC boards.
      
      The patch also adds the mmc-spi-slot entry to the OpenFirmware modalias
      table.
      Signed-off-by: default avatarAnton Vorontsov <avorontsov@ru.mvista.com>
      Cc: David Brownell <david-b@pacbell.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Kumar Gala <galak@gate.crashing.org>
      Cc: Grant Likely <grant.likely@secretlab.ca>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3f1c6ebf
    • Anton Vorontsov's avatar
      spi_mpc83xx: add OF platform driver bindings · 35b4b3c0
      Anton Vorontsov authored
      Implement full support for OF SPI bindings.  Now the driver can manage its
      own chip selects without any help from the board files and/or fsl_soc
      constructors.
      
      The "legacy" code is well isolated and could be removed as time goes by.
      Signed-off-by: default avatarAnton Vorontsov <avorontsov@ru.mvista.com>
      Cc: David Brownell <david-b@pacbell.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Kumar Gala <galak@gate.crashing.org>
      Cc: Grant Likely <grant.likely@secretlab.ca>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      35b4b3c0
    • Anton Vorontsov's avatar
      spi_mpc83xx: rework chip selects handling · 364fdbc0
      Anton Vorontsov authored
      The main purpose of this patch is to pass 'struct spi_device' to the chip
      select handling routines.  This is needed so that we could implement
      full-fledged OpenFirmware support for this driver.
      
      While at it, also:
      - Replace two {de,activate}_cs routines by single cs_contol().
      - Don't duplicate platform data callbacks in mpc83xx_spi struct.
      Signed-off-by: default avatarAnton Vorontsov <avorontsov@ru.mvista.com>
      Cc: David Brownell <david-b@pacbell.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Kumar Gala <galak@gate.crashing.org>
      Cc: Grant Likely <grant.likely@secretlab.ca>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      364fdbc0
    • Anton Vorontsov's avatar
      spi_mpc83xx: fix sparse warnings · 34c8a20c
      Anton Vorontsov authored
      The patch fixes following sparse warnings:
      
        CHECK   spi_mpc83xx.c
      spi_mpc83xx.c:145:1: warning: symbol 'mpc83xx_spi_rx_buf_u8' was not declared. Should it be static?
      spi_mpc83xx.c:146:1: warning: symbol 'mpc83xx_spi_rx_buf_u16' was not declared. Should it be static?
      spi_mpc83xx.c:147:1: warning: symbol 'mpc83xx_spi_rx_buf_u32' was not declared. Should it be static?
      spi_mpc83xx.c:148:1: warning: symbol 'mpc83xx_spi_tx_buf_u8' was not declared. Should it be static?
      spi_mpc83xx.c:149:1: warning: symbol 'mpc83xx_spi_tx_buf_u16' was not declared. Should it be static?
      spi_mpc83xx.c:150:1: warning: symbol 'mpc83xx_spi_tx_buf_u32' was not declared. Should it be static?
      spi_mpc83xx.c:175:32: warning: incorrect type in initializer (different address spaces)
      spi_mpc83xx.c:175:32:    expected void *tmp_ptr
      spi_mpc83xx.c:175:32:    got unsigned int [noderef] <asn:2>*<noident>
      spi_mpc83xx.c:183:26: warning: incorrect type in argument 1 (different address spaces)
      spi_mpc83xx.c:183:26:    expected unsigned int [noderef] [usertype] <asn:2>*reg
      spi_mpc83xx.c:183:26:    got void *tmp_ptr
      spi_mpc83xx.c:184:26: warning: incorrect type in argument 1 (different address spaces)
      spi_mpc83xx.c:184:26:    expected unsigned int [noderef] [usertype] <asn:2>*reg
      spi_mpc83xx.c:184:26:    got void *tmp_ptr
      spi_mpc83xx.c:287:31: warning: incorrect type in initializer (different address spaces)
      spi_mpc83xx.c:287:31:    expected void *tmp_ptr
      spi_mpc83xx.c:287:31:    got unsigned int [noderef] <asn:2>*<noident>
      spi_mpc83xx.c:295:25: warning: incorrect type in argument 1 (different address spaces)
      spi_mpc83xx.c:295:25:    expected unsigned int [noderef] [usertype] <asn:2>*reg
      spi_mpc83xx.c:295:25:    got void *tmp_ptr
      spi_mpc83xx.c:296:25: warning: incorrect type in argument 1 (different address spaces)
      spi_mpc83xx.c:296:25:    expected unsigned int [noderef] [usertype] <asn:2>*reg
      spi_mpc83xx.c:296:25:    got void *tmp_ptr
      spi_mpc83xx.c:486:13: warning: symbol 'mpc83xx_spi_irq' was not declared. Should it be static?
      Signed-off-by: default avatarAnton Vorontsov <avorontsov@ru.mvista.com>
      Cc: David Brownell <david-b@pacbell.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Kumar Gala <galak@gate.crashing.org>
      Cc: Grant Likely <grant.likely@secretlab.ca>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      34c8a20c
    • Wu Fengguang's avatar
      ramfs: add support for "mode=" mount option · c3b1b1cb
      Wu Fengguang authored
      Addresses http://bugzilla.kernel.org/show_bug.cgi?id=12843
      
      "I use ramfs instead of tmpfs for /tmp because I don't use swap on my
      laptop.  Some apps need 1777 mode for /tmp directory, but ramfs does not
      support 'mode=' mount option."
      Reported-by: default avatarAvan Anishchuk <matimatik@gmail.com>
      Signed-off-by: default avatarWu Fengguang <fengguang.wu@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c3b1b1cb
    • Daniel Mack's avatar
      lis3: SPI transport layer · bb233fdf
      Daniel Mack authored
      Make use of the new abstraction layer and add a new transport layer for
      spi.  Works fine on a PXA based board.
      Signed-off-by: default avatarDaniel Mack <daniel@caiaq.de>
      Acked-by: default avatarPavel Machek <pavel@ucw.cz>
      Acked-by: default avatarEric Piel <eric.piel@tremplin-utc.net>
      Cc: David Brownell <david-b@pacbell.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bb233fdf
    • Daniel Mack's avatar
      lis3: solve dependency between core and ACPI · a38da2ed
      Daniel Mack authored
      This solves the dependency between lis3lv02d.[ch] and ACPI specific
      methods.  It introduces a ->bus_priv pointer to the device struct which is
      casted to 'struct acpi_device' in the ACIP layer.  Changed hp_accel.c
      accordingly.
      Signed-off-by: default avatarDaniel Mack <daniel@caiaq.de>
      Acked-by: default avatarPavel Machek <pavel@ucw.cz>
      Acked-by: default avatarEric Piel <eric.piel@tremplin-utc.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a38da2ed
    • Daniel Mack's avatar
      lis3: reorder functions to make forward decl obsolete · ab337a63
      Daniel Mack authored
      Move lis3lv02d_init_device() down so that the forward declaration of
      lis3lv02d_add_fs() becomes unnecessary.
      Signed-off-by: default avatarDaniel Mack <daniel@caiaq.de>
      Acked-by: default avatarPavel Machek <pavel@ucw.cz>
      Acked-by: default avatarEric Piel <eric.piel@tremplin-utc.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ab337a63
    • Luca Cappa's avatar
      hp_accel: axis conversion for hp compaq 8710w · 12a324b6
      Luca Cappa authored
      I have a laptop HP Compaq 8710W, I compiled into my kernel the LIS3LV02DL
      and HP_ACCEL module drivers.  While loading it cannot recognize the laptop
      model, so i am sending the necessary information to update the database of
      axis orientations.
      
      >When the laptop is horizontal the position reported is about 0 for X and Y
      >and a positive value for Z
      Yes, it is about 0,0,1000, the actual reading says: (-17,-26,1018);
      
      > If the left side is elevated, X increases (becomes positive)
      Yes, X goes toward to positive 1000.
      
      >If the front side (where the touchpad is) is elevated, Y decreases (becomes negative)
      No, Y goes toward to positive 1000.
      
      >If the laptop is put upside-down, Z becomes negative
      Yes, the laptop on a table Z gives 1000, and if upsidedown the Z reads
      -1000.
      
      So in few words the Y axis is inverted.
      
      Cc: Eric Piel <eric.piel@tremplin-utc.net>
      Signed-off-by: default avatarPavel Machek <pavel@ucw.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      12a324b6
    • Pavel Machek's avatar
      hp_accel: add two more axis information · 9d7639d3
      Pavel Machek authored
      Add two more laptops to whitelist.
      Signed-off-by: default avatarMichal Marek <mmarek@suse.cz>
      Signed-off-by: default avatarPavel Machek <pavel@ucw.cz>
      Cc: Daniel Mack <daniel@caiaq.de>
      Cc: Eric Piel <eric.piel@tremplin-utc.net>
      Cc: Vladimir Botka <vbotka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9d7639d3
    • Ira Snyder's avatar
      hwmon: Add LTC4215 driver · 72f5de92
      Ira Snyder authored
      Add Linux support for the Linear Technology LTC4215 Hot Swap controller
      I2C monitoring interface.
      
      I have tested the driver with my board, and it appears to work fine.  With
      the power supplies disabled, it reads 11.93V input, 1.93V output, no
      current and no power.  With the supplies enabled, it reads 11.93V input,
      11.98V output, no current, no power.  I'm not drawing any current at the
      moment, so this is reasonable.  The value in the sense register never
      reads anything except 0, so I expect to get zero from the current and
      power calculations.
      
      I didn't attempt to support changing any of the chip's settings or
      enabling the FET.  I'm not sure even how to do that and still fit within
      the hwmon framework.  :)
      Signed-off-by: default avatarIra W. Snyder <iws@ovro.caltech.edu>
      Cc: Jean Delvare <khali@linux-fr.org>
      Cc: "Mark M. Hoffman" <mhoffman@lightlink.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      72f5de92
    • Davide Rizzo's avatar
      hwmon: LM95241 driver · 06160327
      Davide Rizzo authored
      An hwmon driver for the National Semiconductor LM95241 triple temperature
      sensors chip
      Signed-off-by: default avatarDavide Rizzo <elpa-rizzo@gmail.com>
      Cc: Jean Delvare <khali@linux-fr.org>
      Cc: "Mark M. Hoffman" <mhoffman@lightlink.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      06160327
    • Pavel Machek's avatar
      hp_accel: adev is poor name of exported symbol · be84cfc5
      Pavel Machek authored
      As Andrew noted, adev is pretty poor name for symbol being exported.
      Rename it to lis3.
      Signed-off-by: default avatarPavel Machek <pavel@ucw.cz>
      Cc: Eric Piel <eric.piel@tremplin-utc.net>
      Cc: Vladimir Botka <vbotka@suse.cz>
      Cc: <Quoc.Pham@hp.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      be84cfc5
    • Pavel Machek's avatar
      hp_accel: small documentation updates · 2b872903
      Pavel Machek authored
      Fix english in Documentation, add "how to test" description.
      Signed-off-by: default avatarPavel Machek <pavel@suse.cz>
      Cc: Eric Piel <eric.piel@tremplin-utc.net>
      Cc: Vladimir Botka <vbotka@suse.cz>
      Cc: <Quoc.Pham@hp.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2b872903
    • Davide Libenzi's avatar
      epoll keyed wakeups: make tty use keyed wakeups · 4b19449d
      Davide Libenzi authored
      Introduce keyed event wakeups inside the TTY code.
      Signed-off-by: default avatarDavide Libenzi <davidel@xmailserver.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: David Miller <davem@davemloft.net>
      Cc: William Lee Irwin III <wli@movementarian.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4b19449d
    • Davide Libenzi's avatar
      epoll keyed wakeups: make eventfd use keyed wakeups · 39510888
      Davide Libenzi authored
      Introduce keyed event wakeups inside the eventfd code.
      Signed-off-by: default avatarDavide Libenzi <davidel@xmailserver.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: David Miller <davem@davemloft.net>
      Cc: William Lee Irwin III <wli@movementarian.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      39510888
    • Davide Libenzi's avatar
      epoll keyed wakeups: teach epoll about hints coming with the wakeup key · 2dfa4eea
      Davide Libenzi authored
      Use the events hint now sent by some devices, to avoid unnecessary wakeups
      for events that are of no interest for the caller.  This code handles both
      devices that are sending keyed events, and the ones that are not (and
      event the ones that sometimes send events, and sometimes don't).
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: default avatarDavide Libenzi <davidel@xmailserver.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: David Miller <davem@davemloft.net>
      Cc: William Lee Irwin III <wli@movementarian.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2dfa4eea
    • Davide Libenzi's avatar
      epoll keyed wakeups: make sockets use keyed wakeups · 37e5540b
      Davide Libenzi authored
      Add support for event-aware wakeups to the sockets code.  Events are
      delivered to the wakeup target, so that epoll can avoid spurious wakeups
      for non-interesting events.
      Signed-off-by: default avatarDavide Libenzi <davidel@xmailserver.org>
      Acked-by: default avatarAlan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: David Miller <davem@davemloft.net>
      Cc: William Lee Irwin III <wli@movementarian.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      37e5540b
    • Davide Libenzi's avatar
      epoll keyed wakeups: introduce new *_poll() wakeup macros · c0da3775
      Davide Libenzi authored
      Introduce new wakeup macros that allow passing an event mask to the wakeup
      targets.  They exactly mimic their non-_poll() counterpart, with the added
      event mask passing capability.  I did add only the ones currently
      requested, avoiding the _nr() and _all() for the moment.
      Signed-off-by: default avatarDavide Libenzi <davidel@xmailserver.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: David Miller <davem@davemloft.net>
      Cc: William Lee Irwin III <wli@movementarian.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c0da3775
    • Davide Libenzi's avatar
      epoll keyed wakeups: add __wake_up_locked_key() and __wake_up_sync_key() · 4ede816a
      Davide Libenzi authored
      This patchset introduces wakeup hints for some of the most popular (from
      epoll POV) devices, so that epoll code can avoid spurious wakeups on its
      waiters.
      
      The problem with epoll is that the callback-based wakeups do not, ATM,
      carry any information about the events the wakeup is related to.  So the
      only choice epoll has (not being able to call f_op->poll() from inside the
      callback), is to add the file* to a ready-list and resolve the real events
      later on, at epoll_wait() (or its own f_op->poll()) time.  This can cause
      spurious wakeups, since the wake_up() itself might be for an event the
      caller is not interested into.
      
      The rate of these spurious wakeup can be pretty high in case of many
      network sockets being monitored.
      
      By allowing devices to report the events the wakeups refer to (at least
      the two major classes - POLLIN/POLLOUT), we are able to spare useless
      wakeups by proper handling inside the epoll's poll callback.
      
      Epoll will have in any case to call f_op->poll() on the file* later on,
      since the change to be done in order to have the full event set sent via
      wakeup, is too invasive for the way our f_op->poll() system works (the
      full event set is calculated inside the poll function - there are too many
      of them to even start thinking the change - also poll/select would need
      change too).
      
      Epoll is changed in a way that both devices which send event hints, and
      the ones that don't, are correctly handled.  The former will gain some
      efficiency though.
      
      As a general rule for devices, would be to add an event mask by using
      key-aware wakeup macros, when making up poll wait queues.  I tested it
      (together with the epoll's poll fix patch Andrew has in -mm) and wakeups
      for the supported devices are correctly filtered.
      
      Test program available here:
      
      http://www.xmailserver.org/epoll_test.c
      
      This patch:
      
      Nothing revolutionary here.  Just using the available "key" that our
      wakeup core already support.  The __wake_up_locked_key() was no brainer,
      since both __wake_up_locked() and __wake_up_locked_key() are thin wrappers
      around __wake_up_common().
      
      The __wake_up_sync() function had a body, so the choice was between
      borrowing the body for __wake_up_sync_key() and calling it from
      __wake_up_sync(), or make an inline and calling it from both.  I chose the
      former since in most archs it all resolves to "mov $0, REG; jmp ADDR".
      Signed-off-by: default avatarDavide Libenzi <davidel@xmailserver.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: David Miller <davem@davemloft.net>
      Cc: William Lee Irwin III <wli@movementarian.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4ede816a
    • Davide Libenzi's avatar
      eventfd: improve support for semaphore-like behavior · bcd0b235
      Davide Libenzi authored
      People started using eventfd in a semaphore-like way where before they
      were using pipes.
      
      That is, counter-based resource access.  Where a "wait()" returns
      immediately by decrementing the counter by one, if counter is greater than
      zero.  Otherwise will wait.  And where a "post(count)" will add count to
      the counter releasing the appropriate amount of waiters.  If eventfd the
      "post" (write) part is fine, while the "wait" (read) does not dequeue 1,
      but the whole counter value.
      
      The problem with eventfd is that a read() on the fd returns and wipes the
      whole counter, making the use of it as semaphore a little bit more
      cumbersome.  You can do a read() followed by a write() of COUNTER-1, but
      IMO it's pretty easy and cheap to make this work w/out extra steps.  This
      patch introduces a new eventfd flag that tells eventfd to only dequeue 1
      from the counter, allowing simple read/write to make it behave like a
      semaphore.  Simple test here:
      
      http://www.xmailserver.org/eventfd-sem.c
      
      To be back-compatible with earlier kernels, userspace applications should
      probe for the availability of this feature via
      
      #ifdef EFD_SEMAPHORE
      	fd = eventfd2 (CNT, EFD_SEMAPHORE);
      	if (fd == -1 && errno == EINVAL)
      		<fallback>
      #else
      		<fallback>
      #endif
      Signed-off-by: default avatarDavide Libenzi <davidel@xmailserver.org>
      Cc: <linux-api@vger.kernel.org>
      Tested-by: default avatarMichael Kerrisk <mtk.manpages@gmail.com>
      Cc: Ulrich Drepper <drepper@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bcd0b235
    • Tony Battersby's avatar
      epoll: use real type instead of void * · 4f0989db
      Tony Battersby authored
      eventpoll.c uses void * in one place for no obvious reason; change it to
      use the real type instead.
      Signed-off-by: default avatarTony Battersby <tonyb@cybernetics.com>
      Acked-by: default avatarDavide Libenzi <davidel@xmailserver.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4f0989db
    • Tony Battersby's avatar
      epoll: clean up ep_modify · e057e15f
      Tony Battersby authored
      ep_modify() doesn't need to set event.data from within the ep->lock
      spinlock as the comment suggests.  The only place event.data is used is
      ep_send_events_proc(), and this is protected by ep->mtx instead of
      ep->lock.  Also update the comment for mutex_lock() at the top of
      ep_scan_ready_list(), which mentions epoll_ctl(EPOLL_CTL_DEL) but not
      epoll_ctl(EPOLL_CTL_MOD).
      
      ep_modify() can also use spin_lock_irq() instead of spin_lock_irqsave().
      Signed-off-by: default avatarTony Battersby <tonyb@cybernetics.com>
      Acked-by: default avatarDavide Libenzi <davidel@xmailserver.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e057e15f
    • Tony Battersby's avatar
      epoll: remove unnecessary xchg · d1bc90dd
      Tony Battersby authored
      xchg in ep_unregister_pollwait() is unnecessary because it is protected by
      either epmutex or ep->mtx (the same protection as ep_remove()).
      
      If xchg was necessary, it would be insufficient to protect against
      problems: if multiple concurrent calls to ep_unregister_pollwait() were
      possible then a second caller that returns without doing anything because
      nwait == 0 could return before the waitqueues are removed by the first
      caller, which looks like it could lead to problematic races with
      ep_poll_callback().
      
      So remove xchg and add comments about the locking.
      Signed-off-by: default avatarTony Battersby <tonyb@cybernetics.com>
      Acked-by: default avatarDavide Libenzi <davidel@xmailserver.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d1bc90dd
    • Tony Battersby's avatar
      epoll: remember the event if epoll_wait returns -EFAULT · d0305882
      Tony Battersby authored
      If epoll_wait returns -EFAULT, the event that was being returned when the
      fault was encountered will be forgotten.  This is not a big deal since
      EFAULT will happen only if a buggy userspace program passes in a bad
      address, in which case what happens later usually doesn't matter.
      However, it is easy to remember the event for later, and this patch makes
      a simple change to do that.
      Signed-off-by: default avatarTony Battersby <tonyb@cybernetics.com>
      Acked-by: default avatarDavide Libenzi <davidel@xmailserver.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d0305882
    • Tony Battersby's avatar
      epoll: don't use current in irq context · abff55ce
      Tony Battersby authored
      ep_call_nested() (formerly ep_poll_safewake()) uses "current" (without
      dereferencing it) to detect callback recursion, but it may be called from
      irq context where the use of current is generally discouraged.  It would
      be better to use get_cpu() and put_cpu() to detect the callback recursion.
      Signed-off-by: default avatarTony Battersby <tonyb@cybernetics.com>
      Acked-by: default avatarDavide Libenzi <davidel@xmailserver.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      abff55ce
    • Davide Libenzi's avatar
      epoll: remove debugging code · bb57c3ed
      Davide Libenzi authored
      Remove debugging code from epoll.  There's no need for it to be included
      into mainline code.
      Signed-off-by: default avatarDavide Libenzi <davidel@xmailserver.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bb57c3ed
    • Davide Libenzi's avatar
      epoll: fix epoll's own poll (update) · 296e236e
      Davide Libenzi authored
      Signed-off-by: default avatarDavide Libenzi <davidel@xmailserver.org>
      Cc: Pavel Pisa <pisa@cmp.felk.cvut.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      296e236e
    • Davide Libenzi's avatar
      epoll: fix epoll's own poll · 5071f97e
      Davide Libenzi authored
      Fix a bug inside the epoll's f_op->poll() code, that returns POLLIN even
      though there are no actual ready monitored fds.  The bug shows up if you
      add an epoll fd inside another fd container (poll, select, epoll).
      
      The problem is that callback-based wake ups used by epoll does not carry
      (patches will follow, to fix this) any information about the events that
      actually happened.  So the callback code, since it can't call the file*
      ->poll() inside the callback, chains the file* into a ready-list.
      
      So, suppose you added an fd with EPOLLOUT only, and some data shows up on
      the fd, the file* mapped by the fd will be added into the ready-list (via
      wakeup callback).  During normal epoll_wait() use, this condition is
      sorted out at the time we're actually able to call the file*'s
      f_op->poll().
      
      Inside the old epoll's f_op->poll() though, only a quick check
      !list_empty(ready-list) was performed, and this could have led to
      reporting POLLIN even though no ready fds would show up at a following
      epoll_wait().  In order to correctly report the ready status for an epoll
      fd, the ready-list must be checked to see if any really available fd+event
      would be ready in a following epoll_wait().
      
      Operation (calling f_op->poll() from inside f_op->poll()) that, like wake
      ups, must be handled with care because of the fact that epoll fds can be
      added to other epoll fds.
      
      Test code:
      
      /*
       *  epoll_test by Davide Libenzi (Simple code to test epoll internals)
       *  Copyright (C) 2008  Davide Libenzi
       *
       *  This program is free software; you can redistribute it and/or modify
       *  it under the terms of the GNU General Public License as published by
       *  the Free Software Foundation; either version 2 of the License, or
       *  (at your option) any later version.
       *
       *  This program is distributed in the hope that it will be useful,
       *  but WITHOUT ANY WARRANTY; without even the implied warranty of
       *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
       *  GNU General Public License for more details.
       *
       *  You should have received a copy of the GNU General Public License
       *  along with this program; if not, write to the Free Software
       *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
       *
       *  Davide Libenzi <davidel@xmailserver.org>
       *
       */
      
      #include <sys/types.h>
      #include <unistd.h>
      #include <stdio.h>
      #include <stdlib.h>
      #include <string.h>
      #include <errno.h>
      #include <signal.h>
      #include <limits.h>
      #include <poll.h>
      #include <sys/epoll.h>
      #include <sys/wait.h>
      
      #define EPWAIT_TIMEO	(1 * 1000)
      #ifndef POLLRDHUP
      #define POLLRDHUP 0x2000
      #endif
      
      #define EPOLL_MAX_CHAIN	100L
      
      #define EPOLL_TF_LOOP (1 << 0)
      
      struct epoll_test_cfg {
      	long size;
      	long flags;
      };
      
      static int xepoll_create(int n) {
      	int epfd;
      
      	if ((epfd = epoll_create(n)) == -1) {
      		perror("epoll_create");
      		exit(2);
      	}
      
      	return epfd;
      }
      
      static void xepoll_ctl(int epfd, int cmd, int fd, struct epoll_event *evt) {
      	if (epoll_ctl(epfd, cmd, fd, evt) < 0) {
      		perror("epoll_ctl");
      		exit(3);
      	}
      }
      
      static void xpipe(int *fds) {
      	if (pipe(fds)) {
      		perror("pipe");
      		exit(4);
      	}
      }
      
      static pid_t xfork(void) {
      	pid_t pid;
      
      	if ((pid = fork()) == (pid_t) -1) {
      		perror("pipe");
      		exit(5);
      	}
      
      	return pid;
      }
      
      static int run_forked_proc(int (*proc)(void *), void *data) {
      	int status;
      	pid_t pid;
      
      	if ((pid = xfork()) == 0)
      		exit((*proc)(data));
      	if (waitpid(pid, &status, 0) != pid) {
      		perror("waitpid");
      		return -1;
      	}
      
      	return WIFEXITED(status) ? WEXITSTATUS(status): -2;
      }
      
      static int check_events(int fd, int timeo) {
      	struct pollfd pfd;
      
      	fprintf(stdout, "Checking events for fd %d\n", fd);
      	memset(&pfd, 0, sizeof(pfd));
      	pfd.fd = fd;
      	pfd.events = POLLIN | POLLOUT;
      	if (poll(&pfd, 1, timeo) < 0) {
      		perror("poll()");
      		return 0;
      	}
      	if (pfd.revents & POLLIN)
      		fprintf(stdout, "\tPOLLIN\n");
      	if (pfd.revents & POLLOUT)
      		fprintf(stdout, "\tPOLLOUT\n");
      	if (pfd.revents & POLLERR)
      		fprintf(stdout, "\tPOLLERR\n");
      	if (pfd.revents & POLLHUP)
      		fprintf(stdout, "\tPOLLHUP\n");
      	if (pfd.revents & POLLRDHUP)
      		fprintf(stdout, "\tPOLLRDHUP\n");
      
      	return pfd.revents;
      }
      
      static int epoll_test_tty(void *data) {
      	int epfd, ifd = fileno(stdin), res;
      	struct epoll_event evt;
      
      	if (check_events(ifd, 0) != POLLOUT) {
      		fprintf(stderr, "Something is cooking on STDIN (%d)\n", ifd);
      		return 1;
      	}
      	epfd = xepoll_create(1);
      	fprintf(stdout, "Created epoll fd (%d)\n", epfd);
      	memset(&evt, 0, sizeof(evt));
      	evt.events = EPOLLIN;
      	xepoll_ctl(epfd, EPOLL_CTL_ADD, ifd, &evt);
      	if (check_events(epfd, 0) & POLLIN) {
      		res = epoll_wait(epfd, &evt, 1, 0);
      		if (res == 0) {
      			fprintf(stderr, "Epoll fd (%d) is ready when it shouldn't!\n",
      				epfd);
      			return 2;
      		}
      	}
      
      	return 0;
      }
      
      static int epoll_wakeup_chain(void *data) {
      	struct epoll_test_cfg *tcfg = data;
      	int i, res, epfd, bfd, nfd, pfds[2];
      	pid_t pid;
      	struct epoll_event evt;
      
      	memset(&evt, 0, sizeof(evt));
      	evt.events = EPOLLIN;
      
      	epfd = bfd = xepoll_create(1);
      
      	for (i = 0; i < tcfg->size; i++) {
      		nfd = xepoll_create(1);
      		xepoll_ctl(bfd, EPOLL_CTL_ADD, nfd, &evt);
      		bfd = nfd;
      	}
      	xpipe(pfds);
      	if (tcfg->flags & EPOLL_TF_LOOP)
      	{
      		xepoll_ctl(bfd, EPOLL_CTL_ADD, epfd, &evt);
      		/*
      		 * If we're testing for loop, we want that the wakeup
      		 * triggered by the write to the pipe done in the child
      		 * process, triggers a fake event. So we add the pipe
      		 * read size with EPOLLOUT events. This will trigger
      		 * an addition to the ready-list, but no real events
      		 * will be there. The the epoll kernel code will proceed
      		 * in calling f_op->poll() of the epfd, triggering the
      		 * loop we want to test.
      		 */
      		evt.events = EPOLLOUT;
      	}
      	xepoll_ctl(bfd, EPOLL_CTL_ADD, pfds[0], &evt);
      
      	/*
      	 * The pipe write must come after the poll(2) call inside
      	 * check_events(). This tests the nested wakeup code in
      	 * fs/eventpoll.c:ep_poll_safewake()
      	 * By having the check_events() (hence poll(2)) happens first,
      	 * we have poll wait queue filled up, and the write(2) in the
      	 * child will trigger the wakeup chain.
      	 */
      	if ((pid = xfork()) == 0) {
      		sleep(1);
      		write(pfds[1], "w", 1);
      		exit(0);
      	}
      
      	res = check_events(epfd, 2000) & POLLIN;
      
      	if (waitpid(pid, NULL, 0) != pid) {
      		perror("waitpid");
      		return -1;
      	}
      
      	return res;
      }
      
      static int epoll_poll_chain(void *data) {
      	struct epoll_test_cfg *tcfg = data;
      	int i, res, epfd, bfd, nfd, pfds[2];
      	pid_t pid;
      	struct epoll_event evt;
      
      	memset(&evt, 0, sizeof(evt));
      	evt.events = EPOLLIN;
      
      	epfd = bfd = xepoll_create(1);
      
      	for (i = 0; i < tcfg->size; i++) {
      		nfd = xepoll_create(1);
      		xepoll_ctl(bfd, EPOLL_CTL_ADD, nfd, &evt);
      		bfd = nfd;
      	}
      	xpipe(pfds);
      	if (tcfg->flags & EPOLL_TF_LOOP)
      	{
      		xepoll_ctl(bfd, EPOLL_CTL_ADD, epfd, &evt);
      		/*
      		 * If we're testing for loop, we want that the wakeup
      		 * triggered by the write to the pipe done in the child
      		 * process, triggers a fake event. So we add the pipe
      		 * read size with EPOLLOUT events. This will trigger
      		 * an addition to the ready-list, but no real events
      		 * will be there. The the epoll kernel code will proceed
      		 * in calling f_op->poll() of the epfd, triggering the
      		 * loop we want to test.
      		 */
      		evt.events = EPOLLOUT;
      	}
      	xepoll_ctl(bfd, EPOLL_CTL_ADD, pfds[0], &evt);
      
      	/*
      	 * The pipe write mush come before the poll(2) call inside
      	 * check_events(). This tests the nested f_op->poll calls code in
      	 * fs/eventpoll.c:ep_eventpoll_poll()
      	 * By having the pipe write(2) happen first, we make the kernel
      	 * epoll code to load the ready lists, and the following poll(2)
      	 * done inside check_events() will test nested poll code in
      	 * ep_eventpoll_poll().
      	 */
      	if ((pid = xfork()) == 0) {
      		write(pfds[1], "w", 1);
      		exit(0);
      	}
      	sleep(1);
      	res = check_events(epfd, 1000) & POLLIN;
      
      	if (waitpid(pid, NULL, 0) != pid) {
      		perror("waitpid");
      		return -1;
      	}
      
      	return res;
      }
      
      int main(int ac, char **av) {
      	int error;
      	struct epoll_test_cfg tcfg;
      
      	fprintf(stdout, "\n********** Testing TTY events\n");
      	error = run_forked_proc(epoll_test_tty, NULL);
      	fprintf(stdout, error == 0 ?
      		"********** OK\n": "********** FAIL (%d)\n", error);
      
      	tcfg.size = 3;
      	tcfg.flags = 0;
      	fprintf(stdout, "\n********** Testing short wakeup chain\n");
      	error = run_forked_proc(epoll_wakeup_chain, &tcfg);
      	fprintf(stdout, error == POLLIN ?
      		"********** OK\n": "********** FAIL (%d)\n", error);
      
      	tcfg.size = EPOLL_MAX_CHAIN;
      	tcfg.flags = 0;
      	fprintf(stdout, "\n********** Testing long wakeup chain (HOLD ON)\n");
      	error = run_forked_proc(epoll_wakeup_chain, &tcfg);
      	fprintf(stdout, error == 0 ?
      		"********** OK\n": "********** FAIL (%d)\n", error);
      
      	tcfg.size = 3;
      	tcfg.flags = 0;
      	fprintf(stdout, "\n********** Testing short poll chain\n");
      	error = run_forked_proc(epoll_poll_chain, &tcfg);
      	fprintf(stdout, error == POLLIN ?
      		"********** OK\n": "********** FAIL (%d)\n", error);
      
      	tcfg.size = EPOLL_MAX_CHAIN;
      	tcfg.flags = 0;
      	fprintf(stdout, "\n********** Testing long poll chain (HOLD ON)\n");
      	error = run_forked_proc(epoll_poll_chain, &tcfg);
      	fprintf(stdout, error == 0 ?
      		"********** OK\n": "********** FAIL (%d)\n", error);
      
      	tcfg.size = 3;
      	tcfg.flags = EPOLL_TF_LOOP;
      	fprintf(stdout, "\n********** Testing loopy wakeup chain (HOLD ON)\n");
      	error = run_forked_proc(epoll_wakeup_chain, &tcfg);
      	fprintf(stdout, error == 0 ?
      		"********** OK\n": "********** FAIL (%d)\n", error);
      
      	tcfg.size = 3;
      	tcfg.flags = EPOLL_TF_LOOP;
      	fprintf(stdout, "\n********** Testing loopy poll chain (HOLD ON)\n");
      	error = run_forked_proc(epoll_poll_chain, &tcfg);
      	fprintf(stdout, error == 0 ?
      		"********** OK\n": "********** FAIL (%d)\n", error);
      
      	return 0;
      }
      Signed-off-by: default avatarDavide Libenzi <davidel@xmailserver.org>
      Cc: Pavel Pisa <pisa@cmp.felk.cvut.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5071f97e