- 25 Jul, 2008 21 commits
-
-
Nathan Fontenot authored
Verify memory entitlement updates can be handled by vio. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Robert Jennings authored
This is a large patch but the normal code path is not affected. For non-pSeries platforms the code is ifdef'ed out and for non-CMO enabled pSeries systems this does not affect the normal code path. Devices that do not perform DMA operations do not need modification with this patch. The function get_desired_dma was renamed from get_io_entitlement for clarity. Overview Cooperative Memory Overcommitment (CMO) allows for a set of OS partitions to be run with less RAM than the aggregate needs of the group of partitions. The firmware will balance memory between the partitions and page in/out memory as needed. Based on the number and type of IO adpaters preset each partition is allocated an amount of memory for DMA operations and this allocation will be guaranteed to the partition; this is referred to as the partition's 'entitlement'. Partitions running in a CMO environment can only have virtual IO devices present. The VIO bus layer will manage the IO entitlement for the system. Accounting, at a system and per-device level, is tracked in the VIO bus code and exposed via sysfs. A set of dma_ops functions are added to the bus to allow for this accounting. Bus initialization At initialization, the bus will calculate the minimum needs of the system based on providing each device present with a standard minimum entitlement along with a spare allocation for the bus to handle hotplug events. If the minimum needs can not be met the system boot will be halted. Device changes The significant changes for devices while running under CMO are that the devices must specify how much dedicated IO entitlement they desire and must also handle DMA mapping errors that can occur due to constrained IO memory. The virtual IO drivers are modified to silence errors when DMA mappings fail for CMO and handle these failures gracefully. Each devices will be guaranteed a minimum entitlement that can always be mapped. Devices will specify how much entitlement they desire and the VIO bus will attempt to provide for this. Devices can change their desired entitlement level at any point in time to address particular needs (via vio_cmo_set_dev_desired()), not just at device probe time. VIO bus changes The system will have a particular entitlement level available from which it can provide memory to the devices. The bus defines two pools of memory within this entitlement, the reserved and excess pools. Each device is provided with it's own entitlement no less than a system defined minimum entitlement and no greater than what the device has specified as it's desired entitlement. The entitlement provided to devices comes from the reserve pool. The reserve pool can also contain a spare allocation as large as the system defined minimum entitlement which is used for device hotplug events. Any entitlement not needed to fulfill the needs of a reserve pool is placed in the excess pool. Each device is guaranteed that it can map up to it's entitled level; additional mapping are possible as long as there is unmapped memory in the excess pool. Bus probe As the system starts, each device is given an entitlement equal only to the system defined minimum entitlement. The reserve pool is equal to the sum of these entitlements, plus a spare allocation. The VIO bus also tracks the aggregate desired entitlement of all the devices. If the system desired entitlement is greater than the size of the reserve pool, when devices unmap IO memory it will be reserved and a balance operation will be scheduled for some time in the future. Entitlement balancing The balance function tries to fairly distribute entitlement between the devices in the system with the goal of providing each device with it's desired amount of entitlement. Devices using more than what would be ideal will have their entitled set-point adjusted; this will effectively set a goal for lower IO memory usage as future mappings can fail and deallocations will trigger a balance operation to distribute the newly unmapped memory. A fair distribution of entitlement can take several balance operations to achieve. Entitlement changes and device DLPAR events will alter the state of CMO and will trigger balance operations. Hotplug events The VIO bus allows for changes in system entitlement at run-time via 'vio_cmo_entitlement_update()'. When devices are added the hotplug device event will be preceded by a system entitlement increase and this is reversed when devices are removed. The following changes are made that the VIO bus layer for CMO: * add IO memory accounting per device structure. * add IO memory entitlement query function to driver structure. * during vio bus probe, if CMO is enabled, check that driver has memory entitlement query function defined. Fail if function not defined. * fail to register driver if io entitlement function not defined. * create set of dma_ops at vio level for CMO that will track allocations and return DMA failures once entitlement is reached. Entitlement will limited by overall system entitlement. Devices will have a reserved quantity of memory that is guaranteed, the rest can be used as available. * expose entitlement, current allocation, desired allocation, and the allocation error counter for devices to the user through sysfs * provide mechanism for changing a device's desired entitlement at run time for devices as an exported function and sysfs tunable * track any DMA failures for entitled IO memory for each vio device. * check entitlement against available system entitlement on device add * track entitlement metrics (high water mark, current usage) * provide function to reset high water mark * provide minimum and desired entitlement numbers at a bus level * provide drivers with a minimum guaranteed entitlement * balance available entitlement between devices to satisfy their needs * handle system entitlement changes and device hotplug Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Robert Jennings authored
To support Cooperative Memory Overcommitment (CMO), we need to check for failure from some of the tce hcalls. These changes for the pseries platform affect the powerpc architecture; patches for the other affected platforms are included in this patch. pSeries platform IOMMU code changes: * platform TCE functions must handle H_NOT_ENOUGH_RESOURCES errors and return an error. Architecture IOMMU code changes: * Calls to ppc_md.tce_build need to check return values and return DMA_MAPPING_ERROR for transient errors. Architecture changes: * struct machdep_calls for tce_build*_pSeriesLP functions need to change to indicate failure. * all other platforms will need updates to iommu functions to match the new calling semantics; they will return 0 on success. The other platforms default configs have been built, but no further testing was performed. Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Olof Johansson <olof@lixom.net> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Brian King authored
With the addition of Cooperative Memory Overcommitment (CMO) support for IBM Power Systems, two fields have been added to the VPA to report paging statistics. Add support in lparcfg to report them to userspace. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Brian King authored
Adds a collaborative memory manager, which acts as a simple balloon driver for System p machines that support cooperative memory overcommitment (CMO). Adds a platform configuration option for CMO called PPC_SMLPAR. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Brian King authored
Newer versions of firmware support page states, which are used by the collaborative memory manager (future patch) to "loan" pages to the hypervisor for use by other partitions. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Robert Jennings authored
For Cooperative Memory Overcommitment (CMO), set the FW_FEATURE_CMO flag in powerpc_firmware_features from the rtas ibm,get-system-parameters table prior to calling iommu_init_early_pSeries. With this, any CMO specific functionality can be controlled by checking: firmware_has_feature(FW_FEATURE_CMO) Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Robert Jennings authored
Split the retrieval of processor entitlement data returned in the H_GET_PPP hcall into its own helper routine. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Nathan Fontenot authored
Update /proc/ppc64/lparcfg to display Cooperative Memory Overcommitment statistics as reported by the H_GET_MPP hcall. This also updates the lparcfg interface to allow setting memory entitlement and weight. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Nathan Fotenot authored
Split the retrieval and setting of processor entitlement and weight into helper routines. This also removes the printing of the raw values returned from h_get_ppp, the values are already parsed and printed. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Nathan Fontenot authored
Remove the extraneous error reporting used when a hcall made from lparcfg fails. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Segher Boessenkool authored
My previous patch to fix compilation with binutils-2.17 causes a "file truncated" build error from ld with binutils 2.15 (and possibly older), and a warning with 2.16 and 2.17. This fixes it. Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org> Acked-by: Chuck Meade <chuckmeade@mindspring.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Mark Nelson authored
At the moment the fixed mapping is by default strongly ordered (the iommu_fixed=weak boot option must be used to make the fixed mapping weakly ordered). If we're on a setup where the southbridge is being used in endpoint mode (triblade and CAB boards) the default should be a weakly ordered fixed mapping. This adds a check so that if a node of type pcie-endpoint can be found in the device tree the fixed mapping is set to be weak by default (but can be overridden using iommu_fixed=strong). Signed-off-by: Mark Nelson <markn@au1.ibm.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Luis Machado authored
This patch implements support for HW based watchpoint via the DBSR_DAC (Data Address Compare) facility of the BookE processors. It does so by interfacing with the existing DABR breakpoint code and adding the necessary bits and pieces for the new bits to be properly set or cleared Signed-off-by: Luis Machado <luisgpm@br.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Stephen Rothwell authored
A struct sysdev_attribute * parameter was added to the show routine by commit 4a0b2b4d "sysdev: Pass the attribute to the low level sysdev show/store function". This eliminates a warning: arch/powerpc/kernel/sysfs.c:538: warning: initialization from incompatible pointer type Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Nathan Lynch authored
Stash the first platform string matched by identify_cpu() in powerpc_base_platform, and supply that to the ELF loader for the value of AT_BASE_PLATFORM. Signed-off-by: Nathan Lynch <ntl@pobox.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Nathan Lynch authored
Some IBM POWER-based platforms have the ability to run in a mode which mostly appears to the OS as a different processor from the actual hardware. For example, a Power6 system may appear to be a Power5+, which makes the AT_PLATFORM value "power5+". This means that programs are restricted to the ISA supported by Power5+; Power6-specific instructions are treated as illegal. However, some applications (virtual machines, optimized libraries) can benefit from knowledge of the underlying CPU model. A new aux vector entry, AT_BASE_PLATFORM, will denote the actual hardware. For example, on a Power6 system in Power5+ compatibility mode, AT_PLATFORM will be "power5+" and AT_BASE_PLATFORM will be "power6". The idea is that AT_PLATFORM indicates the instruction set supported, while AT_BASE_PLATFORM indicates the underlying microarchitecture. If the architecture has defined ELF_BASE_PLATFORM, copy that value to the user stack in the same manner as ELF_PLATFORM. Signed-off-by: Nathan Lynch <ntl@pobox.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Benjamin Herrenschmidt authored
-
Benjamin Herrenschmidt authored
-
Adrian Bunk authored
This fixes the following compile error caused by commit f9247273 ("UFS: add const to parser token table"): CC fs/nfs/nfsroot.o /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/nfs/nfsroot.c:130: error: tokens causes a section type conflict make[3]: *** [fs/nfs/nfsroot.o] Error 1 Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
..otherwise oprofile will fall back on that poor timer interrupt. Also replace the unreadable chain of if-statements with a "switch()" statement instead. It generates better code, and is a lot clearer. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 24 Jul, 2008 19 commits
-
-
Linus Torvalds authored
Suresh Siddha wants to fix a possible FPU leakage in error conditions, but the fact that save/restore_i387() are inlines in a header file makes that harder to do than necessary. So start off with an obvious cleanup. This just moves the x86-64 version of save/restore_i387() out of the header file, and moves it to the only file that it is actually used in: arch/x86/kernel/signal_64.c. So exposing it in a header file was wrong to begin with. [ Side note: I'd like to fix up some of the games we play with the 32-bit version of these functions too, but that's a separate matter. The 32-bit versions are shared - under different names at that! - by both the native x86-32 code and the x86-64 32-bit compatibility code ] Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6Linus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (76 commits) ide: use proper printk() KERN_* levels in ide-probe.c ide: fix for EATA SCSI HBA in ATA emulating mode ide: remove stale comments from drivers/ide/Makefile ide: enable local IRQs in all handlers for TASKFILE_NO_DATA data phase ide-scsi: remove kmalloced struct request ht6560b: remove old history ht6560b: update email address ide-cd: fix oops when using growisofs gayle: release resources on ide_host_add() failure palm_bk3710: add UltraDMA/100 support ide: trivial sparse annotations ide: ide-tape.c sparse annotations and unaligned access removal ide: drop 'name' parameter from ->init_chipset method ide: prefix messages from IDE PCI host drivers by driver name it821x: remove DECLARE_ITE_DEV() macro it8213: remove DECLARE_ITE_DEV() macro ide: include PCI device name in messages from IDE PCI host drivers ide: remove <asm/ide.h> for some archs ide-generic: remove ide_default_{io_base,irq}() inlines (take 3) ide-generic: is no longer needed on ppc32 ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-acpi-2.6Linus Torvalds authored
* 'release-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-acpi-2.6: acpi: fix crash in core ACPI code, triggered by CONFIG_ACPI_PCI_SLOT=y ACPI: thinkpad-acpi: don't misdetect in get_thinkpad_model_data() on -ENOMEM ACPI: thinkpad-acpi: bump up version to 0.21 ACPI: thinkpad-acpi: add bluetooth and WWAN rfkill support ACPI: thinkpad-acpi: WLSW overrides other rfkill switches ACPI: thinkpad-acpi: prepare for bluetooth and wwan rfkill support ACPI: thinkpad-acpi: consolidate wlsw notification function ACPI: thinkpad-acpi: minor refactor on radio switch init Revert "ACPI: don't walk tables if ACPI was disabled" Revert "dock: bay: Don't call acpi_walk_namespace() when ACPI is disabled." Revert "Fix FADT parsing" ACPI : Set FAN device to correct state in boot phase ACPI: Ignore _BQC object when registering backlight device ACPI: stop complaints about interrupt link End Tags and blank IRQ descriptors
-
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6Linus Torvalds authored
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: fixup sparse endianness warnings in proc.c PCI PM: make more PCI PM core functionality available to drivers PCI/DMAR: don't assume presence of RMRRs PCI hotplug: fix error path in pci_slot's register_slot
-
Bartlomiej Zolnierkiewicz authored
While at it: - fixup printk() messages in save_match() and hwif_init(). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
-
Bartlomiej Zolnierkiewicz authored
IDE probing code used to skip devices attached to EATA SCSI HBA in ATA emulating mode but because of warm-plug support port I/O resources are no longer freed if no devices are detected on a port and the decision about the driver to use is left up to the user. Remove no longer valid EATA SCSI HBA quirk from do_identify(). Noticed-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
-
Bartlomiej Zolnierkiewicz authored
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
-
Bartlomiej Zolnierkiewicz authored
It is already done by task_no_data_intr() and there is no reason not to do it in other TASKFILE_NO_DATA data phase handlers. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
-
FUJITA Tomonori authored
This converts ide-scsi to use blk_get/put_request instead of kmalloc/kfree. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
-
Jan Evert van Grootheest authored
Remove the ancient version history. Git does a better job. From: Jan Evert van Grootheest <j.e.van.grootheest@caiway.nl> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
-
Jan Evert van Grootheest authored
Update email address. From: Jan Evert van Grootheest <j.e.van.grootheest@caiway.nl> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
-
Jens Axboe authored
cdrom_read_capacity() will blindly return the capacity from the device without sanity-checking it. This later causes code in fs/buffer.c to oops. Fix this by checking that the device is telling us sensible things. From: Jens Axboe <jens.axboe@oracle.com> Cc: Michael Buesch <mb@bu3sch.de> Cc: Jan Kara <jack@suse.cz> Cc: Arnd Bergmann <arnd@arndb.de> Cc: <stable@kernel.org> Cc: Borislav Petkov <petkovbb@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [bart: print device name instead of driver name] Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> [harvey: blocklen is a big-endian value] Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
-
Bartlomiej Zolnierkiewicz authored
"gayle: reserve memory resources at once" patch temporary removed freeing of resources on failure (to ease convertion to ide_host_add() interface). This patch fixes it. Thanks to Geert for noticing the issue. Noticed-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
-
Sergei Shtylyov authored
This controller supports UltraDMA up to mode 5 but it should be clocked with at least twice the data strobe frequency, so enable mode 5 for 100+ MHz IDECLK. While at it, start passing the correct device to clk_get() -- it worked anyway but WTF? :-/ Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
-
Harvey Harrison authored
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
-
Harvey Harrison authored
If this is actually unaligned the access of speed/max_speed above is already broken and needs a get_unaligned. Otherwise it is aligned and they can be removed. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Borislav Petkov <petkovbb@googlemail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
-
Bartlomiej Zolnierkiewicz authored
There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
-
Bartlomiej Zolnierkiewicz authored
Prefix messages from IDE PCI host drivers by driver name instead of marketed chipset name (it is still possible to exactly identify the particular chipset basing on driver messages). As a bonus this provides nice code savings for some drivers: text data bss dec hex filename 3826 112 8 3946 f6a drivers/ide/pci/amd74xx.o.before 2786 112 8 2906 b5a drivers/ide/pci/amd74xx.o.after 764 108 0 872 368 drivers/ide/pci/cs5520.o.before 680 108 0 788 314 drivers/ide/pci/cs5520.o.after 1680 112 4 1796 704 drivers/ide/pci/generic.o.before 1155 112 4 1271 4f7 drivers/ide/pci/generic.o.after 7128 792 0 7920 1ef0 drivers/ide/pci/hpt366.o.before 6984 792 0 7776 1e60 drivers/ide/pci/hpt366.o.after 2800 148 0 2948 b84 drivers/ide/pci/pdc202xx_new.o.before 2523 148 0 2671 a6f drivers/ide/pci/pdc202xx_new.o.after 2831 148 0 2979 ba3 drivers/ide/pci/pdc202xx_old.o.before 2683 148 0 2831 b0f drivers/ide/pci/pdc202xx_old.o.after 3776 112 4 3892 f34 drivers/ide/pci/piix.o.before 2804 112 4 2920 b68 drivers/ide/pci/piix.o.after 4693 116 0 4809 12c9 drivers/ide/pci/siimage.o.before 4600 116 0 4716 126c drivers/ide/pci/siimage.o.after Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
-
Bartlomiej Zolnierkiewicz authored
While at it: * it821x_chipsets[] -> it821x_chipset. * Fix it821x_chipset's name field (as it is used for IT8211/8212). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
-