commit 5047887caf1806f31652210df27fb62a7c43f27d Merge: 996abf0... 973b7d8... Author: Linus Torvalds Date: Fri Jul 25 11:08:17 2008 -0700 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (34 commits) powerpc: Wireup new syscalls Move update_mmu_cache() declaration from tlbflush.h to pgtable.h powerpc/pseries: Remove kmalloc call in handling writes to lparcfg powerpc/pseries: Update arch vector to indicate support for CMO ibmvfc: Add support for collaborative memory overcommit ibmvscsi: driver enablement for CMO ibmveth: enable driver for CMO ibmveth: Automatically enable larger rx buffer pools for larger mtu powerpc/pseries: Verify CMO memory entitlement updates with virtual I/O powerpc/pseries: vio bus support for CMO powerpc/pseries: iommu enablement for CMO powerpc/pseries: Add CMO paging statistics powerpc/pseries: Add collaborative memory manager powerpc/pseries: Utilities to set firmware page state powerpc/pseries: Enable CMO feature during platform setup powerpc/pseries: Split retrieval of processor entitlement data into a helper routine powerpc/pseries: Add memory entitlement capabilities to /proc/ppc64/lparcfg powerpc/pseries: Split processor entitlement retrieval and gathering to helper routines powerpc/pseries: Remove extraneous error reporting for hcall failures in lparcfg powerpc: Fix compile error with binutils 2.15 ... Fixed up conflict in arch/powerpc/platforms/52xx/Kconfig manually. commit 996abf053eec4d67136be8b911bbaaf989cfb99c Merge: 93082f0... d37e6bf... Author: Linus Torvalds Date: Fri Jul 25 11:02:17 2008 -0700 Merge branch 'linux-next' of git://git.infradead.org/~dedekind/ubi-2.6 * 'linux-next' of git://git.infradead.org/~dedekind/ubi-2.6: (22 commits) UBI: always start the background thread UBI: fix gcc warning UBI: remove pre-sqnum images support UBI: fix kernel-doc errors and warnings UBI: fix checkpatch.pl errors and warnings UBI: bugfix - do not torture PEB needlessly UBI: rework scrubbing messages UBI: implement multiple volumes rename UBI: fix and re-work debugging stuff UBI: amend commentaries UBI: fix error message UBI: improve mkvol request validation UBI: add ubi_sync() interface UBI: fix 64-bit calculations UBI: fix LEB locking UBI: fix memory leak on error path UBI: do not forget to free internal volumes UBI: fix memory leak UBI: avoid unnecessary division operations UBI: fix buffer padding ... commit 93082f0b15841b8926c38ef224d0e6f720000635 Author: Linus Torvalds Date: Fri Jul 25 10:56:36 2008 -0700 Fix ahci driver 'flags' type The new type checking of the flags arguments to irqsave and friends (commit 3f307891ce0e7b0438c432af1aacd656a092ff45) pointed out this thing with a big nice warning. Signed-off-by: Linus Torvalds commit f87bd330edf06fd49b3fbc368d90fb180375f2a2 Author: Dave Jiang Date: Fri Jul 25 01:49:14 2008 -0700 edac: mpc85xx fix pci ofdev 2nd pass Convert PCI err device from platform to open firmware of_dev to comply with powerpc schemes. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Dave Jiang Signed-off-by: Doug Thompson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit fcb19171d196172a4f57e056f7a60e6d1e2e8c85 Author: Dave Jiang Date: Fri Jul 25 01:49:14 2008 -0700 edac: mv64x60 add pci fixup Fixup of missing bit 0 on 64360 PCIx_ERR_MASK and errata FEr-#11 and FEr-#16 for the 64460. Bit 0 must remain 0. Signed-off-by: Dave Jiang Signed-off-by: Doug Thompson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 596d3941035d4d4b484c820f10f57fd4816c6615 Author: Dave Jiang Date: Fri Jul 25 01:49:13 2008 -0700 edac: mv64x60 fix get_property Update get_property() call to use of_get_property() in order to fix compile Signed-off-by: Dave Jiang Signed-off-by: Doug Thompson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 10d33e9c36827e5371479e55ef4089e000af2638 Author: Doug Thompson Date: Fri Jul 25 01:49:12 2008 -0700 edac: e752x fix too loud on nonmemory errors This module harvests more than just memory errors, it also harvests various bus and dma errors that the Chipset detects. Previously, it would report all such errors, which would cause output to be TOO loud. This patches therefore adds a parameter which is used to turn off NON-MEMORY error reports by default. Or the reporting can be enabled via the parameter Also did code style cleanup: less than 80 characters per line rule Signed-off-by: Doug Thompson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 124682c78563e10ba8b2ecd21b0f1098903b7808 Author: Arthur Jones Date: Fri Jul 25 01:49:12 2008 -0700 edac: core fix added newline to sysfs dimm labels The channel DIMM label does not seem to be used much in the edac code. However, where it is used (in the core code), it is assumed to not have a newline embedded. This leaves the sysfs file newline free which looks funny when cat'ing it. Here we just add the trailing newline to the sysfs chX_dimm_label output... [Doug Thompson note: the DIMM label is one of the primary uses of EDAC. User space daemon scripts, edac-utils@sourceforge, populate the DIMM label fields, via /sys/devices/system/edac attributes, with the silk screen labels of the motherboard in use. dmidecode access BIOS tables, but BIOS tables are well known to be incorrect and useless in these respects. edac-utils will strip off any newlines before its use of the output, when displaying DIMM slot silk screen labels. Signed-off-by: Arthur Jones Signed-off-by: Doug Thompson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f9fc82adca43d38a1b79128d80750bd361e15abe Author: Arthur Jones Date: Fri Jul 25 01:49:11 2008 -0700 edac: core fix static to dynamic kset Static kobjects and ksets are not supported in Linux kernel. Convert the mc_kset from static to dynamic. This patch depends on my previous patch to remove the module parameter attributes from mc... Signed-off-by: Arthur Jones Signed-off-by: Doug Thompson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 327dafb1c61c9da7b95ac6cc7634a2340cc9509c Author: Arthur Jones Date: Fri Jul 25 01:49:10 2008 -0700 edac: core fix redundant sysfs controls to parameters /sys/devices/system/edac/mc has a few files which are duplicated in /sys/module/edac_core/parameters. Now that all the functionality is duplicated between these two locations, we remove the former kobject attributes and update the documentation. Signed-off-by: Arthur Jones Signed-off-by: Doug Thompson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 096846e2b0ef39cb7c348f837f06984ef6ba8aa7 Author: Arthur Jones Date: Fri Jul 25 01:49:09 2008 -0700 edac: core fix workq timer When updating the edac_mc_poll_msec module parameter from the sysfs /sys/module/edac_core/parameters/edac_mc_poll_msec file, we don't update the workq timers. So that, if we move from a big poll time to a small one, the small one won't take effect until the big one has timed out. Here we provide a new module parameter set method to call out to the update routine. This brings the /sys/module/edac_core/parameters functionality up to that provided by the /sys/drivers/system/edac/mc sysfs module parameter files so that we can remove them or at least link to the /sys/module files... Signed-off-by: Arthur Jones Signed-off-by: Doug Thompson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 14cc571bb1d072d3f4be2875ea520ab03e093471 Author: Arthur Jones Date: Fri Jul 25 01:49:08 2008 -0700 edac: core fix to use dynamic kobject Static kobjects are not supported in linux kernel. Convert the edac_pci_top_main_kobj from static to dynamic. This avoids the double free of the edac_pci_top_main_kobj.name that we see on module reload of the e752x edac driver (and probably others as well). In addition Greg KH has pointed out that this code may be cleaned up significantly. I will look at that as a follow-on patch, for now, I just want the minimum fix to get this double-free oops bug squashed... Many thanks to Greg KH for his patience in showing me what the Documentation/kobject.txt already said (oops)... Signed-off-by: Arthur Jones Signed-off-by: Doug Thompson Acked-by: Greg Kroah-Hartman Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit b238e57723a6fb2c365fc35de5d7c48ccf9300cd Author: Arthur Jones Date: Fri Jul 25 01:49:08 2008 -0700 edac: i5100: cleanup Some code cleanliness issues found by Andrew Morton (thanks!) which should not affect functionality, but which should help make the code more maintainable. In particular, we now: * convert all #define's w/ a parameter to static inlines * use 1UL rather than 1ULL when calculating an unsigned long * use pci_disable_device The resulting code is tested and seems to work fine... Signed-off-by: Arthur Jones Cc: Doug Thompson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 178d5a742291976d13bff55fa2b130879d4510de Author: Arthur Jones Date: Fri Jul 25 01:49:06 2008 -0700 edac: i5100 fix unmask ecc bits Explicitly unmask ECC errors we are interested in reporting. Signed-off-by: Arthur Jones Signed-off-by: Doug Thompson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 43920a598f9358a12eb59eeddc4cd950f03aea8c Author: Arthur Jones Date: Fri Jul 25 01:49:06 2008 -0700 edac: i5100 fix enable ecc hardware It is possible that the BIOS did not enable ECC at boot time. We check for that case and fail to load if it is true. Signed-off-by: Arthur Jones Signed-off-by: Doug Thompson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f7952ffcffa88c9a3fa92c26081f4ec9143c680f Author: Arthur Jones Date: Fri Jul 25 01:49:05 2008 -0700 edac: i5100 fix missing bits The error mask we use to trigger ECC notifications is missing many bits of interest. We add these bits here so that all possible ECC errors can be reported. Signed-off-by: Arthur Jones Signed-off-by: Doug Thompson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 8f421c595a9145959d8aab09172743132abdffdb Author: Arthur Jones Date: Fri Jul 25 01:49:04 2008 -0700 edac: i5100 new intel chipset driver Preliminary support for the Intel 5100 MCH. CE and UE errors are reported along with the current DIMM label information and other memory parameters. Reasons why this is preliminary: 1) This chip has 2 independent memory controllers which, for best perforance, use interleaved accesses to the DDR2 memory. This architecture does not map very well to the current edac data structures which depend on symmetric channel access to the interleaved data. Without core changes, the best I could do for now is to map both memory controllers to different csrows (first all ranks of controller 0, then all ranks of controller 1). Someone much more familiar with the edac core than I will probably need to come up with a more general data structure to handle the interleaving and de-interleaving of the two memory controllers. 2) I have not yet tackled the de-interleaving of the rank/controller address space into the physical address space of the CPU. There is nothing fundamentally missing, it is just ending up to be a lot of code, and I'd rather keep it separate for now, esp since it doesn't work yet... 3) The code depends on a particular i5100 chip select to DIMM mainboard chip select mapping. This mapping seems obvious to me in order to support dual and single ranked memory, but it is not unique and DIMM labels could be wrong on other mainboards. There is no way to query this mapping that I know of. 4) The code requires that the i5100 is in 32GB mode. Only 4 ranks per controller, 2 ranks per DIMM are supported. I do not have hardware (nor do I expect to have hardware anytime soon) for the 48GB (6 ranks per controller) mode. 5) The serial presence detect code should be broken out into a "real" i2c driver so that decode-dimms.pl can work. Signed-off-by: Arthur Jones Signed-off-by: Doug Thompson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 48e90761b570ff57f58b726229d229729949c5bb Author: Miklos Szeredi Date: Fri Jul 25 01:49:02 2008 -0700 fuse: lockd support If fuse filesystem doesn't define it's own lock operations, then allow the lock manager to work with fuse. Adding lockd support for remote locking is also possible, but more rarely used, so leave it till later. Signed-off-by: Miklos Szeredi Cc: "J. Bruce Fields" Cc: Trond Myklebust Cc: Matthew Wilcox Cc: David Teigland Cc: Christoph Hellwig Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 33670fa296860283f04a7975b8c790f101e43a6e Author: Miklos Szeredi Date: Fri Jul 25 01:49:02 2008 -0700 fuse: nfs export special lookups Implement the get_parent export operation by sending a LOOKUP request with ".." as the name. Implement looking up an inode by node ID after it has been evicted from the cache. This is done by seding a LOOKUP request with "." as the name (for all file types, not just directories). The filesystem can set the FUSE_EXPORT_SUPPORT flag in the INIT reply, to indicate that it supports these special lookups. Thanks to John Muir for the original implementation of this feature. Signed-off-by: Miklos Szeredi Cc: "J. Bruce Fields" Cc: Trond Myklebust Cc: Matthew Wilcox Cc: David Teigland Cc: Christoph Hellwig Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit c180eebe1390c2076ead6a9bc95a02efb994edb7 Author: Miklos Szeredi Date: Fri Jul 25 01:49:01 2008 -0700 fuse: add fuse_lookup_name() helper Add a new helper function which sends a LOOKUP request with the supplied name. This will be used by the next patch to send special LOOKUP requests with "." and ".." as the name. Signed-off-by: Miklos Szeredi Cc: Christoph Hellwig Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit dbd561d236ff16f8143bc727d91758ddd190e8cb Author: Miklos Szeredi Date: Fri Jul 25 01:49:00 2008 -0700 fuse: add export operations Implement export_operations, to allow fuse filesystems to be exported to NFS. This feature has been in the out-of-tree fuse module, and is widely used and tested. It has not been originally merged into mainline, because doing the NFS export in userspace was thought to be a cleaner and more efficient way of doing it, than through the kernel. While that is true, it would also have involved a lot of duplicated effort at reimplementing NFS exporting (all the different versions of the protocol). This effort was unfortunately not undertaken by anyone, so we are left with doing it the easy but less efficient way. If this feature goes in, the out-of-tree fuse module can go away, which would have several advantages: - not having to maintain two versions - less confusion for users - no bugs due to kernel API changes Comment from hch: - Use the same fh_type values as XFS, since we use the same fh encoding. Signed-off-by: Miklos Szeredi Cc: Christoph Hellwig Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 0de6256daafa3a97a269995e9b29f956bd419bbf Author: Miklos Szeredi Date: Fri Jul 25 01:48:59 2008 -0700 fuse: prepare lookup for nfs export Use d_splice_alias() instead of d_add() in fuse lookup code, to allow NFS exporting. Signed-off-by: Miklos Szeredi Cc: Christoph Hellwig Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 764c76b371722e0cba5c24d91225f0f954b69d44 Author: Miklos Szeredi Date: Fri Jul 25 01:48:58 2008 -0700 locks: allow ->lock() to return FILE_LOCK_DEFERRED Allow filesystem's ->lock() method to call posix_lock_file() instead of posix_lock_file_wait(), and return FILE_LOCK_DEFERRED. This makes it possible to implement a such a ->lock() function, that works with the lock manager, which needs the call to be asynchronous. Now the vfs_lock_file() helper can be used, so this is a cleanup as well. Signed-off-by: Miklos Szeredi Cc: "J. Bruce Fields" Cc: Trond Myklebust Cc: Matthew Wilcox Cc: David Teigland Cc: Christoph Hellwig Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit b648a6de00770cc325c22f43bdd4e935f6a2ee55 Author: Miklos Szeredi Date: Fri Jul 25 01:48:57 2008 -0700 locks: cleanup code duplication Extract common code into a function. Signed-off-by: Miklos Szeredi Cc: "J. Bruce Fields" Cc: Trond Myklebust Cc: Matthew Wilcox Cc: David Teigland Cc: Christoph Hellwig Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit bde74e4bc64415b142e556a34d295a52a1b7da9d Author: Miklos Szeredi Date: Fri Jul 25 01:48:57 2008 -0700 locks: add special return value for asynchronous locks Use a special error value FILE_LOCK_DEFERRED to mean that a locking operation returned asynchronously. This is returned by posix_lock_file() for sleeping locks to mean that the lock has been queued on the block list, and will be woken up when it might become available and needs to be retried (either fl_lmops->fl_notify() is called or fl_wait is woken up). f_op->lock() to mean either the above, or that the filesystem will call back with fl_lmops->fl_grant() when the result of the locking operation is known. The filesystem can do this for sleeping as well as non-sleeping locks. This is to make sure, that return values of -EAGAIN and -EINPROGRESS by filesystems are not mistaken to mean an asynchronous locking. This also makes error handling in fs/locks.c and lockd/svclock.c slightly cleaner. Signed-off-by: Miklos Szeredi Cc: Trond Myklebust Cc: "J. Bruce Fields" Cc: Matthew Wilcox Cc: David Teigland Cc: Christoph Hellwig Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit cc77b1521d06be07c9bb1a4a3e1f775dcaa15093 Author: Miklos Szeredi Date: Fri Jul 25 01:48:55 2008 -0700 lockd: dont return EAGAIN for a permanent error Fix nlm_fopen() to return NLM_FAILED (or NLM_LCK_DENIED_NOLOCKS) instead of NLM_LCK_DENIED. The latter means the lock request failed because of a conflicting lock (i.e. a temporary error), which is wrong in this case. Also fix the client to return ENOLCK instead of EAGAIN if a blocking lock request returns with NLM_LOCK_DENIED. Signed-off-by: Miklos Szeredi Cc: Trond Myklebust Cc: "J. Bruce Fields" Cc: Matthew Wilcox Cc: David Teigland Cc: Christoph Hellwig Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit b81f3ea92ba1fa676775677679889dc2a7f03c8b Author: Vegard Nossum Date: Fri Jul 25 01:48:55 2008 -0700 taskstats: remove initialization of static per-cpu variable Cc: Shailabh Nagar Signed-off-by: Vegard Nossum Cc: Balbir Singh Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 9b0975a20af1ff2f367e3b6b7c150eb114c6b500 Author: Keika Kobayashi Date: Fri Jul 25 01:48:54 2008 -0700 per-task-delay-accounting: update document and getdelays.c for memory reclaim Update document and make getdelays.c show delay accounting for memory reclaim. For making a distinction between "swapping in pages" and "memory reclaim" in getdelays.c, MEM is changed to SWAP. Signed-off-by: Keika Kobayashi Acked-by: Balbir Singh Cc: KOSAKI Motohiro Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 016ae219b920c4e606088761d3d6070cdf8ba706 Author: Keika Kobayashi Date: Fri Jul 25 01:48:53 2008 -0700 per-task-delay-accounting: update taskstats for memory reclaim delay Add members for memory reclaim delay to taskstats, and accumulate them in __delayacct_add_tsk() . Signed-off-by: Keika Kobayashi Cc: Hiroshi Shimamoto Cc: Balbir Singh Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 873b47717732c2f33a4b14de02571a4295a02f0c Author: Keika Kobayashi Date: Fri Jul 25 01:48:52 2008 -0700 per-task-delay-accounting: add memory reclaim delay Sometimes, application responses become bad under heavy memory load. Applications take a bit time to reclaim memory. The statistics, how long memory reclaim takes, will be useful to measure memory usage. This patch adds accounting memory reclaim to per-task-delay-accounting for accounting the time of do_try_to_free_pages(). - When System is under low memory load, memory reclaim may not occur. $ free total used free shared buffers cached Mem: 8197800 1577300 6620500 0 4808 1516724 -/+ buffers/cache: 55768 8142032 Swap: 16386292 0 16386292 $ vmstat 1 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 5069748 10612 3014060 0 0 0 0 3 26 0 0 100 0 0 0 0 5069748 10612 3014060 0 0 0 0 4 22 0 0 100 0 0 0 0 5069748 10612 3014060 0 0 0 0 3 18 0 0 100 0 Measure the time of tar command. $ ls -s test.dat 1501472 test.dat $ time tar cvf test.tar test.dat real 0m13.388s user 0m0.116s sys 0m5.304s $ ./delayget -d -p CPU count real total virtual total delay total 428 5528345500 5477116080 62749891 IO count delay total 338 8078977189 SWAP count delay total 0 0 RECLAIM count delay total 0 0 - When system is under heavy memory load memory reclaim may occur. $ vmstat 1 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 7159032 49724 1812 3012 0 0 0 0 3 24 0 0 100 0 0 0 7159032 49724 1812 3012 0 0 0 0 4 24 0 0 100 0 0 0 7159032 49848 1812 3012 0 0 0 0 3 22 0 0 100 0 In this case, one process uses more 8G memory by execution of malloc() and memset(). $ time tar cvf test.tar test.dat real 1m38.563s <- increased by 85 sec user 0m0.140s sys 0m7.060s $ ./delayget -d -p CPU count real total virtual total delay total 9021 7140446250 7315277975 923201824 IO count delay total 8965 90466349669 SWAP count delay total 3 21036367 RECLAIM count delay total 740 61011951153 In the later case, the value of RECLAIM is increasing. So, taskstats can show how much memory reclaim influences TAT. Signed-off-by: Keika Kobayashi Acked-by: Balbir Singh Acked-by: KOSAKI Motohiro Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 3e85ba034deec351f02cb55ff225bbd616463841 Author: David Howells Date: Fri Jul 25 01:48:50 2008 -0700 tsacct: fix bacct_add_tsk()'s use of do_div() Fix bacct_add_tsk()'s use of do_div() on an s64 by making ac_etime a u64 instead and dividing that. Possibly this should be guarded lest the interval calculation turn up negative, but the possible negativity of the result of the division is cast away, and it shouldn't end up negative anyway. This was introduced by patch f3cef7a99469afc159fec3a61b42dc7ca5b6824f. Signed-off-by: David Howells Cc: Jay Lan Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 297c5d92634c809cef23d73e7b2556f2528ff7e2 Author: Andrea Righi Date: Fri Jul 25 01:48:49 2008 -0700 task IO accounting: provide distinct tgid/tid I/O statistics Report per-thread I/O statistics in /proc/pid/task/tid/io and aggregate parent I/O statistics in /proc/pid/io. This approach follows the same model used to account per-process and per-thread CPU times. As a practial application, this allows for example to quickly find the top I/O consumer when a process spawns many child threads that perform the actual I/O work, because the aggregated I/O statistics can always be found in /proc/pid/io. [ Oleg Nesterov points out that we should check that the task is still alive before we iterate over the threads, but also says that we can do that fixup on top of this later. - Linus ] Acked-by: Balbir Singh Signed-off-by: Andrea Righi Cc: Matt Heaton Cc: Shailabh Nagar Acked-by-with-comments: Oleg Nesterov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 0c18d7a5df82524e634637c3aec24d4cba096442 Author: Pavel Emelyanov Date: Fri Jul 25 01:48:49 2008 -0700 bsdacct: fix and add comments around acct_process() Fix the one describing what this function is and add one more - about locking absence around pid namespaces loop. Signed-off-by: Pavel Emelyanov Cc: Randy Dunlap Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 7d1e13505be8c2bd2207894f4e0f069e1f9b51c9 Author: Pavel Emelyanov Date: Fri Jul 25 01:48:48 2008 -0700 bsdacct: account dying tasks in all relevant namespaces This just makes the acct_proces walk the pid namespaces from current up to the top and account a task in each with the accounting turned on. ns->parent access if safe lockless, since current it still alive and holds its namespace, which in turn holds its parent. Signed-off-by: Pavel Emelyanov Cc: Balbir Singh Cc: "Eric W. Biederman" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit b5a7174875ea570cc675f2c503e800db8efdd6a7 Author: Pavel Emelyanov Date: Fri Jul 25 01:48:47 2008 -0700 bsdacct: turn acct off for all pidns-s on umount time All the bsd_acct_strcts with opened accounting are linked into a global list. So, the acct_auto_close(_mnt) walks one and drops the accounting for each. Signed-off-by: Pavel Emelyanov Cc: Balbir Singh Cc: "Eric W. Biederman" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 0b6b030fc30d169bb406b34b4fc60d99dde4a9c6 Author: Pavel Emelyanov Date: Fri Jul 25 01:48:47 2008 -0700 bsdacct: switch from global bsd_acct_struct instance to per-pidns one Allocate the structure on the first call to sys_acct(). After this each namespace, that ordered the accounting, will live with this structure till its own death. Two notes - routines, that close the accounting on fs umount time use the init_pid_ns's acct by now; - accounting routine accounts to dying task's namespace (also by now). Signed-off-by: Pavel Emelyanov Cc: Balbir Singh Cc: "Eric W. Biederman" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 6248b1b342005a428b1247b4e89249da1528d88d Author: Pavel Emelyanov Date: Fri Jul 25 01:48:46 2008 -0700 bsdacct: make internal code work with passed bsd_acct_struct, not global This adds the appropriate pointer to all the internal (i.e. static) functions that work with global acct instance. API calls pass a global instance to them (while we still have such). Mostly this is a s/acct_globals./acct->/ over the file. Signed-off-by: Pavel Emelyanov Cc: Balbir Singh Cc: "Eric W. Biederman" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit a75d97976517dcda69150fd81d6be86ae63324a1 Author: Pavel Emelyanov Date: Fri Jul 25 01:48:45 2008 -0700 bsdacct: turn the acct_lock from on-the-struct to global Don't use per-bsd-acct-struct lock, but work with a global one. This lock is taken for short periods, so it doesn't seem it'll become a bottleneck, but it will allow us to easily avoid many locking difficulties in the future. So this is a mostly s/acct_globals.lock/acct_lock/ over the file. Signed-off-by: Pavel Emelyanov Cc: Balbir Singh Cc: "Eric W. Biederman" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit e59a04a7aa5ce2483470aee4f2eb79ba6b9afe8b Author: Pavel Emelyanov Date: Fri Jul 25 01:48:44 2008 -0700 bsdacct: make check timer accept a bsd_acct_struct argument We're going to have many bsd_acct_struct instances, not just one, so the timer (currently working with a global one) has to know which one to work with. Use a handy setup_timer macro for it (thanks to Oleg for one). Signed-off-by: Pavel Emelyanov Cc: Balbir Singh Cc: "Eric W. Biederman" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 1c552858ac2b1732a99d234d46b98098baef41ff Author: Pavel Emelyanov Date: Fri Jul 25 01:48:44 2008 -0700 bsdacct: "truthify" a comment near acct_process The acct_process does not accept any arguments actually. Signed-off-by: Pavel Emelyanov Cc: Balbir Singh Cc: "Eric W. Biederman" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 20fad13ac66ac001c19220d3d08b4de5b6cca6e1 Author: Pavel Emelyanov Date: Fri Jul 25 01:48:43 2008 -0700 pidns: add the struct bsd_acct_struct pointer on pid_namespace struct All the bsdacct-related info will be stored in the area, pointer by this one. It will be NULL automatically for all new namespaces. Signed-off-by: Pavel Emelyanov Cc: Balbir Singh Cc: "Eric W. Biederman" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 84406c153a5bfa5d8b428a0933e9d39db6b59a75 Author: Pavel Emelyanov Date: Fri Jul 25 01:48:42 2008 -0700 pidns: use kzalloc when allocating new pid_namespace struct It makes many fields initialization implicit helping in auto-setting #ifdef-ed fields (bsd-acct related pointer will be such). Signed-off-by: Pavel Emelyanov Cc: Balbir Singh Cc: "Eric W. Biederman" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 081e4c8a75692c21f3a119a81ca3270081879d0e Author: Pavel Emelyanov Date: Fri Jul 25 01:48:42 2008 -0700 bsdacct: rename acct_gbls to bsd_acct_struct After I fixed access to task->tgid in kernel/acct.c, Oleg pointed out some bad side effects with this accounting vs pid namespaces interaction. I.e. when some task in pid namespace sets this accounting up, this blocks all the others from doing the same. Restricting this to init namespace only could help, but didn't look a graceful solution. So here is the approach to make this accounting work with pid namespaces properly. The idea is simple - when a task dies it accounts itself in each namespace it is visible from and which set the accounting up. For example here are the commands run and the output of lastcomm from init and sub namespaces: init_ns# accton pacct sub_ns# accton pacct (this is a different file - sub ns is run in a chroot-ed environment) init_ns# cat /dev/null sub_ns# ls /dev/null init_ns# accton sub_ns# accton sub_ns# lastcomm -f pacct ls 0 [136,0] 0.00 secs Thu May 15 10:30 accton 0 [136,0] 0.00 secs Thu May 15 10:30 init_ns# lastcomm -f pacct accton root pts/0 0.00 secs Thu May 15 14:30 << got from sub cat root pts/1 0.00 secs Thu May 15 14:30 ls root pts/0 0.00 secs Thu May 15 14:30 << got from sub accton root pts/1 0.00 secs Thu May 15 14:30 That was the summary, the details are in patches. This patch: It will be visible in pid_namespace.h file, so fix its name to look better outside the acct.c file. Signed-off-by: Pavel Emelyanov Cc: Balbir Singh Cc: "Eric W. Biederman" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 49b5cf34727a6c1be1568ab28e89a2d9a6bf51e0 Author: Jonathan Lim Date: Fri Jul 25 01:48:40 2008 -0700 accounting: account for user time when updating memory integrals Adapt acct_update_integrals() to include user time when calculating the time difference. The units of acct_rss_mem1 and acct_vm_mem1 are also changed from pages-jiffies to pages-usecs to avoid calling jiffies_to_usecs() in xacct_add_tsk() which might overflow. Signed-off-by: Jonathan Lim Cc: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 7394f0f6c0baab650ea9194cb1be847df646fb57 Author: Adrian Bunk Date: Fri Jul 25 01:48:40 2008 -0700 unexport uts_sem With the removal of the Solaris binary emulation the export of uts_sem became unused. Signed-off-by: Adrian Bunk Acked-by: David S. Miller Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit a89cc1959d0ea5f36bf7421dc97b34f03809637d Author: Harvey Harrison Date: Fri Jul 25 01:48:39 2008 -0700 markers: fix sparse integer as NULL pointer warning kernel/trace/trace_sysprof.c:164:20: warning: Using plain integer as NULL pointer Signed-off-by: Harvey Harrison Cc: Mathieu Desnoyers Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 28325df0d9339b7f3aba9c45174d4586223ef46b Author: Mathieu Desnoyers Date: Fri Jul 25 01:48:38 2008 -0700 markers: use rcu_barrier_sched() and call_rcu_sched() rcu_barrier_sched() and call_rcu_sched() were introduced in 2.6.26 for the Markers. Change the marker code to use them. It can be seen as a fix since the marker code was using an ugly, temporary, #ifdef hack to work around CONFIG_PREEMPT_RCU. Signed-off-by: Mathieu Desnoyers Acked-by: Paul McKenney Cc: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 24879a8e3e68f146d4d85528cc0b5dea712b77c5 Author: Matthias Kaehlcke Date: Fri Jul 25 01:48:38 2008 -0700 aoe: convert emsgs_sema into a completion ATA over Ethernet: The semaphore emsgs_sema is used for signalling an event, convert it in a completion. Signed-off-by: Matthias Kaehlcke Cc: "Ed L. Cashin" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit dbda0de52618d13d1b927c7ba7bb839cfddc4e8c Author: Pavel Emelyanov Date: Fri Jul 25 01:48:37 2008 -0700 pidns: remove find_task_by_pid, unused for a long time It seems to me that it was a mistake marking this function as deprecated and scheduling it for removal, rather than resolutely removing it after the last caller's death. Anyway - better late, then never. Signed-off-by: Pavel Emelyanov Cc: Oleg Nesterov Cc: "Eric W. Biederman" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit e49859e71e0318b564de1546bdc30fab738f9deb Author: Pavel Emelyanov Date: Fri Jul 25 01:48:36 2008 -0700 pidns: remove now unused find_pid function. This one had the only users so far - the kill_proc, which is removed, so drop this (invalid in namespaced world) call too. And of course - erase all references on it from comments. Signed-off-by: Pavel Emelyanov Cc: Oleg Nesterov Cc: "Eric W. Biederman" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 19b0cfcca41dd772065671ad0584e1cea0f3fd13 Author: Pavel Emelyanov Date: Fri Jul 25 01:48:35 2008 -0700 pidns: remove now unused kill_proc function This function operated on a pid_t to kill a task, which is no longer valid in a containerized system. It has finally lost all its users and we can safely remove it from the tree. Signed-off-by: Pavel Emelyanov Cc: Oleg Nesterov Cc: "Eric W. Biederman" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 33166b1ffca5e1945246bcaa77d72a22b0d3e531 Author: Richard Kennedy Date: Fri Jul 25 01:48:35 2008 -0700 shrink struct pid by removing padding on 64 bit builds When struct pid is built on a 64 bit platform gcc has to insert padding to maintain the correct alignment, by simply reordering its members the memory usage shrinks from 88 bytes to 80. I've successfully run with this patch on my desktop AMD64 machine. There are no significant kernel size changes to a default config.X86_64 on the latest git v2.6.26-rc1 text data bss dec hex filename 5404828 976760 734280 7115868 6c945c vmlinux 5404811 976760 734280 7115851 6c944b vmlinux.pid-patch Acked-by: "Eric W. Biederman" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 3ae4eed34be0177a8e003411a84e4ee212adbced Author: Adrian Bunk Date: Fri Jul 25 01:48:34 2008 -0700 proper pid{hash,map}_init() prototypes This patch adds proper prototypes for pid{hash,map}_init() in include/linux/pid_namespace.h Signed-off-by: Adrian Bunk Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 4ecb90090c84210a8bd2a9d7a5906e616735873c Author: Stephen Hemminger Date: Fri Jul 25 01:48:32 2008 -0700 sysctl: allow override of /proc/sys/net with CAP_NET_ADMIN Extend the permission check for networking sysctl's to allow modification when current process has CAP_NET_ADMIN capability and is not root. This version uses the until now unused permissions hook to override the mode value for /proc/sys/net if accessed by a user with capabilities. Found while working with Quagga. It is impossible to turn forwarding on/off through the command interface because Quagga uses secure coding practice of dropping privledges during initialization and only raising via capabilities when necessary. Since the dameon has reset real/effective uid after initialization, all attempts to access /proc/sys/net variables will fail. Signed-off-by: Stephen Hemminger Acked-by: "Eric W. Biederman" Cc: Chris Wright Cc: Alexey Dobriyan Cc: Andrew Morgan Cc: Pavel Emelyanov Cc: "David S. Miller" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 99541c23cd32bacf1a591ca537a7c0cb9053ad7e Author: Alexey Dobriyan Date: Fri Jul 25 01:48:31 2008 -0700 sysctl: check for bogus modes Catch, e. g., 644/0644 typo. Signed-off-by: Alexey Dobriyan Acked-by: "Eric W. Biederman" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 339caf2a224fc9af0f01686bf287dda32c6efca6 Author: David Sterba Date: Fri Jul 25 01:48:31 2008 -0700 proc: misplaced export of find_get_pid Move EXPORT_SYMBOL right after the func Signed-off-by: David Sterba Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 6eedf8d30d2b48e86fbcee1a32fb2fa5f42219ee Author: Alexey Dobriyan Date: Fri Jul 25 01:48:30 2008 -0700 proc: move Kconfig to fs/proc/Kconfig Signed-off-by: Alexey Dobriyan Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit a9bd4a3e070ba7494f154e1a11687a8a957d88dc Author: Alexey Dobriyan Date: Fri Jul 25 01:48:30 2008 -0700 proc: remove pathetic remount code MS_RMT_MASK will unmask changes in do_remount_sb() anyway. Signed-off-by: Alexey Dobriyan Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 881adb85358309ea9c6f707394002719982ec607 Author: Alexey Dobriyan Date: Fri Jul 25 01:48:29 2008 -0700 proc: always do ->release Current two-stage scheme of removing PDE emphasizes one bug in proc: open rmmod remove_proc_entry close ->release won't be called because ->proc_fops were cleared. In simple cases it's small memory leak. For every ->open, ->release has to be done. List of openers is introduced which is traversed at remove_proc_entry() if neeeded. Discussions with Al long ago (sigh). Signed-off-by: Alexey Dobriyan Cc: Al Viro Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 6e644c3126149b65460610fe5a00d8a162092abe Author: Adrian Bunk Date: Fri Jul 25 01:48:28 2008 -0700 move proc_kmsg_operations to fs/proc/internal.h This patch moves the extern of struct proc_kmsg_operations to fs/proc/internal.h and adds an #include "internal.h" to fs/proc/kmsg.c so that the latter sees the former. Signed-off-by: Adrian Bunk Cc: Alexey Dobriyan Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit cd9a6f1078ed07fe919667b73e829f3bac485573 Author: Adrian Bunk Date: Fri Jul 25 01:48:28 2008 -0700 unexport proc_clear_tty With the removal of the Solaris binary emulation the export of proc_clear_tty became unused. Signed-off-by: Adrian Bunk Acked-by: David S. Miller Acked-by: Alan Cox Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 25377479de7539fdc871a0f0ecaa39da42353bbc Author: Akinobu Mita Date: Fri Jul 25 01:48:27 2008 -0700 dell_rbu: use memory_read_from_buffer() Signed-off-by: Akinobu Mita Cc: Abhay Salunke Cc: Zhang Rui Cc: Matt Domsch Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit d991696263a704be7f41ac186f1a0ed17963c260 Author: Thomas Gleixner Date: Fri Jul 25 01:48:26 2008 -0700 fs/partitions/efi: convert to pr_debug convert the local Dprintk() compile time debug printk wrappers to the generic pr_debug() wrapper. Signed-off-by: Thomas Gleixner Cc: Matt Domsch Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 04ebd4aee52b06a2c38127d9208546e5b96f3a19 Author: Abdel Benamrouche Date: Fri Jul 25 01:48:26 2008 -0700 block/ioctl.c and fs/partition/check.c: check value returned by add_partition() Now that add_partition() has been aught to propagate errors, let's check them. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Abdel Benamrouche Cc: Jens Axboe Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit d805dda412346225a50af2d399d958a4bc676c38 Author: Abdel Benamrouche Date: Fri Jul 25 01:48:25 2008 -0700 fs/partition/check.c: fix return value warning fs/partitions/check.c:381: warning: ignoring return value of ___device_add___, declared with attribute warn_unused_result [akpm@linux-foundation.org: multiple-return-statements-per-function are evil] Signed-off-by: Abdel Benamrouche Cc: Jens Axboe Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit abe19b7b822a8fdbe3dbfd6e066d0698b4eefb06 Author: Akinobu Mita Date: Fri Jul 25 01:48:24 2008 -0700 dcdbas: use memory_read_from_buffer() Signed-off-by: Akinobu Mita Cc: Doug Warzecha Cc: Zhang Rui Cc: Matt Domsch Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f37e66173e0cc09b4e5a89eb0294abbefc15f435 Author: Akinobu Mita Date: Fri Jul 25 01:48:23 2008 -0700 firmware: use memory_read_from_buffer() Signed-off-by: Akinobu Mita Cc: Greg Kroah-Hartman Cc: Markus Rechberger Cc: Kay Sievers Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit ec905a18656daa4d9300bad2bebc02d5dba7883d Author: Jiri Slaby Date: Fri Jul 25 01:48:23 2008 -0700 drivers/misc/phantom: note PCI Tell users that the driver is only for PCI devices to stop asking for support of firewire and parallel devices. Signed-off-by: Jiri Slaby Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit ace7dd96695769f9d76980c7e52139e73228221c Author: Jiri Slaby Date: Fri Jul 25 01:48:22 2008 -0700 Char: mxser, various cleanups - remove unused macro - some whitespace cleanup - useless debug prints removal Signed-off-by: Jiri Slaby Acked-by: Alan Cox Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 1df0092477b8b2df605812e298624f5c35bb4805 Author: Jiri Slaby Date: Fri Jul 25 01:48:22 2008 -0700 Char: mxser, remove predefined isa support Remove a support of ISA addresses predefined at compile time. It is unused (filled by zeroes) and prolongs the code. Don't initialize global array and add `ioaddr' module param description. Signed-off-by: Jiri Slaby Acked-by: Alan Cox Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 83766bc63f7e49b0215811026e7802bd09a9c7e1 Author: Jiri Slaby Date: Fri Jul 25 01:48:21 2008 -0700 Char: mxser, prints cleanup - use dev_* for printing in pci probe function - move ISA p[rints directly into isa find function, do not postpone it. Remove macros bound to it then. - prepend some prints by "mxser: " to know what it belongs to Signed-off-by: Jiri Slaby Acked-by: Alan Cox Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 729f0edbecd0c59c82ee9bf92009acc7e984c425 Author: Jiri Slaby Date: Fri Jul 25 01:48:20 2008 -0700 Char: mxser, update documentation Update Documentation/moxa-smartio to the later document from the mxser package. Signed-off-by: Jiri Slaby Acked-by: Alan Cox Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 72800df9ba3199df02a95b3830c49fbf16ec4a6d Author: Jiri Slaby Date: Fri Jul 25 01:48:20 2008 -0700 Char: mxser, globals cleanup - remove unused mxvar_diagflag - move mxser_msr into the only user/function - GMStatus, hmm, fix race-prone access to it. We need only one instance for real, not MXSER_PORTS. Move it to MOXA_GETMSTATUS ioctl. - mxser_mon_ext, almost the same, but alloc it on heap, since it has more than 2 kilos. - fix indexing, `i' is not the index value, `i * MXSER_PORTS_PER_BOARD + j' is Signed-off-by: Jiri Slaby Acked-by: Alan Cox Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 41aee9a121fd0c31ae22dfe57e8f9ee9d6d85c25 Author: Jiri Slaby Date: Fri Jul 25 01:48:19 2008 -0700 Char: mxser, ioctl cleanup - remove break ctl from ioctl handler, it's never reached, since tty_ops->break_ctl is defined (mxser break handling is done in software) - mark MOXA_GET_MAJOR as deprecated - fix TIOCGICOUNT (some retval non-checks of put_user). Use copy_to_user to whole structure instead. Signed-off-by: Jiri Slaby Acked-by: Alan Cox Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 6ee8928d94841aa764aeaf645ad16daff811dc26 Author: Akinobu Mita Date: Fri Jul 25 01:48:18 2008 -0700 nwflash: use simple_read_from_buffer() Signed-off-by: Akinobu Mita Cc: Russell King Cc: Tim Schmielau Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 236b8756a2b6f90498d45b2c36d43e5372f2d4b8 Author: Alan Cox Date: Fri Jul 25 01:48:17 2008 -0700 dsp56k: BKL pushdown Push the BKL down into the driver ioctl methods Signed-off-by: Alan Cox Cc: Jiri Slaby Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit b8e35919653d76e7dceb8d3b8569c4ec1004d546 Author: Alan Cox Date: Fri Jul 25 01:48:17 2008 -0700 ds1302: push down the BKL into the driver ioctl code Signed-off-by: Alan Cox Cc: Jiri Kosina Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 6d535d3e6ad395345750c361bd2b7f1b9429455d Author: Alan Cox Date: Fri Jul 25 01:48:16 2008 -0700 ppdev: wrap ioctl handler in driver and push lock down Signed-off-by: Alan Cox Cc: Jiri Slaby Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit e05e9f7c4aeb82eaa23e46b29580ff514590c641 Author: Alan Cox Date: Fri Jul 25 01:48:16 2008 -0700 ixj: push BKL into driver and wrap ioctls Signed-off-by: Alan Cox Cc: Nishanth Aravamudan Cc: Domen Puncer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 11af7478addd34c42999b3b84095903ed9e67038 Author: Alan Cox Date: Fri Jul 25 01:48:15 2008 -0700 sx: push BKL down into the firmware ioctl handler Also fix the capability checking for firmware load. Signed-off-by: Alan Cox Cc: Jiri Slaby Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f6759fdcfd79ff1827fd5d4ddfe876164466d30d Author: Alan Cox Date: Fri Jul 25 01:48:14 2008 -0700 rio: push down the BKL into the firmware ioctl handler TTY side is already done. Signed-off-by: Alan Cox Cc: Jiri Slaby Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 909d145f0decbc4f17955e1fc4122a669a51fbc0 Author: Alan Cox Date: Fri Jul 25 01:48:14 2008 -0700 mwave: ioctl BKL pushdown Push the BKL down to the point it wraps the actual mwave method handlers Signed-off-by: Alan Cox Cc: Eric Sesterhenn Cc: Yani Ioannou Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 47be36a24defbd19aea1354c416ec99f291c7ab8 Author: Alan Cox Date: Fri Jul 25 01:48:13 2008 -0700 ip2: push BKL down for the firmware interface (The tty side is already done) Signed-off-by: Alan Cox Cc: Jiri Slaby Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 76528a42e2c5199a1208909318a9c9948d25d0b7 Author: Alan Cox Date: Fri Jul 25 01:48:12 2008 -0700 efirtc: push down the BKL Push it down as far as the EFI method calls. Someone who knows EFI can do the other bits. Also fix another wrong unknown ioctl return. Signed-off-by: Alan Cox Cc: Joe Perches Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 372572e9b1dcc5e36091199be63766d13e5a8ae0 Author: Adrian Bunk Date: Fri Jul 25 01:48:11 2008 -0700 #if 0 hpet_unregister() This patch #if 0's the unused hpet_unregister(). Signed-off-by: Adrian Bunk Acked-by: Clemens Ladisch Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 8d1e120f695e9bcf01585e052577dc1e099033f9 Author: Adrian Bunk Date: Fri Jul 25 01:48:11 2008 -0700 proper extern for mwave_s_mdd This patch adds a proper extern for mwave_s_mdd in drivers/char/mwave/mwavedd.h Signed-off-by: Adrian Bunk Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 79885b227740b9c7d3057f2de556f4098d37cc8f Author: Edgar E. Iglesias Date: Fri Jul 25 01:48:10 2008 -0700 elf: use ELF_CORE_EFLAGS for kcore ELF header flags ELF_CORE_EFLAGS is already used by the binfmt_elf coredumper to set correct arch specific ELF header flags on coredumps. Use it for kcore dumps as well. At the moment, this affects the CRIS and the H8300 arch. Signed-off-by: Edgar E. Iglesias Cc: Mikael Starvik Cc: Yoshinori Sato Cc: Ralf Baechle Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 7833351b5260b3a58b54a0c2e7065001d986d749 Author: Adrian Bunk Date: Fri Jul 25 01:48:09 2008 -0700 pty: remove unused UNIX98_PTY_COUNT options The h8300 and sparc options somehow survived when the code stopped using CONFIG_UNIX98_PTY_COUNT. Reviewed-by: Robert P. J. Day Signed-off-by: Adrian Bunk Cc: Yoshinori Sato Cc: "David S. Miller" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 9eefe520c814f6f62c5d36a2ddcd3fb99dfdb30e Author: Nadia Derbey Date: Fri Jul 25 01:48:08 2008 -0700 ipc: do not use a negative value to re-enable msgmni automatic recomputing This patch proposes an alternative to the "magical positive-versus-negative number trick" Andrew complained about last week in http://lkml.org/lkml/2008/6/24/418. This had been introduced with the patches that scale msgmni to the amount of lowmem. With these patches, msgmni has a registered notification routine that recomputes msgmni value upon memory add/remove or ipc namespace creation/ removal. When msgmni is changed from user space (i.e. value written to the proc file), that notification routine is unregistered, and the way to make it registered back is to write a negative value into the proc file. This is the "magical positive-versus-negative number trick". To fix this, a new proc file is introduced: /proc/sys/kernel/auto_msgmni. This file acts as ON/OFF for msgmni automatic recomputing. With this patch, the process is the following: 1) kernel boots in "automatic recomputing mode" /proc/sys/kernel/msgmni contains the value that has been computed (depends on lowmem) /proc/sys/kernel/automatic_msgmni contains "1" 2) echo > /proc/sys/kernel/msgmni . sets msg_ctlmni to . de-activates automatic recomputing (i.e. if, say, some memory is added msgmni won't be recomputed anymore) . /proc/sys/kernel/automatic_msgmni now contains "0" 3) echo "0" > /proc/sys/kernel/automatic_msgmni . de-activates msgmni automatic recomputing this has the same effect as 2) except that msg_ctlmni's value stays blocked at its current value) 3) echo "1" > /proc/sys/kernel/automatic_msgmni . recomputes msgmni's value based on the current available memory size and number of ipc namespaces . re-activates automatic recomputing for msgmni. Signed-off-by: Nadia Derbey Cc: Solofo Ramangalahy Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f1a43f93f0f3bab418800eaccb9e2e3b5427e173 Author: Akinobu Mita Date: Fri Jul 25 01:48:07 2008 -0700 ipc: use simple_read_from_buffer() Also this patch kills unneccesary trailing NULL character. Signed-off-by: Akinobu Mita Cc: Nadia Derbey Cc: Manfred Spraul Cc: Pierre Peiffer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 380af1b33b3ff92df5cda96329b58f5d1b6b5a53 Author: Manfred Spraul Date: Fri Jul 25 01:48:06 2008 -0700 ipc/sem.c: rewrite undo list locking The attached patch: - reverses the locking order of ulp->lock and sem_lock: Previously, it was first ulp->lock, then inside sem_lock. Now it's the other way around. - converts the undo structure to rcu. Benefits: - With the old locking order, IPC_RMID could not kfree the undo structures. The stale entries remained in the linked lists and were released later. - The patch fixes a a race in semtimedop(): if both IPC_RMID and a semget() that recreates exactly the same id happen between find_alloc_undo() and sem_lock, then semtimedop() would access already kfree'd memory. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Manfred Spraul Reviewed-by: Nadia Derbey Cc: Pierre Peiffer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit a1193f8ec091cd8fd309cc2982abe4499f6f2b4d Author: Manfred Spraul Date: Fri Jul 25 01:48:06 2008 -0700 ipc/sem.c: convert sem_array.sem_pending to struct list_head sem_array.sem_pending is a double linked list, the attached patch converts it to struct list_head. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Manfred Spraul Reviewed-by: Nadia Derbey Cc: Pierre Peiffer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 2c0c29d414087f3b021059673c20a7088f5f1fff Author: Manfred Spraul Date: Fri Jul 25 01:48:05 2008 -0700 ipc/sem.c: remove unused entries from struct sem_queue sem_queue.sma and sem_queue.id were never used, the attached patch removes them. Signed-off-by: Manfred Spraul Reviewed-by: Nadia Derbey Cc: Pierre Peiffer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 4daa28f6d8f5cda8ea0f55048e3c8811c384cbdd Author: Manfred Spraul Date: Fri Jul 25 01:48:04 2008 -0700 ipc/sem.c: convert undo structures to struct list_head The undo structures contain two linked lists, the attached patch replaces them with generic struct list_head lists. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Manfred Spraul Cc: Nadia Derbey Cc: Pierre Peiffer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 00c2bf85d8febfcfddde63822043462b026134ff Author: Nadia Derbey Date: Fri Jul 25 01:48:03 2008 -0700 ipc: get rid of ipc_lock_down() Remove the ipc_lock_down() routines: they used to call idr_find() locklessly (given that the ipc ids lock was already held), so they are not needed anymore. Signed-off-by: Nadia Derbey Acked-by: "Paul E. McKenney" Cc: Manfred Spraul Cc: Jim Houston Cc: Pierre Peiffer Acked-by: Rik van Riel Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 983bfb7db303cfde56ae5bbf4e0f2f46e38c9576 Author: Nadia Derbey Date: Fri Jul 25 01:48:03 2008 -0700 ipc: call idr_find() without locking in ipc_lock() Call idr_find() locklessly from ipc_lock(), since the idr tree is now RCU protected. Signed-off-by: Nadia Derbey Acked-by: "Paul E. McKenney" Cc: Manfred Spraul Cc: Jim Houston Cc: Pierre Peiffer Acked-by: Rik van Riel Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit cf481c20c476ad2c0febdace9ce23f5a4db19582 Author: Nadia Derbey Date: Fri Jul 25 01:48:02 2008 -0700 idr: make idr_remove rcu-safe Introduce the free_layer() routine: it is the one that actually frees memory after a grace period has elapsed. Signed-off-by: Nadia Derbey Reviewed-by: "Paul E. McKenney" Cc: Manfred Spraul Cc: Jim Houston Cc: Pierre Peiffer Acked-by: Rik van Riel Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f9c46d6ea5ce138a886c3a0f10a46130afab75f5 Author: Nadia Derbey Date: Fri Jul 25 01:48:01 2008 -0700 idr: make idr_find rcu-safe Make idr_find rcu-safe: it can now be called inside an rcu_read critical section. Signed-off-by: Nadia Derbey Reviewed-by: "Paul E. McKenney" Cc: Manfred Spraul Cc: Jim Houston Cc: Pierre Peiffer Acked-by: Rik van Riel Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 3219b3b7456d5cf15ba7b1fe7b1bcf15ce8840e2 Author: Nadia Derbey Date: Fri Jul 25 01:48:00 2008 -0700 idr: make idr_get_new* rcu-safe Make the idr_get_new* routines rcu-safe. Signed-off-by: Nadia Derbey Reviewed-by: "Paul E. McKenney" Cc: Manfred Spraul Cc: Jim Houston Cc: Pierre Peiffer Acked-by: Rik van Riel Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 944ca05c7b4972f2ebf37262e0f4933d178ad6db Author: Nadia Derbey Date: Fri Jul 25 01:47:59 2008 -0700 idr: error checking factorization Do some code factorization in the return code analysis. Signed-off-by: Nadia Derbey Cc: "Paul E. McKenney" Cc: Manfred Spraul Cc: Jim Houston Cc: Pierre Peiffer Acked-by: Rik van Riel Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f098ad655f4dd8e3da98ffbeda9cedcc4459c01a Author: Nadia Derbey Date: Fri Jul 25 01:47:59 2008 -0700 idr: fix a printk call Fix the incomplete printk call. Signed-off-by: Nadia Derbey Reviewed-by: "Paul E. McKenney" Cc: Manfred Spraul Cc: Jim Houston Cc: Pierre Peiffer Acked-by: Rik van Riel Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 4ae537892ab9858f71c78701f4651ad1ca531a1b Author: Nadia Derbey Date: Fri Jul 25 01:47:58 2008 -0700 idr: rename some of the idr APIs internal routines This is a trivial patch that renames: . alloc_layer to get_from_free_list since it idr_pre_get that actually allocates memory. . free_layer to move_to_free_list since memory is not actually freed there. This makes things more clear for the next patches. Signed-off-by: Nadia Derbey Reviewed-by: "Paul E. McKenney" Cc: Manfred Spraul Cc: Jim Houston Cc: Pierre Peiffer Acked-by: Rik van Riel Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 2027d1abc25ff770cc3bc936abd33570ce85d85a Author: Nadia Derbey Date: Fri Jul 25 01:47:57 2008 -0700 idr: change the idr structure After scalability problems have been detected when using the sysV ipcs, I have proposed to use an RCU based implementation of the IDR api instead (see threads http://lkml.org/lkml/2008/4/11/212 and http://lkml.org/lkml/2008/4/29/295). This resulted in many people asking to convert the idr API and make it rcu safe (because most of the code was duplicated and thus unmaintanable and unreviewable). So here is a first attempt. The important change wrt to the idr API itself is during idr removes: idr layers are freed after a grace period, instead of being moved to the free list. The important change wrt to ipcs, is that idr_find() can now be called locklessly inside a rcu read critical section. Here are the results I've got for the pmsg test sent by Manfred: 2.6.25-rc3-mm1 2.6.25-rc3-mm1+ 2.6.25-mm1 Patched 2.6.25-mm1 1 1168441 1064021 876000 947488 2 1094264 921059 1549592 1730685 3 2082520 1738165 1694370 2324880 4 2079929 1695521 404553 2400408 5 2898758 406566 391283 3246580 6 2921417 261275 263249 3752148 7 3308761 126056 191742 4243142 8 3329456 100129 141722 4275780 1st column: stock 2.6.25-rc3-mm1 2nd column: 2.6.25-rc3-mm1 + ipc patches (store ipcs into idrs) 3nd column: stock 2.6.25-mm1 4th column: 2.6.25-mm1 + this pacth series. This patch: Add an rcu_head to the idr_layer structure in order to free it after a grace period. Signed-off-by: Nadia Derbey Reviewed-by: "Paul E. McKenney" Cc: Manfred Spraul Cc: Jim Houston Cc: Pierre Peiffer Acked-by: Rik van Riel Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 95b68dec0d52c7b8fea3698b3938cf3ab936436b Author: Chandru Date: Fri Jul 25 01:47:55 2008 -0700 calgary iommu: use the first kernels TCE tables in kdump kdump kernel fails to boot with calgary iommu and aacraid driver on a x366 box. The ongoing dma's of aacraid from the first kernel continue to exist until the driver is loaded in the kdump kernel. Calgary is initialized prior to aacraid and creation of new tce tables causes wrong dma's to occur. Here we try to get the tce tables of the first kernel in kdump kernel and use them. While in the kdump kernel we do not allocate new tce tables but instead read the base address register contents of calgary iommu and use the tables that the registers point to. With these changes the kdump kernel and hence aacraid now boots normally. Signed-off-by: Chandru Siddalingappa Acked-by: Muli Ben-Yehuda Cc: Ingo Molnar Cc: Thomas Gleixner Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 8448502cfc915f70e3f8923849ade27d472044cb Author: Oleg Nesterov Date: Fri Jul 25 01:47:54 2008 -0700 workqueues: do CPU_UP_CANCELED if CPU_UP_PREPARE fails The bug was pointed out by Akinobu Mita , and this patch is based on his original patch. workqueue_cpu_callback(CPU_UP_PREPARE) expects that if it returns NOTIFY_BAD, _cpu_up() will send CPU_UP_CANCELED then. However, this is not true since "cpu hotplug: cpu: deliver CPU_UP_CANCELED only to NOTIFY_OKed callbacks with CPU_UP_PREPARE" commit: a0d8cdb652d35af9319a9e0fb7134de2a276c636 The callback which has returned NOTIFY_BAD will not receive CPU_UP_CANCELED. Change the code to fulfil the CPU_UP_CANCELED logic if CPU_UP_PREPARE fails. Signed-off-by: Oleg Nesterov Reported-by: Akinobu Mita Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 8de6d308bab4f67fcf953562f9f08f9527cad72d Author: Oleg Nesterov Date: Fri Jul 25 01:47:53 2008 -0700 workqueues: schedule_on_each_cpu() can use schedule_work_on() schedule_on_each_cpu() can use schedule_work_on() to avoid the code duplication. Signed-off-by: Oleg Nesterov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit ef1ca236b8d645349ed6569598ae3f6c1b9511c0 Author: Oleg Nesterov Date: Fri Jul 25 01:47:53 2008 -0700 workqueues: queue_work() can use queue_work_on() queue_work() can use queue_work_on() to avoid the code duplication. Signed-off-by: Oleg Nesterov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit a67da70dc0955580665f5444f318b92e69a3c272 Author: Oleg Nesterov Date: Fri Jul 25 01:47:52 2008 -0700 workqueues: lockdep annotations for flush_work() Add lockdep annotations to flush_work() and update the comment. Signed-off-by: Oleg Nesterov Cc: Jarek Poplawski Acked-by: Johannes Berg Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 69b895fd13d73aebf62b75502eb6513d43057ba3 Author: Oleg Nesterov Date: Fri Jul 25 01:47:51 2008 -0700 S390 topology: don't use kthread() for arch_reinit_sched_domains() Now that it is safe to use get_online_cpus() we can revert [S390] cpu topology: Fix possible deadlock. commit: fd781fa25c9e9c6fd1599df060b05e7c4ad724e5 and call arch_reinit_sched_domains() directly from topology_work_fn(). Signed-off-by: Oleg Nesterov Cc: Gautham R Shenoy Tested-by: Heiko Carstens Cc: Max Krasnyansky Cc: Paul Jackson Cc: Paul Menage Cc: Peter Zijlstra Cc: Vegard Nossum Cc: Martin Schwidefsky Cc: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 3da1c84c00c7e5fa8348336bd8c342f9128b0f14 Author: Oleg Nesterov Date: Fri Jul 25 01:47:50 2008 -0700 workqueues: make get_online_cpus() useable for work->func() workqueue_cpu_callback(CPU_DEAD) flushes cwq->thread under cpu_maps_update_begin(). This means that the multithreaded workqueues can't use get_online_cpus() due to the possible deadlock, very bad and very old problem. Introduce the new state, CPU_POST_DEAD, which is called after cpu_hotplug_done() but before cpu_maps_update_done(). Change workqueue_cpu_callback() to use CPU_POST_DEAD instead of CPU_DEAD. This means that create/destroy functions can't rely on get_online_cpus() any longer and should take cpu_add_remove_lock instead. [akpm@linux-foundation.org: fix CONFIG_SMP=n] Signed-off-by: Oleg Nesterov Acked-by: Gautham R Shenoy Cc: Heiko Carstens Cc: Max Krasnyansky Cc: Paul Jackson Cc: Paul Menage Cc: Peter Zijlstra Cc: Vegard Nossum Cc: Martin Schwidefsky Cc: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 8616a89ab761239c963eea3a63be383f127cc7e8 Author: Oleg Nesterov Date: Fri Jul 25 01:47:49 2008 -0700 workqueues: schedule_on_each_cpu: use flush_work() Change schedule_on_each_cpu() to use flush_work() instead of flush_workqueue(), this way we don't wait for other work_struct's which can be queued meanwhile. Signed-off-by: Oleg Nesterov Cc: Jarek Poplawski Cc: Max Krasnyansky Cc: Peter Zijlstra Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit db700897224b5ebdf852f2d38920ce428940d059 Author: Oleg Nesterov Date: Fri Jul 25 01:47:49 2008 -0700 workqueues: implement flush_work() Most of users of flush_workqueue() can be changed to use cancel_work_sync(), but sometimes we really need to wait for the completion and cancelling is not an option. schedule_on_each_cpu() is good example. Add the new helper, flush_work(work), which waits for the completion of the specific work_struct. More precisely, it "flushes" the result of of the last queue_work() which is visible to the caller. For example, this code queue_work(wq, work); /* WINDOW */ queue_work(wq, work); flush_work(work); doesn't necessary work "as expected". What can happen in the WINDOW above is - wq starts the execution of work->func() - the caller migrates to another CPU now, after the 2nd queue_work() this work is active on the previous CPU, and at the same time it is queued on another. In this case flush_work(work) may return before the first work->func() completes. It is trivial to add another helper int flush_work_sync(struct work_struct *work) { return flush_work(work) || wait_on_work(work); } which works "more correctly", but it has to iterate over all CPUs and thus it much slower than flush_work(). Signed-off-by: Oleg Nesterov Acked-by: Max Krasnyansky Acked-by: Jarek Poplawski Cc: Peter Zijlstra Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 1a4d9b0aa0d3c50314e57525a5e5ec2cfc48b4c8 Author: Oleg Nesterov Date: Fri Jul 25 01:47:47 2008 -0700 workqueues: insert_work: use "list_head *" instead of "int tail" insert_work() inserts the new work_struct before or after cwq->worklist, depending on the "int tail" parameter. Change it to accept "list_head *" instead, this shrinks .text a bit and allows us to insert the barrier after specific work_struct. Signed-off-by: Oleg Nesterov Cc: Jarek Poplawski Cc: Max Krasnyansky Cc: Peter Zijlstra Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 565b9b14e7f48131bca58840aa404bbef058fa89 Author: Oleg Nesterov Date: Fri Jul 25 01:47:47 2008 -0700 coredump: format_corename: fix the "core_uses_pid" logic I don't understand why the multi-thread coredump implies the core_uses_pid behaviour, but we shouldn't use mm->mm_users for that. This counter can be incremented by get_task_mm(). Use the valued returned by coredump_wait() instead. Also, remove the "const char *pattern" argument, format_corename() can use core_pattern directly. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Oleg Nesterov Cc: Roland McGrath Cc: Alan Cox Cc: Andi Kleen Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit a94e2d408eaedbd85aae259621d46fafc10479a2 Author: Oleg Nesterov Date: Fri Jul 25 01:47:46 2008 -0700 coredump: kill mm->core_done Now that we have core_state->dumper list we can use it to wake up the sub-threads waiting for the coredump completion. This uglifies the code and .text grows by 47 bytes, but otoh mm_struct lessens by sizeof(struct completion). Also, with this change we can decouple exit_mm() from the coredumping code. Signed-off-by: Oleg Nesterov Cc: Roland McGrath Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 182c515fd2a942623aed4e4e0e0b37fe96571b05 Author: Oleg Nesterov Date: Fri Jul 25 01:47:45 2008 -0700 coredump: elf_fdpic_core_dump: use core_state->dumper list Kill the nasty rcu_read_lock() + do_each_thread() loop, use the list encoded in mm->core_state instead, s/GFP_ATOMIC/GFP_KERNEL/. This patch allows futher cleanups in binfmt_elf_fdpic.c. Signed-off-by: Oleg Nesterov Acked-by: Roland McGrath Cc: David Howells Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 83914441f94c6f2cd468ca97365f6c34f418706e Author: Oleg Nesterov Date: Fri Jul 25 01:47:45 2008 -0700 coredump: elf_core_dump: use core_state->dumper list Kill the nasty rcu_read_lock() + do_each_thread() loop, use the list encoded in mm->core_state instead, s/GFP_ATOMIC/GFP_KERNEL/. This patch allows futher cleanups in binfmt_elf.c, in particular we can kill the parallel info->threads list. Signed-off-by: Oleg Nesterov Acked-by: Roland McGrath Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit b564daf806d492dd4f7afe9b6c83b8d35d137669 Author: Oleg Nesterov Date: Fri Jul 25 01:47:44 2008 -0700 coredump: construct the list of coredumping threads at startup time binfmt->core_dump() has to iterate over the all threads in system in order to find the coredumping threads and construct the list using the GFP_ATOMIC allocations. With this patch each thread allocates the list node on exit_mm()'s stack and adds itself to the list. This allows us to do further changes: - simplify ->core_dump() - change exit_mm() to clear ->mm first, then wait for ->core_done. this makes the coredumping process visible to oom_kill - kill mm->core_done Signed-off-by: Oleg Nesterov Acked-by: Roland McGrath Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 9d5b327bf198d2720666de958dcc2ae219d86952 Author: Oleg Nesterov Date: Fri Jul 25 01:47:43 2008 -0700 coredump: make mm->core_state visible to ->core_dump() Move the "struct core_state core_state" from coredump_wait() to do_coredump(), this makes mm->core_state visible to binfmt->core_dump(). Signed-off-by: Oleg Nesterov Acked-by: Roland McGrath Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit c5f1cc8c1828486a61ab3e575da6e2c62b34d399 Author: Oleg Nesterov Date: Fri Jul 25 01:47:42 2008 -0700 coredump: turn core_state->nr_threads into atomic_t Turn core_state->nr_threads into atomic_t and kill now unneeded down_write(&mm->mmap_sem) in exit_mm(). Signed-off-by: Oleg Nesterov Cc: Roland McGrath Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 8cd9c249128a59e8e833d454a784b0cbd338d468 Author: Oleg Nesterov Date: Fri Jul 25 01:47:42 2008 -0700 coredump: simplify core_state->nr_threads calculation Change zap_process() to return int instead of incrementing mm->core_state->nr_threads directly. Change zap_threads() to set mm->core_state only on success. This patch restores the original size of .text, and more importantly now ->nr_threads is used in two places only. Signed-off-by: Oleg Nesterov Cc: Roland McGrath Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 999d9fc1670bc082928b93b11d1f2e0e417d973c Author: Oleg Nesterov Date: Fri Jul 25 01:47:41 2008 -0700 coredump: move mm->core_waiters into struct core_state Move mm->core_waiters into "struct core_state" allocated on stack. This shrinks mm_struct a little bit and allows further changes. This patch mostly does s/core_waiters/core_state. The only essential change is that coredump_wait() must clear mm->core_state before return. The coredump_wait()'s path is uglified and .text grows by 30 bytes, this is fixed by the next patch. Signed-off-by: Oleg Nesterov Cc: Roland McGrath Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 32ecb1f26dd50eeaac4e3f4dea4541c97848e459 Author: Oleg Nesterov Date: Fri Jul 25 01:47:41 2008 -0700 coredump: turn mm->core_startup_done into the pointer to struct core_state mm->core_startup_done points to "struct completion startup_done" allocated on the coredump_wait()'s stack. Introduce the new structure, core_state, which holds this "struct completion". This way we can add more info visible to the threads participating in coredump without enlarging mm_struct. No changes in affected .o files. Signed-off-by: Oleg Nesterov Cc: Roland McGrath Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 24d5288f06ed8b3a368eba967d587cdb012dfdf7 Author: Oleg Nesterov Date: Fri Jul 25 01:47:40 2008 -0700 coredump: elf_core_dump: skip kernel threads linux_binfmt->core_dump() runs before the process does exit_aio(), this means that we can hit the kernel thread which shares the same ->mm. Afaics, nothing really bad can happen, but perhaps it makes sense to fix this minor bug. It is sad we have to iterate over all threads in system and use GFP_ATOMIC. Hopefully we can kill theses ugly do_each_thread()s, but this needs some nontrivial changes in mm_struct and do_coredump. Signed-off-by: Oleg Nesterov Cc: Roland McGrath Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 15b9f360c0316c06d37c09b02d85565edbaf9dd3 Author: Oleg Nesterov Date: Fri Jul 25 01:47:39 2008 -0700 coredump: zap_threads() must skip kernel threads The main loop in zap_threads() must skip kthreads which may use the same mm. Otherwise we "kill" this thread erroneously (for example, it can not fork or exec after that), and the coredumping task stucks in the TASK_UNINTERRUPTIBLE state forever because of the wrong ->core_waiters count. Signed-off-by: Oleg Nesterov Cc: Roland McGrath Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 246bb0b1deb29726990620d8b5e55ca29f331362 Author: Oleg Nesterov Date: Fri Jul 25 01:47:38 2008 -0700 kill PF_BORROWED_MM in favour of PF_KTHREAD Kill PF_BORROWED_MM. Change use_mm/unuse_mm to not play with ->flags, and do s/PF_BORROWED_MM/PF_KTHREAD/ for a couple of other users. No functional changes yet. But this allows us to do further fixes/cleanups. oom_kill/ptrace/etc often check "p->mm != NULL" to filter out the kthreads, this is wrong because of use_mm(). The problem with PF_BORROWED_MM is that we need task_lock() to avoid races. With this patch we can check PF_KTHREAD directly, or use a simple lockless helper: /* The result must not be dereferenced !!! */ struct mm_struct *__get_task_mm(struct task_struct *tsk) { if (tsk->flags & PF_KTHREAD) return NULL; return tsk->mm; } Note also ecard_task(). It runs with ->mm != NULL, but it's the kernel thread without PF_BORROWED_MM. Signed-off-by: Oleg Nesterov Cc: Roland McGrath Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 7b34e4283c685f5cc6ba6d30e939906eee0d4bcf Author: Oleg Nesterov Date: Fri Jul 25 01:47:37 2008 -0700 introduce PF_KTHREAD flag Introduce the new PF_KTHREAD flag to mark the kernel threads. It is set by INIT_TASK() and copied to the forked childs (we could set it in kthreadd() along with PF_NOFREEZE instead). daemonize() was changed as well. In that case testing of PF_KTHREAD is racy, but daemonize() is hopeless anyway. This flag is cleared in do_execve(), before search_binary_handler(). Probably not the best place, we can do this in exec_mmap() or in start_thread(), or clear it along with PF_FORKNOEXEC. But I think this doesn't matter in practice, and if do_execve() fails kthread should die soon. Signed-off-by: Oleg Nesterov Cc: Roland McGrath Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 3d749b9e676b26584a47e75c235aa6f69d0697ae Author: Oleg Nesterov Date: Fri Jul 25 01:47:37 2008 -0700 ptrace: simplify ptrace_stop()->sigkill_pending() path 1. SIGKILL can't be blocked, remove this check from sigkill_pending(). 2. When ptrace_stop() sees sigkill_pending() == T, it can just return. Kill "int killed" and simplify the code. This also is more correct, the tracer shouldn't see us in TASK_TRACED if we are not going to stop. I strongly believe this code needs further changes. We should do the "was this task killed" check unconditionally, currently it depends on arch_ptrace_stop_needed(). On the other hand, sigkill_pending() isn't very clever. If the task was killed tkill(SIGKILL), the signal can be already dequeued if the caller is do_exit(). Signed-off-by: Oleg Nesterov Cc: Roland McGrath Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 364d3c13c17f45da6d638011078d4c4d3070d719 Author: Oleg Nesterov Date: Fri Jul 25 01:47:36 2008 -0700 ptrace: give more respect to SIGKILL ptrace_stop() has some complicated checks to prevent the scheduling in the TASK_TRACED state with the pending SIGKILL, but these checks are racy, and they depend on arch_ptrace_stop_needed(). This patch assumes that the traced task should die asap if it was killed by SIGKILL, in that case schedule()->signal_pending_state() has no reason to ignore the TASK_WAKEKILL part of TASK_TRACED, and we can kill this nasty special case. Note: do_exit()->ptrace_notify() is special, the killed task can already dequeue SIGKILL at this point. Another indication that fatal_signal_pending() is not exactly right. Signed-off-by: Oleg Nesterov Cc: Ingo Molnar Cc: Matthew Wilcox Cc: Roland McGrath Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f22ab814a24e654b1de24db0c5f8b57b5ab2026a Author: Adrian Bunk Date: Fri Jul 25 01:47:34 2008 -0700 include/asm/ptrace.h userspace headers cleanup This patch contains the following cleanups for the asm/ptrace.h userspace headers: - include/asm-generic/Kbuild.asm already lists ptrace.h, remove the superfluous listings in the Kbuild files of the following architectures: - cris - frv - powerpc - x86 - don't expose function prototypes and macros to userspace: - arm - blackfin - cris - mn10300 - parisc - remove #ifdef CONFIG_'s around #define's: - blackfin - m68knommu - sh: AFAIK __SH5__ should work in both kernel and userspace, no need to leak CONFIG_SUPERH64 to userspace - xtensa: cosmetical change to remove empty #ifndef __ASSEMBLY__ #else #endif from the userspace headers Not changed by this patch is the fact that the following architectures have a different struct pt_regs depending on CONFIG_ variables: - h8300 - m68knommu - mips This does not work in userspace. Signed-off-by: Adrian Bunk Cc: Cc: Roland McGrath Cc: Oleg Nesterov Acked-by: Greg Ungerer Acked-by: Paul Mundt Acked-by: Grant Grundler Acked-by: Jesper Nilsson Acked-by: Chris Zankel Acked-by: David Howells Acked-by: Paul Mackerras Acked-by: Russell King Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit bc64efd220dcd4449aef8dd2564d73127b583b09 Author: Gustavo Fernando Padovan Date: Fri Jul 25 01:47:33 2008 -0700 kernel/signal.c: change vars pid and tgid types to pid_t Change the type of pid and tgid variables from int to the POSIX type pid_t. Signed-off-by: Gustavo F. Padovan Cc: Oleg Nesterov Cc: Roland McGrath Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit d8878ba3f05ae5bbfad5a6e72e5121c0ea35f989 Author: Michael Kerrisk Date: Fri Jul 25 01:47:32 2008 -0700 signals: make siginfo_t si_utime + si_sstime report times in USER_HZ, not HZ In the switch to configurable HZ in 2.6, the treatment of the si_utime and si_stime fields that are exposed to userland via the siginfo structure looks to have been botched. As things stand, these fields report times in units of HZ, so that userland gets information that varies depending on the HZ that the kernel was configured with. This patch changes the reported values to use USER_HZ units. Signed-off-by: Michael Kerrisk Acked-by: Oleg Nesterov Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Roland McGrath Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit e4901f92a8dbe843e76651a50f7a2a6dd3d53474 Author: Oleg Nesterov Date: Fri Jul 25 01:47:31 2008 -0700 coredump: zap_threads: comments && use while_each_thread() No changes in fs/exec.o The for_each_process() loop in zap_threads() is very subtle, it is not clear why we don't race with fork/exit/exec. Add the fat comment. Also, change the code to use while_each_thread(). Signed-off-by: Oleg Nesterov Acked-by: Roland McGrath Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 2b201a9eddf509e8e935b45e573648e36f4b623f Author: Oleg Nesterov Date: Fri Jul 25 01:47:31 2008 -0700 signals: do_signal_stop: kill the SIGNAL_UNKILLABLE check fae5fa44f1fd079ffbed8e0add929dd7bbd1347f changed do_signal_stop() to check SIGNAL_UNKILLABLE, this wasn't needed. If signal_group_exit() == F, the signal sent to SIGNAL_UNKILLABLE task must be already filtered out by the caller, get_signal_to_deliver(). And if signal_group_exit() == T we are not going to stop. Signed-off-by: Oleg Nesterov Acked-by: Roland McGrath Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 92413d771e7123304fb4b9efd2a00cccc946e383 Author: Oleg Nesterov Date: Fri Jul 25 01:47:30 2008 -0700 signals: dequeue_signal: don't check SIGNAL_GROUP_EXIT when setting SIGNAL_STOP_DEQUEUED dequeue_signal() checks SIGNAL_GROUP_EXIT before setting SIGNAL_STOP_DEQUEUED. This was added by 788e05a67c343fa22f2ae1d3ca264e7f15c25eaf a long ago to avoid the coredump/SIGSTOP race. Since then the related code was changed, and now this subtle check is both incomplete and unneeded at the same time. It is incomplete because nowadays exec() doesn't set SIGNAL_GROUP_EXIT, so in fact we should check signal_group_exit() to avoid a similar race. Fortunately, we doesn't need the check at all. The only function which relies on SIGNAL_STOP_DEQUEUED is do_signal_stop(), and it ignores this flag if signal_group_exit() == T, this covers the SIGNAL_GROUP_EXIT case. Signed-off-by: Oleg Nesterov Acked-by: Roland McGrath Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 3854a771821c970065e3203a0b40ddc4101538cc Author: Oleg Nesterov Date: Fri Jul 25 01:47:29 2008 -0700 __exit_signal: don't take rcu lock There is no reason for rcu_read_lock() in __exit_signal(). tsk->sighand can only be changed if tsk does exec, obviously this is not possible. Signed-off-by: Oleg Nesterov Cc: Roland McGrath Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 100360f03077663b7bef3af44805b6cf700c3bee Author: Oleg Nesterov Date: Fri Jul 25 01:47:29 2008 -0700 signals: change collect_signal() to return void With the recent changes collect_signal() always returns true. Change it to return void and update the single caller. Signed-off-by: Oleg Nesterov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit d4434207616980885205c605697868c0f07e4378 Author: Oleg Nesterov Date: Fri Jul 25 01:47:28 2008 -0700 signals: collect_signal: simplify the "still_pending" logic Factor out sigdelset() calls and remove the "still_pending" variable. Signed-off-by: Oleg Nesterov Acked-by: Roland McGrath Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 6715ca451cfff1c9ce4b33ad9918a1dacf43997c Author: Oleg Nesterov Date: Fri Jul 25 01:47:27 2008 -0700 signals: collect_signal: remove the unneeded sigismember() check collect_signal() checks sigismember(&list->signal, sig), this is not needed. This "sig" was just found by next_signal(), so it must be valid. We have a (completely broken) call to ->notifier in between, but it must not play with sigpending->signal bits or unlock ->siglock. Signed-off-by: Oleg Nesterov Acked-by: Roland McGrath Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 96347e7759e2e433c427defa0fa1adfc8cce6226 Author: Oleg Nesterov Date: Fri Jul 25 01:47:27 2008 -0700 posix timers: release_posix_timer: kill the bogus put_task_struct(->it_process); release_posix_timer() can't be called with ->it_process != NULL. Once sys_timer_create() sets ->it_process it must not call release_posix_timer(), otherwise we can race with another thread doing sys_timer_delete(), this timer is visible to idr_find() and unlocked. The same is true for two other callers (actually, for any possible caller), sys_timer_delete() and itimer_delete(). They must clear ->it_process before unlock_timer() + release_posix_timer(). Signed-off-by: Oleg Nesterov Acked-by: Roland McGrath Cc: john stultz Cc: Thomas Gleixner Cc: Roland McGrath Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 4b7a1304267bff68260ae861784b27130e805be3 Author: Oleg Nesterov Date: Fri Jul 25 01:47:26 2008 -0700 posix timers: timer_delete: remove the bogus "->it_process != NULL" check sys_timer_delete() and itimer_delete() check "timer->it_process != NULL", this looks completely bogus. ->it_process == NULL means that this timer is already under destruction or it is not fully initialized, this must not happen. sys_timer_delete: the timer is locked, and lock_timer() can't succeed if ->it_process == NULL. itimer_delete: it is called by exit_itimers() when there are no other threads which can play with signal_struct->posix_timers. Signed-off-by: Oleg Nesterov Acked-by: Roland McGrath Cc: john stultz Cc: Thomas Gleixner Cc: Roland McGrath Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit da5ef6bb96158b0fc0d808704237a453af449124 Author: Lai Jiangshan Date: Fri Jul 25 01:47:25 2008 -0700 cpuset: two minor code-cleanups In cpuset_update_task_memory_state() local variable struct task_struct *tsk = current; And local variable tsk is used 14 times and statement task_cs(tsk) is used twice in this function. So using task_cs(tsk) instead of task_cs(current) is better for readability. And "(struct cgroup_scanner *)&scan" is not good for readability also. (and "container_of" is used in cpuset_do_move_task(), not "(cpuset_hotplug_scanner *)scan") Signed-off-by: Lai Jiangshan Acked-by: Paul Menage Cc: Paul Jackson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 02412483777651a26b19a75e49c2a451a174ca9c Author: Lai Jiangshan Date: Fri Jul 25 01:47:24 2008 -0700 cpuset: code-cleanup for started_after cgroup(cgroup_scan_tasks) will initialize heap->gt for us. This patch removes started_after() and its helper-function. Signed-off-by: Lai Jiangshan Acked-by: Paul Menage Cc: Paul Jackson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 489a5393a20dcbf91104052120eb2eff8791b61b Author: Lai Jiangshan Date: Fri Jul 25 01:47:23 2008 -0700 cpuset: don't pass empty cpumasks to partition_sched_domains() I create lots of empty cpusets(empty cpumasks) and turn off the "sched_load_balance" in top cpuset. I found that all these empty cpumasks are passed to partition_sched_domains() in rebuild_sched_domains(), it's very time-consuming for partition_sched_domains() and it's not need. It also reduce memory consumed and some works in rebuild_sched_domains() too. Signed-off-by: Lai Jiangshan Acked-by: Paul Menage Cc: Paul Jackson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit c372e817afc629fea9ff6321313325ed0b4a855b Author: Li Zefan Date: Fri Jul 25 01:47:23 2008 -0700 cpuset: avoid unnecessary sched domains rebuilding When changing 'sched_relax_domain_level', don't rebuild sched domains if 'cpus' is empty or 'sched_load_balance' is not set. Also make the comments of rebuild_sched_domains() more readable. Signed-off-by: Li Zefan Cc: Hidetoshi Seto Cc: Paul Jackson Cc: Paul Menage Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f9b4fb8dabf38fb456c97f01aace07cb6e7c1723 Author: Miao Xie Date: Fri Jul 25 01:47:22 2008 -0700 cpusets: update task's cpus_allowed and mems_allowed after CPU/NODE offline/online The bug is that a task may run on the cpu/node which is not in its cpuset.cpus/ cpuset.mems. It can be reproduced by the following commands: ----------------------------------- # mkdir /dev/cpuset # mount -t cpuset xxx /dev/cpuset # mkdir /dev/cpuset/0 # echo 0-1 > /dev/cpuset/0/cpus # echo 0 > /dev/cpuset/0/mems # echo $$ > /dev/cpuset/0/tasks # echo 0 > /sys/devices/system/cpu/cpu1/online # echo 1 > /sys/devices/system/cpu/cpu1/online ----------------------------------- There is only CPU0 in cpuset.cpus, but the task in this cpuset runs on both CPU0 and CPU1. It is because the task's cpu_allowed didn't get updated after we did CPU offline/online manipulation. Similar for mem_allowed. This patch fixes this bug expect for root cpuset. Because there is a problem about root cpuset, in that whether it is necessary to update all the tasks in root cpuset or not after cpu/node offline/online. If updating, some kernel threads which is bound into a specified cpu will be unbound. If not updating, there is a bug in root cpuset. This bug is also caused by offline/online manipulation. For example, there is a dual-cpu machine. we create a sub cpuset in root cpuset and assign 1 to its cpus. And then we attach some tasks into this sub cpuset. After this, we offline CPU1. Now, the tasks in this new cpuset are moved into root cpuset automatically because there is no cpu in sub cpuset. Then we online CPU1, we find all the tasks which doesn't belong to root cpuset originally just run on CPU0. Maybe we need to add a flag in the task_struct to mark which task can't be unbound? Signed-off-by: Miao Xie Acked-by: Paul Jackson Cc: Li Zefan Cc: Paul Jackson Cc: Paul Menage Cc: David Rientjes Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 0b2f630a28d53b5a2082a5275bc3334b10373508 Author: Miao Xie Date: Fri Jul 25 01:47:21 2008 -0700 cpusets: restructure the function update_cpumask() and update_nodemask() Extract two functions from update_cpumask() and update_nodemask().They will be used later for updating tasks' cpus_allowed and mems_allowed after CPU/NODE offline/online. [lizf@cn.fujitsu.com: build fix] Signed-off-by: Miao Xie Acked-by: Paul Jackson Cc: David Rientjes Cc: Li Zefan Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 628f42355389cfb596ca3a5a5f64fb9054a2a06a Author: KAMEZAWA Hiroyuki Date: Fri Jul 25 01:47:20 2008 -0700 memcg: limit change shrink usage Shrinking memory usage at limit change. [akpm@linux-foundation.org: coding-style fixes] Acked-by: Balbir Singh Acked-by: Pavel Emelyanov Signed-off-by: KAMEZAWA Hiroyuki Cc: Paul Menage Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 12b9804419cfb1c1bdac413f6c373af3b88d154b Author: KAMEZAWA Hiroyuki Date: Fri Jul 25 01:47:19 2008 -0700 res_counter: limit change support ebusy Add an interface to set limit. This is necessary to memory resource controller because it shrinks usage at set limit. Other controllers may not need this interface to shrink usage because shrinking is not necessary or impossible. Acked-by: Balbir Singh Acked-by: Pavel Emelyanov Signed-off-by: KAMEZAWA Hiroyuki Cc: Paul Menage Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit cede86acd8bd5d2205dec28db8ac86410a3a19e8 Author: Li Zefan Date: Fri Jul 25 01:47:18 2008 -0700 memcg: clean up checking of the disabled flag Those checks are unnecessary, because when the subsystem is disabled it can't be mounted, so those functions won't get called. The check is needed in functions which will be called in other places except cgroup. [hugh@veritas.com: further checking of disabled flag] Signed-off-by: Li Zefan Acked-by: Balbir Singh Acked-by: KAMEZAWA Hiroyuki Acked-by: KOSAKI Motohiro Signed-off-by: Hugh Dickins Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit accf163e6ab729f1fc5fffaa0310e498270bf4e7 Author: KAMEZAWA Hiroyuki Date: Fri Jul 25 01:47:17 2008 -0700 memcg: remove a redundant check Because of remove refcnt patch, it's very rare case to that mem_cgroup_charge_common() is called against a page which is accounted. mem_cgroup_charge_common() is called when. 1. a page is added into file cache. 2. an anon page is _newly_ mapped. A racy case is that a newly-swapped-in anonymous page is referred from prural threads in do_swap_page() at the same time. (a page is not Locked when mem_cgroup_charge() is called from do_swap_page.) Another case is shmem. It charges its page before calling add_to_page_cache(). Then, mem_cgroup_charge_cache() is called twice. This case is handled in mem_cgroup_cache_charge(). But this check may be too hacky... Signed-off-by : KAMEZAWA Hiroyuki Cc: Balbir Singh Cc: "Eric W. Biederman" Cc: Pavel Emelyanov Cc: Li Zefan Cc: Hugh Dickins Cc: YAMAMOTO Takashi Cc: Paul Menage Cc: David Rientjes Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit b76734e5e34e1889ab9fc5f3756570b1129f0f50 Author: KAMEZAWA Hiroyuki Date: Fri Jul 25 01:47:16 2008 -0700 memcg: add hints for branch Showing brach direction for obvious conditions. Signed-off-by: KAMEZAWA Hiroyuki Cc: Balbir Singh Cc: "Eric W. Biederman" Cc: Pavel Emelyanov Cc: Li Zefan Cc: Hugh Dickins Cc: YAMAMOTO Takashi Cc: Paul Menage Cc: David Rientjes Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit c9b0ed51483cc2fc42bb801b6675c4231b0e4634 Author: KAMEZAWA Hiroyuki Date: Fri Jul 25 01:47:15 2008 -0700 memcg: helper function for relcaim from shmem. A new call, mem_cgroup_shrink_usage() is added for shmem handling and relacing non-standard usage of mem_cgroup_charge/uncharge. Now, shmem calls mem_cgroup_charge() just for reclaim some pages from mem_cgroup. In general, shmem is used by some process group and not for global resource (like file caches). So, it's reasonable to reclaim pages from mem_cgroup where shmem is mainly used. [hugh@veritas.com: shmem_getpage release page sooner] [hugh@veritas.com: mem_cgroup_shrink_usage css_put] Signed-off-by: KAMEZAWA Hiroyuki Cc: Balbir Singh Cc: "Eric W. Biederman" Cc: Pavel Emelyanov Cc: Li Zefan Cc: YAMAMOTO Takashi Cc: Paul Menage Cc: David Rientjes Signed-off-by: Hugh Dickins Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 69029cd550284e32de13d6dd2f77b723c8a0e444 Author: KAMEZAWA Hiroyuki Date: Fri Jul 25 01:47:14 2008 -0700 memcg: remove refcnt from page_cgroup memcg: performance improvements Patch Description 1/5 ... remove refcnt fron page_cgroup patch (shmem handling is fixed) 2/5 ... swapcache handling patch 3/5 ... add helper function for shmem's memory reclaim patch 4/5 ... optimize by likely/unlikely ppatch 5/5 ... remove redundunt check patch (shmem handling is fixed.) Unix bench result. == 2.6.26-rc2-mm1 + memory resource controller Execl Throughput 2915.4 lps (29.6 secs, 3 samples) C Compiler Throughput 1019.3 lpm (60.0 secs, 3 samples) Shell Scripts (1 concurrent) 5796.0 lpm (60.0 secs, 3 samples) Shell Scripts (8 concurrent) 1097.7 lpm (60.0 secs, 3 samples) Shell Scripts (16 concurrent) 565.3 lpm (60.0 secs, 3 samples) File Read 1024 bufsize 2000 maxblocks 1022128.0 KBps (30.0 secs, 3 samples) File Write 1024 bufsize 2000 maxblocks 544057.0 KBps (30.0 secs, 3 samples) File Copy 1024 bufsize 2000 maxblocks 346481.0 KBps (30.0 secs, 3 samples) File Read 256 bufsize 500 maxblocks 319325.0 KBps (30.0 secs, 3 samples) File Write 256 bufsize 500 maxblocks 148788.0 KBps (30.0 secs, 3 samples) File Copy 256 bufsize 500 maxblocks 99051.0 KBps (30.0 secs, 3 samples) File Read 4096 bufsize 8000 maxblocks 2058917.0 KBps (30.0 secs, 3 samples) File Write 4096 bufsize 8000 maxblocks 1606109.0 KBps (30.0 secs, 3 samples) File Copy 4096 bufsize 8000 maxblocks 854789.0 KBps (30.0 secs, 3 samples) Dc: sqrt(2) to 99 decimal places 126145.2 lpm (30.0 secs, 3 samples) INDEX VALUES TEST BASELINE RESULT INDEX Execl Throughput 43.0 2915.4 678.0 File Copy 1024 bufsize 2000 maxblocks 3960.0 346481.0 875.0 File Copy 256 bufsize 500 maxblocks 1655.0 99051.0 598.5 File Copy 4096 bufsize 8000 maxblocks 5800.0 854789.0 1473.8 Shell Scripts (8 concurrent) 6.0 1097.7 1829.5 ========= FINAL SCORE 991.3 == 2.6.26-rc2-mm1 + this set == Execl Throughput 3012.9 lps (29.9 secs, 3 samples) C Compiler Throughput 981.0 lpm (60.0 secs, 3 samples) Shell Scripts (1 concurrent) 5872.0 lpm (60.0 secs, 3 samples) Shell Scripts (8 concurrent) 1120.3 lpm (60.0 secs, 3 samples) Shell Scripts (16 concurrent) 578.0 lpm (60.0 secs, 3 samples) File Read 1024 bufsize 2000 maxblocks 1003993.0 KBps (30.0 secs, 3 samples) File Write 1024 bufsize 2000 maxblocks 550452.0 KBps (30.0 secs, 3 samples) File Copy 1024 bufsize 2000 maxblocks 347159.0 KBps (30.0 secs, 3 samples) File Read 256 bufsize 500 maxblocks 314644.0 KBps (30.0 secs, 3 samples) File Write 256 bufsize 500 maxblocks 151852.0 KBps (30.0 secs, 3 samples) File Copy 256 bufsize 500 maxblocks 101000.0 KBps (30.0 secs, 3 samples) File Read 4096 bufsize 8000 maxblocks 2033256.0 KBps (30.0 secs, 3 samples) File Write 4096 bufsize 8000 maxblocks 1611814.0 KBps (30.0 secs, 3 samples) File Copy 4096 bufsize 8000 maxblocks 847979.0 KBps (30.0 secs, 3 samples) Dc: sqrt(2) to 99 decimal places 128148.7 lpm (30.0 secs, 3 samples) INDEX VALUES TEST BASELINE RESULT INDEX Execl Throughput 43.0 3012.9 700.7 File Copy 1024 bufsize 2000 maxblocks 3960.0 347159.0 876.7 File Copy 256 bufsize 500 maxblocks 1655.0 101000.0 610.3 File Copy 4096 bufsize 8000 maxblocks 5800.0 847979.0 1462.0 Shell Scripts (8 concurrent) 6.0 1120.3 1867.2 ========= FINAL SCORE 1004.6 This patch: Remove refcnt from page_cgroup(). After this, * A page is charged only when !page_mapped() && no page_cgroup is assigned. * Anon page is newly mapped. * File page is added to mapping->tree. * A page is uncharged only when * Anon page is fully unmapped. * File page is removed from LRU. There is no change in behavior from user's view. This patch also removes unnecessary calls in rmap.c which was used only for refcnt mangement. [akpm@linux-foundation.org: fix warning] [hugh@veritas.com: fix shmem_unuse_inode charging] Signed-off-by: KAMEZAWA Hiroyuki Cc: Balbir Singh Cc: "Eric W. Biederman" Cc: Pavel Emelyanov Cc: Li Zefan Cc: Hugh Dickins Cc: YAMAMOTO Takashi Cc: Paul Menage Cc: David Rientjes Signed-off-by: Hugh Dickins Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit e8589cc189f96b87348ae83ea4db38eaac624135 Author: KAMEZAWA Hiroyuki Date: Fri Jul 25 01:47:10 2008 -0700 memcg: better migration handling This patch changes page migration under memory controller to use a different algorithm. (thanks to Christoph for new idea.) Before: - page_cgroup is migrated from an old page to a new page. After: - a new page is accounted , no reuse of page_cgroup. Pros: - We can avoid compliated lock depndencies and races in migration. Cons: - new param to mem_cgroup_charge_common(). - mem_cgroup_getref() is added for handling ref_cnt ping-pong. This version simplifies complicated lock dependency in page migraiton under memory resource controller. new refcnt sequence is following. a mapped page: prepage_migration() ..... +1 to NEW page try_to_unmap() ..... all refs to OLD page is gone. move_pages() ..... +1 to NEW page if page cache. remap... ..... all refs from *map* is added to NEW one. end_migration() ..... -1 to New page. page's mapcount + (page_is_cache) refs are added to NEW one. Signed-off-by: KAMEZAWA Hiroyuki Cc: Balbir Singh Cc: Pavel Emelyanov Cc: Li Zefan Cc: YAMAMOTO Takashi Cc: Hugh Dickins Cc: Christoph Lameter Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 508b7be0a5b06b64203512ed9b34191cddc83f56 Author: KAMEZAWA Hiroyuki Date: Fri Jul 25 01:47:09 2008 -0700 memcg: avoid unnecessary initialization * remove over-killing initialization (in fast path) * makeing the condition for PAGE_CGROUP_FLAG_ACTIVE be more obvious. Signed-off-by: KAMEAZAWA Hiroyuki Reviewed-by: Li Zefan Acked-by: Balbir Singh Acked-by: Pavel Emelyanov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit a181b0e888a1d917edcab57cd73ccf7d8e75a46c Author: KAMEZAWA Hiroyuki Date: Fri Jul 25 01:47:08 2008 -0700 memcg: make global var read_mostly mem_cgroup_subsys and page_cgroup_cache should be read_mostly and MEM_CGROUP_RECLAIM_RETRIES can be just a fixed number. Signed-off-by: KAMEZAWA Hiroyuki Acked-by: Balbir Singh Acked-by: Pavel Emelyanov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 7759fc9d10d3559f365cb122d81e0c0a185fe0fe Author: Li Zefan Date: Fri Jul 25 01:47:08 2008 -0700 devcgroup: code cleanup - clean up set_majmin() - use simple_strtoul() to parse major/minor [akpm@linux-foundation.org: fix simple_strtoul() usage] [kosaki.motohiro@jp.fujitsu.com: fix warnings] Signed-off-by: Li Zefan Acked-by: Serge Hallyn Cc: Serge Hallyn Cc: Paul Menage Cc: Pavel Emelyanov Signed-off-by: KOSAKI Motohiro Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 4efd1a1b2f09a4b746dd9dc057986c6dadcb1317 Author: Pavel Emelyanov Date: Fri Jul 25 01:47:07 2008 -0700 devcgroup: relax white-list protection down to RCU Currently this list is protected with a simple spinlock, even for reading from one. This is OK, but can be better. Actually I want it to be better very much, since after replacing the OpenVZ device permissions engine with the cgroup-based one I noticed, that we set 12 default device permissions for each newly created container (for /dev/null, full, terminals, ect devices), and people sometimes have up to 20 perms more, so traversing the ~30-40 elements list under a spinlock doesn't seem very good. Here's the RCU protection for white-list - dev_whitelist_item-s are added and removed under the devcg->lock, but are looked up in permissions checking under the rcu_read_lock. Signed-off-by: Pavel Emelyanov Acked-by: Serge Hallyn Cc: Balbir Singh Cc: Paul Menage Cc: "Paul E. McKenney" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit e885dcde75685e09f23cffae1f6d5169c105b8a0 Author: Serge E. Hallyn Date: Fri Jul 25 01:47:06 2008 -0700 cgroup_clone: use pid of newly created task for new cgroup cgroup_clone creates a new cgroup with the pid of the task. This works correctly for unshare, but for clone cgroup_clone is called from copy_namespaces inside copy_process, which happens before the new pid is created. As a result, the new cgroup was created with current's pid. This patch: 1. Moves the call inside copy_process to after the new pid is created 2. Passes the struct pid into ns_cgroup_clone (as it is not yet attached to the task) 3. Passes a name from ns_cgroup_clone() into cgroup_clone() so as to keep cgroup_clone() itself simpler 4. Uses pid_vnr() to get the process id value, so that the pid used to name the new cgroup is always the pid as it would be known to the task which did the cloning or unsharing. I think that is the most intuitive thing to do. This way, task t1 does clone(CLONE_NEWPID) to get t2, which does clone(CLONE_NEWPID) to get t3, then the cgroup for t3 will be named for the pid by which t2 knows t3. (Thanks to Dan Smith for finding the main bug) Changelog: June 11: Incorporate Paul Menage's feedback: don't pass NULL to ns_cgroup_clone from unshare, and reduce patch size by using 'nodename' in cgroup_clone. June 10: Original version [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Serge Hallyn Acked-by: Paul Menage Tested-by: Dan Smith Cc: Balbir Singh Cc: KAMEZAWA Hiroyuki Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 856c13aa1ff6136c1968414fdea5938ea9d5ebf2 Author: Paul Menage Date: Fri Jul 25 01:47:04 2008 -0700 cgroup files: convert res_counter_write() to be a cgroups write_string() handler Currently res_counter_write() is a raw file handler even though it's ultimately taking a number, since in some cases it wants to pre-process the string when converting it to a number. This patch converts res_counter_write() from a raw file handler to a write_string() handler; this allows some of the boilerplate copying/locking/checking to be removed, and simplies the cleanup path, since these functions are now performed by the cgroups framework. [lizf@cn.fujitsu.com: build fix] Signed-off-by: Paul Menage Cc: Paul Jackson Cc: Pavel Emelyanov Cc: Balbir Singh Cc: Serge Hallyn Cc: KAMEZAWA Hiroyuki Signed-off-by: Li Zefan Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f92523e3a7861f5dbd76021e0719a35fe8771f2d Author: Paul Menage Date: Fri Jul 25 01:47:03 2008 -0700 cgroup files: convert devcgroup_access_write() into a cgroup write_string() handler This patch converts devcgroup_access_write() from a raw file handler into a handler for the cgroup write_string() method. This allows some boilerplate copying/locking/checking to be removed and simplifies the cleanup path, since these functions are performed by the cgroups framework before calling the handler. Signed-off-by: Paul Menage Cc: Paul Jackson Cc: Pavel Emelyanov Cc: Balbir Singh Acked-by: Serge Hallyn Cc: KAMEZAWA Hiroyuki Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit e37123953292146445c8629b3950d0513fd10ae2 Author: Paul Menage Date: Fri Jul 25 01:47:02 2008 -0700 cgroup files: remove cpuset_common_file_write() This patch tweaks the signatures of the update_cpumask() and update_nodemask() functions so that they can be called directly as handlers for the new cgroups write_string() method. This allows cpuset_common_file_write() to be removed. Signed-off-by: Paul Menage Cc: Paul Jackson Cc: Pavel Emelyanov Cc: Balbir Singh Cc: Serge Hallyn Cc: KAMEZAWA Hiroyuki Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit af351026aafc8da16518a02b41c66d3e0c1cdef4 Author: Paul Menage Date: Fri Jul 25 01:47:01 2008 -0700 cgroup files: turn attach_task_by_pid directly into a cgroup write handler This patch changes attach_task_by_pid() to take a u64 rather than a string; as a result it can be called directly as a control groups write_u64 handler, and cgroup_common_file_write() can be removed. Signed-off-by: Paul Menage Cc: Paul Jackson Cc: Pavel Emelyanov Cc: Balbir Singh Cc: Serge Hallyn Cc: KAMEZAWA Hiroyuki Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 6379c106152388f7ea45d6dda63edda0e9181fc8 Author: Paul Menage Date: Fri Jul 25 01:47:01 2008 -0700 cgroup files: move notify_on_release file to separate write handler This patch moves the write handler for the cgroups notify_on_release file into a separate handler. This handler requires no cgroups locking since it relies on atomic bitops for synchronization. Signed-off-by: Paul Menage Cc: Paul Jackson Cc: Pavel Emelyanov Cc: Balbir Singh Cc: Serge Hallyn Cc: KAMEZAWA Hiroyuki Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 84eea842886ac35020be6043e04748ed22014359 Author: Paul Menage Date: Fri Jul 25 01:47:00 2008 -0700 cgroups: misc cleanups to write_string patchset This patch contains cleanups suggested by reviewers for the recent write_string() patchset: - pair cgroup_lock_live_group() with cgroup_unlock() in cgroup.c for clarity, rather than directly unlocking cgroup_mutex. - make the return type of cgroup_lock_live_group() a bool - use a #define'd constant for the local buffer size in read/write functions Signed-off-by: Paul Menage Cc: Paul Jackson Cc: Pavel Emelyanov Cc: Balbir Singh Acked-by: Serge Hallyn Cc: KAMEZAWA Hiroyuki Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit e788e066c651b1bbf4a927dc95395c1aa13be436 Author: Paul Menage Date: Fri Jul 25 01:46:59 2008 -0700 cgroup files: move the release_agent file to use typed handlers Adds cgroup_release_agent_write() and cgroup_release_agent_show() methods to handle writing/reading the path to a cgroup hierarchy's release agent. As a result, cgroup_common_file_read() is now unnecessary. As part of the change, a previously-tolerated race in cgroup_release_agent() is avoided by copying the current release_agent_path prior to calling call_usermode_helper(). Signed-off-by: Paul Menage Cc: Paul Jackson Cc: Pavel Emelyanov Cc: Balbir Singh Acked-by: Serge Hallyn Cc: KAMEZAWA Hiroyuki Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit db3b14978abc02041046ed8353f0899cb58ffffc Author: Paul Menage Date: Fri Jul 25 01:46:58 2008 -0700 cgroup files: add write_string cgroup control file method This patch adds a write_string() method for cgroups control files. The semantics are that a buffer is copied from userspace to kernelspace and the handler function invoked on that buffer. The buffer is guaranteed to be nul-terminated, and no longer than max_write_len (defaulting to 64 bytes if unspecified). Later patches will convert existing raw file write handlers in control group subsystems to use this method. Signed-off-by: Paul Menage Cc: Paul Jackson Cc: Pavel Emelyanov Acked-by: Balbir Singh Acked-by: Serge Hallyn Cc: KAMEZAWA Hiroyuki Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit ce16b49d37e748574f7fabc2726268d542d0aa1a Author: Paul Menage Date: Fri Jul 25 01:46:57 2008 -0700 cgroup files: clean up whitespace in struct cftype This patch removes some extraneous spaces from method declarations in struct cftype, to fit in with conventional kernel style. Signed-off-by: Paul Menage Cc: Paul Jackson Cc: Pavel Emelyanov Cc: Balbir Singh Cc: Serge Hallyn Cc: KAMEZAWA Hiroyuki Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 8947f9d5b361ce927be6d5c11fed57905b7a4100 Author: Li Zefan Date: Fri Jul 25 01:46:56 2008 -0700 cgroups: annotate two variables with __read_mostly - need_forkexit_callback will be read only after system boot. - use_task_css_set_links will be read only after it's set. And these 2 variables are checked when a new process is forked. Signed-off-by: Li Zefan Acked-by: Paul Menage Acked-by: KOSAKI Motohiro Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 71cbb949d17d4d776abd547135feb7f3282405c8 Author: KOSAKI Motohiro Date: Fri Jul 25 01:46:55 2008 -0700 cgroup: list_for_each cleanup -------------------------- while() { list_entry(); ... } -------------------------- is equivalent to following code. -------------------------- list_for_each_entry(){ ... } -------------------------- later can review easily more. this patch is just clean up. it doesn't have any behavor change. Signed-off-by: KOSAKI Motohiro Cc: Paul Menage Cc: Li Zefan Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f2992db2a4f7ae10f61d5bc68c7c1528cec639e2 Author: Pavel Emelyanov Date: Fri Jul 25 01:46:55 2008 -0700 Mark res_counter_charge(_locked) with __must_check Ignoring their return values may result in counter underflow in the future - when the value charged will be uncharged (or in "leaks" - when the value is not uncharged). This also prevents from using charging routines to decrement the counter value (i.e. uncharge it) ;) (Current code works OK with res_counter, however :) ) Signed-off-by: Pavel Emelyanov Cc: Balbir Singh Cc: Paul Menage Cc: KAMEZAWA Hiroyuki Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 7e9abd89cbdf9b73d327d8173343abce9022609b Author: Li Zefan Date: Fri Jul 25 01:46:54 2008 -0700 cgroup: use read lock to guard find_existing_css_set() The function does not modify anything (except the temporary css template), so it's sufficient to hold read lock. Signed-off-by: Li Zefan Acked-by: Paul Menage Cc: Balbir Singh Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 9d96d82da437ed5f2053821779ed5d7797ed1f81 Author: Mike Frysinger Date: Fri Jul 25 01:46:53 2008 -0700 procfs-guide: drop pointless   entities Having trailing   entities in a revision numer seems pretty pointless to me. More so, it's causing me pains, so just drop them since no other guide is doing this. Signed-off-by: Mike Frysinger Acked-by: Randy Dunlap Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 657d3bfa98e542271b449f8cd84c7501ae2b2255 Author: Jan Kara Date: Fri Jul 25 01:46:52 2008 -0700 quota: implement sending information via netlink about user below quota Sometimes it may be useful for userspace to know (e.g. for some hosting guys) that some user stopped exceeding his hardlimit or softlimit in quotas. Implement sending of such events to userspace via quota netlink protocol so that they don't have to poll for such events. Based on idea and initial implementation by Vladislav Bogdanov. Cc: Vladislav Bogdanov Signed-off-by: Jan Kara Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 03b063436ca1076301de58d9d628f610ab5404ad Author: Jan Kara Date: Fri Jul 25 01:46:52 2008 -0700 quota: convert macros to inline functions Signed-off-by: Jan Kara Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 74abb9890dafb12a50dc140de215ed477beb1b88 Author: Jan Kara Date: Fri Jul 25 01:46:51 2008 -0700 quota: move function-macros from quota.h to quotaops.h Move declarations of some macros, which should be in fact functions to quotaops.h. This way they can be later converted to inline functions because we can now use declarations from quota.h. Also add necessary includes of quotaops.h to a few files. [akpm@linux-foundation.org: fix JFS build] [akpm@linux-foundation.org: fix UFS build] [vegard.nossum@gmail.com: fix QUOTA=n build] Signed-off-by: Jan Kara Cc: Vegard Nossum Cc: Arjen Pool Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 02a55ca87185e114e5d298a8d00608501dbabf67 Author: Jan Kara Date: Fri Jul 25 01:46:50 2008 -0700 quota: cleanup loop in sync_dquots() Make loop in sync_dquots() checking whether there's something to write more readable, remove useless variable and macro info_any_dirty() which is used only in this place. Signed-off-by: Jan Kara Cc: "Vegard Nossum" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit b85f4b87a511bea86dac68c4f0fabaee2cac6c4c Author: Jan Kara Date: Fri Jul 25 01:46:50 2008 -0700 quota: rename quota functions from upper case, make bigger ones non-inline Cleanup quotaops.h: Rename functions from uppercase to lowercase (and define backward compatibility macros), move larger functions to dquot.c and make them non-inline. Signed-off-by: Jan Kara Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit b48d380541f634663b71766005838edbb7261685 Author: Jan Kara Date: Fri Jul 25 01:46:49 2008 -0700 quota: fix possible infinite loop in quota code When quota structure is going to be dropped and it is dirty, quota code tries to write it. If the write fails for some reason (e. g. transaction cannot be started because the journal is aborted), we try writing again and again and again... Fix the problem by clearing the dirty bit even if the write failed. (akpm: for 2.6.27, 2.6.26.x and 2.6.25.x) Signed-off-by: Jan Kara Reviewed-by: dingdinghua Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 41003cde95e7e976d3876dbdcdc83dd0a9059279 Author: Joe Peterson Date: Fri Jul 25 01:46:48 2008 -0700 UTC timestamp option for FAT filesystems fix Signed-off-by: Joe Peterson Acked-by: OGAWA Hirofumi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit b271e067c896ad4082b15e96077675d08db40625 Author: Joe Peterson Date: Fri Jul 25 01:46:47 2008 -0700 fatfs: add UTC timestamp option Provide a new mount option ("tz=UTC") for DOS (vfat/msdos) filesystems, allowing timestamps to be in coordinated universal time (UTC) rather than local time in applications where doing this is advantageous. In particular, portable devices that use fat/vfat (such as digital cameras) can benefit from using UTC in their internal clocks, thus avoiding daylight saving time errors and general time ambiguity issues. The user of the device does not have to worry about changing the time when moving from place or when daylight saving changes. The new mount option, when set, disables the counter-adjustment that Linux currently makes to FAT timestamp info in anticipation of the normal userspace time zone correction. When used in this new mode, all daylight saving time and time zone handling is done in userspace as is normal for many other filesystems (like ext3). The default mode, which remains unchanged, is still appropriate when mounting volumes written in Windows (because of its use of local time). I originally based this patch on one submitted last year by Paul Collins, but I updated it to work with current source and changed variable/option naming. Ogawa Hirofumi (who maintains these filesystems) and I discussed this patch at length on lkml, and he suggested using the option name in the attached version of the patch. Barry Bouwsma pointed out a good addition to the patch as well. Signed-off-by: Joe Peterson Signed-off-by: Paul Collins Acked-by: OGAWA Hirofumi Cc: Barry Bouwsma Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit e8938a62a85d1f487e02c3b01955b47c9598f6d2 Author: Adrian Bunk Date: Fri Jul 25 01:46:46 2008 -0700 remove unused #include 's Remove some unused #include 's. Signed-off-by: Adrian Bunk Cc: Ralf Baechle Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit cf6ae8b50e0ee3f764392dadd1970e3f03c40773 Author: Adrian Bunk Date: Fri Jul 25 01:46:46 2008 -0700 remove the in-kernel struct dirent{,64} The kernel struct dirent{,64} were different from the ones in userspace. Even worse, we exported the kernel ones to userspace. But after the fat usages are fixed we can remove the conflicting kernel versions. Reviewed-by: H. Peter Anvin Signed-off-by: Adrian Bunk Cc: OGAWA Hirofumi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 7557bc66be629d19a402e752673708bfbb8b5e86 Author: Rene Scharfe Date: Fri Jul 25 01:46:45 2008 -0700 msdos fs: remove unsettable atari option It has been impossible to set the option 'atari' of the MSDOS filesystem for several years. Since nobody seems to have missed it, let's remove its remains. Signed-off-by: Rene Scharfe Acked-by: OGAWA Hirofumi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit dcd8c53f13f068ee039589d84fbd0baf686abc41 Author: OGAWA Hirofumi Date: Fri Jul 25 01:46:44 2008 -0700 fat: small optimization to __fat_readdir() This removes unnecessary parsing for directory entries. If short_only, we don't need to parse longname. And if !both and it found the longname, we don't need shortname. Signed-off-by: OGAWA Hirofumi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 98a15160049fc1a0f822047f33ff513906a35567 Author: OGAWA Hirofumi Date: Fri Jul 25 01:46:44 2008 -0700 fat: use same logic in fat_search_long() and __fat_readdir() This uses uses stack for shortname, and uses __getname() for longname in fat_search_long() and __fat_readdir(). By this, it removes unneeded __getname() for shortname. Signed-off-by: OGAWA Hirofumi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit d688611674cc9c265ee67e89d2ea8bf060c17e8d Author: OGAWA Hirofumi Date: Fri Jul 25 01:46:43 2008 -0700 fat: cleanup fs/fat/dir.c This is no logic changes, just cleans fs/fat/dir.c up. Signed-off-by: OGAWA Hirofumi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 531f710f8e68fd2bad7516a090bff372f5f9cf6d Author: Adrian Bunk Date: Fri Jul 25 01:46:43 2008 -0700 fat/dir.c: switch to struct __fat_dirent struct __fat_dirent is what was formerly the kernel struct dirent (that was different from the userspace struct dirent). Converting all fat users to struct __fat_dirent will allow us to get rid of the conflicting struct dirent definition. Signed-off-by: Adrian Bunk Signed-off-by: OGAWA Hirofumi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 4596c8aaf96e8634ca755c9f34b91420a39bebd4 Author: OGAWA Hirofumi Date: Fri Jul 25 01:46:42 2008 -0700 fat: fix VFAT_IOCTL_READDIR_xxx and cleanup for userland "struct dirent" is a kernel type here, but is a **different type** in userspace! This means both the structure and the IOCTL number is wrong! So, this adds new "struct __fat_dirent" to generate correct IOCTL number. And kernel stuff moves to under __KERNEL__. Signed-off-by: OGAWA Hirofumi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 8d44d9741f6808c107a144f469fb89e6fe7c55e3 Author: OGAWA Hirofumi Date: Fri Jul 25 01:46:41 2008 -0700 fat: fix parse_options() Current parse_options() exits too early. We need to run the code of bottom in this function even if users doesn't specify options. Signed-off-by: OGAWA Hirofumi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 3264d4ded4d916d294d776b77b72d477c63ac3be Author: Shen Feng Date: Fri Jul 25 01:46:41 2008 -0700 reiserfs: remove double definitions of xattr macros remove the definitions of macros: XATTR_SECURITY_PREFIX XATTR_TRUSTED_PREFIX XATTR_USER_PREFIX since they are defined in linux/xattr.h Signed-off-by: Shen Feng Signed-off-by: Mingming Cao Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 90415deac75a761a25239af6f56381546f8d2201 Author: Jeff Mahoney Date: Fri Jul 25 01:46:40 2008 -0700 reiserfs: convert j_commit_lock to mutex j_commit_lock is a semaphore but uses it as if it were a mutex. This patch converts it to a mutex. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Jeff Mahoney Cc: Matthew Wilcox Cc: Chris Mason Cc: Edward Shishkin Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit afe70259076fff0446001eaa1a287f615241a357 Author: Jeff Mahoney Date: Fri Jul 25 01:46:39 2008 -0700 reiserfs: convert j_flush_sem to mutex j_flush_sem is a semaphore but uses it as if it were a mutex. This patch converts it to a mutex. [akpm@linux-foundation.org: fix mutex_trylock retval treatment] Signed-off-by: Jeff Mahoney Cc: Matthew Wilcox Cc: Chris Mason Cc: Edward Shishkin Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f68215c4640a38d66429014e524a627bf572d26a Author: Jeff Mahoney Date: Fri Jul 25 01:46:38 2008 -0700 reiserfs: convert j_lock to mutex j_lock is a semaphore but uses it as if it were a mutex. This patch converts it to a mutex. Signed-off-by: Jeff Mahoney Cc: Matthew Wilcox Cc: Chris Mason Cc: Edward Shishkin Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 00b441970a0ab48185244300ac7d4e4eb76df692 Author: Jan Kara Date: Fri Jul 25 01:46:38 2008 -0700 reiserfs: correct mount option parsing to detect when quota options can be changed We should not allow user to change quota mount options when quota is just suspended. It would make mount options and internal quota state inconsistent. Also we should not allow user to change quota format when quota is turned on. On the other hand we can just silently ignore when some option is set to the value it already has (some mount versions do this on remount). Finally, we should not discard current quota options if parsing of mount options fails. Cc: Signed-off-by: Jan Kara Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 4506567b24d3ea707e46e8aad64caef539382f4b Author: Jan Kara Date: Fri Jul 25 01:46:37 2008 -0700 reiserfs: fix typos in messages and comments (journalled -> journaled) Cc: Signed-off-by: Jan Kara Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 5d4f7fddf8882b214e4aabb3bdb37f90a72b2537 Author: Jan Kara Date: Fri Jul 25 01:46:36 2008 -0700 reiserfs: fix synchronization of quota files in journal=data mode In journal=data mode, it is not enough to do write_inode_now() as done in vfs_quota_on() to write all data to their final location (which is needed for quota_read to work correctly). Calling journal_end_sync() before calling vfs_quota_on() does it's job because transactions are committed to the journal and data marked as dirty in memory so write_inode_now() writes them to their final locations. Cc: Signed-off-by: Jan Kara Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 895c23f8c39c0c8d7b536bb2566d4aa968d78be2 Author: Matthias Kaehlcke Date: Fri Jul 25 01:46:36 2008 -0700 hfsplus: convert the extents_lock in a mutex Apple Extended HFS file system: The semaphore extents lock is used as a mutex. Convert it to the mutex API. Signed-off-by: Matthias Kaehlcke Cc: Roman Zippel Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 39f8d472f280dee6503a364d1d911b9e20ce3ec9 Author: Matthias Kaehlcke Date: Fri Jul 25 01:46:35 2008 -0700 hfs: convert extents_lock in a mutex Apple Macintosh file system: The semaphore extens_lock is used as a mutex. Convert it to the mutex API Signed-off-by: Matthias Kaehlcke Cc: Roman Zippel Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 3084b72de73a6f8af0f16989ffb348068bd066d4 Author: Matthias Kaehlcke Date: Fri Jul 25 01:46:34 2008 -0700 hfs: convert bitmap_lock in a mutex Apple Macintosh file system: The semaphore bitmap_lock is used as a mutex. Convert it to the mutex API Signed-off-by: Matthias Kaehlcke Cc: Roman Zippel Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit de0ca06a99c33df8333955642843331ab6b6e7ff Author: Adrian Bunk Date: Fri Jul 25 01:46:34 2008 -0700 coda: remove CODA_FS_OLD_API While fixing CONFIG_ leakages to the userspace kernel headers I ran into CODA_FS_OLD_API. After five years, are there still people using the old API left? Especially considering that you have to choose at compile time which API to support in the kernel (and distributions tend to offer the new API for some time). Jan: "The old API can definitely go. Around the time the new interface went in there were some non-Coda userspace file system implementations that took a while longer to convert to the new API, but by now they all switched to the new interface or in some cases to a FUSE-based solution." Signed-off-by: Adrian Bunk Acked-by: Jan Harkes Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit c0a1633b6201ef79e31b7da464d44fdf5953054d Author: Adam Greenblatt Date: Fri Jul 25 01:46:32 2008 -0700 isofs: fix minor filesystem corruption Some iso9660 images contain files with rockridge data that is either incorrect or incompletely parsed. Prior to commit f2966632a134e865db3c819346a1dc7d96e05309 ("[PATCH] rock: handle directory overflows") (included with kernel 2.6.13) the kernel ignored the rockridge data for these files, while still allowing the files to be accessed under their non-rockridge names. That commit inadvertently changed things so that files with invalid rockridge data could not be accessed at all. (I ran across the problem when comparing some old CDs with hard disk copies I had made long ago under kernel 2.4: a few of the files on the hard disk copies were no longer visible on the CDs.) This change reverts to the pre-2.6.13 behavior. Signed-off-by: Adam Greenblatt Reviewed-by: Pekka Enberg Cc: [2.6.25.x, 2.6.26.x] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 275c0a8f1253a7542ad9726956c918d8a1f694c4 Author: Duane Griffin Date: Fri Jul 25 01:46:31 2008 -0700 ext3: validate directory entry data before use ext3_dx_find_entry uses ext3_next_entry without verifying that the entry is valid. If its rec_len == 0 this causes an infinite loop. Refactor the loop to check the validity of entries before checking whether they match and moving onto the next one. There are other uses of ext3_next_entry in this file which also look problematic. They should be reviewed and fixed if/when we have a test-case that triggers them. This patch fixes the first case (image hdb.25.softlockup.gz) reported in http://bugzilla.kernel.org/show_bug.cgi?id=10882. Signed-off-by: Duane Griffin Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit cbe5f466f6995e10a10c7ae66d6dc8608f08a6b8 Author: Hidehiro Kawai Date: Fri Jul 25 01:46:30 2008 -0700 jbd: don't abort if flushing file data failed In ordered mode, the current jbd aborts the journal if a file data buffer has an error. But this behavior is unintended, and we found that it has been adopted accidentally. This patch undoes it and just calls printk() instead of aborting the journal. Additionally, set AS_EIO into the address_space object of the failed buffer which is submitted by journal_do_submit_data() so that fsync() can get -EIO. Missing error checkings are also added to inform errors on file data buffers to the user. The following buffers are targeted. (a) the buffer which has already been written out by pdflush (b) the buffer which has been unlocked before scanned in the t_locked_list loop [akpm@linux-foundation.org: improve grammar in a printk] Signed-off-by: Hidehiro Kawai Acked-by: Jan Kara Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 8ef2720397bb813d4985405a5ae7b8ad6474188b Author: Li Zefan Date: Fri Jul 25 01:46:29 2008 -0700 ext3: kill 2 useless magic numbers dx_root_limit() will never return 20, and I can't figure out what 20 stands for. This function has never changed since htree directory indexing was merged. Similar for dx_node_limit() and the magic 22. Signed-off-by: Li Zefan Acked-by: Andreas Dilger Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit fc80c44277b3c92d808b73e9d40e120229aa4b6a Author: Toshiyuki Okajima Date: Fri Jul 25 01:46:29 2008 -0700 jbd: positively dispose the unmapped data buffers in journal_commit_transaction() After ext3-ordered files are truncated, there is a possibility that the pages which cannot be estimated still remain. Remaining pages can be released when the system has really few memory. So, it is not memory leakage. But the resource management software etc. may not work correctly. It is possible that journal_unmap_buffer() cannot release the buffers, and the pages to which they belong because they are attached to a commiting transaction and journal_unmap_buffer() cannot release them. To release such the buffers and the pages later, journal_unmap_buffer() leaves it to journal_commit_transaction(). (journal_unmap_buffer() puts the mark 'BH_Freed' to the buffers so that journal_commit_transaction() can identify whether they can be released or not.) In the journalled mode and the writeback mode, jbd does with only metadata buffers. But in the ordered mode, jbd does with metadata buffers and also data buffers. Actually, journal_commit_transaction() releases only the metadata buffers of which release is demanded by journal_unmap_buffer(), and also releases the pages to which they belong if possible. As a result, the data buffers of which release is demanded by journal_unmap_buffer() remain after a transaction commits. And also the pages to which they belong remain. Such the remained pages don't have mapping any longer. Due to this fact, there is a possibility that the pages which cannot be estimated remain. The metadata buffers marked 'BH_Freed' and the pages to which they belong can be released at 'JBD: commit phase 7'. Therefore, by applying the same code into 'JBD: commit phase 2' (where the data buffers are done with), journal_commit_transaction() can also release the data buffers marked 'BH_Freed' and the pages to which they belong. As a result, all the buffers marked 'BH_Freed' can be released, and also all the pages to which these buffers belong can be released at journal_commit_transaction(). So, the page which cannot be estimated is lost. <> > spin_lock(&journal->j_list_lock); > while (commit_transaction->t_forget) { > transaction_t *cp_transaction; > struct buffer_head *bh; > > jh = commit_transaction->t_forget; >... > if (buffer_freed(bh)) { > ^^^^^^^^^^^^^^^^^^^^^^^^ > clear_buffer_freed(bh); > ^^^^^^^^^^^^^^^^^^^^^^^^ > clear_buffer_jbddirty(bh); > } > > if (buffer_jbddirty(bh)) { > JBUFFER_TRACE(jh, "add to new checkpointing trans"); > __journal_insert_checkpoint(jh, commit_transaction); > JBUFFER_TRACE(jh, "refile for checkpoint writeback"); > __journal_refile_buffer(jh); > jbd_unlock_bh_state(bh); > } else { > J_ASSERT_BH(bh, !buffer_dirty(bh)); > ... > JBUFFER_TRACE(jh, "refile or unfile freed buffer"); > __journal_refile_buffer(jh); > if (!jh->b_transaction) { > jbd_unlock_bh_state(bh); > /* needs a brelse */ > journal_remove_journal_head(bh); > release_buffer_page(bh); > ^^^^^^^^^^^^^^^^^^^^^^^^ > } else > } **************************************************************** * Apply the code of "^^^^^^" lines into 'JBD: commit phase 2' * **************************************************************** At journal_commit_transaction() code, there is one extra message in the series of jbd debug messages. ("JBD: commit phase 2") This patch fixes it, too. Signed-off-by: Toshiyuki Okajima Acked-by: Jan Kara Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit a10320e8f7c4dcfa050aac566092f29b40458d5a Author: Adrian Bunk Date: Fri Jul 25 01:46:26 2008 -0700 jbd: unexport journal_update_superblock Remove the unused EXPORT_SYMBOL(journal_update_superblock). Signed-off-by: Adrian Bunk Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 3ccc3167b0e5d46ab3bf03e22fbdb7616ce038cd Author: Duane Griffin Date: Fri Jul 25 01:46:26 2008 -0700 ext3: handle deleting corrupted indirect blocks While freeing indirect blocks we attach a journal head to the parent buffer head, free the blocks, then journal the parent. If the indirect block list is corrupted and points to the parent the journal head will be detached when the block is cleared, causing an OOPS. Check for that explicitly and handle it gracefully. This patch fixes the third case (image hdb.20000057.nullderef.gz) reported in http://bugzilla.kernel.org/show_bug.cgi?id=10882. Immediately above the change, in the ext3_free_data function, we call ext3_clear_blocks to clear the indirect blocks in this parent block. If one of those blocks happens to actually be the parent block it will clear b_private / BH_JBD. I did the check at the end rather than earlier as it seemed more elegant. I don't think there should be much practical difference, although it is possible the FS may not be quite so badly corrupted if we did it the other way (and didn't clear the block at all). To be honest, I'm not convinced there aren't other similar failure modes lurking in this code, although I couldn't find any with a quick review. [akpm@linux-foundation.org: fix printk warning] Signed-off-by: Duane Griffin Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 95450f5a7e53d5752ce1a0d0b8282e10fe745ae0 Author: Hidehiro Kawai Date: Fri Jul 25 01:46:24 2008 -0700 ext3: don't read inode block if the buffer has a write error A transient I/O error can corrupt inode data. Here is the scenario: (1) update inode_A at the block_B (2) pdflush writes out new inode_A to the filesystem, but it results in write I/O error, at this point, BH_Uptodate flag of the buffer for block_B is cleared and BH_Write_EIO is set (3) create new inode_C which located at block_B, and __ext3_get_inode_loc() tries to read on-disk block_B because the buffer is not uptodate (4) if it can read on-disk block_B successfully, inode_A is overwritten by old data This patch makes __ext3_get_inode_loc() not read the inode block if the buffer has BH_Write_EIO flag. In this case, the buffer should have the latest information, so setting the uptodate flag to the buffer (this avoids WARN_ON_ONCE() in mark_buffer_dirty().) According to this change, we would need to test BH_Write_EIO flag for the error checking. Currently nobody checks write I/O errors on metadata buffers, but it will be done in other patches I'm working on. Signed-off-by: Hidehiro Kawai Cc: sugita Cc: Satoshi OSHIMA Cc: Nick Piggin Cc: Jan Kara Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit ae76dd9a6b5bbe5315fb7028e03f68f75b8538f3 Author: Duane Griffin Date: Fri Jul 25 01:46:23 2008 -0700 ext3: handle corrupted orphan list at mount If the orphan node list includes valid, untruncatable nodes with nlink > 0 the ext3_orphan_cleanup loop which attempts to delete them will not do so, causing it to loop forever. Fix by checking for such nodes in the ext3_orphan_get function. This patch fixes the second case (image hdb.20000009.softlockup.gz) reported in http://bugzilla.kernel.org/show_bug.cgi?id=10882. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: printk warning fix] Signed-off-by: Duane Griffin Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit ef1afd39519b74fbe1f63c9ab5a14490effec0e3 Author: Shen Feng Date: Fri Jul 25 01:46:23 2008 -0700 ext3: remove double definitions of xattr macros remove the definitions of macros: XATTR_TRUSTED_PREFIX XATTR_USER_PREFIX since they are defined in linux/xattr.h Signed-off-by: Shen Feng Signed-off-by: Mingming Cao Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 3f31fddfa26b7594b44ff2b34f9a04ba409e0f91 Author: Mingming Cao Date: Fri Jul 25 01:46:22 2008 -0700 jbd: fix race between free buffer and commit transaction journal_try_to_free_buffers() could race with jbd commit transaction when the later is holding the buffer reference while waiting for the data buffer to flush to disk. If the caller of journal_try_to_free_buffers() request tries hard to release the buffers, it will treat the failure as error and return back to the caller. We have seen the directo IO failed due to this race. Some of the caller of releasepage() also expecting the buffer to be dropped when passed with GFP_KERNEL mask to the releasepage()->journal_try_to_free_buffers(). With this patch, if the caller is passing the __GFP_WAIT and __GFP_FS to indicating this call could wait, in case of try_to_free_buffers() failed, let's waiting for journal_commit_transaction() to finish commit the current committing transaction, then try to free those buffers again. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Mingming Cao Reviewed-by: Badari Pulavarty Acked-by: Jan Kara Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 9ebfbe9f926553eabc21b4400918d1216b27ed0c Author: Shen Feng Date: Fri Jul 25 01:46:21 2008 -0700 ext3: improve some code in rb tree part of dir.c - remove unnecessary code in free_rb_tree_fname - rename free_rb_tree_fname to ext3_htree_create_dir_info since it and ext3_htree_free_dir_info are a pair - replace kmalloc with kzalloc in ext3_htree_free_dir_info Signed-off-by: Shen Feng Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 1984bb763c2e50d0ebfb0cf56d1b319bd7afe63a Author: Duane Griffin Date: Fri Jul 25 01:46:21 2008 -0700 jbd: tidy up revoke cache initialisation and destruction Make revocation cache destruction safe to call if initialisation fails partially or entirely. This allows it to be used to cleanup in the case of initialisation failure, simplifying that code slightly. Signed-off-by: Duane Griffin Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f4d79ca2fa211cffc07306eeed7013448e77d7ec Author: Duane Griffin Date: Fri Jul 25 01:46:20 2008 -0700 jbd: eliminate duplicated code in revocation table init/destroy functions The revocation table initialisation/destruction code is repeated for each of the two revocation tables stored in the journal. Refactoring the duplicated code into functions is tidier, simplifies the logic in initialisation in particular, and slightly reduces the code size. There should not be any functional change. Signed-off-by: Duane Griffin Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 3850f7a521dc17659ef6758a219f083418788490 Author: Duane Griffin Date: Fri Jul 25 01:46:19 2008 -0700 jbd: replace potentially false assertion with if block If an error occurs during jbd cache initialisation it is possible for the journal_head_cache to be NULL when journal_destroy_journal_head_cache is called. Replace the J_ASSERT with an if block to handle the situation correctly. Note that even with this fix things will break badly if jbd is statically compiled in and cache initialisation fails. Signed-off-by: Duane Griffin Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit d06bf1d252fe16f5f0d13e04da7a9913420aa1cf Author: Jan Kara Date: Fri Jul 25 01:46:18 2008 -0700 ext3: correct mount option parsing to detect when quota options can be changed We should not allow user to change quota mount options when quota is just suspended. I would make mount options and internal quota state inconsistent. Also we should not allow user to change quota format when quota is turned on. On the other hand we can just silently ignore when some option is set to the value it already has (mount does this on remount). Cc: Signed-off-by: Jan Kara Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 99aeaf639f61ab6be1967e5f92e2e28dafad8383 Author: Jan Kara Date: Fri Jul 25 01:46:17 2008 -0700 ext3: fix typos in messages and comments (journalled -> journaled) Cc: Signed-off-by: Jan Kara Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 9cfe7b9010aa66da5f3b2bc33d9e30a4d53bd274 Author: Jan Kara Date: Fri Jul 25 01:46:16 2008 -0700 ext3: fix synchronization of quota files in journal=data mode In journal=data mode, it is not enough to do write_inode_now as done in vfs_quota_on() to write all data to their final location (which is needed for quota_read to work correctly). Calling journal_flush() does its job. Reported-by: Nick Cc: Signed-off-by: Jan Kara Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 50c33a84db4aa5082e3af8d873b22344ae2ebea8 Author: Samuel Thibault Date: Fri Jul 25 01:46:16 2008 -0700 ext2: fix typo in Hurd part of include/linux/ext2_fs.h Fix typo in Hurd part of include/linux/ext2_fs.h The ';' here is redundant or can even pose problem. This is actually not used by the Linux kernel, but it is exposed in GNU/Hurd. Signed-off-by: Samuel Thibault Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f905f06fca5d3949eca12f5a43e251a404b3470a Author: Shen Feng Date: Fri Jul 25 01:46:15 2008 -0700 ext2: remove double definitions of xattr macros remove the definitions of macros: XATTR_TRUSTED_PREFIX XATTR_USER_PREFIX since they are defined in linux/xattr.h Signed-off-by: Shen Feng Signed-off-by: Mingming Cao Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit fb523f32275344282f20ef3352cbf03e599241e6 Author: Adrian Bunk Date: Fri Jul 25 01:46:14 2008 -0700 minix: remove !NO_TRUNCATE code This patch removes the !NO_TRUNCATE code that anyway required a manual editing of the code for being used. Signed-off-by: Adrian Bunk Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit bbcd6d543de335bf81e96477f46a60a8bf51039c Author: Eric Miao Date: Fri Jul 25 01:46:14 2008 -0700 gpio: max732x driver This adds a driver supporting a family of I2C port expanders from Maxim, which includes the MAX7319 and MAX7320-7327 chips. [dbrownell@users.sourceforge.net: minor fixes] Signed-off-by: Jack Ren Signed-off-by: Eric Miao Acked-by: Jean Delvare Signed-off-by: David Brownell Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 7444a72effa632fcd8edc566f880d96fe213c73b Author: Michael Buesch Date: Fri Jul 25 01:46:11 2008 -0700 gpiolib: allow user-selection This patch adds functionality to the gpio-lib subsystem to make it possible to enable the gpio-lib code even if the architecture code didn't request to get it built in. The archtitecture code does still need to implement the gpiolib accessor functions in its asm/gpio.h file. This patch adds the implementations for x86 and PPC. With these changes it is possible to run generic GPIO expansion cards on every architecture that implements the trivial wrapper functions. Support for more architectures can easily be added. Signed-off-by: Michael Buesch Cc: Benjamin Herrenschmidt Cc: Stephen Rothwell Cc: David Brownell Cc: Russell King Cc: Haavard Skinnemoen Cc: Jesper Nilsson Cc: Ralf Baechle Cc: Paul Mackerras Cc: Benjamin Herrenschmidt Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Jean Delvare Cc: Samuel Ortiz Cc: Kumar Gala Cc: Sam Ravnborg Cc: Adrian Bunk Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit ff1d5c2f0268f4e32103536e2e65480b5b7b6530 Author: Michael Buesch Date: Fri Jul 25 01:46:10 2008 -0700 gpio: add bt8xxgpio driver This adds the bt8xxgpio driver. The purpose of the bt8xxgpio driver is to export all of the 24 GPIO pins available on Brooktree 8xx chips to the kernel GPIO infrastructure. This makes it possible to use a physically modified BT8xx card as cheap digital GPIO card. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Michael Buesch Cc: David Brownell Cc: Stephen Rothwell Cc: Mauro Carvalho Chehab Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 8f1cc3b10e6ee0c5c7c8ed27f8771c4f252b4862 Author: David Brownell Date: Fri Jul 25 01:46:09 2008 -0700 gpio: mcp23s08 handles multiple chips per chipselect Teach the mcp23s08 driver about a curious feature of these chips: up to four of them can share the same chipselect, with the SPI signals wired in parallel, by matching two bits in the first protocol byte against two address lines on the chip. This is handled by three software changes: * Platform data now holds an array of per-chip structs, not just one chip's address and pullup configuration. * Probe() and remove() now use another level of structure, wrapping an instance of the original structure for each mcp23s08 chip sharing that chipselect. * The HAEN bit is set, so that the hardware address bits can no longer be ignored (boot firmware may not have enabled them). The "one struct per chip" preserves the guts of the current code, but platform_data will need minor changes. OLD: /* incorrect "slave" ID may not have mattered */ .slave = 3, .pullups = BIT(3) | BIT(1) | BIT(0), NEW: /* slave address _must_ match chip's wiring */ .chip[3] = { .is_present = true, .pullups = BIT(3) | BIT(1) | BIT(0), }, There's no change in how things _behave_ for spi_device nodes with a single mcp23s08 chip. New multi-chip configurations assign GPIOs in sequence, without holes. The spi_device just resembles a bigger controller, but internally it has multiple gpio_chip instances. Signed-off-by: David Brownell Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit d8f388d8dc8d4f36539dd37c1fff62cc404ea0fc Author: David Brownell Date: Fri Jul 25 01:46:07 2008 -0700 gpio: sysfs interface This adds a simple sysfs interface for GPIOs. /sys/class/gpio /export ... asks the kernel to export a GPIO to userspace /unexport ... to return a GPIO to the kernel /gpioN ... for each exported GPIO #N /value ... always readable, writes fail for input GPIOs /direction ... r/w as: in, out (default low); write high, low /gpiochipN ... for each gpiochip; #N is its first GPIO /base ... (r/o) same as N /label ... (r/o) descriptive, not necessarily unique /ngpio ... (r/o) number of GPIOs; numbered N .. N+(ngpio - 1) GPIOs claimed by kernel code may be exported by its owner using a new gpio_export() call, which should be most useful for driver debugging. Such exports may optionally be done without a "direction" attribute. Userspace may ask to take over a GPIO by writing to a sysfs control file, helping to cope with incomplete board support or other "one-off" requirements that don't merit full kernel support: echo 23 > /sys/class/gpio/export ... will gpio_request(23, "sysfs") and gpio_export(23); use /sys/class/gpio/gpio-23/direction to (re)configure it, when that GPIO can be used as both input and output. echo 23 > /sys/class/gpio/unexport ... will gpio_free(23), when it was exported as above The extra D-space footprint is a few hundred bytes, except for the sysfs resources associated with each exported GPIO. The additional I-space footprint is about two thirds of the current size of gpiolib (!). Since no /dev node creation is involved, no "udev" support is needed. Related changes: * This adds a device pointer to "struct gpio_chip". When GPIO providers initialize that, sysfs gpio class devices become children of that device instead of being "virtual" devices. * The (few) gpio_chip providers which have such a device node have been updated. * Some gpio_chip drivers also needed to update their module "owner" field ... for which missing kerneldoc was added. * Some gpio_chips don't support input GPIOs. Those GPIOs are now flagged appropriately when the chip is registered. Based on previous patches, and discussion both on and off LKML. A Documentation/ABI/testing/sysfs-gpio update is ready to submit once this merges to mainline. [akpm@linux-foundation.org: a few maintenance build fixes] Signed-off-by: David Brownell Cc: Guennadi Liakhovetski Cc: Greg KH Cc: Kay Sievers Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 8b6dd986823a8d92ed9f54baa5cef8604d9d9d44 Author: Abhishek Sagar Date: Fri Jul 25 01:46:05 2008 -0700 kprobes: remove redundant config check I noticed that there's a CONFIG_KPROBES check inside kernel/kprobes.c, which is redundant. Signed-off-by: Abhishek Sagar Acked-by: Masami Hiramatsu Cc: Ananth N Mavinakayanahalli Cc: Anil S Keshavamurthy Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit ef53d9c5e4da147ecaa43c44c5e5945eb83970a2 Author: Srinivasa D S Date: Fri Jul 25 01:46:04 2008 -0700 kprobes: improve kretprobe scalability with hashed locking Currently list of kretprobe instances are stored in kretprobe object (as used_instances,free_instances) and in kretprobe hash table. We have one global kretprobe lock to serialise the access to these lists. This causes only one kretprobe handler to execute at a time. Hence affects system performance, particularly on SMP systems and when return probe is set on lot of functions (like on all systemcalls). Solution proposed here gives fine-grain locks that performs better on SMP system compared to present kretprobe implementation. Solution: 1) Instead of having one global lock to protect kretprobe instances present in kretprobe object and kretprobe hash table. We will have two locks, one lock for protecting kretprobe hash table and another lock for kretporbe object. 2) We hold lock present in kretprobe object while we modify kretprobe instance in kretprobe object and we hold per-hash-list lock while modifying kretprobe instances present in that hash list. To prevent deadlock, we never grab a per-hash-list lock while holding a kretprobe lock. 3) We can remove used_instances from struct kretprobe, as we can track used instances of kretprobe instances using kretprobe hash table. Time duration for kernel compilation ("make -j 8") on a 8-way ppc64 system with return probes set on all systemcalls looks like this. cacheline non-cacheline Un-patched kernel aligned patch aligned patch =============================================================================== real 9m46.784s 9m54.412s 10m2.450s user 40m5.715s 40m7.142s 40m4.273s sys 2m57.754s 2m58.583s 3m17.430s =========================================================== Time duration for kernel compilation ("make -j 8) on the same system, when kernel is not probed. ========================= real 9m26.389s user 40m8.775s sys 2m7.283s ========================= Signed-off-by: Srinivasa DS Signed-off-by: Jim Keniston Acked-by: Ananth N Mavinakayanahalli Cc: Anil S Keshavamurthy Cc: David S. Miller Cc: Masami Hiramatsu Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 53a9600c634e3bfd6230e0597aca159bf4d4d010 Author: Ben Dooks Date: Fri Jul 25 01:46:03 2008 -0700 mfd: sm501 fix gpio number calculation for upper bank The sm501_gpio_pin2nr() routine returns the wrong values for gpios in the upper bank. Signed-off-by: Ben Dooks Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f2999209d779573e17468b680f5f267d8cb2a9c7 Author: Ben Dooks Date: Fri Jul 25 01:46:02 2008 -0700 mfd: sm501 build fixes when CONFIG_MFD_SM501_GPIO unset Fix the build problems if CONFIG_MFD_SM501_GPIO is not set, which is generally when there is no gpiolib support available as currently happens on x86 when building PCI SM501. Signed-off-by: Ben Dooks Tested-by: Li Zefan Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 28130bea3bcfefe3437b0a5dcab786f1f0296953 Author: Ben Dooks Date: Fri Jul 25 01:46:02 2008 -0700 sm501: fixes for akpms comments on gpiolib addition Fixup the comments from the patch that added the gpiolib support from Andrew Morton. These include spotting some missing frees on error or release, and changing a memcpy for a type-safe assingment. Signed-off-by: Ben Dooks Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 42cd2366fb9b58cdfc1855be32b31a78e40b2079 Author: Ben Dooks Date: Fri Jul 25 01:46:01 2008 -0700 sm501: gpio I2C support Add support for adding the GPIO based I2C resources. Signed-off-by: Ben Dooks Cc: Arnaud Patard Cc: David Brownell Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 60e540d617b40eb3d37f1dd99c97af588ff9b70b Author: Arnaud Patard Date: Fri Jul 25 01:46:00 2008 -0700 sm501: gpio dynamic registration for PCI devices The SM501 PCI card requires a dyanmic gpio allocation as the number of cards is not known at compile time. Fixup the platform data and registration to deal with this. Acked-by: Ben Dooks Signed-off-by: Arnaud Patard Cc: David Brownell Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f61be273d3699d174bc1438e6804f9f9e52bb932 Author: Ben Dooks Date: Fri Jul 25 01:45:59 2008 -0700 sm501: add gpiolib support Add support for exporting the GPIOs on the SM501 via gpiolib. Signed-off-by: Ben Dooks Cc: Arnaud Patard Cc: David Brownell Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 472dba7d117844c746be97db6be26c2810d79b62 Author: Ben Dooks Date: Fri Jul 25 01:45:58 2008 -0700 sm501: add power control callback Add callback to get or set the power control if the device has the sleep connected to some form of GPIO. Signed-off-by: Ben Dooks Cc: Arnaud Patard Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 717115e1a5856b57af0f71e1df7149108294fc10 Author: Dave Young Date: Fri Jul 25 01:45:58 2008 -0700 printk ratelimiting rewrite All ratelimit user use same jiffies and burst params, so some messages (callbacks) will be lost. For example: a call printk_ratelimit(5 * HZ, 1) b call printk_ratelimit(5 * HZ, 1) before the 5*HZ timeout of a, then b will will be supressed. - rewrite __ratelimit, and use a ratelimit_state as parameter. Thanks for hints from andrew. - Add WARN_ON_RATELIMIT, update rcupreempt.h - remove __printk_ratelimit - use __ratelimit in net_ratelimit Signed-off-by: Dave Young Cc: "David S. Miller" Cc: "Paul E. McKenney" Cc: Dave Young Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 2711b793eb62a5873a0ba583a69252040aef176e Author: Vegard Nossum Date: Fri Jul 25 01:45:56 2008 -0700 kallsyms: unify 32- and 64-bit code Use the %p format string which already accounts for the padding you need with a pointer type on a particular architecture. Also replace the macro with a static inline function to match the rest of the file. Cc: Heiko Carstens Cc: Arjan van de Ven Signed-off-by: Vegard Nossum Cc: Sam Ravnborg Cc: Randy Dunlap Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 924d9addb9b1474fc81a78a5c6706755efea7aaa Author: Dave Jones Date: Fri Jul 25 01:45:55 2008 -0700 list debugging: use WARN() instead of BUG() Arjan noted that the list_head debugging is BUG'ing when it detects corruption. By causing the box to panic immediately, we're possibly losing some bug reports. Changing this to a WARN() should mean we at the least start seeing reports collected at kerneloops.org Signed-off-by: Dave Jones Cc: Matthew Wilcox Cc: Arjan van de Ven Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit d955c78ac4699ac9c3fe07be62982cda13d13267 Author: Arjan van de Ven Date: Fri Jul 25 01:45:55 2008 -0700 Example use of WARN() Now that WARN() exists, we can fold some of the printk's into it. Signed-off-by: Arjan van de Ven Cc: Greg KH Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 7a2c477069fbd32f91598f05334003979b987a39 Author: Arjan van de Ven Date: Fri Jul 25 01:45:54 2008 -0700 kernel/irq/manage.c: replace a printk + WARN_ON() to a WARN() Replace a printk+WARN_ON() by a WARN(); this increases the chance of the string making it into the bugreport (ie: it goes inside the ---[ cut here ]--- section) Signed-off-by: Arjan van de Ven Cc: Thomas Gleixner Cc: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit a8f18b909c0a3f22630846207035c8b84bb252b8 Author: Arjan van de Ven Date: Fri Jul 25 01:45:53 2008 -0700 Add a WARN() macro; this is WARN_ON() + printk arguments Add a WARN() macro that acts like WARN_ON(), with the added feature that it takes a printk like argument that is printed as part of the warning message. [akpm@linux-foundation.org: fix printk arguments] [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Arjan van de Ven Cc: Greg KH Cc: Jiri Slaby Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit b6c63937001889af6fe431aaba97e59d04e028e7 Author: Arjan van de Ven Date: Fri Jul 25 01:45:52 2008 -0700 Rename WARN() to WARNING() to clear the namespace We want to use WARN() as a variant of WARN_ON(), however a few drivers are using WARN() internally. This patch renames these to WARNING() to avoid the namespace clash. A few cases were defining but not using the thing, for those cases I just deleted the definition. Signed-off-by: Arjan van de Ven Acked-by: Greg KH Cc: Karsten Keil Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f38954c93c4a548f55d73ac5c1cf5e7f4023bb6c Author: Andrew Morton Date: Fri Jul 25 01:45:52 2008 -0700 drivers/misc/hpilo.c needs CONFIG_PCI m68k allmodconfig: drivers/misc/hpilo.c: In function 'ilo_ccb_close': drivers/misc/hpilo.c:225: error: implicit declaration of function 'pci_free_consistent' drivers/misc/hpilo.c: In function 'ilo_ccb_open': drivers/misc/hpilo.c:244: error: implicit declaration of function 'pci_alloc_consistent' drivers/misc/hpilo.c:245: warning: assignment makes pointer from integer without a cast Cc: David Altobelli Cc: Greg Kroah-Hartman Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit a7f371e54fac49ff62bb640d4a7276fca01527e8 Author: Johannes Weiner Date: Fri Jul 25 01:45:51 2008 -0700 documentation: update CodingStyle tips for Emacs users Describe a setup that integrates better with Emacs' cc-mode and also fixes up the alignment of continuation lines to really only use tabs. Signed-off-by: Johannes Weiner Cc: Jonathan Corbet Cc: Randy Dunlap Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 197dcffc8ba0ea943fee86e28e99cd9575799772 Author: Daniel Guilak Date: Fri Jul 25 01:45:50 2008 -0700 init/version.c: define version_string only if CONFIG_KALLSYMS is not defined int Version_* is only used with ksymoops, which is only needed (according to README and Documentation/Changes) if CONFIG_KALLSYMS is NOT defined. Therefore this patch defines version_string only if CONFIG_KALLSYMS is not defined. Signed-off-by: Daniel Guilak Cc: Randy Dunlap Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 277e2c695907a70b316a31769cd891dc4d43b7f3 Author: Daniel Guilak Date: Fri Jul 25 01:45:49 2008 -0700 init/version.c: silence sparse warning by declaring the version string Signed-off-by: Daniel Guilak Cc: Randy Dunlap Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 4500d067eeb3d00679335d9cf5c6536e79cd3ef4 Author: Robert P. J. Day Date: Fri Jul 25 01:45:49 2008 -0700 init.h: remove obsolete content Remove apparently obsolete content from init.h referring to gcc 2.9x and to "no_module_init". Signed-off-by: Robert P. J. Day Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit db358b40e0674fd4079204d8e3e1c8ab3829a1b9 Author: Kay Sievers Date: Fri Jul 25 01:45:48 2008 -0700 parport: fix platform driver hotplug/coldplug Since 43cc71eed1250755986da4c0f9898f9a635cb3bf (platform: prefix MODALIAS with "platform:"), the platform modalias is prefixed with "platform:". Add MODULE_ALIAS() to the hotpluggable parport platform drivers, to re-enable auto loading. Signed-off-by: Kay Sievers Signed-off-by: David Brownell Cc: Greg KH Cc: "Rafael J. Wysocki" Acked-by: Ben Dooks Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 4f46d6e7e5ffbce0ee1d1a80767fdf45e56cc863 Author: Kay Sievers Date: Fri Jul 25 01:45:47 2008 -0700 mfd: fix platform driver hotplug/coldplug Since 43cc71eed1250755986da4c0f9898f9a635cb3bf (platform: prefix MODALIAS with "platform:"), the platform modalias is prefixed with "platform:". Add MODULE_ALIAS() to the MFD platform drivers, to re-enable auto loading. [dbrownell@users.sourceforge.net: one was missing] Signed-off-by: Kay Sievers Signed-off-by: David Brownell Cc: Greg KH Cc: "Rafael J. Wysocki" Cc: Samuel Ortiz Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 2f5a5cf93fae7b8354b45b8443dcc3448a8fc276 Author: Kay Sievers Date: Fri Jul 25 01:45:46 2008 -0700 drivers/power: fix platform driver hotplug/coldplug Since 43cc71eed1250755986da4c0f9898f9a635cb3bf ("platform: prefix MODALIAS with "platform:"), the platform modalias is prefixed with "platform:". Add MODULE_ALIAS() to the hotpluggable "power" drivers drivers, to re-enable auto loading. [dbrownell@users.sourceforge.net: one was missing] Signed-off-by: Kay Sievers Signed-off-by: David Brownell Cc: Greg KH Cc: "Rafael J. Wysocki" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 2d6ffcca623a9a16df6cdfbe8250b7a5904a5f5e Author: Thomas Petazzoni Date: Fri Jul 25 01:45:44 2008 -0700 inflate: refactor inflate malloc code Inflate requires some dynamic memory allocation very early in the boot process and this is provided with a set of four functions: malloc/free/gzip_mark/gzip_release. The old inflate code used a mark/release strategy rather than implement free. This new version instead keeps a count on the number of outstanding allocations and when it hits zero, it resets the malloc arena. This allows removing all the mark and release implementations and unifying all the malloc/free implementations. The architecture-dependent code must define two addresses: - free_mem_ptr, the address of the beginning of the area in which allocations should be made - free_mem_end_ptr, the address of the end of the area in which allocations should be made. If set to 0, then no check is made on the number of allocations, it just grows as much as needed The architecture-dependent code can also provide an arch_decomp_wdog() function call. This function will be called several times during the decompression process, and allow to notify the watchdog that the system is still running. If an architecture provides such a call, then it must define ARCH_HAS_DECOMP_WDOG so that the generic inflate code calls arch_decomp_wdog(). Work initially done by Matt Mackall, updated to a recent version of the kernel and improved by me. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Thomas Petazzoni Cc: Matt Mackall Cc: Richard Henderson Cc: Ivan Kokshaysky Cc: Mikael Starvik Cc: Jesper Nilsson Cc: Haavard Skinnemoen Cc: David Howells Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Andi Kleen Cc: "H. Peter Anvin" Acked-by: Paul Mundt Acked-by: Yoshinori Sato Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit ba92a43dbaee339cf5915ef766d3d3ffbaaf103c Author: Hugh Dickins Date: Fri Jul 25 01:45:43 2008 -0700 exec: remove some includes fs/exec.c used to need mman.h pagemap.h swap.h and rmap.h when it did mm-ish stuff in install_arg_page(); but no need for them after 2.6.22. [akpm@linux-foundation.org: unbreak arm] Signed-off-by: Hugh Dickins Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 2b4bc46052ea8cd7c370b67ca0b9c26586f1439a Author: OGAWA Hirofumi Date: Fri Jul 25 01:45:42 2008 -0700 pdflush: use time_after() instead of open-coding it Signed-off-by: OGAWA Hirofumi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit b69c49b78457f681ecfb3147bd968434ee6559c1 Author: FUJITA Tomonori Date: Fri Jul 25 01:45:40 2008 -0700 clean up duplicated alloc/free_thread_info We duplicate alloc/free_thread_info defines on many platforms (the majority uses __get_free_pages/free_pages). This patch defines common defines and removes these duplicated defines. __HAVE_ARCH_THREAD_INFO_ALLOCATOR is introduced for platforms that do something different. Signed-off-by: FUJITA Tomonori Acked-by: Russell King Cc: Pekka Enberg Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 62ec30d45ecbb85b5991474c8f04192697687495 Author: Matthew Garrett Date: Fri Jul 25 01:45:39 2008 -0700 misc: add HP WMI laptop extras driver This driver adds support for reading and configuring certain information on modern HP laptops with WMI BIOS interfaces. It supports enabling and disabling the ambient light sensor, querying attached displays and hard drive temperature, sending events on docking and querying the state of the dock and toggling the state of the wifi, bluetooth and wwan hardware via rfkill. It also makes the little "(i)" button work on machines that send that via WMI rather than via the keyboard controller. Signed-off-by: Matthew Garrett Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit ac331d158e198d2a91a5b0a3ec4ca9991fdb57af Author: KOSAKI Motohiro Date: Fri Jul 25 01:45:38 2008 -0700 call_usermodehelper(): increase reliability Presently call_usermodehelper_setup() uses GFP_ATOMIC. but it can return NULL _very_ easily. GFP_ATOMIC is needed only when we can't sleep. and, GFP_KERNEL is robust and better. thus, I add gfp_mask argument to call_usermodehelper_setup(). So, its callers pass the gfp_t as below: call_usermodehelper() and call_usermodehelper_keys(): depend on 'wait' argument. call_usermodehelper_pipe(): always GFP_KERNEL because always run under process context. orderly_poweroff(): pass to GFP_ATOMIC because may run under interrupt context. Signed-off-by: KOSAKI Motohiro Cc: "Paul Menage" Reviewed-by: Li Zefan Acked-by: Jeremy Fitzhardinge Cc: Rusty Russell Cc: Andi Kleen Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f557d0996a6f9c06912528ea85e1dba0fb7d485f Author: Adrian Bunk Date: Fri Jul 25 01:45:37 2008 -0700 remove some more tipar bits Some bits were missed when the tipar driver was removed. Signed-off-by: Adrian Bunk Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f16695f4ac088cf7593e113574046d2d7e5af5eb Author: Adrian Bunk Date: Fri Jul 25 01:45:36 2008 -0700 asm-generic/int-ll64.h: always provide __{s,u}64 Several compilers offer "long long" without claiming to support C99. Considering how frequent __s64/__u64 are used our userspace headers are anyway unusable without __s64/__u64 available. Always offer __s64/__u64 to non-gcc non-C99 compilers - if they provide "long long" that makes the headers compiling and if they don't they are anyway screwed. Signed-off-by: Adrian Bunk Acked-by: H. Peter Anvin Cc: Harvey Harrison Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit cebbd3fb803603b12408458ba17c29ce1e15a5f2 Author: Andrew Morton Date: Fri Jul 25 01:45:35 2008 -0700 build-kernel-profileo-only-when-requested-cleanups Cc: Adrian Bunk Cc: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit b03f6489f9f27dc519a4c60ebf39cc7b8a58eae7 Author: Adrian Bunk Date: Fri Jul 25 01:45:35 2008 -0700 build kernel/profile.o only when requested Build kernel/profile.o only if CONFIG_PROFILING is enabled. This makes CONFIG_PROFILING=n kernels smaller. As a bonus, some profile_tick() calls and one branch from schedule() are now eliminated with CONFIG_PROFILING=n (but I doubt these are measurable effects). This patch changes the effects of CONFIG_PROFILING=n, but I don't think having more than two choices would be the better choice. This patch also adds the name of the first parameter to the prototypes of profile_{hits,tick}() since I anyway had to add them for the dummy functions. Signed-off-by: Adrian Bunk Cc: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 696adfe84c11c571a1e0863460ff0ec142b4e5a9 Author: Paul E. McKenney Date: Fri Jul 25 01:45:34 2008 -0700 list_for_each_rcu must die: networking All uses of list_for_each_rcu() can be profitably replaced by the easier-to-use list_for_each_entry_rcu(). This patch makes this change for networking, in preparation for removing the list_for_each_rcu() API entirely. Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 2fc9c4e18f94431e7eb77d97edb2a995b46fba55 Author: Vegard Nossum Date: Fri Jul 25 01:45:34 2008 -0700 kallsyms: fix potential overflow in binary search This will probably never trigger... but it won't hurt to be careful. http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html Signed-off-by: Vegard Nossum Cc: Joshua Bloch Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 58340a07c194e0aed7bc58b61ff24330bb2a409f Author: Johannes Berg Date: Fri Jul 25 01:45:33 2008 -0700 introduce HAVE_EFFICIENT_UNALIGNED_ACCESS Kconfig symbol In many cases, especially in networking, it can be beneficial to know at compile time whether the architecture can do unaligned accesses efficiently. This patch introduces a new Kconfig symbol HAVE_EFFICIENT_UNALIGNED_ACCESS for that purpose and adds it to the powerpc and x86 architectures. Also add some documentation about alignment and networking, and especially one intended use of this symbol. Signed-off-by: Johannes Berg Acked-by: Sam Ravnborg Acked-by: Ingo Molnar [x86 architecture part] Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit e0ce0da9fefcc723dc006c35a7f91a32750abd40 Author: Robert P. J. Day Date: Fri Jul 25 01:45:32 2008 -0700 lists: remove a redundant conditional definition of list_add() Remove the conditional surrounding the definition of list_add() from list.h since, if you define CONFIG_DEBUG_LIST, the definition you will subsequently pick up from lib/list_debug.c will be absolutely identical, at which point you can remove that redundant definition from list_debug.c as well. Signed-off-by: Robert P. J. Day Cc: Dave Jones Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit fd193829744bc77392395cf8f47889235c97f0a3 Author: Robert P. J. Day Date: Fri Jul 25 01:45:31 2008 -0700 lib: allow memparse() to accept a NULL and ignorable second parm Extend memparse() to allow the caller to use a NULL second parameter, which would represent no interest in returning the address of the end of the parsed string. In numerous cases, callers invoke memparse() to parse a possibly-suffixed string (such as "64K" or "2G" or whatever) and define a character pointer to accept the end pointer being returned by memparse() even though they have no interest in it and promptly throw it away. This (backward-compatible) enhancement allows callers to use NULL in the cases where they just don't care about getting back that end pointer. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Robert P. J. Day Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit cb345d7352aa9e692ef4b83c41d3e6e1cdb2f846 Author: Robert P. J. Day Date: Fri Jul 25 01:45:30 2008 -0700 init/: delete hard-coded setting and testing of BUILD_CRAMDISK There seems to be little point in explicitly setting, then testing the macro BUILD_CRAMDISK within the context of a single source file. Signed-off-by: Robert P. J. Day Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit b39c08cb692cb8898c30e0d8187c7cbe27cc905c Author: Robert P. J. Day Date: Fri Jul 25 01:45:29 2008 -0700 Remove apparently unused fd1772.h header file. This header file has been unused for quite some time, and the corresponding source files appear to have been removed back in commit 99eb8a550dbccc0e1f6c7e866fe421810e0585f6 ("Remove the arm26 port") Signed-off-by: Robert P. J. Day Cc: Adrian Bunk Cc: Ian Molton Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 82c8253ac27291d6c70114eb445c714359812a10 Author: Adrian Bunk Date: Fri Jul 25 01:45:29 2008 -0700 init/do_mounts.c should #include Every file should include the headers containing the externs for its global code (in this case for rd_doload). Signed-off-by: Adrian Bunk Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit abddaec56ebb7911bbf0578a4636a74bd7376d92 Author: Eric Sandeen Date: Fri Jul 25 01:45:28 2008 -0700 fix checkstack.pl arch detection uname -m was leaving a newline in $arch, and not passing the tests. Also, printing the unknown arch on failure is probably helpful. Signed-off-by: Eric Sandeen Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 585e93ae83b80c874bf4eb50a239027cef5db4af Author: Eric Sandeen Date: Fri Jul 25 01:45:27 2008 -0700 find dynamic stack allocations in checkstack.pl Currently, checkstack.pl only looks for fixed subtractions from the stack pointer. However, things like this: void function(int size) { char stackbuster[size << 2]; ... are certainly worth pointing out, I think. This could perhaps be done more cleanly, and the following patch only adds "dynamic" REs for x86 and x86_64, but it works: 0x00b0 crypto_cbc_decrypt_inplace [cbc]: Dynamic (%rax) 0x00ad crypto_pcbc_decrypt_inplace [pcbc]: Dynamic (%rax) 0x02f6 crypto_pcbc_encrypt_inplace [pcbc]: Dynamic (%rax) 0x036c _crypto_xcbc_digest_setkey [xcbc]: Dynamic (%rax) ... (Inspired by Keith Owens' old stack-check script) Signed-off-by: Eric Sandeen Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 545e400619b24b6b17b7f1f1e838e9ff6d036949 Author: Harvey Harrison Date: Fri Jul 25 01:45:27 2008 -0700 lzo: use get/put_unaligned_* helpers Signed-off-by: Harvey Harrison Acked-by: Richard Purdie Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 8b5ac31e27135a6f2c210c40d03bf8f1b3a86b77 Author: Harvey Harrison Date: Fri Jul 25 01:45:26 2008 -0700 include: use get/put_unaligned_* helpers Signed-off-by: Harvey Harrison Cc: "John W. Linville" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit b7bbf8fa6ba329b3552b75a0716f5fbc6f839499 Author: Harvey Harrison Date: Fri Jul 25 01:45:25 2008 -0700 fs: ldm.[ch] use get_unaligned_* helpers Replace the private BE16/BE32/BE64 macros with direct calls to get_unaligned_be16/32/64. Signed-off-by: Harvey Harrison Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 3f307891ce0e7b0438c432af1aacd656a092ff45 Author: Steven Rostedt Date: Fri Jul 25 01:45:25 2008 -0700 locking: add typecheck on irqsave and friends for correct flags There haave been several areas in the kernel where an int has been used for flags in local_irq_save() and friends instead of a long. This can cause some hard to debug problems on some architectures. This patch adds a typecheck inside the irqsave and restore functions to flag these cases. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: build fix] Signed-off-by: Steven Rostedt Cc: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit e0deaff470900a4c3222ca7139f6c9639e26a2f5 Author: Andrew Morton Date: Fri Jul 25 01:45:24 2008 -0700 split the typecheck macros out of include/linux/kernel.h Needed to fix up a recursive include snafu in locking-add-typecheck-on-irqsave-and-friends-for-correct-flags.patch Cc: Steven Rostedt Cc: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 5df439ef06d4173357711a04740aa8bfcf50d621 Author: Wang Chen Date: Fri Jul 25 01:45:23 2008 -0700 flag parameters: fix compile error of sys_epoll_create1 GEN .version CHK include/linux/compile.h UPD include/linux/compile.h CC init/version.o LD init/built-in.o LD vmlinux arch/x86/kernel/built-in.o: In function `sys_call_table': (.rodata+0x8a4): undefined reference to `sys_epoll_create1' make: *** [vmlinux] Error 1 Signed-off-by: Wang Chen Cc: Ulrich Drepper Cc: Davide Libenzi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit c82dd5321cf779f1f536ef26b383cbe8c9de7f10 Author: Andrew Morton Date: Fri Jul 25 01:45:22 2008 -0700 mfd: don't use memzero For it doesn't exist on i386. Cc: Ian Molton Cc: Dmitry Baryshkov Cc: Russell King Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 3d6f4a20cc287a8980c6186624834cf10a70752b Author: David Miller Date: Thu Jul 24 23:38:31 2008 -0700 endian: Always evaluate arguments. Changeset 7fa897b91a3ea0f16c2873b869d7a0eef05acff4 ("ide: trivial sparse annotations") created an IDE bootup regression on big-endian systems. In drivers/ide/ide-iops.c, function ide_fixstring() we now have the loop: for (p = end ; p != s;) be16_to_cpus((u16 *)(p -= 2)); which will never terminate on big-endian because in such a configuration be16_to_cpus() evaluates to "do { } while (0)" Therefore, always evaluate the arguments to nop endian transformation operations. Signed-off-by: David S. Miller Signed-off-by: Linus Torvalds commit 43de804df8d6002059bf4af4522fa9273a19b8aa Author: Huang Weiyi Date: Fri Jul 25 23:30:15 2008 +0800 char/xilinx_hwicap/xilinx_hwicap.c: Removed duplicated include Removed duplicated include file in char/xilinx_hwicap/xilinx_hwicap.c. Signed-off-by: Huang Weiyi Signed-off-by: Linus Torvalds commit 29b309e52d3d51ef8a15bd15590903cf272beb93 Author: Linus Torvalds Date: Fri Jul 25 09:19:36 2008 -0700 Undo duplicate "m68k: drivers/input/serio/hp_sdc.c needs " Both commits 0f17e4c796e89d1f69f13b653aba60e6ccfb8ae0 ("Add missing semaphore.h includes") and 4933d07531711e399d8d578036aa9fc1be2f9b20 ("m68k: drivers/input/serio/hp_sdc.c needs ") added a We only really need one ;) Reported-by: Huang Weiyi Requested-by: Dmitry Torokhov Signed-off-by: Linus Torvalds commit d37e6bf68fc1eb34a4ad21d9ae8890ed37ea80e7 Author: Artem Bityutskiy Date: Thu Jul 24 18:28:11 2008 +0300 UBI: always start the background thread This fix only affects UBI debugging. If the the background thread is disabled for debugging purposes, start it anyway, because otherwise we see tonns of kernel debugging complaints like this: INFO: task ubi_bgt0d:26857 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ubi_bgt0d D dd37bf94 0 26857 2 dd37bfcc 00000086 f8e17cea dd37bf94 00000046 00000000 00000000 f5c62430 f5c62430 f5c62590 c2a09c80 f6cbd498 dd8e9cbc 00000296 dd37bfb0 00000296 dd8e9cb8 dd8e9cbc dd37bfcc c0119774 00000000 00000000 c0132e89 f6961560 Call Trace: [] ? ubi_thread+0x0/0x127 [ubi] [] ? complete+0x43/0x4b [] ? kthread+0x0/0x5b [] ? ubi_thread+0x0/0x127 [ubi] [] kthread+0x25/0x5b [] ? kthread+0x0/0x5b [] kernel_thread_helper+0x7/0x14 ======================= So start it, and go sleep inside it, instead of creating it and never start. Signed-off-by: Artem Bityutskiy commit 973b7d83ebeb1e34b8bee69208916e5f0e2353c3 Author: Tony Breeds Date: Fri Jul 25 16:21:51 2008 +1000 powerpc: Wireup new syscalls signalfd4, eventfd2, epoll_create1, dup3, pipe2 and inotify_init1 Signed-off-by: Tony Breeds Signed-off-by: Benjamin Herrenschmidt commit 1e3519f8e1baec0b733cd42684fcd3d9681662f1 Author: Benjamin Herrenschmidt Date: Fri Jul 25 16:21:11 2008 +1000 Move update_mmu_cache() declaration from tlbflush.h to pgtable.h where it belongs. This fixes some build problems on some configs Signed-off-by: Benjamin Herrenschmidt commit 16c14b4621c7b6fc4611abf1f86cd78cdb1b2b03 Author: Nathan Fontenot Date: Thu Jul 24 05:10:46 2008 +1000 powerpc/pseries: Remove kmalloc call in handling writes to lparcfg There are only 4 valid name=value pairs for writes to /proc/ppc64/lparcfg. Current code allocates a buffer to copy this information in from the user. Since the longest name=value pair will easily fit into a buffer of 64 characters, simply put the buffer on the stack instead of allocating the buffer. Signed-off-by: Nathan Fotenot Signed-off-by: Benjamin Herrenschmidt commit 8391e42a5c1f3d757faa5e7f46a4a68f9aa6cb12 Author: Nathan Fontenot Date: Thu Jul 24 04:36:38 2008 +1000 powerpc/pseries: Update arch vector to indicate support for CMO Update the architecture vector to indicate that Cooperative Memory Overcommitment is supported if CONFIG_PPC_SMLPAR is set. Signed-off-by: Nathan Fontenot Signed-off-by: Robert Jennings Acked-by: Paul Mackerras Signed-off-by: Benjamin Herrenschmidt commit 39c1ffecc6aabcc8105602a95ce769f27bcf6048 Author: Brian King Date: Thu Jul 24 04:35:48 2008 +1000 ibmvfc: Add support for collaborative memory overcommit Adds support to the ibmvfc driver for collaborative memory overcommit. Signed-off-by: Brian King Signed-off-by: Robert Jennings Acked-by: Paul Mackerras Signed-off-by: Benjamin Herrenschmidt commit 7912a0ac5907df1f8b214b3ca15ccf96129daae0 Author: Robert Jennings Date: Thu Jul 24 04:35:27 2008 +1000 ibmvscsi: driver enablement for CMO Enable the driver to function in a Cooperative Memory Overcommitment (CMO) environment. The following changes are made to enable the driver for CMO: * DMA mapping errors will not result in error messages if entitlement has been exceeded and resources were not available. * The driver has a get_desired_dma function defined to function in a CMO environment. It will indicate how much IO memory it would like to function. Signed-off-by: Robert Jennings Acked by: Brian King Acked-by: Paul Mackerras Acked-by: James Bottomley Signed-off-by: Benjamin Herrenschmidt commit 1096d63d8e7d226630706e15648705d0187787e4 Author: Robert Jennings Date: Thu Jul 24 04:34:52 2008 +1000 ibmveth: enable driver for CMO Enable ibmveth for Cooperative Memory Overcommitment (CMO). For this driver it means calculating a desired amount of IO memory based on the current MTU and updating this value with the bus when MTU changes occur. Because DMA mappings can fail, we have added a bounce buffer for temporary cases where the driver can not map IO memory for the buffer pool. The following changes are made to enable the driver for CMO: * DMA mapping errors will not result in error messages if entitlement has been exceeded and resources were not available. * DMA mapping errors are handled gracefully, ibmveth_replenish_buffer_pool() is corrected to check the return from dma_map_single and fail gracefully. * The driver will have a get_desired_dma function defined to function in a CMO environment. * When the MTU is changed, the driver will update the device IO entitlement Signed-off-by: Robert Jennings Signed-off-by: Brian King Signed-off-by: Santiago Leon Acked-by: Paul Mackerras Signed-off-by: Benjamin Herrenschmidt commit ea866e6526b8a2ead92875732d41b26fdb470312 Author: Santiago Leon Date: Thu Jul 24 04:34:23 2008 +1000 ibmveth: Automatically enable larger rx buffer pools for larger mtu Activates larger rx buffer pools when the MTU is changed to a larger value. This patch de-activates the large rx buffer pools when the MTU changes to a smaller value. Signed-off-by: Santiago Leon Signed-off-by: Robert Jennings Acked-by: Paul Mackerras Signed-off-by: Benjamin Herrenschmidt commit 22e1a4dd3f2a9009d1d8896a5e833b6094877008 Author: Nathan Fontenot Date: Thu Jul 24 04:31:52 2008 +1000 powerpc/pseries: Verify CMO memory entitlement updates with virtual I/O Verify memory entitlement updates can be handled by vio. Signed-off-by: Nathan Fontenot Signed-off-by: Robert Jennings Acked-by: Paul Mackerras Signed-off-by: Benjamin Herrenschmidt commit a90ab95a9576d35de0d05f9f4fc435edcccafaa9 Author: Robert Jennings Date: Thu Jul 24 04:31:33 2008 +1000 powerpc/pseries: vio bus support for CMO This is a large patch but the normal code path is not affected. For non-pSeries platforms the code is ifdef'ed out and for non-CMO enabled pSeries systems this does not affect the normal code path. Devices that do not perform DMA operations do not need modification with this patch. The function get_desired_dma was renamed from get_io_entitlement for clarity. Overview Cooperative Memory Overcommitment (CMO) allows for a set of OS partitions to be run with less RAM than the aggregate needs of the group of partitions. The firmware will balance memory between the partitions and page in/out memory as needed. Based on the number and type of IO adpaters preset each partition is allocated an amount of memory for DMA operations and this allocation will be guaranteed to the partition; this is referred to as the partition's 'entitlement'. Partitions running in a CMO environment can only have virtual IO devices present. The VIO bus layer will manage the IO entitlement for the system. Accounting, at a system and per-device level, is tracked in the VIO bus code and exposed via sysfs. A set of dma_ops functions are added to the bus to allow for this accounting. Bus initialization At initialization, the bus will calculate the minimum needs of the system based on providing each device present with a standard minimum entitlement along with a spare allocation for the bus to handle hotplug events. If the minimum needs can not be met the system boot will be halted. Device changes The significant changes for devices while running under CMO are that the devices must specify how much dedicated IO entitlement they desire and must also handle DMA mapping errors that can occur due to constrained IO memory. The virtual IO drivers are modified to silence errors when DMA mappings fail for CMO and handle these failures gracefully. Each devices will be guaranteed a minimum entitlement that can always be mapped. Devices will specify how much entitlement they desire and the VIO bus will attempt to provide for this. Devices can change their desired entitlement level at any point in time to address particular needs (via vio_cmo_set_dev_desired()), not just at device probe time. VIO bus changes The system will have a particular entitlement level available from which it can provide memory to the devices. The bus defines two pools of memory within this entitlement, the reserved and excess pools. Each device is provided with it's own entitlement no less than a system defined minimum entitlement and no greater than what the device has specified as it's desired entitlement. The entitlement provided to devices comes from the reserve pool. The reserve pool can also contain a spare allocation as large as the system defined minimum entitlement which is used for device hotplug events. Any entitlement not needed to fulfill the needs of a reserve pool is placed in the excess pool. Each device is guaranteed that it can map up to it's entitled level; additional mapping are possible as long as there is unmapped memory in the excess pool. Bus probe As the system starts, each device is given an entitlement equal only to the system defined minimum entitlement. The reserve pool is equal to the sum of these entitlements, plus a spare allocation. The VIO bus also tracks the aggregate desired entitlement of all the devices. If the system desired entitlement is greater than the size of the reserve pool, when devices unmap IO memory it will be reserved and a balance operation will be scheduled for some time in the future. Entitlement balancing The balance function tries to fairly distribute entitlement between the devices in the system with the goal of providing each device with it's desired amount of entitlement. Devices using more than what would be ideal will have their entitled set-point adjusted; this will effectively set a goal for lower IO memory usage as future mappings can fail and deallocations will trigger a balance operation to distribute the newly unmapped memory. A fair distribution of entitlement can take several balance operations to achieve. Entitlement changes and device DLPAR events will alter the state of CMO and will trigger balance operations. Hotplug events The VIO bus allows for changes in system entitlement at run-time via 'vio_cmo_entitlement_update()'. When devices are added the hotplug device event will be preceded by a system entitlement increase and this is reversed when devices are removed. The following changes are made that the VIO bus layer for CMO: * add IO memory accounting per device structure. * add IO memory entitlement query function to driver structure. * during vio bus probe, if CMO is enabled, check that driver has memory entitlement query function defined. Fail if function not defined. * fail to register driver if io entitlement function not defined. * create set of dma_ops at vio level for CMO that will track allocations and return DMA failures once entitlement is reached. Entitlement will limited by overall system entitlement. Devices will have a reserved quantity of memory that is guaranteed, the rest can be used as available. * expose entitlement, current allocation, desired allocation, and the allocation error counter for devices to the user through sysfs * provide mechanism for changing a device's desired entitlement at run time for devices as an exported function and sysfs tunable * track any DMA failures for entitled IO memory for each vio device. * check entitlement against available system entitlement on device add * track entitlement metrics (high water mark, current usage) * provide function to reset high water mark * provide minimum and desired entitlement numbers at a bus level * provide drivers with a minimum guaranteed entitlement * balance available entitlement between devices to satisfy their needs * handle system entitlement changes and device hotplug Signed-off-by: Robert Jennings Acked-by: Paul Mackerras Signed-off-by: Benjamin Herrenschmidt commit 6490c4903d12f242bec4454301f76f6a7520e399 Author: Robert Jennings Date: Thu Jul 24 04:31:16 2008 +1000 powerpc/pseries: iommu enablement for CMO To support Cooperative Memory Overcommitment (CMO), we need to check for failure from some of the tce hcalls. These changes for the pseries platform affect the powerpc architecture; patches for the other affected platforms are included in this patch. pSeries platform IOMMU code changes: * platform TCE functions must handle H_NOT_ENOUGH_RESOURCES errors and return an error. Architecture IOMMU code changes: * Calls to ppc_md.tce_build need to check return values and return DMA_MAPPING_ERROR for transient errors. Architecture changes: * struct machdep_calls for tce_build*_pSeriesLP functions need to change to indicate failure. * all other platforms will need updates to iommu functions to match the new calling semantics; they will return 0 on success. The other platforms default configs have been built, but no further testing was performed. Signed-off-by: Robert Jennings Acked-by: Olof Johansson Acked-by: Paul Mackerras Signed-off-by: Benjamin Herrenschmidt commit ffa5abbd0c399b32fc13a1b4718d87ee7a716999 Author: Brian King Date: Thu Jul 24 04:30:58 2008 +1000 powerpc/pseries: Add CMO paging statistics With the addition of Cooperative Memory Overcommitment (CMO) support for IBM Power Systems, two fields have been added to the VPA to report paging statistics. Add support in lparcfg to report them to userspace. Signed-off-by: Brian King Signed-off-by: Robert Jennings Acked-by: Paul Mackerras Signed-off-by: Benjamin Herrenschmidt commit 84af458bb23bf5f0ba1af4320dd2a57f7c4363e5 Author: Brian King Date: Thu Jul 24 04:30:29 2008 +1000 powerpc/pseries: Add collaborative memory manager Adds a collaborative memory manager, which acts as a simple balloon driver for System p machines that support cooperative memory overcommitment (CMO). Adds a platform configuration option for CMO called PPC_SMLPAR. Signed-off-by: Brian King Signed-off-by: Robert Jennings Acked-by: Paul Mackerras Signed-off-by: Benjamin Herrenschmidt commit 86630a32320f83736c4c24e2c8bae218e4c56c7c Author: Brian King Date: Thu Jul 24 04:29:16 2008 +1000 powerpc/pseries: Utilities to set firmware page state Newer versions of firmware support page states, which are used by the collaborative memory manager (future patch) to "loan" pages to the hypervisor for use by other partitions. Signed-off-by: Brian King Signed-off-by: Robert Jennings Acked-by: Paul Mackerras Signed-off-by: Benjamin Herrenschmidt commit e46de429cb954d30a5642fba81d516ede518c65e Author: Robert Jennings Date: Thu Jul 24 04:29:03 2008 +1000 powerpc/pseries: Enable CMO feature during platform setup For Cooperative Memory Overcommitment (CMO), set the FW_FEATURE_CMO flag in powerpc_firmware_features from the rtas ibm,get-system-parameters table prior to calling iommu_init_early_pSeries. With this, any CMO specific functionality can be controlled by checking: firmware_has_feature(FW_FEATURE_CMO) Signed-off-by: Robert Jennings Acked-by: Paul Mackerras Signed-off-by: Benjamin Herrenschmidt commit 398778f78b76fb72cb18411487af01dea202709e Author: Robert Jennings Date: Thu Jul 24 04:28:05 2008 +1000 powerpc/pseries: Split retrieval of processor entitlement data into a helper routine Split the retrieval of processor entitlement data returned in the H_GET_PPP hcall into its own helper routine. Signed-off-by: Nathan Fontenot Signed-off-by: Robert Jennings Acked-by: Paul Mackerras Signed-off-by: Benjamin Herrenschmidt commit dfc3403f0e5ffb94ee29942f313b87d4061d951b Author: Nathan Fontenot Date: Thu Jul 24 04:27:30 2008 +1000 powerpc/pseries: Add memory entitlement capabilities to /proc/ppc64/lparcfg Update /proc/ppc64/lparcfg to display Cooperative Memory Overcommitment statistics as reported by the H_GET_MPP hcall. This also updates the lparcfg interface to allow setting memory entitlement and weight. Signed-off-by: Nathan Fontenot Signed-off-by: Robert Jennings Acked-by: Paul Mackerras Signed-off-by: Benjamin Herrenschmidt commit 11529396ea3190113173f7a15e59a58dbcaa36c8 Author: Nathan Fotenot Date: Thu Jul 24 04:25:16 2008 +1000 powerpc/pseries: Split processor entitlement retrieval and gathering to helper routines Split the retrieval and setting of processor entitlement and weight into helper routines. This also removes the printing of the raw values returned from h_get_ppp, the values are already parsed and printed. Signed-off-by: Nathan Fontenot Signed-off-by: Robert Jennings Acked-by: Paul Mackerras Signed-off-by: Benjamin Herrenschmidt commit 545500b307658ad5783e0f3a52a32b97b2dfaed2 Author: Nathan Fontenot Date: Thu Jul 24 04:25:00 2008 +1000 powerpc/pseries: Remove extraneous error reporting for hcall failures in lparcfg Remove the extraneous error reporting used when a hcall made from lparcfg fails. Signed-off-by: Nathan Fontenot Signed-off-by: Robert Jennings Acked-by: Paul Mackerras Signed-off-by: Benjamin Herrenschmidt commit 80c60bf9b96f6108c630d90efc073cd520801e6c Author: Segher Boessenkool Date: Fri Jul 25 10:08:41 2008 +1000 powerpc: Fix compile error with binutils 2.15 My previous patch to fix compilation with binutils-2.17 causes a "file truncated" build error from ld with binutils 2.15 (and possibly older), and a warning with 2.16 and 2.17. This fixes it. Signed-off-by: Segher Boessenkool Acked-by: Chuck Meade Signed-off-by: Benjamin Herrenschmidt commit 7886250e9d71b24d0205ac6798ee855fb3836318 Author: Mark Nelson Date: Thu Jul 24 14:28:48 2008 +1000 powerpc/cell: Fixed IOMMU mapping uses weak ordering for a pcie endpoint At the moment the fixed mapping is by default strongly ordered (the iommu_fixed=weak boot option must be used to make the fixed mapping weakly ordered). If we're on a setup where the southbridge is being used in endpoint mode (triblade and CAB boards) the default should be a weakly ordered fixed mapping. This adds a check so that if a node of type pcie-endpoint can be found in the device tree the fixed mapping is set to be weak by default (but can be overridden using iommu_fixed=strong). Signed-off-by: Mark Nelson Acked-by: Arnd Bergmann Signed-off-by: Benjamin Herrenschmidt commit d6a61bfc06d6f2248f3e75f208d64e794082013c Author: Luis Machado Date: Thu Jul 24 02:10:41 2008 +1000 powerpc: BookE hardware watchpoint support This patch implements support for HW based watchpoint via the DBSR_DAC (Data Address Compare) facility of the BookE processors. It does so by interfacing with the existing DABR breakpoint code and adding the necessary bits and pieces for the new bits to be properly set or cleared Signed-off-by: Luis Machado Signed-off-by: Benjamin Herrenschmidt commit 00bf6e906156b07cd641fe154ad0efe78f989692 Author: Stephen Rothwell Date: Wed Jul 23 10:44:58 2008 +1000 powerpc: Fallout from sysdev API changes A struct sysdev_attribute * parameter was added to the show routine by commit 4a0b2b4dbe1335b8b9886ba3dc85a145d5d938ed "sysdev: Pass the attribute to the low level sysdev show/store function". This eliminates a warning: arch/powerpc/kernel/sysfs.c:538: warning: initialization from incompatible pointer type Signed-off-by: Stephen Rothwell Signed-off-by: Benjamin Herrenschmidt commit 9115d13453dee22473a1e8cacc90a8d64a9c4bc9 Author: Nathan Lynch Date: Wed Jul 16 09:58:51 2008 +1000 powerpc: Enable AT_BASE_PLATFORM aux vector Stash the first platform string matched by identify_cpu() in powerpc_base_platform, and supply that to the ELF loader for the value of AT_BASE_PLATFORM. Signed-off-by: Nathan Lynch Signed-off-by: Benjamin Herrenschmidt commit 483fad1c3fa1060d7e6710e84a065ad514571739 Author: Nathan Lynch Date: Tue Jul 22 04:48:46 2008 +1000 ELF loader support for auxvec base platform string Some IBM POWER-based platforms have the ability to run in a mode which mostly appears to the OS as a different processor from the actual hardware. For example, a Power6 system may appear to be a Power5+, which makes the AT_PLATFORM value "power5+". This means that programs are restricted to the ISA supported by Power5+; Power6-specific instructions are treated as illegal. However, some applications (virtual machines, optimized libraries) can benefit from knowledge of the underlying CPU model. A new aux vector entry, AT_BASE_PLATFORM, will denote the actual hardware. For example, on a Power6 system in Power5+ compatibility mode, AT_PLATFORM will be "power5+" and AT_BASE_PLATFORM will be "power6". The idea is that AT_PLATFORM indicates the instruction set supported, while AT_BASE_PLATFORM indicates the underlying microarchitecture. If the architecture has defined ELF_BASE_PLATFORM, copy that value to the user stack in the same manner as ELF_PLATFORM. Signed-off-by: Nathan Lynch Acked-by: Andrew Morton Signed-off-by: Benjamin Herrenschmidt commit e9f76354ce83a20c7768ad37caa033f6506b4f96 Merge: c174aff... ad1ede1... Author: Benjamin Herrenschmidt Date: Fri Jul 25 15:35:10 2008 +1000 Merge commit 'jk/jk-merge' commit c174aff95642bcc830102becb9802adeb8f87a5a Merge: fb2e405... 79c28ac... Author: Benjamin Herrenschmidt Date: Fri Jul 25 15:35:03 2008 +1000 Merge commit 'gcl/gcl-next' commit 832fe9c222c7d431c2bff5765a0ac61bcb3df8c8 Merge: ed9559d... e34f872... Author: Linus Torvalds Date: Thu Jul 24 19:11:49 2008 -0700 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: virtio: Add transport feature handling stub for virtio_ring. virtio: Rename set_features to finalize_features virtio: Formally reserve bits 28-31 to be 'transport' features. s390: use virtio_console for KVM on s390 virtio: console as a config option virtio_console: use virtqueue notification for hvc_console hvc_console: rework setup to replace irq functions with callbacks virtio_blk: check for hardsector size from host virtio: Use bus_type probe and remove methods virtio: don't always force a notification when ring is full virtio: clarify that ABI is usable by any implementations virtio: Recycle unused recv buffer pages for large skbs in net driver virtio net: Allow receiving SG packets virtio net: Add ethtool ops for SG/GSO virtio: fix virtio_net xmit of freed skb bug commit ed9559d38a87a44e3bda87d73a50aab92471d7dc Author: Rusty Russell Date: Fri Jul 25 12:11:09 2008 +1000 Label kthread_create() with printf attribute tag. Obvious misc patch been in my queue (& linux-next) for over a cycle. Signed-off-by: Rusty Russell Signed-off-by: Linus Torvalds commit e34f87256794b87e7f4a8f1812538be7b7b5214c Author: Rusty Russell Date: Fri Jul 25 12:06:13 2008 -0500 virtio: Add transport feature handling stub for virtio_ring. To prepare for virtio_ring transport feature bits, hook in a call in all the users to manipulate them. This currently just clears all the bits, since it doesn't understand any features. Signed-off-by: Rusty Russell commit c624896e488ba2bff5ae497782cfb265c8b00646 Author: Rusty Russell Date: Fri Jul 25 12:06:07 2008 -0500 virtio: Rename set_features to finalize_features Rather than explicitly handing the features to the lower-level, we just hand the virtio_device and have it set the features. This make it clear that it has the chance to manipulate the features of the device at this point (and that all feature negotiation is already done). Signed-off-by: Rusty Russell commit dd7c7bc46211785a1aa7d70feb15830f62682b3c Author: Rusty Russell Date: Fri Jul 25 12:06:07 2008 -0500 virtio: Formally reserve bits 28-31 to be 'transport' features. We assign feature bits as required, but it makes sense to reserve some for the particular transport, rather than the particular device. Signed-off-by: Rusty Russell commit faeba830b086bc9e58748869054e994cb09693cd Author: Christian Borntraeger Date: Fri Jun 20 15:24:18 2008 +0200 s390: use virtio_console for KVM on s390 This patch enables virtio_console as the default console on kvm for s390. We currently use the same notify hack as lguest for early console output. I will try to address this for lguest and s390 later. Signed-off-by: Christian Borntraeger Signed-off-by: Rusty Russell commit 7721c494a28e06543a3d6aa412957aa783a4a531 Author: Christian Borntraeger Date: Fri Jul 25 12:06:06 2008 -0500 virtio: console as a config option I also added a small Kconfig change that allows the user to specify the virtio console in menuconfig. (Fixes to export symbols from Stephen Rothwell ) (Fixes for CONFIG_VIRTIO_CONSOLE=y vs CONFIG_VIRTIO=m from Christian himself) Signed-off-by: Rusty Russell Cc: Stephen Rothwell commit 91fcad19d03ed67cb50fd0e1913a8b89cc3ed3ec Author: Christian Borntraeger Date: Fri Jun 20 15:24:15 2008 +0200 virtio_console: use virtqueue notification for hvc_console This patch exploits the new notifier callbacks of the hvc_console. We can use the virtio callbacks instead of the polling code. Signed-off-by: Christian Borntraeger Signed-off-by: Rusty Russell commit 611e097d7707741a336a0677d9d69bec40f29f3d Author: Christian Borntraeger Date: Fri Jun 20 15:24:08 2008 +0200 hvc_console: rework setup to replace irq functions with callbacks This patch tries to change hvc_console to not use request_irq/free_irq if the backend does not use irqs. This allows virtio_console to use hvc_console without having a linker reference to request_irq/free_irq. In addition, together with patch 2/3 it improves the performance for virtio console input. (an earlier version of this patch was tested by Yajin on lguest) The irq specific code is moved to hvc_irq.c and selected by the drivers that use irqs (System p, System i, XEN). I replaced "int irq" with the opaque "int data". The request_irq and free_irq calls are replaced with notifier_add and notifier_del. I have also changed the code a bit to call the notifier_add and notifier_del inside the spinlock area as the callbacks are found via hp->ops. Changes since last version: o remove ifdef o reintroduce "irq_requested" as "notified" o cleanups, sparse.. I did not move the timer based polling into a separate polling scheme. I played with several variants, but it seems we need to sleep/schedule in a thread even for irq based consoles, as there are throttleing and buffer size constraints. I also kept hvc_struct defined in hvc_console.h so that hvc_irq.c can access the irq_requested element. Feedback is appreciated. virtio_console is currently the only available console for kvm on s390. I plan to push this change as soon as all affected parties agree on it. I would love to get test results from System p, Xen etc. Signed-off-by: Christian Borntraeger Signed-off-by: Rusty Russell commit 066f4d82a67f621ddd547bfa4b9c94631d8457b0 Author: Christian Borntraeger Date: Thu May 29 11:08:26 2008 +0200 virtio_blk: check for hardsector size from host Currently virtio_blk assumes a 512 byte hard sector size. This can cause trouble / performance issues if the backing has a different block size (like a file on an ext3 file system formatted with 4k block size or a dasd). Lets add a feature flag that tells the guest to use a different hard sector size than 512 byte. Signed-off-by: Christian Borntraeger Signed-off-by: Rusty Russell commit e962fa660d391fc9b90988e6538c94c858c099f9 Author: Mark McLoughlin Date: Fri Jun 13 13:46:40 2008 +0100 virtio: Use bus_type probe and remove methods Hook up to the probe() and remove() methods in bus_type rather than device_driver. The latter has been preferred since 2.6.16. Signed-off-by: Mark McLoughlin Signed-off-by: Rusty Russell commit 44653eae1407f79dff6f52fcf594ae84cb165ec4 Author: Rusty Russell Date: Fri Jul 25 12:06:04 2008 -0500 virtio: don't always force a notification when ring is full We force notification when the ring is full, even if the host has indicated it doesn't want to know. This seemed like a good idea at the time: if we fill the transmit ring, we should tell the host immediately. Unfortunately this logic also applies to the receiving ring, which is refilled constantly. We should introduce real notification thesholds to replace this logic. Meanwhile, removing the logic altogether breaks the heuristics which KVM uses, so we use a hack: only notify if there are outgoing parts of the new buffer. Here are the number of exits with lguest's crappy network implementation: Before: network xmit 7859051 recv 236420 After: network xmit 7858610 recv 118136 Signed-off-by: Rusty Russell commit 674bfc23c585b34c42263d73fb51710d49762a23 Author: Rusty Russell Date: Fri Jul 25 12:06:03 2008 -0500 virtio: clarify that ABI is usable by any implementations We want others to implement and use virtio, so it makes sense to BSD license the non-__KERNEL__ parts of the headers to make this crystal clear. Signed-off-by: Rusty Russell Acked-by: Christian Borntraeger Acked-by: Mark McLoughlin Acked-by: Ryan Harper Acked-by: Eric Van Hensbergen Acked-by: Anthony Liguori commit fb6813f480806d62361719e84777c8e00d3e86a8 Author: Rusty Russell Date: Fri Jul 25 12:06:01 2008 -0500 virtio: Recycle unused recv buffer pages for large skbs in net driver If we hack the virtio_net driver to always allocate full-sized (64k+) skbuffs, the driver slows down (lguest numbers): Time to receive 1GB (small buffers): 10.85 seconds Time to receive 1GB (64k+ buffers): 24.75 seconds Of course, large buffers use up more space in the ring, so we increase that from 128 to 2048: Time to receive 1GB (64k+ buffers, 2k ring): 16.61 seconds If we recycle pages rather than using alloc_page/free_page: Time to receive 1GB (64k+ buffers, 2k ring, recycle pages): 10.81 seconds This demonstrates that with efficient allocation, we don't need to have a separate "small buffer" queue. Signed-off-by: Rusty Russell commit 97402b96f87c6e32f75f1bffdd91a5ee144b679d Author: Herbert Xu Date: Fri Apr 18 11:24:27 2008 +0800 virtio net: Allow receiving SG packets Finally this patch lets virtio_net receive GSO packets in addition to sending them. This can definitely be optimised for the non-GSO case. For comparison the Xen approach stores one page in each skb and uses subsequent skb's pages to construct an SG skb instead of preallocating the maximum amount of pages per skb. Signed-off-by: Rusty Russell (added feature bits) commit a9ea3fc6f2654a7407864fec983d1671d775b5ee Author: Herbert Xu Date: Fri Apr 18 11:21:42 2008 +0800 virtio net: Add ethtool ops for SG/GSO This patch adds some basic ethtool operations to virtio_net so I could test SG without GSO (which was really useful because TSO turned out to be buggy :) Signed-off-by: Rusty Russell (remove MTU setting) commit 9953ca6cb757fb317bb7cdd2fcbf9b88312e241b Author: Mark McLoughlin Date: Tue May 27 12:06:26 2008 +0100 virtio: fix virtio_net xmit of freed skb bug On Mon, 2008-05-26 at 17:42 +1000, Rusty Russell wrote: > If we fail to transmit a packet, we assume the queue is full and put > the skb into last_xmit_skb. However, if more space frees up before we > xmit it, we loop, and the result can be transmitting the same skb twice. > > Fix is simple: set skb to NULL if we've used it in some way, and check > before sending. ... > diff -r 564237b31993 drivers/net/virtio_net.c > --- a/drivers/net/virtio_net.c Mon May 19 12:22:00 2008 +1000 > +++ b/drivers/net/virtio_net.c Mon May 19 12:24:58 2008 +1000 > @@ -287,21 +287,25 @@ again: > free_old_xmit_skbs(vi); > > /* If we has a buffer left over from last time, send it now. */ > - if (vi->last_xmit_skb) { > + if (unlikely(vi->last_xmit_skb)) { > if (xmit_skb(vi, vi->last_xmit_skb) != 0) { > /* Drop this skb: we only queue one. */ > vi->dev->stats.tx_dropped++; > kfree_skb(skb); > + skb = NULL; > goto stop_queue; > } > vi->last_xmit_skb = NULL; With this, may drop an skb and then later in the function discover that we could have sent it after all. Poor wee skb :) How about the incremental patch below? Cheers, Mark. Subject: [PATCH] virtio_net: Delay dropping tx skbs Currently we drop the skb in start_xmit() if we have a queued buffer and fail to transmit it. However, if we delay dropping it until we've stopped the queue and enabled the tx notification callback, then there is a chance space might become available for it. Signed-off-by: Mark McLoughlin Signed-off-by: Rusty Russell commit fb2e405fc1fc8b20d9c78eaa1c7fd5a297efde43 Author: Adrian Bunk Date: Fri Jul 25 02:55:49 2008 +0300 fix fs/nfs/nfsroot.c compilation This fixes the following compile error caused by commit f9247273cb69ba101877e946d2d83044409cc8c5 ("UFS: add const to parser token table"): CC fs/nfs/nfsroot.o /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/nfs/nfsroot.c:130: error: tokens causes a section type conflict make[3]: *** [fs/nfs/nfsroot.o] Error 1 Signed-off-by: Adrian Bunk Signed-off-by: Linus Torvalds commit 4b9f12a3779c548b68bc9af7d94030868ad3aa1b Author: Linus Torvalds Date: Thu Jul 24 17:29:00 2008 -0700 x86/oprofile/nmi_int: add Nehalem to list of ppro cores ..otherwise oprofile will fall back on that poor timer interrupt. Also replace the unreadable chain of if-statements with a "switch()" statement instead. It generates better code, and is a lot clearer. Signed-off-by: Linus Torvalds commit b30f3ae50cd03ef2ff433a5030fbf88dd8323528 Author: Linus Torvalds Date: Thu Jul 24 15:43:44 2008 -0700 x86-64: Clean up 'save/restore_i387()' usage Suresh Siddha wants to fix a possible FPU leakage in error conditions, but the fact that save/restore_i387() are inlines in a header file makes that harder to do than necessary. So start off with an obvious cleanup. This just moves the x86-64 version of save/restore_i387() out of the header file, and moves it to the only file that it is actually used in: arch/x86/kernel/signal_64.c. So exposing it in a header file was wrong to begin with. [ Side note: I'd like to fix up some of the games we play with the 32-bit version of these functions too, but that's a separate matter. The 32-bit versions are shared - under different names at that! - by both the native x86-32 code and the x86-64 32-bit compatibility code ] Acked-by: Suresh Siddha Signed-off-by: Linus Torvalds commit b5684b83b1e1579bbbc80e703e990c0cccf5892c Merge: 1481b91... 1b8ebad... Author: Linus Torvalds Date: Thu Jul 24 14:55:09 2008 -0700 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (76 commits) ide: use proper printk() KERN_* levels in ide-probe.c ide: fix for EATA SCSI HBA in ATA emulating mode ide: remove stale comments from drivers/ide/Makefile ide: enable local IRQs in all handlers for TASKFILE_NO_DATA data phase ide-scsi: remove kmalloced struct request ht6560b: remove old history ht6560b: update email address ide-cd: fix oops when using growisofs gayle: release resources on ide_host_add() failure palm_bk3710: add UltraDMA/100 support ide: trivial sparse annotations ide: ide-tape.c sparse annotations and unaligned access removal ide: drop 'name' parameter from ->init_chipset method ide: prefix messages from IDE PCI host drivers by driver name it821x: remove DECLARE_ITE_DEV() macro it8213: remove DECLARE_ITE_DEV() macro ide: include PCI device name in messages from IDE PCI host drivers ide: remove for some archs ide-generic: remove ide_default_{io_base,irq}() inlines (take 3) ide-generic: is no longer needed on ppc32 ... commit 1481b9109fe771ec8b035d7760f42e36d2bed5d4 Merge: 5042d99... f88133d... Author: Linus Torvalds Date: Thu Jul 24 13:57:37 2008 -0700 Merge branch 'release-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-acpi-2.6 * 'release-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-acpi-2.6: acpi: fix crash in core ACPI code, triggered by CONFIG_ACPI_PCI_SLOT=y ACPI: thinkpad-acpi: don't misdetect in get_thinkpad_model_data() on -ENOMEM ACPI: thinkpad-acpi: bump up version to 0.21 ACPI: thinkpad-acpi: add bluetooth and WWAN rfkill support ACPI: thinkpad-acpi: WLSW overrides other rfkill switches ACPI: thinkpad-acpi: prepare for bluetooth and wwan rfkill support ACPI: thinkpad-acpi: consolidate wlsw notification function ACPI: thinkpad-acpi: minor refactor on radio switch init Revert "ACPI: don't walk tables if ACPI was disabled" Revert "dock: bay: Don't call acpi_walk_namespace() when ACPI is disabled." Revert "Fix FADT parsing" ACPI : Set FAN device to correct state in boot phase ACPI: Ignore _BQC object when registering backlight device ACPI: stop complaints about interrupt link End Tags and blank IRQ descriptors commit 5042d99795d3d817bef2f4cc46e953bee9bf7398 Merge: 5c40235... f17a077... Author: Linus Torvalds Date: Thu Jul 24 13:57:13 2008 -0700 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: fixup sparse endianness warnings in proc.c PCI PM: make more PCI PM core functionality available to drivers PCI/DMAR: don't assume presence of RMRRs PCI hotplug: fix error path in pci_slot's register_slot commit 1b8ebad87b459e2e1333fbf28005977245ff5402 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:36 2008 +0200 ide: use proper printk() KERN_* levels in ide-probe.c While at it: - fixup printk() messages in save_match() and hwif_init(). Signed-off-by: Bartlomiej Zolnierkiewicz commit 52f3a771feafe3e9c56f8d00c8eb53fd8f578f2d Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:36 2008 +0200 ide: fix for EATA SCSI HBA in ATA emulating mode IDE probing code used to skip devices attached to EATA SCSI HBA in ATA emulating mode but because of warm-plug support port I/O resources are no longer freed if no devices are detected on a port and the decision about the driver to use is left up to the user. Remove no longer valid EATA SCSI HBA quirk from do_identify(). Noticed-by: Alan Cox Signed-off-by: Bartlomiej Zolnierkiewicz commit d0b53f6866fa185da94968e62ae97923db18298c Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:36 2008 +0200 ide: remove stale comments from drivers/ide/Makefile Signed-off-by: Bartlomiej Zolnierkiewicz commit 90d2c6bc68745d67cdbf00bab43818d90aa0dfb6 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:36 2008 +0200 ide: enable local IRQs in all handlers for TASKFILE_NO_DATA data phase It is already done by task_no_data_intr() and there is no reason not to do it in other TASKFILE_NO_DATA data phase handlers. Signed-off-by: Bartlomiej Zolnierkiewicz commit e27420d046600cd3e4139ea1b6cba59a8b4050eb Author: FUJITA Tomonori Date: Thu Jul 24 22:53:35 2008 +0200 ide-scsi: remove kmalloced struct request This converts ide-scsi to use blk_get/put_request instead of kmalloc/kfree. Signed-off-by: FUJITA Tomonori Signed-off-by: Bartlomiej Zolnierkiewicz commit 216f9a88feabf5ed574c3aa78447a6bd872910bc Author: Jan Evert van Grootheest Date: Thu Jul 24 22:53:35 2008 +0200 ht6560b: remove old history Remove the ancient version history. Git does a better job. From: Jan Evert van Grootheest Signed-off-by: Bartlomiej Zolnierkiewicz commit eb34b2d90e71380ad19695188934230b06a3668b Author: Jan Evert van Grootheest Date: Thu Jul 24 22:53:35 2008 +0200 ht6560b: update email address Update email address. From: Jan Evert van Grootheest Signed-off-by: Bartlomiej Zolnierkiewicz commit e8e7b9eb11c34ee18bde8b7011af41938d1ad667 Author: Jens Axboe Date: Thu Jul 24 22:53:35 2008 +0200 ide-cd: fix oops when using growisofs cdrom_read_capacity() will blindly return the capacity from the device without sanity-checking it. This later causes code in fs/buffer.c to oops. Fix this by checking that the device is telling us sensible things. From: Jens Axboe Cc: Michael Buesch Cc: Jan Kara Cc: Arnd Bergmann Cc: Cc: Borislav Petkov Signed-off-by: Andrew Morton [bart: print device name instead of driver name] Signed-off-by: Bartlomiej Zolnierkiewicz [harvey: blocklen is a big-endian value] Signed-off-by: Harvey Harrison Signed-off-by: Bartlomiej Zolnierkiewicz commit 96cc112c09b3c6674da01ef8b377f7a916883ea2 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:34 2008 +0200 gayle: release resources on ide_host_add() failure "gayle: reserve memory resources at once" patch temporary removed freeing of resources on failure (to ease convertion to ide_host_add() interface). This patch fixes it. Thanks to Geert for noticing the issue. Noticed-by: Geert Uytterhoeven Signed-off-by: Bartlomiej Zolnierkiewicz commit a0f403bc58dcaa118f02ec70c3ecfec1bc26e445 Author: Sergei Shtylyov Date: Thu Jul 24 22:53:34 2008 +0200 palm_bk3710: add UltraDMA/100 support This controller supports UltraDMA up to mode 5 but it should be clocked with at least twice the data strobe frequency, so enable mode 5 for 100+ MHz IDECLK. While at it, start passing the correct device to clk_get() -- it worked anyway but WTF? :-/ Signed-off-by: Sergei Shtylyov Signed-off-by: Bartlomiej Zolnierkiewicz commit 7fa897b91a3ea0f16c2873b869d7a0eef05acff4 Author: Harvey Harrison Date: Thu Jul 24 22:53:34 2008 +0200 ide: trivial sparse annotations Signed-off-by: Harvey Harrison Signed-off-by: Bartlomiej Zolnierkiewicz commit cd740ab0f69f6c94d9c7f916758e308f30a439fa Author: Harvey Harrison Date: Thu Jul 24 22:53:33 2008 +0200 ide: ide-tape.c sparse annotations and unaligned access removal If this is actually unaligned the access of speed/max_speed above is already broken and needs a get_unaligned. Otherwise it is aligned and they can be removed. Signed-off-by: Harvey Harrison Cc: Borislav Petkov Signed-off-by: Bartlomiej Zolnierkiewicz commit a326b02b0c576001353dbc489154959b0889c6bf Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:33 2008 +0200 ide: drop 'name' parameter from ->init_chipset method There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz commit ced3ec8aa7d0fa3300187ee47c144a22ccfc974e Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:32 2008 +0200 ide: prefix messages from IDE PCI host drivers by driver name Prefix messages from IDE PCI host drivers by driver name instead of marketed chipset name (it is still possible to exactly identify the particular chipset basing on driver messages). As a bonus this provides nice code savings for some drivers: text data bss dec hex filename 3826 112 8 3946 f6a drivers/ide/pci/amd74xx.o.before 2786 112 8 2906 b5a drivers/ide/pci/amd74xx.o.after 764 108 0 872 368 drivers/ide/pci/cs5520.o.before 680 108 0 788 314 drivers/ide/pci/cs5520.o.after 1680 112 4 1796 704 drivers/ide/pci/generic.o.before 1155 112 4 1271 4f7 drivers/ide/pci/generic.o.after 7128 792 0 7920 1ef0 drivers/ide/pci/hpt366.o.before 6984 792 0 7776 1e60 drivers/ide/pci/hpt366.o.after 2800 148 0 2948 b84 drivers/ide/pci/pdc202xx_new.o.before 2523 148 0 2671 a6f drivers/ide/pci/pdc202xx_new.o.after 2831 148 0 2979 ba3 drivers/ide/pci/pdc202xx_old.o.before 2683 148 0 2831 b0f drivers/ide/pci/pdc202xx_old.o.after 3776 112 4 3892 f34 drivers/ide/pci/piix.o.before 2804 112 4 2920 b68 drivers/ide/pci/piix.o.after 4693 116 0 4809 12c9 drivers/ide/pci/siimage.o.before 4600 116 0 4716 126c drivers/ide/pci/siimage.o.after Signed-off-by: Bartlomiej Zolnierkiewicz commit 04ba6e739e9c0623c25f94b191fd20dfbd1b26e3 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:32 2008 +0200 it821x: remove DECLARE_ITE_DEV() macro While at it: * it821x_chipsets[] -> it821x_chipset. * Fix it821x_chipset's name field (as it is used for IT8211/8212). Signed-off-by: Bartlomiej Zolnierkiewicz commit 29f1ca920cb8d65b979f7edf2fc7d11095461306 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:32 2008 +0200 it8213: remove DECLARE_ITE_DEV() macro While at it: * it8213_chipsets[] -> it8213_chipset. Signed-off-by: Bartlomiej Zolnierkiewicz commit 28cfd8af52a9ed4e5bd1751ea6bc0b8c870f68ec Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:31 2008 +0200 ide: include PCI device name in messages from IDE PCI host drivers While at it: * Apply small fixes to messages (s/dma/DMA/, remove trailing '.', etc). * Fix printk() call in ide_setup_pci_baseregs() to use KERN_INFO. * Move printk() call from ide_pci_clear_simplex() to the caller. * Cleanup do_ide_setup_pci_device() a bit. * amd74xx.c: remove superfluous PCI device revision information. * hpt366.c: fix two printk() calls in ->init_chipset to use KERN_INFO. * pdc202xx_new.c: fix printk() call in ->init_chipset to use KERN_INFO. * pdc202xx_old.c: fix driver message in pdc202xx_init_one(). * via82cxxx.c: fix driver warning message in via_init_one(). Signed-off-by: Bartlomiej Zolnierkiewicz commit 2a8f7450f828eaee49d66f41f99ac2e54f1160a6 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:31 2008 +0200 ide: remove for some archs * Remove include from ( includes which is enough). * Remove for alpha/blackfin/h8300/ia64/m32r/sh/x86/xtensa (this leaves us with arm/frv/m68k/mips/mn10300/parisc/powerpc/sparc[64]). There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz commit f01d35d87f39ab794ddcdefadb79c11054bcbfbc Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:31 2008 +0200 ide-generic: remove ide_default_{io_base,irq}() inlines (take 3) Replace ide_default_{io_base,irq}() inlines by legacy_{bases,irqs}[]. v2: Add missing zero-ing of hws[] (caught during testing by Borislav Petkov). v3: Fix zero-oing of hws[] for _real_ this time. There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz commit 35bbac9a2f73a7e0967d0a1d3e3673e2590ef716 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:30 2008 +0200 ide-generic: is no longer needed on ppc32 Cc: Benjamin Herrenschmidt Signed-off-by: Bartlomiej Zolnierkiewicz commit ffed0b6e1a6f5132681d4b521531d992f893190b Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:30 2008 +0200 ide-generic: remove broken PPC_PREP support PPC_PREP has been depending on BROKEN for some time now. Cc: Benjamin Herrenschmidt Signed-off-by: Bartlomiej Zolnierkiewicz commit d83b8b85cd56a083d30df73f3fd5e4714591b910 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:30 2008 +0200 ide: define MAX_HWIFS in * Now that ide_hwif_t instances are allocated dynamically the difference between MAX_HWIFS == 2 and MAX_HWIFS == 10 is ~100 bytes (x86-32) so use MAX_HWIFS == 10 on all archs except these ones that use MAX_HWIFS == 1. * Define MAX_HWIFS in instead of . [ Please note that avr32/cris/v850 have no and alpha/ia64/sh always define CONFIG_IDE_MAX_HWIFS. ] Signed-off-by: Bartlomiej Zolnierkiewicz commit 2c9d86438a0104800da2a8ecdc1e27baf38ba6a4 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:29 2008 +0200 ide: remove Remove and . This has been a broken code for some time now and needs rewrite to match IDE core code / host driver model anyway. Cc: Jesper Nilsson Cc: Mikael Starvik Signed-off-by: Bartlomiej Zolnierkiewicz commit b6cd7da5be2522b62bbc48d41b36c828b88e02fe Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:28 2008 +0200 ide-generic: remove "no_pci_devices()" quirk from ide_default_io_base() Since the decision to probe for ISA ide2-6 is now left to the user "no_pci_devices()" quirk is no longer needed and may be removed. Signed-off-by: Bartlomiej Zolnierkiewicz commit dbdec839c4c2bfc8f2da8e50c06b9947e5ad0394 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:28 2008 +0200 ide-generic: minor fix for mips Move ide_probe_legacy() call to ide_generic_init() so it fails early if necessary and returns the proper error value (nowadays ide_default_io_base() is used only by ide-generic). Cc: Ralf Baechle Signed-off-by: Bartlomiej Zolnierkiewicz commit ac32f3238c1d95a6ebea2c312160dbdbd61bf91c Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:27 2008 +0200 ide-generic: fix ide_default_io_base() for m32r Fix ide_default_io_base() to match ide_default_irq(). Cc: Hirokazu Takata Signed-off-by: Bartlomiej Zolnierkiewicz commit b0a62817961796f6dcef5f316134d8bc7279bf6e Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:27 2008 +0200 ide: fix * Add missing include. While at it: * Remove needless ide_default_{irq,io_base}() inlines. Cc: Chris Zankel Signed-off-by: Bartlomiej Zolnierkiewicz commit 37c5ef56989717d871d048f98fb6411e7a17c43d Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:27 2008 +0200 rapide: add module_exit() Cc: Russell King Signed-off-by: Bartlomiej Zolnierkiewicz commit 8e27cb1135de4cc69bf358209f91e1f7ba81eca1 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:27 2008 +0200 icside: add module_exit() Cc: Russell King Signed-off-by: Bartlomiej Zolnierkiewicz commit 585f67e736eece4cdf96b628042170273221e770 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:26 2008 +0200 via82cxxx: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit fc2c32b737fa370683f8c44d74f41febe33b9c23 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:26 2008 +0200 trm290: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit 29d72f2df933ea5ecf294b170b2f02af2af88120 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:26 2008 +0200 triflex: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit ea881d6d6c58aa6d56105d1faba7432243ea7118 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:26 2008 +0200 tc86c001: add ->remove method and module_exit() Cc: Sergei Shtylyov Signed-off-by: Bartlomiej Zolnierkiewicz commit 64b0fed31d6704e4e2e42e9a1ac5995b0a1b54e4 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:25 2008 +0200 slc90e66: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit 6ce7199897bcbad05ecd06a4df22795fb37f4d0a Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:25 2008 +0200 sl82c105: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit 1ceb906b4062954e92295191402e9214345ee0e9 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:25 2008 +0200 sis5513: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit fe3825808ad67af02bd826a0d2ca6831e947e80e Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:25 2008 +0200 siimage: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit bc2c9a8025921972f0774859b8f19b324734e824 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:25 2008 +0200 serverworks: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit 991f5e69c512b284aaec81432dff0440b2a2f418 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:24 2008 +0200 sc1200: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit 0fd188047ca75df85191cc55f929cb2889631430 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:24 2008 +0200 rz1000: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit da8c3e0d21c5dbb2815d7c8f1f09e0c68f626ed1 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:24 2008 +0200 piix: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit 574a1c24b63fdb584935b4924a38b451eeb0880e Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:24 2008 +0200 pdc202xx_old: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit d69c8f8c0068b9fc7f5a5082d8a891618b732e2d Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:23 2008 +0200 pdc202xx_new: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit adc7f85ae68bd2e8db2e0136dcd4679891e5c321 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:23 2008 +0200 opti621: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit aa6e518d75742fd3ac3d2cb4c2bcbae850319fc1 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:23 2008 +0200 ns87415: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit 1bcaaba7749dce7c0506cff0e811c9bed8121f38 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:22 2008 +0200 jmicron: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit 87d8b61356108835f5e91c0fb32b830ec585978c Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:22 2008 +0200 it821x: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit 5102f768570b3486979afb68c595b71cfb7f026f Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:22 2008 +0200 it8213: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit a6c43a2be9721d00ef9d6ef5b7b0e8113444577b Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:21 2008 +0200 hpt366: add ->remove method and module_exit() Cc: Sergei Shtylyov Signed-off-by: Bartlomiej Zolnierkiewicz commit 741ac62f6fca55ddbef52513fbc687ba6b04f99e Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:21 2008 +0200 hpt34x: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit f566bcae9fb39b108e39a2f31594c028d6ee2e77 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:21 2008 +0200 ide/pci/generic: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit cd68841b854e24076d41c32eae3ccfce6ae60a59 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:21 2008 +0200 cy82c693: add ->remove method and module_exit() Fix the refcounting for dev2 while at it. Signed-off-by: Bartlomiej Zolnierkiewicz commit 40c8a7f67d38de87f97a548b81b6cd0621a3ff9a Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:20 2008 +0200 cs5535: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit d16492a9789982955e627a7ffdcd1c3b945f7e85 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:20 2008 +0200 cs5530: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit e2b15b4765ca032d0837dfc8c195ecd3bc56a433 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:20 2008 +0200 cmd64x: add ->remove method and module_exit() Cc: Sergei Shtylyov Signed-off-by: Bartlomiej Zolnierkiewicz commit f354fbc4b45a730aa0f876322ea4f096b47d1013 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:20 2008 +0200 atiixp: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit b2509ac1d9dbe7a9d3a9915afbe108978002c95b Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:19 2008 +0200 amd74xx: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit 8ee3f3b69d9c37f86a45862f53451699ec77fe12 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:19 2008 +0200 alim15x3: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit eb7cb98b1cc8be1d4395d9accf49ae3924cd68f1 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:19 2008 +0200 aec62xx: add ->remove method and module_exit() Signed-off-by: Bartlomiej Zolnierkiewicz commit ef0b04276d8f719d754c092434fbd62c2aeb5307 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:19 2008 +0200 ide: add ide_pci_remove() helper * Add 'unsigned long host_flags' field to struct ide_host. * Set ->host_flags in ide_host_alloc_all(). * Always set PCI dev's ->driver_data in ide_pci_init_{one,two}(). * Add ide_pci_remove() helper (the default implementation for struct pci_driver's ->remove method). Signed-off-by: Bartlomiej Zolnierkiewicz commit 37525bebcfc15a1fe5a9cb50bf49b21bf43559c1 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:18 2008 +0200 via82cxxx: cleanup ->init_chipset method * Move the boot message and via_clock setup from init_chipset_via82cxxx() to via_init_one(). * Set vdev->via_config in via_init_one() and cleanup init_chipset_via82cxxx() accordingly. Signed-off-by: Bartlomiej Zolnierkiewicz commit 0794230fd4b1bf61af8aabd7e987a595d6dbc430 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:17 2008 +0200 cmd64x: cleanup ->init_chipset method Remove verbose reporting for CMD646 (PCI device revision is always logged by IDE PCI layer). Cc: Sergei Shtylyov Signed-off-by: Bartlomiej Zolnierkiewicz commit d51f19c86583ca70468883d8137a92689f1a80c1 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:17 2008 +0200 amd74xx: cleanup ->init_chipset method Move amd_clock setup from init_chipset_amd74xx() to amd74xx_probe(). Signed-off-by: Bartlomiej Zolnierkiewicz commit b16040b14e766d390138b04c8829c816f4c1d95b Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:17 2008 +0200 tc86c001: remove ->init_chipset method * Reserve PCI BAR 5 in tc86c001_init_one() and remove no longer needed init_chipset_tc86c001(). While at it: * Add & use DRV_NAME define. Cc: Sergei Shtylyov Signed-off-by: Bartlomiej Zolnierkiewicz commit ee77325b074a73694b66ec9eca4f7e55dad58b84 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:17 2008 +0200 via82cxxx: convert to use ->host_priv Signed-off-by: Bartlomiej Zolnierkiewicz commit 4c674235d667d7ddc6b0c95a228a507eb94da2d6 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:16 2008 +0200 siimage: convert to use ->host_priv While at it: * Reserve PCI BAR 5 in siimage_init_one() and remove no longer needed setup_mmio_siimage(). Signed-off-by: Bartlomiej Zolnierkiewicz commit 96776f3b57eb7beb889a4368937cc9d74082a47e Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:16 2008 +0200 sc1200: convert to use ->host_priv Signed-off-by: Bartlomiej Zolnierkiewicz commit 1d76d9dc448d5a6fc7b49ba06c634aa6927bcc3d Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:16 2008 +0200 it821x: convert to use ->host_priv While at it: * Allocate both struct it821x_dev instances at once. * Don't leak itdevs on ide_pci_init_one() failure. Signed-off-by: Bartlomiej Zolnierkiewicz commit 74811f355f4f69a187fa74892dcf2a684b84ce99 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:15 2008 +0200 hpt366: convert to use ->host_priv While at it: * Allocate both struct hpt_info instances at once. Cc: Sergei Shtylyov Signed-off-by: Bartlomiej Zolnierkiewicz commit 60e57ed7c12917932a01d1679d92a7a8735afbce Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:15 2008 +0200 aec62xx: convert to use ->host_priv Signed-off-by: Bartlomiej Zolnierkiewicz commit 08da591e14cf87247ec09b17c350235157a92fc3 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:15 2008 +0200 ide: add ide_device_{get,put}() helpers * Add 'struct ide_host *host' field to ide_hwif_t and set it in ide_host_alloc_all(). * Add ide_device_{get,put}() helpers loosely based on SCSI's scsi_device_{get,put}() ones. * Convert IDE device drivers to use ide_device_{get,put}(). Signed-off-by: Bartlomiej Zolnierkiewicz commit 6cdf6eb357c2681596b7b1672b92396ba82333d4 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:14 2008 +0200 ide: add ->dev and ->host_priv fields to struct ide_host * Add 'struct device *dev[2]' and 'void *host_priv' fields to struct ide_host. * Set ->dev[] in ide_host_alloc_all()/ide_setup_pci_device[s](). * Pass 'void *priv' argument to ide_setup_pci_device[s]() and use it to set ->host_priv. * Set PCI dev's ->driver_data to point to the struct ide_host instance if PCI host driver wants to use ->host_priv. * Rename ide_setup_pci_device[s]() to ide_pci_init_{one,two}(). Signed-off-by: Bartlomiej Zolnierkiewicz commit 8c2eece50a368c7986bae0b3e52739558dd71b51 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:12 2008 +0200 ide: call ide_pci_setup_ports() before do_ide_setup_pci_device() * Call ide_pci_setup_ports() before do_ide_setup_pci_device() in ide_setup_pci_device[s](). While at it: * Remove stale FIXMEs. Signed-off-by: Bartlomiej Zolnierkiewicz commit a742d6cf0b37b1a96a1549b1fda0d6b19e0185c2 Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:12 2008 +0200 ide: move ide_setup_pci_controller() call to ide_setup_pci_device[s]() There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz commit a95925a309cd9a2e7f5a5713fd70e0dadb09890c Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:11 2008 +0200 ide: respect dev->irq in do_ide_setup_pci_device() also if 'tried_config' * If device is in the PCI native mode respect dev->irq regardless of 'tried_config' in do_ide_setup_pci_device(). * Drop no longer needed 'config' argument from ide_setup_pci_controller(). Signed-off-by: Bartlomiej Zolnierkiewicz commit 708e5f9eb68589b87724af3f0fb4e681dfdfd69f Author: Bartlomiej Zolnierkiewicz Date: Thu Jul 24 22:53:11 2008 +0200 ide: always call ->init_chipset method in do_ide_setup_pci_device() Call ->init_chipset method also for 'tried_config' / '!pciirq' conditions. Signed-off-by: Bartlomiej Zolnierkiewicz commit 5c402355adf8f920531f02099f4ec0d2bccd4c64 Merge: ecc8b65... 2cc1773... Author: Linus Torvalds Date: Thu Jul 24 12:56:07 2008 -0700 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: MAINTAINERS: Remove Glenn Streiff from NetEffect entry mlx4_core: Improve error message when not enough UAR pages are available IB/mlx4: Add support for memory management extensions and local DMA L_Key IB/mthca: Keep free count for MTT buddy allocator mlx4_core: Keep free count for MTT buddy allocator mlx4_code: Add missing FW status return code IB/mlx4: Rename struct mlx4_lso_seg to mlx4_wqe_lso_seg mlx4_core: Add module parameter to enable QoS support RDMA/iwcm: Remove IB_ACCESS_LOCAL_WRITE from remote QP attributes IPoIB: Include err code in trace message for ib_sa_path_rec_get() failures IB/sa_query: Check if sm_ah is NULL in ib_sa_remove_one() IB/ehca: Release mutex in error path of alloc_small_queue_page() IB/ehca: Use default value for Local CA ACK Delay if FW returns 0 IB/ehca: Filter PATH_MIG events if QP was never armed IB/iser: Add support for RDMA_CM_EVENT_ADDR_CHANGE event RDMA/cma: Add RDMA_CM_EVENT_TIMEWAIT_EXIT event RDMA/cma: Add RDMA_CM_EVENT_ADDR_CHANGE event commit ecc8b655b38a880b578146895e0e1e2d477ca2c0 Merge: 2528ce3... e338125... Author: Linus Torvalds Date: Thu Jul 24 12:55:01 2008 -0700 Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: nohz: adjust tick_nohz_stop_sched_tick() call of s390 as well nohz: prevent tick stop outside of the idle loop commit 2528ce3237be4e900f5eaa455490146e1422e424 Merge: 8ffa5b6... 36bd53d... Author: Linus Torvalds Date: Thu Jul 24 12:54:26 2008 -0700 Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: arch/mips/kernel/stacktrace.c: Heiko can't type kthread: reduce stack pressure in create_kthread and kthreadd fix core/stacktrace changes on avr32, mips, sh commit 8ffa5b65968262ba6bb046329972791c0d960745 Merge: 6209ed9... 58838cf... Author: Linus Torvalds Date: Thu Jul 24 12:53:51 2008 -0700 Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: clean up compiler warning sched: fix hrtick & generic-ipi dependency commit 6209ed9d8443b63c36d340908530fa470c4d4fff Author: Linus Torvalds Date: Thu Jul 24 12:49:26 2008 -0700 x86-64: make BUILD_IRQ() also reset section back Commit 9d25d4db81833029d30b7b03cc1000cbbe09e192 ("x86: BUILD_IRQ say .text to avoid .data.percpu") added a ".text" specifier to make sure that BUILD_IRQ() builds the irq trampoline in the text segment rather than in some random left-over segment that the compiler happened to leave the asm in. However, we should also make sure that we switch back by adding a ".previous" at the end, so that there are no subtle issues with subsequent compiler-generated code. Signed-off-by: Linus Torvalds commit 6044110742bc2ae0577b962985e7c63e0634b2e9 Merge: 7540081... 04bbe43... Author: Linus Torvalds Date: Thu Jul 24 12:33:51 2008 -0700 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: fix header export, asm-x86/processor-flags.h, CONFIG_* leaks x86: BUILD_IRQ say .text to avoid .data.percpu xen: don't use sysret for sysexit32 x86: call early_cpu_init at the same point commit 7540081c6b16dc941895bca840749cabfd0d3b48 Merge: 3fde80e... b552068... Author: Linus Torvalds Date: Thu Jul 24 12:24:40 2008 -0700 Merge branch 'semaphore' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc * 'semaphore' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc: Remove __DECLARE_SEMAPHORE_GENERIC Remove asm/semaphore.h Remove use of asm/semaphore.h Add missing semaphore.h includes Remove mention of semaphores from kernel-locking commit 3fde80e94c2bbffbb13f5faa3340cf438440ebea Merge: ac9f80a... 9b0e741... Author: Linus Torvalds Date: Thu Jul 24 12:17:19 2008 -0700 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: m68knommu: put ColdFire head code into .text.head section m68knommu: remove last use of CONFIG_FADS and CONFIG_RPXCLASSIC m68knommu: remove RPXCLASSIC from the m68k tree m68knommu: fec: remove FADS m68knommu: MCF5307 PIT GENERIC_CLOCKEVENTS support m68knommu: add read_barrier_depends() and irqs_disabled_flags() m68knommu: add byteswap assembly opcode for ISA A+ m68knommu: add ffs and __ffs plattform which support ISA A+ or ISA C m68knommu: add sched_clock() for the DMA timer m68knommu: complete generic time m68knommu: move code within time.c m68knommu: m68knommu: add old stack trace method m68knommu: Add Coldfire DMA Timer support m68knommu: defconfig for M5407C3 board m68knommu: defconfig for M5307C3 board m68knommu: defconfig for M5275EVB board m68knommu: defconfig for M5249EVB board m68knommu: change to a configs directory for board configurations commit ac9f80ad16e6e934b6c1f12f82d27889c0f9abcc Merge: c54554d... f6ec2d9... Author: Linus Torvalds Date: Thu Jul 24 12:16:40 2008 -0700 Merge branch 'for-linus' of git://git.o-hand.com/linux-rpurdie-backlight * 'for-linus' of git://git.o-hand.com/linux-rpurdie-backlight: backlight: Fix missing kernel doc entry backlight: Add Nvidia-based Apple Macbook Pro backlight driver commit c54554d388369f7f88ddcbe285ca96f7fb8a2d4b Merge: 4378dcc... fe3025b... Author: Linus Torvalds Date: Thu Jul 24 12:16:02 2008 -0700 Merge branch 'for-linus' of git://git.o-hand.com/linux-rpurdie-leds * 'for-linus' of git://git.o-hand.com/linux-rpurdie-leds: leds: Ensure led->trigger is set earlier leds: Add support for Philips PCA955x I2C LED drivers leds: Fix sparse warnings in leds-h1940 driver leds: mark led_classdev.default_trigger as const leds: fix unsigned value overflow in atmel pwm driver leds: Add pca9532 platform data for Thecus N2100 leds: Add pca9532 led driver commit 4378dcca8578b0fd0fba883a3354ad4820d4f85f Merge: c3c2233... 7ae93f5... Author: Linus Torvalds Date: Thu Jul 24 12:15:16 2008 -0700 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sparc64: Fix cpufreq notifier registry. sparc64: Fix lockdep issues in LDC protocol layer. commit c3c2233d84bee397b8271923c007264eb3efa67b Merge: f924727... f867e6a... Author: Linus Torvalds Date: Thu Jul 24 12:14:58 2008 -0700 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: pkt_sched: sch_sfq: dump a real number of flows atm: [fore200e] use MODULE_FIRMWARE() and other suggested cleanups netfilter: make security table depend on NETFILTER_ADVANCED tcp: Clear probes_out more aggressively in tcp_ack(). e1000e: fix e1000_netpoll(), remove extraneous e1000_clean_tx_irq() call net: Update entry in af_family_clock_key_strings netdev: Remove warning from __netif_schedule(). sky2: don't stop queue on shutdown commit f9247273cb69ba101877e946d2d83044409cc8c5 Author: Steven Whitehouse Date: Thu Jul 24 17:22:13 2008 +0100 UFS: add const to parser token table This patch adds a "const" to the parser token table. I've done an allmodconfig build to see if this produces any warnings/failures and the patch includes a fix for the only warning that was produced. Signed-off-by: Steven Whitehouse Acked-by: Alexander Viro Acked-by: Evgeniy Dushistov Signed-off-by: Linus Torvalds commit b340e8a57ef381e69c99a7a8ede61a6bf71a8014 Author: Akinobu Mita Date: Wed Jul 23 21:31:51 2008 -0700 auxdisplay: small cleanups - Use BUILD_BUG_ON for CFAG12864B_SIZE instead of runtime-check - Use get_zeroed_page() Signed-off-by: Akinobu Mita Cc: Miguel Ojeda Sandonis Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 5bb49fcd501aa9fd3d321a22b7c01d9b0db7ab36 Author: Philippe De Muyter Date: Wed Jul 23 21:31:50 2008 -0700 video/fb: cleanup FB_MAJOR usage Currently, linux/major.h defines a GRAPHDEV_MAJOR (29) that nobody uses, and linux/fb.h defines the real FB_MAJOR (also 29), that only fbmem.c needs. Drop GRAPHDEV_MAJOR from major.h, move FB_MAJOR definition from fb.h to major.h, and fix fbmem.c to use major.h's definition. Signed-off-by: Philippe De Muyter Cc: Krzysztof Helt Cc: "Antonino A. Daplas" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit cba603bf514c101bf48f6adf393c3d00ed457a57 Author: Jan Beulich Date: Wed Jul 23 21:31:49 2008 -0700 fbcon: remove stray semicolons [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Jan Beulich Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 3e074058d72486676f6fdf6fe803200c62dcb403 Author: Hans-Christian Egtvedt Date: Wed Jul 23 21:31:48 2008 -0700 fbdev: LCD backlight driver using Atmel PWM driver This patch adds a platform driver using the ATMEL PWM driver to control a backlight which requires a PWM signal and optional GPIO signal for discrete on/off signal. It has been tested on Favr-32 board from EarthLCD. The driver is configurable by supplying a struct with the platform data. See the include/linux/atmel-pwm-bl.h for details. The board code for Favr-32 will be submitted to the AVR32 kernel list. Signed-off-by: Hans-Christian Egtvedt Cc: Krzysztof Helt Cc: Haavard Skinnemoen Cc: Richard Purdie Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 2d04a4a72d7e1519b4838f24bdd4b5d0f3f426dc Author: Stefano Stabellini Date: Wed Jul 23 21:31:48 2008 -0700 fbcon: bgcolor fix The fourth bit of the background color is the blink property bit, not the intensity bit, as for the foreground color. Therefore it shouldn't be included in the background color. Signed-off-by: Stefano Stabellini Cc: Krzysztof Helt Cc: "Antonino A. Daplas" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 4a25e41831ee851c1365d8b41decc22493b18e6d Author: Nobuhiro Iwamatsu Date: Wed Jul 23 21:31:46 2008 -0700 video: sh7760fb: SH7760/SH7763 LCDC framebuffer driver Framebuffer driver for the SH7760/SH7763 integrated LCD controller. Signed-off-by: Manuel Lauss Signed-off-by: Nobuhiro Iwamatsu Reviewed-by: Paul Mundt Cc: Krzysztof Helt Cc: "Antonino A. Daplas" Cc: Siegfried Schaefer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit c6b044d6bab5e2878d408666469362fc200a889a Author: Krzysztof Helt Date: Wed Jul 23 21:31:45 2008 -0700 neofb: drop the xtimings structure Remove the xtimings structure which only stored some values to be used later (mostly once). Calculate and use these values in places they are needed. Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 1ca6b62f8ca668ccfab0da9112c0125ef82343bd Author: Krzysztof Helt Date: Wed Jul 23 21:31:45 2008 -0700 neofb: drop redundant code Drop structure which is only set but never read. Drop variables which are only set and never read. Convert one long switch into two shorter ones. Add cpu_relax() in busy waiting loop. Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 7fc80b7bd682b47825e806018cca8ff7dc6bb55a Author: Krzysztof Helt Date: Wed Jul 23 21:31:44 2008 -0700 neofb: simplify clock calculation There is nothing to gain by converting value in kHz to fixed point MHz. Just calculate everything in kHz. A reorder of the loop allows reducing number of iterations (check if frequency is not too high already). Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 5798712d608f5ebad994487748a2ccf3cc613d78 Author: Adrian Bunk Date: Wed Jul 23 21:31:43 2008 -0700 drivers/video/amifb.c cleanups This patch contains the following cleanups: - make the needlessly global amifb_init() static - rename cleanup_module() to amifb_exit(), make it static __exit, use module_exit(), there's no need to #ifdef MODULE it Signed-off-by: Adrian Bunk Acked-by: Geert Uytterhoeven Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 104b198dd0b3b62a4fc4e9146f01f2abc718e926 Author: Jordan Crouse Date: Wed Jul 23 21:31:43 2008 -0700 lxfb: fix console blanking Simply enabling DAC blanking without turning off the CRT seems to be resulting in characters remaining on the screen when the monitor blanks. This patch turns off the CRT for all modes, and also powers down the DACs when vsync and/or hsync are disabled. Signed-off-by: Jordan Crouse Acked-by: Andres Salomon Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit be935d5b6301865b4e9ec35d79d398cedb3c82b7 Author: Andres Salomon Date: Wed Jul 23 21:31:41 2008 -0700 lxfb: drop dead declarations from header We never sent the gamma stuff upstream, and don't really care about it. However, lx_[gs]_et_gamma prototypes snuck into lxfb.h anyways; there are no definitions for them. Drop the dead code. Signed-off-by: Andres Salomon Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 18b095d4b847bb08bf8a1bace7711a93d27732c0 Author: Yoichi Yuasa Date: Wed Jul 23 21:31:41 2008 -0700 drivers/char: remove old broken Cobalt LCD driver Remove old broken Cobalt LCD driver. Signed-off-by: Yoichi Yuasa Acked-by: Ralf Baechle Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 5abe3b4063f16245b8fafbff37bd93814eb8e363 Author: Yoichi Yuasa Date: Wed Jul 23 21:31:40 2008 -0700 fbdev: add new Cobalt LCD framebuffer driver Add new Cobalt LCD framebuffer driver. [akpm@linux-foundation.org: fix build] Signed-off-by: Yoichi Yuasa Cc: Krzysztof Helt Cc: "Antonino A. Daplas" Cc: Ralf Baechle Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 6b51d51a9d24719f905ba9657b29e04efd82a7ea Author: Timur Tabi Date: Wed Jul 23 21:31:39 2008 -0700 fsl-diu-fb: update Freescale DIU driver to use page_alloc_exact() Update the Freescale DIU driver to use page_alloc_exact() to allocate a DMA buffer. This also eliminates the rheap-based memory allocator. We can do this now because commit 6ccf61f9 allows us to allocate 8MB physically- contiguous memory blocks. [akpm@linux-foundation.org: fix printk warnings] Signed-off-by: Timur Tabi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit c25826a7cf1c61b5c6e6db8365172eb97ef39ef3 Author: Ben Dooks Date: Wed Jul 23 21:31:38 2008 -0700 lcd: add platform_lcd driver Add a platform_lcd driver to allow boards with simple lcd power controls to register themselves easily. [akpm@linux-foundation.org: build fix] Signed-off-by: Ben Dooks Cc: Richard Purdie Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 0c531360ed504aa0ce995fcb8ef08e82b6534d0b Author: Ben Dooks Date: Wed Jul 23 21:31:38 2008 -0700 lcd: add lcd_device to check_fb() entry in lcd_ops Add the lcd_device being checked to the check_fb entry of lcd_ops. This ensures that any driver using this to check against it's own state can do so, and also makes all the calls in lcd_ops more orthogonal in their arguments. Signed-off-by: Ben Dooks Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit cccb6d3c149603b9c15d3c460dff317455df1766 Author: Ben Dooks Date: Wed Jul 23 21:31:37 2008 -0700 fb: add support for the ILI9320 video display controller Provide support for the ILI9320 display controller chip which is found in many LCD displays. Included with this is support for an example LCD using this chip, the VGG2432A4. Signed-off-by: Ben Dooks Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit d05254190dd1a4751284f4a51efb70fcc16c45a4 Author: Ben Dooks Date: Wed Jul 23 21:31:37 2008 -0700 sm501: fixup allocation code to be 64bit resource compliant As pointed out by Andrew Morton, we have a problem when setting the 64bit resources option. Alter the allocation routines to remove the need to use the start and end fields, use the proper HEAD_PANEL/HEAD_CRT and update the comments. Note, we also fix the bug where we failed to check the size of the CRT memory allocation. [akpm@linux-foundation.org: cleanup] Signed-off-by: Ben Dooks Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 9b599fb2fc23386dfd2965bf7d10b2b0f628b208 Author: Ben Dooks Date: Wed Jul 23 21:31:36 2008 -0700 sm501: restructure init to allow only 1 fb on an SM501 Add the ability to register only one of the two possible main framebuffer devices on the SM501 by passing platform data for only the framebuffer that you are interested in having. As a side note, we update the init sequence to commonise the code that is executed twice, and fix a pair of missing frees that we didn't do on framebuffer exit, such as freeing the fb's cmap. Signed-off-by: Ben Dooks Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 206c5d69d0540024faffd423fc703f1e457332d7 Author: Ben Dooks Date: Wed Jul 23 21:31:35 2008 -0700 sm501: add inversion controls for VBIASEN and FPEN Add flags to allow the driver to invert the sense of both VBIASEN and FPEN signals comming from the SM501. Signed-off-by: Ben Dooks Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 968910bd03b226ed410d092c2da59dffe5bfe8de Author: Nicolas Ferre Date: Wed Jul 23 21:31:34 2008 -0700 atmel_lcdfb: avoid division by zero Avoid division by zero in atmel_lcdfb_check_var() function. If pixclock is not specified while passing a var structure in the check_var() funtion, a division by zero occurs (when translating pixclock to KHz). This patch adds a checking of this value and try to choose a video mode in the modelist. The mode found in the probe function in added to the modelist. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Nicolas Ferre Cc: Haavard Skinnemoen Cc: Andrew Victor Cc: "Antonino A. Daplas" Cc: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 84c41ce83e9b2987ccef352f28ba0055b26c8f8e Author: Krzysztof Helt Date: Wed Jul 23 21:31:34 2008 -0700 skeletonfb: update to correct platform driver usage It updates skeletonfb to new platform driver API. The skeletonfb is templates for creating new drivers. Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit a882ef47c7156e8cc47e72f2aa396f2514569c48 Author: Akinobu Mita Date: Wed Jul 23 21:31:33 2008 -0700 aty: use memory_read_from_buffer() Signed-off-by: Akinobu Mita Cc: Benjamin Herrenschmidt Cc: "Antonino A. Daplas" Cc: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 1c554ff9554d67b4db0fb5e2f78c7cb4b2e0d627 Author: Ville Syrjala Date: Wed Jul 23 21:31:32 2008 -0700 atyfb: fix a cast The argument to iounmap() is void __iomem *. Fix the cast. Signed-off-by: Ville Syrjala Cc: "Antonino A. Daplas" Cc: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 89c69d2b8eb3ee2338fded9d70a0795b4712f112 Author: Ville Syrjala Date: Wed Jul 23 21:31:32 2008 -0700 atyfb: report probe errors Properly propagate errors to the probe function. Signed-off-by: Ville Syrjala Cc: "Antonino A. Daplas" Cc: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 6cfafc15994ac2a2377b32b5a65cf62a90a80d49 Author: Ville Syrjala Date: Wed Jul 23 21:31:31 2008 -0700 atyfb: use a PCI device ID table Convert atyfb to use a PCI device ID table. Signed-off-by: Ville Syrjala Cc: "Antonino A. Daplas" Cc: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 3880b0b5297ae9bf58a7662d13a46b5d5f0b2af6 Author: Ville Syrjala Date: Wed Jul 23 21:31:30 2008 -0700 atyfb: correct_chipset() can fail Atari probe code relies on correct_chipset() failing if the device is not a mach64 GX/CX. aty_chips[] array would be indexed with -1 in that case. Signed-off-by: Ville Syrjala Cc: "Antonino A. Daplas" Cc: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 50cd0221c9062ec5dac8a3620f36f568df052ac1 Author: Olaf Hering Date: Wed Jul 23 21:31:29 2008 -0700 atyfb: remove dead code Remove dead code. This will slightly change the behaviour of the driver on systems that support backlight control. Previously they would just turn the backlight off using the backlight control but now the generic LCD code will also turn off the LCD using the POWER_MANAGEMENT register. Signed-off-by: Olaf Hering Signed-off-by: Ville Syrjala Cc: "Antonino A. Daplas" Cc: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 7572a1ea034a8fc45e57de28cc7573264975532a Author: Ville Syrjala Date: Wed Jul 23 21:31:28 2008 -0700 fbdev: xoffset, yoffset and yres are unsigned The xoffset, yoffset and yres members of fb_var_screeninfo are __u32. Make them unsigned in the code as well. Signed-off-by: Ville Syrjala Cc: "Antonino A. Daplas" Cc: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 43a3abc6aca8505e708508e2c7c2f99a7f8f820b Author: Ville Syrjala Date: Wed Jul 23 21:31:27 2008 -0700 fbdev: width and height are unsigned The width and height members of fb_var_screeninfo are __u32. The code initializes them to -1 which seems wrong, and 0 seems like an equally good default value. Signed-off-by: Ville Syrjala Cc: "Antonino A. Daplas" Cc: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 2870086e9f2032bdd95b8da9bd187e3c16fc6d49 Author: Krzysztof Helt Date: Wed Jul 23 21:31:26 2008 -0700 hgafb: convert to new platform driver API Convert the hgafb driver to use new platform driver API. Addresses http://bugzilla.kernel.org/show_bug.cgi?id=9689 Signed-off-by: Krzysztof Helt Cc: Anton Vorontsov Cc: "Antonino A. Daplas" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit b604838ac6d233fd6bffc0e758a818133a01ff22 Author: Frans Pop Date: Wed Jul 23 21:31:26 2008 -0700 vfb: only enable if explicitly requested when compiled in The Kconfig help for the vfb driver says: Do NOT enable it for normal systems! To protect the innocent, it has to be enabled explicitly at boot time using the kernel option `video=vfb:'. This change lets the code match the description. Support for vfb:disable is kept for backwards compatibility; vfb:off works because it is tested at a higher level. Note: any undefined option (e.g. vfb:enable) will also enable this driver. The relevant code has been unchanged since before the migration to git (2.6.12). This patch fixes bugzilla #9310 and was the root cause behind http://lkml.org/lkml/2008/5/31/220. Signed-off-by: Frans Pop Cc: Antonino A. Daplas Acked-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit cfb4f5d1750e05f43902197713c50c29e7dfbc99 Author: Magnus Damm Date: Wed Jul 23 21:31:24 2008 -0700 fbdev: SuperH Mobile LCDC Driver This is the SuperH Mobile LCDC frame buffer driver V2, adding support for the LCDC block found in SuperH Mobile processors. The hardware supports up to two LCD panels per LCDC block, and both RGB and SYS interfaces can be used to hook up LCD panels/modules. The device driver is a regular platform driver, so LCD configuration and board specific hooks are passed to the driver using platform data. LCD modules using SYS interface often require special configuration using the SYS bus, and to solve this cleanly the driver provides SYS interface operations to the board code. Tested on sh7723 and sh7722 processors with a SYS16A QVGA panel and WVGA panels using RGB16 and RGB18 interfaces. Signed-off-by: Magnus Damm Acked-by: Paul Mundt Reviewed-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit c2c12155cf05bf3e25eeae5711beffc634505400 Author: Krzysztof Helt Date: Wed Jul 23 21:31:24 2008 -0700 tdfxfb: remove ypan checks done by a higher layer These checks and assignments are done by a higher layer so remove them from the driver. Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 98219374d9ed2d257e56e8e1fcd9d16a083397bb Author: Krzysztof Helt Date: Wed Jul 23 21:31:23 2008 -0700 vga16fb: source code improvement Use constants and functions from the vga.h file. Also add module description. Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit ea9014bcacf236124d5e0ff971838049a98456cb Author: Krzysztof Helt Date: Wed Jul 23 21:31:22 2008 -0700 tdfxfb: add mode_option module parameter Small step toward unification of mode setting parameter. This is required to fix the Bugzilla's bug 9847 Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit a90ed92ed852a3d4b8a6f20b10bba771997f5ede Author: Krzysztof Helt Date: Wed Jul 23 21:31:22 2008 -0700 tridentfb: documentation update Make the tridentfb documentation closer to current state of the tridentfb driver. Fix also some formatting. Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 012e26096b36bfeacaba2c9e31eaf32d6faa6567 Author: Krzysztof Helt Date: Wed Jul 23 21:31:21 2008 -0700 uvesafb: change mode parameter to mode_option Make more drivers use the "mode_option" parameter. This one is quite new so drop the old "mode" parameter before someone starts using it seriously. Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 49a1d28f57adc9cb064572f0373e26363b0a412f Author: Krzysztof Helt Date: Wed Jul 23 21:31:21 2008 -0700 fbcon: make logo_height a local variable Make logo_height variable local in the only function it is used. Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit d22579b837358cbef12ccca5adaf7e93ae09ab7a Author: Nicolas Ferre Date: Wed Jul 23 21:31:20 2008 -0700 atmel_lcdfb: FIFO underflow management Manage atmel_lcdfb FIFO underflow Resetting the LCD and DMA allows to fix screen shifting after a FIFO underflow. It follows reset sequence from errata "LCD Screen Shifting After a Reset". Signed-off-by: Nicolas Ferre Cc: Haavard Skinnemoen Cc: Andrew Victor Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 77a6e7abb09de0e85a15e2fe42c21ffc59847759 Author: Roel Kluin <12o3l@tiscali.nl> Date: Wed Jul 23 21:31:19 2008 -0700 vga16fb: test virtual screen range before subtraction on unsigned dx and dy are u32's, so the test should occur before the subtraction Signed-off-by: Roel Kluin <12o3l@tiscali.nl> Cc: Antonino Daplas Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 1c0face9d4024bf942096297937759bdf0e1aeac Author: Roel Kluin <12o3l@tiscali.nl> Date: Wed Jul 23 21:31:18 2008 -0700 atafb: test virtual screen range before subtraction on unsigned dx and dy are u32's, so the test should occur before the subtraction Signed-off-by: Roel Kluin <12o3l@tiscali.nl> Cc: Tim Schmielau Cc: Krzysztof Helt Cc: Antonino Daplas Cc: Geert Uytterhoeven Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 091c82c01295719d47b89b38d24e41ad2066ead8 Author: Roel Kluin <12o3l@tiscali.nl> Date: Wed Jul 23 21:31:18 2008 -0700 amifb: test virtual screen range before subtraction on unsigned dx and dy are u32's, so the test should occur before the subtraction Signed-off-by: Roel Kluin <12o3l@tiscali.nl> Cc: Antonino Daplas Cc: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 816664f88707b03fde24fb09759d569ed42406cb Author: Roel Kluin <12o3l@tiscali.nl> Date: Wed Jul 23 21:31:17 2008 -0700 aty128fb: test below 0 on unsigned pll->post_divider pll->post_divider is unsigned, so the test fails Signed-off-by: Roel Kluin <12o3l@tiscali.nl> Cc: Benjamin Herrenschmidt Cc: Antonino Daplas Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit fcea8030b3c2e71ad89f080901c63a04f07881c8 Author: Tony Breeds Date: Wed Jul 23 21:31:16 2008 -0700 drivers/video/aty/radeon_base.c: notify user if sysfs_create_bin_file() failed Current kernel builds warn about: drivers/video/aty/radeon_base.c: In function 'radeonfb_pci_register': drivers/video/aty/radeon_base.c:2334: warning: ignoring return value of 'sysfs_create_bin_file', declared with attribute warn_unused_result drivers/video/aty/radeon_base.c:2336: warning: ignoring return value of 'sysfs_create_bin_file', declared with attribute warn_unused_result Do minimal checking of these functions and issue a warning if either fails. They don't seem to be critical.. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Tony Breeds Cc: "Antonino A. Daplas" Cc: Benjamin Herrenschmidt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 7951ac91c7d45b61f54f1cdabc24b52b40785de6 Author: Matthias Kaehlcke Date: Wed Jul 23 21:31:16 2008 -0700 sa1100fb: convert ctrlr_sem in a mutex The semaphore ctrlr_sem is used as a mutex. Convert it to the mutex API Signed-off-by: Matthias Kaehlcke Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit b91dbce56a8dbf312f6255d5121b295553d2b4db Author: Matthias Kaehlcke Date: Wed Jul 23 21:31:14 2008 -0700 pxafb: convert ctrlr_sem in a mutex The semaphore ctrlr_sem is used as a mutex. Convert it to the mutex API. Signed-off-by: Matthias Kaehlcke Cc: Daniel Mack Cc: Eric Miao Cc: Russell King Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 14aefd1b49ff3bd13caa37fb06bd53488d5d1486 Author: Adrian Bunk Date: Wed Jul 23 21:31:12 2008 -0700 video/sis/: remove compat code This patch removes compat code for older kernel versions. Signed-off-by: Adrian Bunk Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 0b9cf3aa6b1e934807b40b4d478d7e11f7c43f55 Author: Roland Kletzing Date: Wed Jul 23 21:31:10 2008 -0700 mdacon messing up default vc's - set default to vc13-16 again mdacon incorrectly detects MDA hardware on systems without such graphics card. One may load this module by chance, for example when doing some systematical module-testing, and if there is no Monochrome Display Adapter attached , module init renders vc1-16 completely unusable. I and others have run into this more than once. see [Bug 224522 - modprobe mdacon freezes machine -> https://bugzilla.novell.com/show_bug.cgi?id=224522 ] for example Apparently proper MDA detection seems to be broken for a long time - seems to be related to those #ifdef TEST_MDA_B statements added by Edward Betts. this commit back in 2002 made things even worse : http://git.kernel.org/?p=linux/kernel/git/tglx/history.git;a=commit;h=c72757b49c88914433244757fb4967fc63546685 It changed default vc allocation from 13-16 to 1-16 for no apparent reason (!?) , and with that (and without X), mdacon grabs the vc you`re currently sitting on and locks you out. this is from Kconfig : >config MDA_CONSOLE > depends on !M68K && !PARISC && ISA > tristate "MDA text console (dual-headed) (EXPERIMENTAL)" > ---help--- > Say Y here if you have an old MDA or monochrome Hercules graphics > adapter in your system acting as a second head ( = video card). You > will then be able to use two monitors with your Linux system. Do not > say Y here if your MDA card is the primary card in your system; the > normal VGA driver will handle it. As we can see mdacon is just meant as an additional driver for dual-head setup, and since kernel 2.4.36 still defaults to vc13-16 , setting the default back to that value again shouldn`t do any harm. Hereby i'm reverting that change, setting default back to to vc13-16 again. Besides the fact that mdacon may be rarely or never be used these days and could perhaps put to trash anyway (pre-dinosaur hardware!), indeed this is not a real solution, but at least it removes the unfortunate side-effect of messing up the vc you`re working on. Signed-off-by: Roland Kletzing Cc: James Simmons Cc: "Antonino A. Daplas" Cc: Tim Schmielau Cc: Jan Engelhardt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 663b0e15877293451bdfea619db45eafae9dec54 Author: Krzysztof Helt Date: Wed Jul 23 21:31:09 2008 -0700 tridentfb: remove warning message that cyblafb driver should be used The tridentfb driver should handle now all chipsets handled by the cyblafb driver. Remove the message which claims that support will be removed. Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 0292be4a382957016e8b574dc292779cfb49e029 Author: Krzysztof Helt Date: Wed Jul 23 21:31:08 2008 -0700 tridentfb: add imageblit acceleration for Blade3D family Add imageblit acceleration for the Blade3D family of cores. The code is based on code from the cyblafb driver. It is a step toward assimilating back the cyblafb driver into the tridentfb driver. The cyblafb driver handles a subfamily of the Trident Blade3d cores. Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 6280fd4f9c2683a4d2f096320dd74ded4e5106ad Author: Krzysztof Helt Date: Wed Jul 23 21:31:08 2008 -0700 tridentfb: Blade3D clock fixes This patch fixes following problems: - does not allow the m parameter to reach 0 as it locks the graphics core (power cycle needed) - for the newer chips (with new clock registers) does not allow of n / m ratio below 4 as it gives unstable image on the Blade3D core - extend shift parameter (k) range to 2 for the newer chips to cope with the n /m >= 4 limit at low resolution (bandwidth) modes - prefer modes with higher n / m ratio (higher k values) Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f330c4b1961d730ef15ac184e4b7f1c25847d0ae Author: Krzysztof Helt Date: Wed Jul 23 21:31:07 2008 -0700 tridentfb: y-panning fixes The Trident cards uses only 20-bit address of screen start in double words. This allows addressing for only 4MB of video memory so check this. Also remove some redundant checks and assignments. Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit a4af1798d768ab2f12ab623e21ad68dc8c248005 Author: Krzysztof Helt Date: Wed Jul 23 21:31:06 2008 -0700 tridentfb: fix 224 color logo at 8 bpp Fix depth setting for 8 bpp mode. The nice 224 color logo is not displayed in 8 bpp depth without this fix. Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 13b0de49f52ec8638b3e3e59192a959b35214d9e Author: Krzysztof Helt Date: Wed Jul 23 21:31:06 2008 -0700 tridentfb: fix console freeze when switching from X11 This patch fixes two problems when acceleration is enabled: - console switch from the Xorg locks up the computer because the Xorg code locks some registers and disables the mmio mode, so reenable these in the tridentfb_set_par() and enable_mmio() - blacklist the Image975 chipset from setting PCI burst mode. This helps with random lock ups of the framebuffer on this chip. The same fix is probably needed for the Xorg as well. Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 5cf138457af20b0ef79d8c249381927718ca1417 Author: Krzysztof Helt Date: Wed Jul 23 21:31:05 2008 -0700 tridentfb: source code improvements This patch contains general source code improvments: - more simple functions are inline - removes some meaningless output and the VERSION string as it is no use - eng_par is moved into the tridentfb_par - removed small section of code for CyberBladeXPAi1 which is maybe right for only one resolution and refresh rate and is probably redundant now - other minor improvements Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 01a2d9ed85c945fc8a672622780533a1a0b7caf5 Author: Krzysztof Helt Date: Wed Jul 23 21:31:04 2008 -0700 tridentfb: acceleration constants change This patch replaces deprecated constant FB_ACCELF_TEXT with FBINFO_HWACCEL_DISABLED and adds constants for Trident families of accelerators. The FBINFO_HWACCEL_DISABLED is correctly used so noaccel parameter works now. Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 34dec24317d6824b7db172bb0072b909a9c376f2 Author: Krzysztof Helt Date: Wed Jul 23 21:31:04 2008 -0700 tridentfb: various pixclock and timing improvements This patch fixes few issues related to timings and pixclock generation: - disallow the pixclocks with numerator lower than double denominator. This fixes display instability for some modes. - choose the pixelclock with the highest numerator and denominator values. This improve image quality and fixes display instability for some modes. - make interlaced modes work. - set synchronization pulses polarization correctly. - horizontal synchronization timing are now the same as generated by X. Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 2c86a0c26fbe8ea218f7a267645679fb78aba8a3 Author: Krzysztof Helt Date: Wed Jul 23 21:31:03 2008 -0700 tridentfb: acceleration bug fixes This patch fixes two problems when acceleration is enabled: - bit for bitblt direction is corrected so scrolling down works as expected on 3DImage chips - initialization of acceleration is done later this helps with initial console malfuntion (on Blade3D chips) well documented here: http://marc.info/?l=linux-fbdev-users&m=111386953124478&w=2 Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 49b1f4b44bcdc47a10d2b354b269305043ef2a32 Author: Krzysztof Helt Date: Wed Jul 23 21:31:02 2008 -0700 tridentfb: acceleration code improvements This patch brings various acceleration improvements: - set copyarea/fillrect for non-accelerated framebuffer (fix) - remove 15 bpp depth handling to simplify code as it hardly works (15 bpp handling was obviously missing in some switches) - add fb_sync call and move waiting before accelerated function to make acceleration more asynchronous to cpu (few % of speed improvement) - add cpu_relax() call in waiting loops - make longer register names and name more registers - move registers' definition to header - general code improvements (shortening, simplifying) Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit bcac2d5fe36238dcfc955b49f9db10ad3ae3e53c Author: Krzysztof Helt Date: Wed Jul 23 21:31:01 2008 -0700 tridentfb: add acceleration for TGUI families This patch adds acceleration for TGUI 9440 and 96xx chips. These chips requires line length to be power of 2, so this is also changed. It also moves the troubling enable_mmio() function to its final destination. Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 54f019e54244fef0ad927ce5501927d9033492de Author: Krzysztof Helt Date: Wed Jul 23 21:31:01 2008 -0700 tridentfb: fix hi-color modes for TGUI 9440 The TGUI 9440 requires doubling clock for 16bpp (hi-color) modes. The patch also moves back enable_mmio() call to the right position. Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 65e93e038c8a6eb65b6907d6aed22a8ff1029d3a Author: Krzysztof Helt Date: Wed Jul 23 21:31:00 2008 -0700 tridentfb: preserve memory type settings Do not overwrite bits which contain memory type settings. It removes noise pixels ("snow") on Blade3D and 3DImage chips. Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 74a933feaf13f705e6c798d87efe6a9d758b3ca0 Author: Krzysztof Helt Date: Wed Jul 23 21:31:00 2008 -0700 tridentfb: improve check_var function Do some additional checks (like pixelclock versus ramdac speed) to eliminate modes which do not work. Signed-off-by: Krzysztof Helt Cc: "Antonino A. Daplas" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit aa0aa8ab2f28d8985daa79ecab51970376e17157 Author: Krzysztof Helt Date: Wed Jul 23 21:30:59 2008 -0700 tridentfb: fix unitialized pseudo_palette Initialize the pseudo_palette pointer properly. This fixes crash when 16bpp or 32bpp mode is selected. Signed-off-by: Krzysztof Helt Cc: "Antonino A. Daplas" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit a0d922562d56073f147a4de2983bee499dd2a10e Author: Krzysztof Helt Date: Wed Jul 23 21:30:58 2008 -0700 tridentfb: add TGUI 9440 support Add support for TGUI 9440 chip. Signed-off-by: Krzysztof Helt Cc: "Antonino A. Daplas" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 0e73a47f094a919e2edeaa88e840cd0400adc423 Author: Krzysztof Helt Date: Wed Jul 23 21:30:58 2008 -0700 tridentfb: improved register values on TGUI 9680 Improved values for some registers after Xorg Trident driver. The main problem was that values set by BIOS have been ignored. This patch completely remove random pixels ("snow") on the TGUI 9680 and 9440 (not supported yet by the driver). It does not help with the "snow" on 3DImage and Blade3D cards. There is also small improvement in timing calculations (hblank start and vblank start) Signed-off-by: Krzysztof Helt Cc: "Antonino A. Daplas" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 3876ae8beb2c7c19e21279b9603b1244fcd744dd Author: Krzysztof Helt Date: Wed Jul 23 21:30:57 2008 -0700 tridentfb: improve probe function Add missing release of allocated fb_info structure and move enable_mmio() to fix error path. Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 6bdf1035602abf0564d24a7447eea1c149c4bcb1 Author: Krzysztof Helt Date: Wed Jul 23 21:30:56 2008 -0700 tridentfb: fix clock settings for older Trident 96XX chips The Xorg code shows that Trident models 9660, 9680 and 9682 require a different clock setting method. Add the second clock setting method for older models. Signed-off-by: Krzysztof Helt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit c1724fecabfed504a4cfb87319ad3b9d3a8baa92 Author: Krzysztof Helt Date: Wed Jul 23 21:30:56 2008 -0700 tridentfb: use mmio access for clock setting Use the mmio outb function instead of direct one. The mmio registers are already mapped (in the probe function). Signed-off-by: Krzysztof Helt Cc: "Antonino A. Daplas" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 7f762d23e607af786bba8ff4a18059f43950c0e8 Author: Krzysztof Helt Date: Wed Jul 23 21:30:55 2008 -0700 tridentfb: fix timing calculations Fix broken timings calculations. This patch helps with following problems: - no left part of screen visible (up to half of the screen) - monitor's frequencies are not the ones intended for selected modes - if mode with resoultion y > 1024 is selected at least once then all modes with y < 1024 are "out of sync" (no display) Signed-off-by: Krzysztof Helt Cc: "Antonino A. Daplas" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 10172ed6dc4d40ff42bf5ce2dd2f65f401a93696 Author: Krzysztof Helt Date: Wed Jul 23 21:30:54 2008 -0700 tridentfb: make use of functions and constants from the vga.h Make use of functions and constants from the vga.h header to compact the code and make it more readable. Signed-off-by: Krzysztof Helt Cc: "Antonino A. Daplas" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit d9cad04bcde00411976402eda726199ac13b29ca Author: Krzysztof Helt Date: Wed Jul 23 21:30:54 2008 -0700 tridentfb: move global acceleration hooks into structure This patch moves acceleration hooks into the tridentfb_par structure and removes global hooks. Signed-off-by: Krzysztof Helt Cc: "Antonino A. Daplas" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit e0759a5fbba12e0f2c9149d85bea1ec7df0178fd Author: Krzysztof Helt Date: Wed Jul 23 21:30:53 2008 -0700 tridentfb: convert is_blade and is_xp macros into functions This patch converts the is_blade() and is_xp() macros into local functions. Signed-off-by: Krzysztof Helt Cc: "Antonino A. Daplas" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 6eed8e1ec8532a6cd10c8b27236bde023c52c56a Author: Krzysztof Helt Date: Wed Jul 23 21:30:53 2008 -0700 tridentfb: move global flat panel variable into structure This patch moves flat panel indicator into tridentfb_par structure and removes related global variables and macros. Signed-off-by: Krzysztof Helt Cc: "Antonino A. Daplas" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 122e8ad3cbf172043ea93f2db8e107fa9f9b0192 Author: Krzysztof Helt Date: Wed Jul 23 21:30:52 2008 -0700 tridentfb: move global chip_id into structure This patch moves the chip_id into tridentfb_par structure and removes global chip_id related constants. It also bumps version of the driver to 0.7.9 Signed-off-by: Krzysztof Helt Cc: "Antonino A. Daplas" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit ea8ee55c12f77cbbb6e067f91e0cd794baa692ab Author: Krzysztof Helt Date: Wed Jul 23 21:30:51 2008 -0700 tridentfb: move global pseudo palette into structure This patch moves pseudo palette int tridentfb_par structure and removes global default_var. Signed-off-by: Krzysztof Helt Cc: "Antonino A. Daplas" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit e09ed099d0169ac3a22b17cfeece0fa54a9e43eb Author: Krzysztof Helt Date: Wed Jul 23 21:30:51 2008 -0700 tridentfb: convert fb_info into allocated one This patch converts fb_info structure from global variable to allocatable one. The global default_par is moved into function variable. Signed-off-by: Krzysztof Helt Cc: "Antonino A. Daplas" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 306fa6f60a2870b7a9827a64e1b45cd35a9549aa Author: Krzysztof Helt Date: Wed Jul 23 21:30:50 2008 -0700 tridentfb: replace macros with functions This patch replaces macros with static functions and puts tridentfb_par pointer as the first argument of these functions. These is a step toward multihead support. Additionally, bogus TRIDENT_MMIO define is removed as the driver supports graphics cards only through the mmio mode. Signed-off-by: Krzysztof Helt Cc: "Antonino A. Daplas" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 2ece5f43b041b96fa2a05107a10a6b0ea0c03a3b Author: Sebastian Siewior Date: Wed Jul 23 21:30:49 2008 -0700 fbdev: add the carmine FB driver Basic FB driver for the carmine chip. The driver registers two FB devices for the two possible screens. The DRAM settings can be be switched via Kconfig (between eval board and custom). Signed-off-by: Sebastian Siewior Cc: "Antonino A. Daplas" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 4cad4431fcd872a1b2efc093b0db6df943f5a898 Author: Yoichi Yuasa Date: Wed Jul 23 21:30:48 2008 -0700 rtc-vr41xx: add irq_set_freq() and irq_set_state() Implement the ioctls RTC_PIE_ON, RTC_PIE_OFF, RTC_IRQP_SET and RTC_IRQP_READ in the standard RTC way. Thanks Dave for noticing it. Signed-off-by: Yoichi Yuasa Cc: David Brownell Cc: Ralf Baechle Cc: Alessandro Zummo Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 7e2a31da854dcf8324012a83a31b40bc11e52589 Author: David Brownell Date: Wed Jul 23 21:30:47 2008 -0700 rtc-cmos: avoid spurious irqs This fixes kernel http://bugzilla.kernel.org/show_bug.cgi?id=11112 (bogus RTC update IRQs reported) for rtc-cmos, in two ways: - When HPET is stealing the IRQs, use the first IRQ to grab the seconds counter which will be monitored (instead of using whatever was previously in that memory); - In sane IRQ handling modes, scrub out old IRQ status before enabling IRQs. That latter is done by tightening up IRQ handling for rtc-cmos everywhere, also ensuring that when HPET is used it's the only thing triggering IRQ reports to userspace; net object shrink. Also fix a bogus HPET message related to its RTC emulation. Signed-off-by: David Brownell Report-by: W Unruh Cc: Andrew Victor Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 449321b39f6c6ebfa15d6da24f134240bd51db29 Author: David Brownell Date: Wed Jul 23 21:30:46 2008 -0700 rtc-at91rm9200: avoid spurious irqs This fixes kernel http://bugzilla.kernel.org/show_bug.cgi?id=11112 (bogus RTC update IRQs reported) for rtc-at91rm9200 by scrubbing old IRQ status before enabling IRQs. It also removes nonfunctional periodic IRQ support from this driver; only update IRQs are reported, or provided by the hardware. I suspect some other RTCs probably have versions of #11112; it's easy to overlook, since most non-RTC drivers don't care about spurious IRQs: they're not reported to userspace. Signed-off-by: David Brownell Report-by: W Unruh Cc: Andrew Victor Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 773be7ee97c11fbb6b8a912a58b268dbe8a6a3fe Author: Ben Dooks Date: Wed Jul 23 21:30:45 2008 -0700 rtc: rtc-s3c: update IRQ handling The rtc-s3c.c driver has been using its own ioctl() handling to deal with alarm and periodic interrupts to handle what should now be done with the rtc core code. Change to using the .irq_set_freq and .irq_set_state driver entries and remove the .ioctl handling. Signed-off-by: Ben Dooks Acked-by: Alessandro Zummo Cc: David Brownell Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 4cd0c5c40b64ef9fd94fb8382dade2fd28f2b935 Author: Ben Dooks Date: Wed Jul 23 21:30:44 2008 -0700 rtc: rtc-s3c: add __devexit and __devinit markers Add the relevant __devinit and __devexit attributes to the rtc-s3c driver. Signed-off-by: Ben Dooks Acked-by: Alessandro Zummo Cc: David Brownell Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 35d3fdd5f304c06654c940921fc045c60df34693 Author: David Brownell Date: Wed Jul 23 21:30:43 2008 -0700 rtc-cmos: improve HPET IRQ glue Resolve http://bugzilla.kernel.org/show_bug.cgi?id=11051 and other bugs related to the way the HPET glue code in rtc-cmos was incomplete and inconsistent: * Switch the approach so that the basic driver code flow isn't changed by having HPET ... instead, just have HPET shadow the RTC_CONTROL irq enables and RTC_FREQ_SELECT data. It's only coping with IRQ thievery, after all. * Do that consistently (!!) to avoid problems when the HPET code is out of sync with the real RTC intent. Examples include: - cmos_procfs(), which now reports correct data - cmos_irq_set_state() ... also removing the previous PIE_{ON,OFF} ioctl support so only one code path manages "periodic" IRQs - cmos_do_shutdown() ... currently a "just in case" change. - cmos_suspend() and cmos_resume() ... also handling a bug that was specific to HPET's IRQ thievery, where the alarm wasn't disabled after waking the system * Always call that HPET code under the RTC spinlock (it doesn't do its own locking) Also clean up the HPET glue: * Add some comments explaining what's going on. * Switch to having just one #ifdef for the HPET glue, and inline functions (not #defines) to avoid some compiler warnings. * Have the probe message also report when HPET IRQs are involved This still leaves various holes in the HPET glue, like the emulated update IRQs being out of sync with the RTC, alarms never using day or month matches, and many extra IRQs (at 64 Hz). [akpm@linux-foundation.org: fix build] Signed-off-by: David Brownell Cc: Tomas Janousek Cc: Bernhard Walle Cc: Carlos R. Mafra Acked-by: Alessandro Zummo Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit c68d07b2da54c941bb36c9d6d35fe8f263ee10ef Author: Carlos R. Mafra Date: Wed Jul 23 21:30:40 2008 -0700 rtc: remove and clarify unneeded externs When CONFIG_HPET_EMULATE_RTC is defined the external declaration of hpet_rtc_interrupt is redundant due to the inclusion of hpet.h. When !CONFIG_HPET_EMULATE_RTC we make it clear that hpet_rtc_interrupt is not used by defining it to return zero. Signed-off-by: Carlos R. Mafra Cc: Ingo Molnar Cc: Thomas Gleixner Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 02bb584f3b1cfc8188522a4d2c8881b65073a4f1 Author: Wolfram Sang Date: Wed Jul 23 21:30:39 2008 -0700 rtc: convert the PCF8583 driver to the new I2C style framework with device_ids Convert the PCF8583 driver to the new I2C style framework with device_ids Signed-off-by: Juergen Beisert Signed-off-by: Wolfram Sang Signed-off-by: Alessandro Zummo Cc: David Brownell Acked-by: Jean Delvare Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 71fc822455ccb63a66be0b6e97a415aceb0062c6 Author: David Brownell Date: Wed Jul 23 21:30:38 2008 -0700 rtc: rtc-omap footprint shrinkage Shrink the runtime footprint of the OMAP1 RTC driver a bunch by removing some old hacks and switching to platform_driver_probe(). Signed-off-by: David Brownell Cc: Alessandro Zummo Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit d3de851a445123f24ad8ece18662014b5e8a8b4e Author: David Brownell Date: Wed Jul 23 21:30:37 2008 -0700 rtc: BCD codeshrink This updates to define the key routines as constant functions, which the macros will then call. Newer code can now call bcd2bin() instead of SCREAMING BCD2BIN() TO THE FOUR WINDS. This lets each driver shrink their codespace by using N function calls to a single (global) copy of those routines, instead of N inlined copies of these functions per driver. These routines aren't used in speed-critical code. Almost all callers are in the RTC framework. Typical per-driver savings is near 300 bytes. Signed-off-by: David Brownell Acked-by: Adrian Bunk Cc: Alessandro Zummo Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 53e84b672c1a8190af2b376c35c7a39cf1214f59 Author: David Brownell Date: Wed Jul 23 21:30:36 2008 -0700 rtc: ds1305/ds1306 driver Support the Dallas/Maxim DS1305 and DS1306 RTC chips. These use SPI, and support alarms, NVRAM, and a trickle charger for use when their backup power supply is a supercap or rechargeable cell. This basic driver doesn't yet support suspend/resume or wakealarms. Signed-off-by: David Brownell Cc: Alessandro Zummo Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 8fc2c767b06067b417c565c4e75731e68ed41fd8 Author: Kim B. Heino Date: Wed Jul 23 21:30:34 2008 -0700 rtc: add support for ST M41T94 SPI RTC This patch adds kernel driver for M41T94 RTC chip connected via SPI. I've tested it on two different AT91-based hardwares. This is third revision of the patch: some comments made by Alessandro Zummo fixed. Revision two added support for century bit and fixes. Signed-off-by: Kim B. Heino Signed-off-by: Alessandro Zummo Cc: David Brownell Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 5ad31a575157147b43fa84ef1e21471661653878 Author: David Brownell Date: Wed Jul 23 21:30:33 2008 -0700 rtc: remove BKL for ioctl() Remove implicit use of BKL in ioctl() from the RTC framework. Instead, the rtc->ops_lock is used. That's the same lock that already protects the RTC operations when they're issued through the exported rtc_*() calls in drivers/rtc/interface.c ... making this a bugfix, not just a cleanup, since both ioctl calls and set_alarm() need to update IRQ enable flags and that implies a common lock (which RTC drivers as a rule do not provide on their own). A new comment at the declaration of "struct rtc_class_ops" summarizes current locking rules. It's not clear to me that the exceptions listed there should exist ... if not, those are pre-existing problems which can be fixed in a patch that doesn't relate to BKL removal. Signed-off-by: David Brownell Cc: Alan Cox Cc: Jonathan Corbet Acked-by: Alessandro Zummo Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 53f1b1433da7eac2607a4a0898a221a4485fd732 Author: Alan Cox Date: Wed Jul 23 21:30:32 2008 -0700 rtc: push the BKL down into the driver ioctl method For now just wrap the main logic, but this driver is a prime candidate for someone wanting to eliminate the lock entirely [lizf@cn.fujitsu.com: fix build failure] Signed-off-by: Alan Cox Signed-off-by: Li Zefan Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 4c228db0b30fa12d65ae7461ce29ed1f4da12c5b Author: Maciej W. Rozycki Date: Wed Jul 23 21:30:32 2008 -0700 rtc: m41t80: use pr_info() as appropriate Replace printk(KERN_INFO ...) calls with appropriate pr_info(...) equivalents. Signed-off-by: Maciej W. Rozycki Cc: Alessandro Zummo Cc: Alexander Bigga Cc: David Brownell Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 35aa64f3a117a16c466f688f52ac3847b3b572e8 Author: Maciej W. Rozycki Date: Wed Jul 23 21:30:29 2008 -0700 rtc: m41t80: sort header inclusions for readability Sort the header inclusions for readability. No functional changes. Signed-off-by: Maciej W. Rozycki Cc: Alessandro Zummo Cc: Alexander Bigga Cc: David Brownell Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit aa55ddf340c9fa3f303ee16bbf35887e42c50304 Author: Ian Kent Date: Wed Jul 23 21:30:29 2008 -0700 autofs4: remove unused ioctls The ioctls AUTOFS_IOC_TOGGLEREGHOST and AUTOFS_IOC_ASKREGHOST were added several years ago but what they were intended for has never been implemented (as far as I'm aware noone uses them) so remove them. Signed-off-by: Ian Kent Reviewed-by: Jeff Moyer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 06a3598552dc3b2b30eb18bd53bbac2a901489d7 Author: Ian Kent Date: Wed Jul 23 21:30:28 2008 -0700 autofs4: reorganize expire pending wait function calls This patch re-orgnirzes the checking for and waiting on active expires and elininates redundant checks. Signed-off-by: Ian Kent Cc: Jeff Moyer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit ec6e8c7d3f9073336ec7b2eed3fcda6f922087c3 Author: Ian Kent Date: Wed Jul 23 21:30:28 2008 -0700 autofs4: fix direct mount pending expire race - correction Appologies, somehow I seem to have sent an out dated version of this patch. Here is an additional patch that brings the patch up to date. Signed-off-by: Ian Kent Cc: Jeff Moyer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 6e60a9ab5f5d314735467752f623072f5b75157a Author: Ian Kent Date: Wed Jul 23 21:30:27 2008 -0700 autofs4: fix direct mount pending expire race For direct and offset type mounts that are covered by another mount we cannot check the AUTOFS_INF_EXPIRING flag during a path walk which leads to lookups walking into an expiring mount while it is being expired. For example, for the direct multi-mount map entry with a couple of offsets: /race/mm1 / :/ /om1 :/ /om2 :/ an autofs trigger mount is mounted on /race/mm1 and when accessed it is over mounted and trigger mounts made for /race/mm1/om1 and /race/mm1/om2. So it isn't possible for path walks to see the expiring flag at all and they happily walk into the file system while it is expiring. When expiring these mounts follow_down() must stop at the autofs mount and all processes must block in the ->follow_link() method (except the daemon) until the expire is complete. This is done by decrementing the d_mounted field of the autofs trigger mount root dentry until the expire is completed. In ->follow_link() all processes wait on the expire and the mount following is completed for the daemon until the expire is complete. Signed-off-by: Ian Kent Cc: Jeff Moyer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 97e7449a7ad883bf9f516fc970778d75999c7843 Author: Ian Kent Date: Wed Jul 23 21:30:26 2008 -0700 autofs4: fix indirect mount pending expire race The selection of a dentry for expiration and the setting of the AUTOFS_INF_EXPIRING flag isn't done atomically which can lead to lookups walking into an expiring mount. What happens is that an expire is initiated by the daemon and a dentry is selected for expire but, since there is no lock held between the selection and setting of the expiring flag, a process may find the flag clear and continue walking into the mount tree at the same time the daemon attempts the expire it. Signed-off-by: Ian Kent Reviewed-by: Jeff Moyer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 26e81b3142f1ba497d4cd0365c13661684b784ce Author: Ian Kent Date: Wed Jul 23 21:30:25 2008 -0700 autofs4: fix pending checks There are two cases for which a dentry that has a pending mount request does not wait for completion. One is via autofs4_revalidate() and the other via autofs4_follow_link(). In revalidate, after the mount point directory is created, but before the mount is done, the check in try_to_fill_dentry() can can fail to send the dentry to the wait queue since the dentry is positive and the lookup flags may contain only LOOKUP_FOLLOW. Although we don't trigger a mount for the LOOKUP_FOLLOW flag, if ther's one pending we might as well wait and use the mounted dentry for the lookup. In autofs4_follow_link() the dentry is not checked to see if it is pending so it may fail to call try_to_fill_dentry() and not wait for mount completion. A dentry that is pending must always be sent to the wait queue. Signed-off-by: Ian Kent Reviewed-by: Jeff Moyer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit ff9cd499d6258952385cb2f12e9a3c0908fd5786 Author: Ian Kent Date: Wed Jul 23 21:30:24 2008 -0700 autofs4: cleanup redundant readir code The mount triggering functionality of readdir and related functions is no longer used (and is quite broken as well). The unused portions have been removed. Signed-off-by: Ian Kent Reviewed-by: Jeff Moyer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit c72305b5472522299bb6f45b736080128eb1c822 Author: Ian Kent Date: Wed Jul 23 21:30:23 2008 -0700 autofs4: indirect dentry must almost always be positive We have been seeing mount requests comming to the automount daemon for keys of the form "/" which are lookups for invalid map keys. But we can check for this in the kernel module and return a fail immediately, without having to send a request to the daemon. It is possible to recognise these requests are invalid based on whether the request dentry is negative and its relation to the autofs file system root. For example, given the indirect multi-mount map entry: idm1 \ /mm1 :/ /mm2 :/ For a request to mount idm1, IS_ROOT((idm1)->d_parent) will be always be true and the dentry may be negative. But directories idm1/mm1 and idm1/mm2 will always be created as part of the mount request for idm1. So any mount request within idm1 itself must have a positive dentry otherwise the map key is invalid. In version 4 these multi-mount entries are all mounted and umounted as a single request and in version 5 the directories idm1/mm1 and idm1/mm2 are created and an autofs fs mounted on them to act as a mount trigger so the above is also true. This also holds true for the autofs version 4 pseudo direct mount feature. When this feature is used without the "--ghost" option automount(8) will create internal submounts as we go down the map key paths which are essentially normal indirect mounts for which the above holds. If the "--ghost" option is given the directories for map keys are created at daemon startup so valid map entries correspond to postive dentries in the autofs fs. autofs version 5 direct mount maps are similar except that the IS_ROOT check is not needed. This has been addressed in a previous patch tittled "autofs4 - detect invalid direct mount requests". For example, given the direct multi-mount map entry: /test/dm1 \ /mm1 :/ /mm2 :/ An autofs fs is mounted on /test/dm1 as a trigger mount and when a mount is triggered for /test/dm1, the multi-mount offset directories /test/dm1/mm1 and /test/dm1/mm2 are created and an autofs fs is mounted on them to act as mount triggers. So valid direct mount requests must always have a positive dentry if they correspond to a valid map entry. Signed-off-by: Ian Kent Acked-by: Jeff Moyer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit eb3b176796b0e53fd26fce86847231542eb0d198 Author: Ian Kent Date: Wed Jul 23 21:30:22 2008 -0700 autofs4: detect invalid direct mount requests autofs v5 direct and offset mounts within an autofs filesystem are triggered by existing autofs triger mounts so the mount point dentry must be positive. If the mount point dentry is negative then the trigger doesn't exist so we can return fail immediately. Signed-off-by: Ian Kent Cc: Jeff Moyer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 296f7bf78bc5c7a4d772aea580ce800d14040d1a Author: Ian Kent Date: Wed Jul 23 21:30:21 2008 -0700 autofs4: fix waitq memory leak If an autofs mount becomes catatonic before autofs4_wait_release() is called the wait queue counter will not be decremented down to zero and the entry will never be freed. There are also races decrementing the wait counter in the wait release function. To deal with this the counter needs to be updated while holding the wait queue mutex and waiters need to be woken up unconditionally when the wait is removed from the queue to ensure we eventually free the wait. Signed-off-by: Ian Kent Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit e64be33ccaceaca67c84237dff8805b861398eab Author: Ian Kent Date: Wed Jul 23 21:30:20 2008 -0700 autofs4: check kernel communication pipe is valid for write It is possible for an autofs mount to become catatonic (and for the daemon communication pipe to become NULL) after a wait has been initiallized but before the request has been sent to the daemon. We need to check for this before sending the request packet. Signed-off-by: Ian Kent Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f4c7da02615bebcaf89f15a8d055922f515160b8 Author: Ian Kent Date: Wed Jul 23 21:30:19 2008 -0700 autofs4: add missing kfree It see that the patch tittled "autofs4 - fix pending mount race" is missing a change that I had recently made. It's missing a kfree for the case mutex_lock_interruptible() fails to aquire the wait queue mutex. Signed-off-by: Ian Kent Cc: Jeff Moyer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit a1362fe92f1bde687b3a9e93d6b8d105d0a84f74 Author: Ian Kent Date: Wed Jul 23 21:30:19 2008 -0700 autofs4: fix pending mount race Close a race between a pending mount that is about to finish and a new lookup for the same directory. Process P1 triggers a mount of directory foo. It sets DCACHE_AUTOFS_PENDING in the ->lookup routine, creates a waitq entry for 'foo', and calls out to the daemon to perform the mount. The autofs daemon will then create the directory 'foo', using a new dentry that will be hashed in the dcache. Before the mount completes, another process, P2, tries to walk into the 'foo' directory. The vfs path walking code finds an entry for 'foo' and calls the revalidate method. Revalidate finds that the entry is not PENDING (because PENDING was never set on the dentry created by the mkdir), but it does find the directory is empty. Revalidate calls try_to_fill_dentry, which sets the PENDING flag and then calls into the autofs4 wait code to trigger or wait for a mount of 'foo'. The wait code finds the entry for 'foo' and goes to sleep waiting for the completion of the mount. Yet another process, P3, tries to walk into the 'foo' directory. This process again finds a dentry in the dcache for 'foo', and calls into the autofs revalidate code. The revalidate code finds that the PENDING flag is set, and so calls try_to_fill_dentry. a) try_to_fill_dentry sets the PENDING flag redundantly for this dentry, then calls into the autofs4 wait code. b) the autofs4 wait code takes the waitq mutex and searches for an entry for 'foo' Between a and b, P1 is woken up because the mount completed. P1 takes the wait queue mutex, clears the PENDING flag from the dentry, and removes the waitqueue entry for 'foo' from the list. When it releases the waitq mutex, P3 (eventually) acquires it. At this time, it looks for an existing waitq for 'foo', finds none, and so creates a new one and calls out to the daemon to mount the 'foo' directory. Now, the reason that three processes are required to trigger this race is that, because the PENDING flag is not set on the dentry created by mkdir, the window for the race would be way to slim for it to ever occur. Basically, between the testing of d_mountpoint(dentry) and the taking of the waitq mutex, the mount would have to complete and the daemon would have to be woken up, and that in turn would have to wake up P1. This is simply impossible. Add the third process, though, and it becomes slightly more likely. Signed-off-by: Jeff Moyer Signed-off-by: Ian Kent Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 5a11d4d0ee1ff284271f7265929d07ea4a1168a6 Author: Ian Kent Date: Wed Jul 23 21:30:17 2008 -0700 autofs4: fix waitq locking The autofs4_catatonic_mode() function accesses the wait queue without any locking but can be called at any time. This could lead to a possible double free of the name field of the wait and a double fput of the daemon communication pipe or an fput of a NULL file pointer. Signed-off-by: Ian Kent Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 70b52a0a5005ce6a0ceec56e97222437a0ba7506 Author: Jeff Moyer Date: Wed Jul 23 21:30:16 2008 -0700 autofs4: use struct qstr in waitq.c The autofs_wait_queue already contains all of the fields of the struct qstr, so change it into a qstr. This patch, from Jeff Moyer, has been modified a liitle by myself. Signed-off-by: Jeff Moyer Signed-off-by: Ian Kent Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 6d5cb926fa0162b1e62f37c117cc7ce763cfcbb9 Author: Ian Kent Date: Wed Jul 23 21:30:15 2008 -0700 autofs4: use lookup intent flags to trigger mounts When an open(2) call is made on an autofs mount point directory that already exists and the O_DIRECTORY flag is not used the needed mount callback to the daemon is not done. This leads to the path walk continuing resulting in a callback to the daemon with an incorrect key. open(2) is called without O_DIRECTORY by the "find" utility but this should be handled properly anyway. This happens because autofs needs to use the lookup flags to decide when to callback to the daemon to perform a mount to prevent mount storms. For example, an autofs indirect mount map that has the "browse" option will have the mount point directories are pre-created and the stat(2) call made by a color ls against each directory will cause all these directories to be mounted. It is unfortunate we need to resort to this but mount maps can be quite large. Additionally, if a user manually umounts an autofs indirect mount the directory isn't removed which also leads to this situation. To resolve this autofs needs to use the lookup intent flags to enable it to make this decision. This patch adds this check and triggers a call back if any of the lookup intent flags are set as all these calls warrant a mount attempt be requested. I know that external VFS code which uses the lookup flags is something that the VFS would like to eliminate but I have no choice as I can't see any other way to do this. A VFS dentry or inode operation callback which returns the lookup "type" (requires a definition) would be sufficient. But this change is needed now and I'm not aware of the form that coming VFS changes will take so I'm not willing to propose anything along these lines. If anyone can provide an alternate method I would be happy to use it. [akpm@linux-foundation.org: fix build for concurrent VFS changes] Signed-off-by: Ian Kent Cc: Al Viro Cc: Jeff Moyer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit c432c2586a0811c7d0030d78f0993568bc889a6f Author: Ian Kent Date: Wed Jul 23 21:30:14 2008 -0700 autofs4: don't release directory mutex if called in oz_mode Since we now delay hashing of dentrys until the ->mkdir() call, droping and re-taking the directory mutex within the ->lookup() function when we are being called by user space is not needed. This can lead to a race when other processes are attempting to access the same directory during mount point directory creation. In this case we need to hang onto the mutex to ensure we don't get user processes trying to create a mount request for a newly created dentry after the mount point entry has already been created. This ensures that when we need to check a dentry passed to autofs4_wait(), if it is hashed, it is always the mount point dentry and not a new dentry created by another lookup during directory creation. Signed-off-by: Ian Kent Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit ef581a742874ebc4c28d24b374c78b762144ebdc Author: Ian Kent Date: Wed Jul 23 21:30:13 2008 -0700 autofs4: fix symlink name allocation The length of the symlink name has been moved but it needs to be set before allocating space for it in the dentry info struct. This corrects a mistake in a recent patch. Signed-off-by: Ian Kent Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 2576737873dc1d9ea461a5955a5f6779b569a350 Author: Ian Kent Date: Wed Jul 23 21:30:12 2008 -0700 autofs4: use look aside list for lookups A while ago a patch to resolve a deadlock during directory creation was merged. This delayed the hashing of lookup dentrys until the ->mkdir() (or ->symlink()) operation completed to ensure we always went through ->lookup() instead of also having processes go through ->revalidate() so our VFS locking remained consistent. Now we are seeing a couple of side affects of that change in situations with heavy mount activity. Two cases have been identified: 1) When a mount request is triggered, due to the delayed hashing, the directory created by user space for the mount point doesn't have the DCACHE_AUTOFS_PENDING flag set. In the case of an autofs multi-mount where a tree of mount point directories are created this can lead to the path walk continuing rather than the dentry being sent to the wait queue to wait for request completion. This is because, if the pending flag isn't set, the criteria for deciding this is a mount in progress fails to hold, namely that the dentry is not a mount point and has no subdirectories. 2) A mount request dentry is initially created negative and unhashed. It remains this way until the ->mkdir() callback completes. Since it is unhashed a fresh dentry is used when the user space mount request creates the mount point directory. This leaves the original dentry negative and unhashed. But revalidate has no way to tell the VFS that the dentry has changed, other than to force another ->lookup() by returning false, which is at best wastefull and at worst not possible. This results in an -ENOENT return from the original path walk when in fact the mount succeeded. To resolve this we need to ensure that the same dentry is used in all calls to ->lookup() during the course of a mount request. This patch achieves that by adding the initial dentry to a look aside list and removes it at ->mkdir() or ->symlink() completion (or when the dentry is released), since these are the only create operations autofs4 supports. Signed-off-by: Ian Kent Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit caf7da3d5d4d9dd873eb52d025d8cc63b89f1fdb Author: Ian Kent Date: Wed Jul 23 21:30:11 2008 -0700 autofs4: revert - redo lookup in ttfd This patch series enables the use of a single dentry for lookups prior to the dentry being hashed and so we no longer need to redo the lookup. This patch reverts the patch of commit 033790449ba9c4dcf8478a87693d33df625c23b5. Signed-off-by: Ian Kent Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 5f6f4f28b6ba543beef8bad91aa6f69c7ffeee51 Author: Ian Kent Date: Wed Jul 23 21:30:09 2008 -0700 autofs4: don't make expiring dentry negative Correct the error of making a positive dentry negative after it has been instantiated. The code that makes this error attempts to re-use the dentry from a concurrent expire and mount to resolve a race and the dentry used for the lookup must be negative for mounts to trigger in the required cases. The fact is that the dentry doesn't need to be re-used because all that is needed is to preserve the flag that indicates an expire is still incomplete at the time of the mount request. This change uses the the dentry to check the flag and wait for the expire to complete then discards it instead of attempting to re-use it. Signed-off-by: Ian Kent Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 391b52f98cf2e9bff227dad8bf9ea206fec43fa4 Author: Michael Halcrow Date: Wed Jul 23 21:30:08 2008 -0700 eCryptfs: Make all persistent file opens delayed There is no good reason to immediately open the lower file, and that can cause problems with files that the user does not intend to immediately open, such as device nodes. This patch removes the persistent file open from the interpose step and pushes that to the locations where eCryptfs really does need the lower persistent file, such as just before reading or writing the metadata stored in the lower file header. Two functions are jumping to out_dput when they should just be jumping to out on error paths. This patch also fixes these. Signed-off-by: Michael Halcrow Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 72b55fffd631a89e5be6fe1b4f2565bc4cd90deb Author: Michael Halcrow Date: Wed Jul 23 21:30:07 2008 -0700 eCryptfs: do not try to open device files on mknod When creating device nodes, eCryptfs needs to delay actually opening the lower persistent file until an application tries to open. Device handles may not be backed by anything when they first come into existence. [Valdis.Kletnieks@vt.edu: build fix] Signed-off-by: Michael Halcrow Cc: Signed-off-by: Linus Torvalds commit 0a688ad713949643e201431d3f4a4ceddfeb70ca Author: Harvey Harrison Date: Wed Jul 23 21:30:07 2008 -0700 ecryptfs: inode.c mmap.c use unaligned byteorder helpers Fixe sparse warnings: fs/ecryptfs/inode.c:368:15: warning: cast to restricted __be64 fs/ecryptfs/mmap.c:385:12: warning: incorrect type in assignment (different base types) fs/ecryptfs/mmap.c:385:12: expected unsigned long long [unsigned] [assigned] [usertype] file_size fs/ecryptfs/mmap.c:385:12: got restricted __be64 [usertype] fs/ecryptfs/mmap.c:428:12: warning: incorrect type in assignment (different base types) fs/ecryptfs/mmap.c:428:12: expected unsigned long long [unsigned] [assigned] [usertype] file_size fs/ecryptfs/mmap.c:428:12: got restricted __be64 [usertype] Signed-off-by: Harvey Harrison Cc: Michael Halcrow Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 29335c6a41568d4708d4ec3b9187f9b6d302e5ea Author: Harvey Harrison Date: Wed Jul 23 21:30:06 2008 -0700 ecryptfs: crypto.c use unaligned byteorder helpers Fixes the following sparse warnings: fs/ecryptfs/crypto.c:1036:8: warning: cast to restricted __be32 fs/ecryptfs/crypto.c:1038:8: warning: cast to restricted __be32 fs/ecryptfs/crypto.c:1077:10: warning: cast to restricted __be32 fs/ecryptfs/crypto.c:1103:6: warning: incorrect type in assignment (different base types) fs/ecryptfs/crypto.c:1105:6: warning: incorrect type in assignment (different base types) fs/ecryptfs/crypto.c:1124:8: warning: incorrect type in assignment (different base types) fs/ecryptfs/crypto.c:1241:21: warning: incorrect type in assignment (different base types) fs/ecryptfs/crypto.c:1244:30: warning: incorrect type in assignment (different base types) fs/ecryptfs/crypto.c:1414:23: warning: cast to restricted __be32 fs/ecryptfs/crypto.c:1417:32: warning: cast to restricted __be16 Signed-off-by: Harvey Harrison Cc: Michael Halcrow Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 8f2368095e25018838e1bf145041f58270ccd32e Author: Miklos Szeredi Date: Wed Jul 23 21:30:05 2008 -0700 ecryptfs: string copy cleanup Clean up overcomplicated string copy, which also gets rid of this bogus warning: fs/ecryptfs/main.c: In function 'ecryptfs_parse_options': include/asm/arch/string_32.h:75: warning: array subscript is above array bounds Signed-off-by: Miklos Szeredi Cc: Michael Halcrow Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 982363c97f8cad7aea4c3d2cfebffc1cc2d2f166 Author: Eric Sandeen Date: Wed Jul 23 21:30:04 2008 -0700 ecryptfs: propagate key errors up at mount time Mounting with invalid key signatures should probably fail, if they were specifically requested but not available. Also fix case checks in process_request_key_err() for the right sign of the errnos, as spotted by Jan Tluka. Signed-off-by: Eric Sandeen Reviewed-by: Jan Tluka Acked-by: Michael Halcrow Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 6c4c17b073cd4a5a61bc04329561632870bb21fc Author: Tyler Hicks Date: Wed Jul 23 21:30:04 2008 -0700 ecryptfs: discard ecryptfsd registration messages in miscdev The userspace eCryptfs daemon sends HELO and QUIT messages to the kernel for per-user daemon (un)registration. These messages are required when netlink is used as the transport, but (un)registration is handled by opening and closing the device file when miscdev is the transport. These messages should be discarded in the miscdev transport so that a daemon isn't registered twice. Signed-off-by: Tyler Hicks Cc: Michael Halcrow Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 746f1e558bc52b9693c1a1ecdab60f8392e5ff18 Author: Michael Halcrow Date: Wed Jul 23 21:30:02 2008 -0700 eCryptfs: Privileged kthread for lower file opens eCryptfs would really like to have read-write access to all files in the lower filesystem. Right now, the persistent lower file may be opened read-only if the attempt to open it read-write fails. One way to keep from having to do that is to have a privileged kthread that can open the lower persistent file on behalf of the user opening the eCryptfs file; this patch implements this functionality. This patch will properly allow a less-privileged user to open the eCryptfs file, followed by a more-privileged user opening the eCryptfs file, with the first user only being able to read and the second user being able to both read and write. eCryptfs currently does this wrong; it will wind up calling vfs_write() on a file that was opened read-only. This is fixed in this patch. Signed-off-by: Michael Halcrow Cc: Dave Kleikamp Cc: Serge Hallyn Cc: Eric Sandeen Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 0293902a4d66fab27d0ddcc0766e05dae68f004e Author: Wang Chen Date: Wed Jul 23 21:30:01 2008 -0700 I2O: handle sysfs_create_link() failures Compile warning: ignoring return value of `sysfs_create_link', declared with attribute warn_unused_result. If sysfs_create_link failed, take care of the return value and do some error handle after the failure. Since sysfs_remove_link() will check whether a link exists, when removing the link in error path, we don't need to care whether a link was created. Signed-off-by: Wang Chen Cc: Markus Lidel Cc: Jens Axboe Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f700d6e5e5549cb9349d22043f4bd153792c621f Author: Stefano Stabellini Date: Wed Jul 23 21:29:59 2008 -0700 vt: do not update when the console is blanked vt.c DO_UPDATE macro checks if the console is visible but doesn't check if the console is blanked. In fact updating fbcon while the console is blanked is not only unnecessary but can even cause screen corruption. Therefore I am adding a simple check on console_blanked in DO_UPDATE. Signed-off-by: Stefano Stabellini Cc: Krzysztof Helt Cc: "Antonino A. Daplas" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit e0426e6a09954d205da2d674a3d368d2715e3afd Author: Jiri Slaby Date: Wed Jul 23 21:29:58 2008 -0700 vt: hold console_sem across sysfs operations Hold console sem while creating/destroying sysfs files. Serialisation is so far done by BKL held in tty release_dev and chrdev_open, but no other locks are held in open path. Signed-off-by: Jiri Slaby Cc: Alan Cox Cc: Aristeu Rozanski Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit bbe48ecc7f6559318cfc6c023da225a0b0e14ab3 Author: Jan Nikitenko Date: Wed Jul 23 21:29:57 2008 -0700 spi: au1550_spi: improve pio transfer mode Improve PIO transfer mode of au1550 spi controller by continuing of spi transfer, instead of aborting transfer when transmit underflow interrupt occurrs. Verified by oscilloscope that the spi clock pauses on trasmit underflow, so transfer continuation is perfectly valid even though au1550 datasheet says that on tx underflow zeroes will be transfered. Also make some error messages more specific. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Jan Nikitenko Signed-off-by: David Brownell Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 3a93a159c61e38a12f7ecbb3a25cf3f012abcf7a Author: Manuel Lauss Date: Wed Jul 23 21:29:56 2008 -0700 spi: au1550_spi: proper platform device Remove the Au1550 resource table and instead extract MMIO/IRQ/DMA resources from platform resource information like any well-behaved platform driver. Signed-off-by: Manuel Lauss Signed-off-by: Jan Nikitenko Signed-off-by: David Brownell Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 4ef754b7d7971a704d5b1b4608839da1bae37e5e Author: Alan Cox Date: Wed Jul 23 21:29:55 2008 -0700 spidev: BKL removal Another step to removing ->ioctl and to removing the BKL [dbrownell@users.sourceforge.net: take final step; BKL not needed] Signed-off-by: Alan Cox Signed-off-by: David Brownell Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 102eb97564c73ea73645b38599c5cbe6f54b030c Author: Grant Likely Date: Wed Jul 23 21:29:55 2008 -0700 spi: make spi_board_info.modalias a char array Currently, 'modalias' in the spi_device structure is a 'const char *'. The spi_new_device() function fills in the modalias value from a passed in spi_board_info data block. Since it is a pointer copy, the new spi_device remains dependent on the spi_board_info structure after the new spi_device is registered (no other fields in spi_device directly depend on the spi_board_info structure; all of the other data is copied). This causes a problem when dynamically propulating the list of attached SPI devices. For example, in arch/powerpc, the list of SPI devices can be populated from data in the device tree. With the current code, the device tree adapter must kmalloc() a new spi_board_info structure for each new SPI device it finds in the device tree, and there is no simple mechanism in place for keeping track of these allocations. This patch changes modalias from a 'const char *' to a fixed char array. By copying the modalias string instead of referencing it, the dependency on the spi_board_info structure is eliminated and an outside caller does not need to maintain a separate spi_board_info allocation for each device. If searched through the code to the best of my ability for any references to modalias which may be affected by this change and haven't found anything. It has been tested with the lite5200b platform in arch/powerpc. [dbrownell@users.sourceforge.net: cope with linux-next changes: KOBJ_NAME_LEN obliterated, etc] Signed-off-by: Grant Likely Signed-off-by: David Brownell Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 6291fe2abce4689d6ee7cbaea16692c79bf0d01b Author: Robert P. J. Day Date: Wed Jul 23 21:29:53 2008 -0700 SPI Kconfig simplifications Use "if SPI_MASTER" to remove numerous dependencies. [dbrownell@users.sourceforge.net: remove a couple now-needless EXPERIMENTAL dependencies too] Signed-off-by: Robert P. J. Day Signed-off-by: David Brownell Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 166a375b657b7af494f4ce3f72c4d2002180da44 Author: Roel Kluin <12o3l@tiscali.nl> Date: Wed Jul 23 21:29:53 2008 -0700 xilinx_spi: test below 0 on unsigned irq in xilinx_spi_probe() xilinx_spi->irq is unsigned, so the test fails Signed-off-by: Roel Kluin <12o3l@tiscali.nl> Cc: David Brownell Cc: Andrei Konovalov Cc: Yuri Frolov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit a61f5345eba34772a71523227de890a28410f320 Author: Chen Gong Date: Wed Jul 23 21:29:52 2008 -0700 spi: spi_mpc83xx clockrate fixes This updates the SPI clock rate calculations for the spi_mpc83xx driver. Some boundary conditions were wrong, and in several cases divide-by-16 wasn't always needed Signed-off-by: Chen Gong Signed-off-by: David Brownell Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 708d8cefd0f6d8dc13027f899e865ccfa5f63871 Author: Andre Haupt Date: Wed Jul 23 21:29:51 2008 -0700 stallion: removed unused variable Signed-off-by: Andre Haupt Acked-by: Alan Cox Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit ae2d4c396e19f45918ed6e0900b031538d009823 Author: Nye Liu Date: Wed Jul 23 21:29:50 2008 -0700 cpm1: don't send break on TX_STOP, don't interrupt RX/TX when adjusting termios parameters Before setting STOP_TX, set _brkcr to 0 so the SMC does not send a break character. The driver appears to properly re-initialize _brkcr when the SMC is restarted. Do not interrupt RX/TX when the termios is being adjusted; it results in corrupted characters appearing on the line. Cc: Vitaly Bordug Cc: Scott Wood Cc: Paul Mackerras Cc: Kumar Gala Cc: Alan Cox Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit e9a8f4d1de12633bfb71b5fee47745b32877b7b5 Author: Maciej W. Rozycki Date: Wed Jul 23 21:29:49 2008 -0700 serial: DZ11: avoid a hang at console switch-over Changes to the generic console support code that happened a while ago introduced a scenario where the initial console is used in parallel with the final console during a brief period when switching between the two is in progress. During that time a message about the switch-over is printed. With some combinations of chips, firmware and drivers, such as the DEC DZ11 clone used with the DECstation, a hang may happen because the firmware used for the initial console may not expect the state of the chip after it has been initialised by the driver. This is a workaround for the DZ11 which reuses the power-management callback to keep the transmitter of the line associated with the console enabled. It reflects the consensus reached in a discussion a while ago. Signed-off-by: Maciej W. Rozycki Cc: Jiri Slaby Cc: Alan Cox Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 377135912806ddc87d56d64fafa685f4063c45f1 Author: Maciej W. Rozycki Date: Wed Jul 23 21:29:48 2008 -0700 serial: Z85C30: avoid a hang at console switch-over Changes to the generic console support code that happened a while ago introduced a scenario where the initial console is used in parallel with the final console during a brief period when switching between the two is in progress. During that time a message about the switch-over is printed. With some combinations of chips, firmware and drivers, such as the Zilog Z85C30 SCC used with the DECstation, a hang may happen because the firmware used for the initial console may not expect the state of the chip after it has been initialised by the driver. This is not a bug in the firmware, as some registers it would have to examine are write-only. This is a workaround for the Z85C30 which reuses the power-management callback to keep the transmitter of the line associated with the console enabled. It reflects the consensus reached in a discussion a while ago. Signed-off-by: Maciej W. Rozycki Cc: Jiri Slaby Cc: Alan Cox Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit b76c5a0717094f0a900d9afd8e36f7ad8dbba587 Author: Catalin(ux) M BOIE Date: Wed Jul 23 21:29:46 2008 -0700 serial: add support for a no-name 4 ports multiserial card It is a no-name PCI card. I found no reference to a producer so I used "UNKNOWN_0x1584" as the name. Full lspci: 01:07.0 0780: 10b5:9050 (rev 01) Subsystem: 10b5:1584 Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- \ ParErr- Stepping- SERR+ FastB2B- Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- \ DEVSEL=medium >TAbort- SERR- Acked-by: Alan Cox Acked-by: Russell King Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 7500b1f602aad75901774a67a687ee985d85893f Author: Aristeu Rozanski Date: Wed Jul 23 21:29:45 2008 -0700 8250: fix break handling for Intel 82571 Intel 82571 has a "Serial Over LAN" feature that doesn't properly implements the receiving of break characters. When a break is received, it doesn't set UART_LSR_DR and unless another character is received, the break won't be received by the application. Signed-off-by: Aristeu Rozanski Acked-by: Alan Cox Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 920519c1c31ca46ef6caab1a4be102ed0dfb5fbc Author: Adrian Bunk Date: Wed Jul 23 21:29:44 2008 -0700 serial/8250_gsc.c: add MODULE_LICENSE This patch adds the missing MODULE_LICENSE("GPL"). Signed-off-by: Adrian Bunk Acked-by: Alan Cox Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 9fe5ad9c8cef9ad5873d8ee55d1cf00d9b607df0 Author: Ulrich Drepper Date: Wed Jul 23 21:29:43 2008 -0700 flag parameters add-on: remove epoll_create size param Remove the size parameter from the new epoll_create syscall and renames the syscall itself. The updated test program follows. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include #include #include #include #include #ifndef __NR_epoll_create2 # ifdef __x86_64__ # define __NR_epoll_create2 291 # elif defined __i386__ # define __NR_epoll_create2 329 # else # error "need __NR_epoll_create2" # endif #endif #define EPOLL_CLOEXEC O_CLOEXEC int main (void) { int fd = syscall (__NR_epoll_create2, 0); if (fd == -1) { puts ("epoll_create2(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("epoll_create2(0) set close-on-exec flag"); return 1; } close (fd); fd = syscall (__NR_epoll_create2, EPOLL_CLOEXEC); if (fd == -1) { puts ("epoll_create2(EPOLL_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("epoll_create2(EPOLL_CLOEXEC) set close-on-exec flag"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper Acked-by: Davide Libenzi Cc: Michael Kerrisk Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit e38b36f325153eaadd1c2a7abc5762079233e540 Author: Ulrich Drepper Date: Wed Jul 23 21:29:42 2008 -0700 flag parameters: check magic constants This patch adds test that ensure the boundary conditions for the various constants introduced in the previous patches is met. No code is generated. [akpm@linux-foundation.org: fix alpha] Signed-off-by: Ulrich Drepper Acked-by: Davide Libenzi Cc: Michael Kerrisk Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 510df2dd482496083e1c3b1a8c9b6afd5fa4c7d7 Author: Ulrich Drepper Date: Wed Jul 23 21:29:41 2008 -0700 flag parameters: NONBLOCK in inotify_init This patch adds non-blocking support for inotify_init1. The additional changes needed are minimal. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include #include #include #include #ifndef __NR_inotify_init1 # ifdef __x86_64__ # define __NR_inotify_init1 294 # elif defined __i386__ # define __NR_inotify_init1 332 # else # error "need __NR_inotify_init1" # endif #endif #define IN_NONBLOCK O_NONBLOCK int main (void) { int fd = syscall (__NR_inotify_init1, 0); if (fd == -1) { puts ("inotify_init1(0) failed"); return 1; } int fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if (fl & O_NONBLOCK) { puts ("inotify_init1(0) set non-blocking mode"); return 1; } close (fd); fd = syscall (__NR_inotify_init1, IN_NONBLOCK); if (fd == -1) { puts ("inotify_init1(IN_NONBLOCK) failed"); return 1; } fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if ((fl & O_NONBLOCK) == 0) { puts ("inotify_init1(IN_NONBLOCK) set non-blocking mode"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper Acked-by: Davide Libenzi Cc: Michael Kerrisk Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit be61a86d7237dd80510615f38ae21d6e1e98660c Author: Ulrich Drepper Date: Wed Jul 23 21:29:40 2008 -0700 flag parameters: NONBLOCK in pipe This patch adds O_NONBLOCK support to pipe2. It is minimally more involved than the patches for eventfd et.al but still trivial. The interfaces of the create_write_pipe and create_read_pipe helper functions were changed and the one other caller as well. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include #include #include #include #ifndef __NR_pipe2 # ifdef __x86_64__ # define __NR_pipe2 293 # elif defined __i386__ # define __NR_pipe2 331 # else # error "need __NR_pipe2" # endif #endif int main (void) { int fds[2]; if (syscall (__NR_pipe2, fds, 0) == -1) { puts ("pipe2(0) failed"); return 1; } for (int i = 0; i < 2; ++i) { int fl = fcntl (fds[i], F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if (fl & O_NONBLOCK) { printf ("pipe2(0) set non-blocking mode for fds[%d]\n", i); return 1; } close (fds[i]); } if (syscall (__NR_pipe2, fds, O_NONBLOCK) == -1) { puts ("pipe2(O_NONBLOCK) failed"); return 1; } for (int i = 0; i < 2; ++i) { int fl = fcntl (fds[i], F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if ((fl & O_NONBLOCK) == 0) { printf ("pipe2(O_NONBLOCK) does not set non-blocking mode for fds[%d]\n", i); return 1; } close (fds[i]); } puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper Acked-by: Davide Libenzi Cc: Michael Kerrisk Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 6b1ef0e60d42f2fdaec26baee8327eb156347b4f Author: Ulrich Drepper Date: Wed Jul 23 21:29:39 2008 -0700 flag parameters: NONBLOCK in timerfd_create This patch adds support for the TFD_NONBLOCK flag to timerfd_create. The additional changes needed are minimal. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include #include #include #include #include #ifndef __NR_timerfd_create # ifdef __x86_64__ # define __NR_timerfd_create 283 # elif defined __i386__ # define __NR_timerfd_create 322 # else # error "need __NR_timerfd_create" # endif #endif #define TFD_NONBLOCK O_NONBLOCK int main (void) { int fd = syscall (__NR_timerfd_create, CLOCK_REALTIME, 0); if (fd == -1) { puts ("timerfd_create(0) failed"); return 1; } int fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if (fl & O_NONBLOCK) { puts ("timerfd_create(0) set non-blocking mode"); return 1; } close (fd); fd = syscall (__NR_timerfd_create, CLOCK_REALTIME, TFD_NONBLOCK); if (fd == -1) { puts ("timerfd_create(TFD_NONBLOCK) failed"); return 1; } fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if ((fl & O_NONBLOCK) == 0) { puts ("timerfd_create(TFD_NONBLOCK) set non-blocking mode"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper Acked-by: Davide Libenzi Cc: Michael Kerrisk Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit e7d476dfdf0bcfed478a207aecfdc84f81efecaf Author: Ulrich Drepper Date: Wed Jul 23 21:29:38 2008 -0700 flag parameters: NONBLOCK in eventfd This patch adds support for the EFD_NONBLOCK flag to eventfd2. The additional changes needed are minimal. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include #include #include #include #ifndef __NR_eventfd2 # ifdef __x86_64__ # define __NR_eventfd2 290 # elif defined __i386__ # define __NR_eventfd2 328 # else # error "need __NR_eventfd2" # endif #endif #define EFD_NONBLOCK O_NONBLOCK int main (void) { int fd = syscall (__NR_eventfd2, 1, 0); if (fd == -1) { puts ("eventfd2(0) failed"); return 1; } int fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if (fl & O_NONBLOCK) { puts ("eventfd2(0) sets non-blocking mode"); return 1; } close (fd); fd = syscall (__NR_eventfd2, 1, EFD_NONBLOCK); if (fd == -1) { puts ("eventfd2(EFD_NONBLOCK) failed"); return 1; } fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if ((fl & O_NONBLOCK) == 0) { puts ("eventfd2(EFD_NONBLOCK) does not set non-blocking mode"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper Acked-by: Davide Libenzi Cc: Michael Kerrisk Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 5fb5e04926a54bc1c22bba7ca166840f4476196f Author: Ulrich Drepper Date: Wed Jul 23 21:29:37 2008 -0700 flag parameters: NONBLOCK in signalfd This patch adds support for the SFD_NONBLOCK flag to signalfd4. The additional changes needed are minimal. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include #include #include #include #include #ifndef __NR_signalfd4 # ifdef __x86_64__ # define __NR_signalfd4 289 # elif defined __i386__ # define __NR_signalfd4 327 # else # error "need __NR_signalfd4" # endif #endif #define SFD_NONBLOCK O_NONBLOCK int main (void) { sigset_t ss; sigemptyset (&ss); sigaddset (&ss, SIGUSR1); int fd = syscall (__NR_signalfd4, -1, &ss, 8, 0); if (fd == -1) { puts ("signalfd4(0) failed"); return 1; } int fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if (fl & O_NONBLOCK) { puts ("signalfd4(0) set non-blocking mode"); return 1; } close (fd); fd = syscall (__NR_signalfd4, -1, &ss, 8, SFD_NONBLOCK); if (fd == -1) { puts ("signalfd4(SFD_NONBLOCK) failed"); return 1; } fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if ((fl & O_NONBLOCK) == 0) { puts ("signalfd4(SFD_NONBLOCK) does not set non-blocking mode"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper Acked-by: Davide Libenzi Cc: Michael Kerrisk Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 77d2720059618b9b6e827a8b73831eb6c6fad63c Author: Ulrich Drepper Date: Wed Jul 23 21:29:35 2008 -0700 flag parameters: NONBLOCK in socket and socketpair This patch introduces support for the SOCK_NONBLOCK flag in socket, socketpair, and paccept. To do this the internal function sock_attach_fd gets an additional parameter which it uses to set the appropriate flag for the file descriptor. Given that in modern, scalable programs almost all socket connections are non-blocking and the minimal additional cost for the new functionality I see no reason not to add this code. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include #include #include #include #include #include #include #ifndef __NR_paccept # ifdef __x86_64__ # define __NR_paccept 288 # elif defined __i386__ # define SYS_PACCEPT 18 # define USE_SOCKETCALL 1 # else # error "need __NR_paccept" # endif #endif #ifdef USE_SOCKETCALL # define paccept(fd, addr, addrlen, mask, flags) \ ({ long args[6] = { \ (long) fd, (long) addr, (long) addrlen, (long) mask, 8, (long) flags }; \ syscall (__NR_socketcall, SYS_PACCEPT, args); }) #else # define paccept(fd, addr, addrlen, mask, flags) \ syscall (__NR_paccept, fd, addr, addrlen, mask, 8, flags) #endif #define PORT 57392 #define SOCK_NONBLOCK O_NONBLOCK static pthread_barrier_t b; static void * tf (void *arg) { pthread_barrier_wait (&b); int s = socket (AF_INET, SOCK_STREAM, 0); struct sockaddr_in sin; sin.sin_family = AF_INET; sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK); sin.sin_port = htons (PORT); connect (s, (const struct sockaddr *) &sin, sizeof (sin)); close (s); pthread_barrier_wait (&b); pthread_barrier_wait (&b); s = socket (AF_INET, SOCK_STREAM, 0); sin.sin_port = htons (PORT); connect (s, (const struct sockaddr *) &sin, sizeof (sin)); close (s); pthread_barrier_wait (&b); return NULL; } int main (void) { int fd; fd = socket (PF_INET, SOCK_STREAM, 0); if (fd == -1) { puts ("socket(0) failed"); return 1; } int fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if (fl & O_NONBLOCK) { puts ("socket(0) set non-blocking mode"); return 1; } close (fd); fd = socket (PF_INET, SOCK_STREAM|SOCK_NONBLOCK, 0); if (fd == -1) { puts ("socket(SOCK_NONBLOCK) failed"); return 1; } fl = fcntl (fd, F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if ((fl & O_NONBLOCK) == 0) { puts ("socket(SOCK_NONBLOCK) does not set non-blocking mode"); return 1; } close (fd); int fds[2]; if (socketpair (PF_UNIX, SOCK_STREAM, 0, fds) == -1) { puts ("socketpair(0) failed"); return 1; } for (int i = 0; i < 2; ++i) { fl = fcntl (fds[i], F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if (fl & O_NONBLOCK) { printf ("socketpair(0) set non-blocking mode for fds[%d]\n", i); return 1; } close (fds[i]); } if (socketpair (PF_UNIX, SOCK_STREAM|SOCK_NONBLOCK, 0, fds) == -1) { puts ("socketpair(SOCK_NONBLOCK) failed"); return 1; } for (int i = 0; i < 2; ++i) { fl = fcntl (fds[i], F_GETFL); if (fl == -1) { puts ("fcntl failed"); return 1; } if ((fl & O_NONBLOCK) == 0) { printf ("socketpair(SOCK_NONBLOCK) does not set non-blocking mode for fds[%d]\n", i); return 1; } close (fds[i]); } pthread_barrier_init (&b, NULL, 2); struct sockaddr_in sin; pthread_t th; if (pthread_create (&th, NULL, tf, NULL) != 0) { puts ("pthread_create failed"); return 1; } int s = socket (AF_INET, SOCK_STREAM, 0); int reuse = 1; setsockopt (s, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof (reuse)); sin.sin_family = AF_INET; sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK); sin.sin_port = htons (PORT); bind (s, (struct sockaddr *) &sin, sizeof (sin)); listen (s, SOMAXCONN); pthread_barrier_wait (&b); int s2 = paccept (s, NULL, 0, NULL, 0); if (s2 < 0) { puts ("paccept(0) failed"); return 1; } fl = fcntl (s2, F_GETFL); if (fl & O_NONBLOCK) { puts ("paccept(0) set non-blocking mode"); return 1; } close (s2); close (s); pthread_barrier_wait (&b); s = socket (AF_INET, SOCK_STREAM, 0); sin.sin_port = htons (PORT); setsockopt (s, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof (reuse)); bind (s, (struct sockaddr *) &sin, sizeof (sin)); listen (s, SOMAXCONN); pthread_barrier_wait (&b); s2 = paccept (s, NULL, 0, NULL, SOCK_NONBLOCK); if (s2 < 0) { puts ("paccept(SOCK_NONBLOCK) failed"); return 1; } fl = fcntl (s2, F_GETFL); if ((fl & O_NONBLOCK) == 0) { puts ("paccept(SOCK_NONBLOCK) does not set non-blocking mode"); return 1; } close (s2); close (s); pthread_barrier_wait (&b); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper Acked-by: Davide Libenzi Cc: Michael Kerrisk Cc: "David S. Miller" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 99829b832997d907c30669bfd17da32151e18f04 Author: Ulrich Drepper Date: Wed Jul 23 21:29:33 2008 -0700 flag parameters: NONBLOCK in anon_inode_getfd Building on the previous change to anon_inode_getfd, this patch introduces support for handling of O_NONBLOCK in addition to the already supported O_CLOEXEC. Following patches will take advantage of this support. As can be seen, the additional support for supporting this functionality is minimal. Signed-off-by: Ulrich Drepper Acked-by: Davide Libenzi Cc: Michael Kerrisk Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 4006553b06306b34054529477b06b68a1c66249b Author: Ulrich Drepper Date: Wed Jul 23 21:29:32 2008 -0700 flag parameters: inotify_init This patch introduces the new syscall inotify_init1 (note: the 1 stands for the one parameter the syscall takes, as opposed to no parameter before). The values accepted for this parameter are function-specific and defined in the inotify.h header. Here the values must match the O_* flags, though. In this patch CLOEXEC support is introduced. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include #include #include #include #ifndef __NR_inotify_init1 # ifdef __x86_64__ # define __NR_inotify_init1 294 # elif defined __i386__ # define __NR_inotify_init1 332 # else # error "need __NR_inotify_init1" # endif #endif #define IN_CLOEXEC O_CLOEXEC int main (void) { int fd; fd = syscall (__NR_inotify_init1, 0); if (fd == -1) { puts ("inotify_init1(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("inotify_init1(0) set close-on-exit"); return 1; } close (fd); fd = syscall (__NR_inotify_init1, IN_CLOEXEC); if (fd == -1) { puts ("inotify_init1(IN_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("inotify_init1(O_CLOEXEC) does not set close-on-exit"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [akpm@linux-foundation.org: add sys_ni stub] Signed-off-by: Ulrich Drepper Acked-by: Davide Libenzi Cc: Michael Kerrisk Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit ed8cae8ba01348bfd83333f4648dd807b04d7f08 Author: Ulrich Drepper Date: Wed Jul 23 21:29:30 2008 -0700 flag parameters: pipe This patch introduces the new syscall pipe2 which is like pipe but it also takes an additional parameter which takes a flag value. This patch implements the handling of O_CLOEXEC for the flag. I did not add support for the new syscall for the architectures which have a special sys_pipe implementation. I think the maintainers of those archs have the chance to go with the unified implementation but that's up to them. The implementation introduces do_pipe_flags. I did that instead of changing all callers of do_pipe because some of the callers are written in assembler. I would probably screw up changing the assembly code. To avoid breaking code do_pipe is now a small wrapper around do_pipe_flags. Once all callers are changed over to do_pipe_flags the old do_pipe function can be removed. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include #include #include #include #ifndef __NR_pipe2 # ifdef __x86_64__ # define __NR_pipe2 293 # elif defined __i386__ # define __NR_pipe2 331 # else # error "need __NR_pipe2" # endif #endif int main (void) { int fd[2]; if (syscall (__NR_pipe2, fd, 0) != 0) { puts ("pipe2(0) failed"); return 1; } for (int i = 0; i < 2; ++i) { int coe = fcntl (fd[i], F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { printf ("pipe2(0) set close-on-exit for fd[%d]\n", i); return 1; } } close (fd[0]); close (fd[1]); if (syscall (__NR_pipe2, fd, O_CLOEXEC) != 0) { puts ("pipe2(O_CLOEXEC) failed"); return 1; } for (int i = 0; i < 2; ++i) { int coe = fcntl (fd[i], F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { printf ("pipe2(O_CLOEXEC) does not set close-on-exit for fd[%d]\n", i); return 1; } } close (fd[0]); close (fd[1]); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper Acked-by: Davide Libenzi Cc: Michael Kerrisk Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 336dd1f70ff62d7dd8655228caed4c5bfc818c56 Author: Ulrich Drepper Date: Wed Jul 23 21:29:29 2008 -0700 flag parameters: dup2 This patch adds the new dup3 syscall. It extends the old dup2 syscall by one parameter which is meant to hold a flag value. Support for the O_CLOEXEC flag is added in this patch. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include #include #include #include #include #ifndef __NR_dup3 # ifdef __x86_64__ # define __NR_dup3 292 # elif defined __i386__ # define __NR_dup3 330 # else # error "need __NR_dup3" # endif #endif int main (void) { int fd = syscall (__NR_dup3, 1, 4, 0); if (fd == -1) { puts ("dup3(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("dup3(0) set close-on-exec flag"); return 1; } close (fd); fd = syscall (__NR_dup3, 1, 4, O_CLOEXEC); if (fd == -1) { puts ("dup3(O_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("dup3(O_CLOEXEC) set close-on-exec flag"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper Acked-by: Davide Libenzi Cc: Michael Kerrisk Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit a0998b50c3f0b8fdd265c63e0032f86ebe377dbf Author: Ulrich Drepper Date: Wed Jul 23 21:29:27 2008 -0700 flag parameters: epoll_create This patch adds the new epoll_create2 syscall. It extends the old epoll_create syscall by one parameter which is meant to hold a flag value. In this patch the only flag support is EPOLL_CLOEXEC which causes the close-on-exec flag for the returned file descriptor to be set. A new name EPOLL_CLOEXEC is introduced which in this implementation must have the same value as O_CLOEXEC. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include #include #include #include #include #ifndef __NR_epoll_create2 # ifdef __x86_64__ # define __NR_epoll_create2 291 # elif defined __i386__ # define __NR_epoll_create2 329 # else # error "need __NR_epoll_create2" # endif #endif #define EPOLL_CLOEXEC O_CLOEXEC int main (void) { int fd = syscall (__NR_epoll_create2, 1, 0); if (fd == -1) { puts ("epoll_create2(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("epoll_create2(0) set close-on-exec flag"); return 1; } close (fd); fd = syscall (__NR_epoll_create2, 1, EPOLL_CLOEXEC); if (fd == -1) { puts ("epoll_create2(EPOLL_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("epoll_create2(EPOLL_CLOEXEC) set close-on-exec flag"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper Acked-by: Davide Libenzi Cc: Michael Kerrisk Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 11fcb6c14676023d0bd437841f5dcd670e7990a0 Author: Ulrich Drepper Date: Wed Jul 23 21:29:26 2008 -0700 flag parameters: timerfd_create The timerfd_create syscall already has a flags parameter. It just is unused so far. This patch changes this by introducing the TFD_CLOEXEC flag to set the close-on-exec flag for the returned file descriptor. A new name TFD_CLOEXEC is introduced which in this implementation must have the same value as O_CLOEXEC. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include #include #include #include #include #ifndef __NR_timerfd_create # ifdef __x86_64__ # define __NR_timerfd_create 283 # elif defined __i386__ # define __NR_timerfd_create 322 # else # error "need __NR_timerfd_create" # endif #endif #define TFD_CLOEXEC O_CLOEXEC int main (void) { int fd = syscall (__NR_timerfd_create, CLOCK_REALTIME, 0); if (fd == -1) { puts ("timerfd_create(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("timerfd_create(0) set close-on-exec flag"); return 1; } close (fd); fd = syscall (__NR_timerfd_create, CLOCK_REALTIME, TFD_CLOEXEC); if (fd == -1) { puts ("timerfd_create(TFD_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("timerfd_create(TFD_CLOEXEC) set close-on-exec flag"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper Acked-by: Davide Libenzi Cc: Michael Kerrisk Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit b087498eb5605673b0f260a7620d91818cd72304 Author: Ulrich Drepper Date: Wed Jul 23 21:29:25 2008 -0700 flag parameters: eventfd This patch adds the new eventfd2 syscall. It extends the old eventfd syscall by one parameter which is meant to hold a flag value. In this patch the only flag support is EFD_CLOEXEC which causes the close-on-exec flag for the returned file descriptor to be set. A new name EFD_CLOEXEC is introduced which in this implementation must have the same value as O_CLOEXEC. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include #include #include #include #ifndef __NR_eventfd2 # ifdef __x86_64__ # define __NR_eventfd2 290 # elif defined __i386__ # define __NR_eventfd2 328 # else # error "need __NR_eventfd2" # endif #endif #define EFD_CLOEXEC O_CLOEXEC int main (void) { int fd = syscall (__NR_eventfd2, 1, 0); if (fd == -1) { puts ("eventfd2(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("eventfd2(0) sets close-on-exec flag"); return 1; } close (fd); fd = syscall (__NR_eventfd2, 1, EFD_CLOEXEC); if (fd == -1) { puts ("eventfd2(EFD_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("eventfd2(EFD_CLOEXEC) does not set close-on-exec flag"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [akpm@linux-foundation.org: add sys_ni stub] Signed-off-by: Ulrich Drepper Acked-by: Davide Libenzi Cc: Michael Kerrisk Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 9deb27baedb79759c3ab9435a7d8b841842d56e9 Author: Ulrich Drepper Date: Wed Jul 23 21:29:24 2008 -0700 flag parameters: signalfd This patch adds the new signalfd4 syscall. It extends the old signalfd syscall by one parameter which is meant to hold a flag value. In this patch the only flag support is SFD_CLOEXEC which causes the close-on-exec flag for the returned file descriptor to be set. A new name SFD_CLOEXEC is introduced which in this implementation must have the same value as O_CLOEXEC. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include #include #include #include #include #ifndef __NR_signalfd4 # ifdef __x86_64__ # define __NR_signalfd4 289 # elif defined __i386__ # define __NR_signalfd4 327 # else # error "need __NR_signalfd4" # endif #endif #define SFD_CLOEXEC O_CLOEXEC int main (void) { sigset_t ss; sigemptyset (&ss); sigaddset (&ss, SIGUSR1); int fd = syscall (__NR_signalfd4, -1, &ss, 8, 0); if (fd == -1) { puts ("signalfd4(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("signalfd4(0) set close-on-exec flag"); return 1; } close (fd); fd = syscall (__NR_signalfd4, -1, &ss, 8, SFD_CLOEXEC); if (fd == -1) { puts ("signalfd4(SFD_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("signalfd4(SFD_CLOEXEC) does not set close-on-exec flag"); return 1; } close (fd); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [akpm@linux-foundation.org: add sys_ni stub] Signed-off-by: Ulrich Drepper Acked-by: Davide Libenzi Cc: Michael Kerrisk Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 7d9dbca34240ebb6ff88d8a29c6c7bffd098f0c1 Author: Ulrich Drepper Date: Wed Jul 23 21:29:22 2008 -0700 flag parameters: anon_inode_getfd extension This patch just extends the anon_inode_getfd interface to take an additional parameter with a flag value. The flag value is passed on to get_unused_fd_flags in anticipation for a use with the O_CLOEXEC flag. No actual semantic changes here, the changed callers all pass 0 for now. [akpm@linux-foundation.org: KVM fix] Signed-off-by: Ulrich Drepper Acked-by: Davide Libenzi Cc: Michael Kerrisk Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit c019bbc612f6633ede7ed67725cbf68de45ae8a4 Author: Ulrich Drepper Date: Wed Jul 23 21:29:21 2008 -0700 flag parameters: paccept w/out set_restore_sigmask Some platforms do not have support to restore the signal mask in the return path from a syscall. For those platforms syscalls like pselect are not defined at all. This is, I think, not a good choice for paccept() since paccept() adds more value on top of accept() than just the signal mask handling. Therefore this patch defines a scaled down version of the sys_paccept function for those platforms. It returns -EINVAL in case the signal mask is non-NULL but behaves the same otherwise. Note that I explicitly included . I saw that it is currently included but indirectly two levels down. There is too much risk in relying on this. The header might change and then suddenly the function definition would change without anyone immediately noticing. Signed-off-by: Ulrich Drepper Cc: Davide Libenzi Cc: Michael Kerrisk Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit aaca0bdca573f3f51ea03139f9c7289541e7bca3 Author: Ulrich Drepper Date: Wed Jul 23 21:29:20 2008 -0700 flag parameters: paccept This patch is by far the most complex in the series. It adds a new syscall paccept. This syscall differs from accept in that it adds (at the userlevel) two additional parameters: - a signal mask - a flags value The flags parameter can be used to set flag like SOCK_CLOEXEC. This is imlpemented here as well. Some people argued that this is a property which should be inherited from the file desriptor for the server but this is against POSIX. Additionally, we really want the signal mask parameter as well (similar to pselect, ppoll, etc). So an interface change in inevitable. The flag value is the same as for socket and socketpair. I think diverging here will only create confusion. Similar to the filesystem interfaces where the use of the O_* constants differs, it is acceptable here. The signal mask is handled as for pselect etc. The mask is temporarily installed for the thread and removed before the call returns. I modeled the code after pselect. If there is a problem it's likely also in pselect. For architectures which use socketcall I maintained this interface instead of adding a system call. The symmetry shouldn't be broken. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include #include #include #include #include #include #include #include #include #ifndef __NR_paccept # ifdef __x86_64__ # define __NR_paccept 288 # elif defined __i386__ # define SYS_PACCEPT 18 # define USE_SOCKETCALL 1 # else # error "need __NR_paccept" # endif #endif #ifdef USE_SOCKETCALL # define paccept(fd, addr, addrlen, mask, flags) \ ({ long args[6] = { \ (long) fd, (long) addr, (long) addrlen, (long) mask, 8, (long) flags }; \ syscall (__NR_socketcall, SYS_PACCEPT, args); }) #else # define paccept(fd, addr, addrlen, mask, flags) \ syscall (__NR_paccept, fd, addr, addrlen, mask, 8, flags) #endif #define PORT 57392 #define SOCK_CLOEXEC O_CLOEXEC static pthread_barrier_t b; static void * tf (void *arg) { pthread_barrier_wait (&b); int s = socket (AF_INET, SOCK_STREAM, 0); struct sockaddr_in sin; sin.sin_family = AF_INET; sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK); sin.sin_port = htons (PORT); connect (s, (const struct sockaddr *) &sin, sizeof (sin)); close (s); pthread_barrier_wait (&b); s = socket (AF_INET, SOCK_STREAM, 0); sin.sin_port = htons (PORT); connect (s, (const struct sockaddr *) &sin, sizeof (sin)); close (s); pthread_barrier_wait (&b); pthread_barrier_wait (&b); sleep (2); pthread_kill ((pthread_t) arg, SIGUSR1); return NULL; } static void handler (int s) { } int main (void) { pthread_barrier_init (&b, NULL, 2); struct sockaddr_in sin; pthread_t th; if (pthread_create (&th, NULL, tf, (void *) pthread_self ()) != 0) { puts ("pthread_create failed"); return 1; } int s = socket (AF_INET, SOCK_STREAM, 0); int reuse = 1; setsockopt (s, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof (reuse)); sin.sin_family = AF_INET; sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK); sin.sin_port = htons (PORT); bind (s, (struct sockaddr *) &sin, sizeof (sin)); listen (s, SOMAXCONN); pthread_barrier_wait (&b); int s2 = paccept (s, NULL, 0, NULL, 0); if (s2 < 0) { puts ("paccept(0) failed"); return 1; } int coe = fcntl (s2, F_GETFD); if (coe & FD_CLOEXEC) { puts ("paccept(0) set close-on-exec-flag"); return 1; } close (s2); pthread_barrier_wait (&b); s2 = paccept (s, NULL, 0, NULL, SOCK_CLOEXEC); if (s2 < 0) { puts ("paccept(SOCK_CLOEXEC) failed"); return 1; } coe = fcntl (s2, F_GETFD); if ((coe & FD_CLOEXEC) == 0) { puts ("paccept(SOCK_CLOEXEC) does not set close-on-exec flag"); return 1; } close (s2); pthread_barrier_wait (&b); struct sigaction sa; sa.sa_handler = handler; sa.sa_flags = 0; sigemptyset (&sa.sa_mask); sigaction (SIGUSR1, &sa, NULL); sigset_t ss; pthread_sigmask (SIG_SETMASK, NULL, &ss); sigaddset (&ss, SIGUSR1); pthread_sigmask (SIG_SETMASK, &ss, NULL); sigdelset (&ss, SIGUSR1); alarm (4); pthread_barrier_wait (&b); errno = 0 ; s2 = paccept (s, NULL, 0, &ss, 0); if (s2 != -1 || errno != EINTR) { puts ("paccept did not fail with EINTR"); return 1; } close (s); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [akpm@linux-foundation.org: make it compile] [akpm@linux-foundation.org: add sys_ni stub] Signed-off-by: Ulrich Drepper Acked-by: Davide Libenzi Cc: Michael Kerrisk Cc: Cc: "David S. Miller" Cc: Roland McGrath Cc: Kyle McMartin Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit a677a039be7243357d93502bff2b40850c942e2d Author: Ulrich Drepper Date: Wed Jul 23 21:29:17 2008 -0700 flag parameters: socket and socketpair This patch adds support for flag values which are ORed to the type passwd to socket and socketpair. The additional code is minimal. The flag values in this implementation can and must match the O_* flags. This avoids overhead in the conversion. The internal functions sock_alloc_fd and sock_map_fd get a new parameters and all callers are changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include #include #include #include #include #define PORT 57392 /* For Linux these must be the same. */ #define SOCK_CLOEXEC O_CLOEXEC int main (void) { int fd; fd = socket (PF_INET, SOCK_STREAM, 0); if (fd == -1) { puts ("socket(0) failed"); return 1; } int coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { puts ("socket(0) set close-on-exec flag"); return 1; } close (fd); fd = socket (PF_INET, SOCK_STREAM|SOCK_CLOEXEC, 0); if (fd == -1) { puts ("socket(SOCK_CLOEXEC) failed"); return 1; } coe = fcntl (fd, F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { puts ("socket(SOCK_CLOEXEC) does not set close-on-exec flag"); return 1; } close (fd); int fds[2]; if (socketpair (PF_UNIX, SOCK_STREAM, 0, fds) == -1) { puts ("socketpair(0) failed"); return 1; } for (int i = 0; i < 2; ++i) { coe = fcntl (fds[i], F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { printf ("socketpair(0) set close-on-exec flag for fds[%d]\n", i); return 1; } close (fds[i]); } if (socketpair (PF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0, fds) == -1) { puts ("socketpair(SOCK_CLOEXEC) failed"); return 1; } for (int i = 0; i < 2; ++i) { coe = fcntl (fds[i], F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { printf ("socketpair(SOCK_CLOEXEC) does not set close-on-exec flag for fds[%d]\n", i); return 1; } close (fds[i]); } puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper Acked-by: Davide Libenzi Cc: Michael Kerrisk Cc: "David S. Miller" Cc: Ralf Baechle Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 6e2c10a12a2170856f5582d62d583cbcd1cb5eaf Author: Akinobu Mita Date: Wed Jul 23 21:29:15 2008 -0700 binfmt_misc: use simple_read_from_buffer() Signed-off-by: Akinobu Mita Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 76a6f3dc9a7108785c145a298f82c72f9208fe17 Author: Adrian Bunk Date: Wed Jul 23 21:29:15 2008 -0700 CONFIG_SOUND_WM97XX: remove stale makefile line The driver is gone for a long time. Reported-by: Robert P. J. Day Signed-off-by: Adrian Bunk Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 7102ed519a08b70eadc8fea9d8765d2d990241d1 Author: Adrian Bunk Date: Wed Jul 23 21:29:13 2008 -0700 remove the OSS trident driver SOUND_TRIDENT was the last PCI OSS driver, and since there's already an ALSA driver for the same hardware we can remove it. [muli@il.ibm.com: update CREDITS] Signed-off-by: Adrian Bunk Signed-off-by: Muli Ben-Yehuda Signed-off-by: Muli Ben-Yehuda Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 33cba0657393a75e18e1781e3e13613303f18124 Author: Andy Whitcroft Date: Wed Jul 23 21:29:12 2008 -0700 checkpatch: version 0.21 Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 234fff6515a11cf3e67c793146689da426787fea Author: Andy Whitcroft Date: Wed Jul 23 21:29:12 2008 -0700 checkpatch: types cannot start mid word for pointer tests When checking spacing for pointer checks the type cannot start in the middle of a word, ie. this is not 'int * bar': x = fooint * bar; Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 292f1a9b342d763f94ea3915726a48905be4acd1 Author: Andy Whitcroft Date: Wed Jul 23 21:29:11 2008 -0700 checkpatch: complex macros need to ignore comments Ensure we ignore comments in complex macro detection else we incorrectly report this: #define PFM_GROUP_PERM_ANY -1 /* any user/group */ Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 74048ed811152a995a88945ba9e0dded34adfff4 Author: Andy Whitcroft Date: Wed Jul 23 21:29:10 2008 -0700 checkpatch: variants -- move the main unary/binary operators to use variants Now that we have a variants system, move to using that to carry the unary/binary designation for +, -, &, and *. Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 1f65f947a6a875e1fe7867dc08e981c4101d435d Author: Andy Whitcroft Date: Wed Jul 23 21:29:10 2008 -0700 checkpatch: add checks for question mark and colon spacing Add checks for the question mark colon operator spacing, and also check the other uses of colon. Colon means a number of things: - it introduces the else part of the ?: operator, - it terminates a goto label, - it terminates the case value, - it separates the identifier from the bit size on bit fields, and - it is used to introduce option types in asm(). Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit d2506586586c59f5db0e2ce00d5d31ccec6260b8 Author: Andy Whitcroft Date: Wed Jul 23 21:29:09 2008 -0700 checkpatch: possible modifiers -- handle multiple modifiers and trailing Add support for multiple modifiers such as: int __one __two foo; Also handle trailing known modifiers when defecting modifiers: int __one foo __read_mostly; Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 0221f55c142b0ac8baf6f0b6c4e1ec89f0c98e96 Author: Andy Whitcroft Date: Wed Jul 23 21:29:08 2008 -0700 checkpatch: possible types -- known modifiers cannot be types Ensure we do not inadvertantly load known modifiers up as possible types. Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 8ea3eb9a20f39d5afa52900a34092b4b5f6b55cb Author: Andy Whitcroft Date: Wed Jul 23 21:29:08 2008 -0700 checkpatch: handle return types of pointers to functions Make sure we correctly mark the return type of the pointer to a function declaration. const void *(*sb_tag)(struct sysfs_tag_info *info); Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit b8f96a31f38c8e9fc75f0a89c6815e7cbc402858 Author: Andy Whitcroft Date: Wed Jul 23 21:29:07 2008 -0700 checkpatch: macro complexity checks are meaningless in linker scripts Exclude vmlinux.lds.h from the macro complexity checks. They will never apply sanely here. Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit d2172eb5bd4b7d06577113ec40635083619ca54a Author: Andy Whitcroft Date: Wed Jul 23 21:29:07 2008 -0700 checkpatch: possible modifiers are not being correctly matched Although we are finding the added modifier in the declaration below we are not correctly matching it as a type. Fix the declaration. static void __ref *vmem_alloc_pages(unsigned int order) { } Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 7429c6903e3628fc2cfea65ec7e13bac030c7bfe Author: Andy Whitcroft Date: Wed Jul 23 21:29:06 2008 -0700 checkpatch: improve type matcher debug Improve type matcher debug so we can see what it does match. As part of this move us to to using the common debug framework. Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 389a2fe57ffc59a649bea39db4d7e6d2eff2b562 Author: Andy Whitcroft Date: Wed Jul 23 21:29:05 2008 -0700 checkpatch: allow for type modifiers on multiple declarations Allow for type modifiers mid declaration on multiple declarations: struct mxser_mstatus ms, __user *msu = argp; Reported by Jiri Slaby. Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 3c232147a7d5b0418b0a0bae0e5b9a62fb81f4f2 Author: Wolfram Sang Date: Wed Jul 23 21:29:05 2008 -0700 checkpatch: correct spelling in kfree checks Correct spelling in the kfree reports. Signed-off-by: Wolfram Sang Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 4c432a8f0134504814aa8dcce6cc57c89d175604 Author: Greg Kroah-Hartman Date: Wed Jul 23 21:29:04 2008 -0700 checkpatch: usb_free_urb() can take NULL usb_free_urb() can take a NULL, so let's check and warn about that. Signed-off-by: Greg Kroah-Hartman Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f5fe35dd95549b1b419cdeb2ec3fe61fda94fa93 Author: Andy Whitcroft Date: Wed Jul 23 21:29:03 2008 -0700 checkpatch: condition/loop indent checks Check to see if the block/statement which a condition or loop introduces is indented correctly. Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 53210168feeff9a3c780bd42f69936d4c12381d5 Author: Andy Whitcroft Date: Wed Jul 23 21:29:03 2008 -0700 checkpatch: toughen trailing if statement checks and extend them to while and for Extend the trailing statement checks to report a trailing semi-colon ';' as we really want it on the next line and indented so it is really really obvious. Also extend the tests to include while and for. Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 8d31cfcecf67563d70cd68616cb8fb4384f24b51 Author: Andy Whitcroft Date: Wed Jul 23 21:29:02 2008 -0700 checkpatch: check spacing for square brackets Check on the spacing before square brackets. We should only allow spaces there if this is part of a type definition or an initialialiser. Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit e2a763c20b89890d2153551b1af6962b135de4c0 Author: Andy Whitcroft Date: Wed Jul 23 21:29:02 2008 -0700 checkpatch: switch -- report trailing statements on case and default Report trailing statements on case and default lines. Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f4c014c0dede10cc0a8463e748892e738e190699 Author: Andy Whitcroft Date: Wed Jul 23 21:29:01 2008 -0700 checkpatch: allow printk strings to exceed 80 characters to maintain their searchability Allow printk strings to break the 80 character width limits, thus keeping them complete and searchable. Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 548596d523d83dff5a670beb84be0daf4c3bcd16 Author: Andy Whitcroft Date: Wed Jul 23 21:29:01 2008 -0700 checkpatch: trailing statement indent: fix end of statement location Fix end of statement location. Where the last line of the statement is replaced we are miss reporting the newly added replacement an incorrectly indented trailing statement for the negative context. We are also incorrectly reporting negative statements generally. Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit a3bb97a7aba36055d476896ed6393ab35a119d5b Author: Andy Whitcroft Date: Wed Jul 23 21:29:00 2008 -0700 checkpatch: macros: fix statement counting block end detection We are incorrectly counting the lines in a block while accumulating the trailing lines in a macro statement, leading to false positives. Fix end of block handling and general counting for negative context lines. Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 6ef9b297f6e8850da3be9c9ff5f00385c0977004 Author: Andy Whitcroft Date: Wed Jul 23 21:28:59 2008 -0700 checkpatch: types: unary -- goto introduces unary context When we see a goto we enter unary context. For example: goto *h; Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit beae6332493a40116dba24928154621f2e88b9a9 Author: Andy Whitcroft Date: Wed Jul 23 21:28:59 2008 -0700 checkpatch: comment detection: ignore macro continuation when detecting associated comments When looking for an associated comment they may be suffixed by a macro continuation. Ignore this. Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit d3ddcf471ea90d7ff711dbaa371ef379ed625ec0 Author: Andy Whitcroft Date: Wed Jul 23 21:28:58 2008 -0700 checkpatch: possible types: __asm__ is never a type We are false matching __asm__ as a type, and then tripping the external function checks. Squash. Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f3db6639fee577f6ed92c0a1fc881e748c47ec48 Author: Michael Ellerman Date: Wed Jul 23 21:28:57 2008 -0700 checkpatch: add a checkpatch warning for new uses of __initcall(). [apw@shadowen.org: generalise pattern and add tests] Signed-off-by: Michael Ellerman Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit c8cb2ca37ed51aa1f3b20e3eff1e72df1c400f70 Author: Andy Whitcroft Date: Wed Jul 23 21:28:57 2008 -0700 checkpatch: types: some types may also be identifiers Some types such as typedefs may overlap real identifiers. Be more targetted about when a type can really exist. Where it cannot let it be an identifier. This prevents false reporting of the minus '-' in unary context in the following: foo[bar->bool - 1]; Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit fee61c47d15270bdea699a8a3dd867f0825c3541 Author: Andy Whitcroft Date: Wed Jul 23 21:28:56 2008 -0700 checkpatch: return is not a function -- parentheses for casts are ok too Casts require parentheses so it is possible to have something like this: return (int)(*a); This miss trips the complexity function. Ensure that the two separate parenthesised sections are not coelesced. Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 6cbb2e711128b505209f7c910018aac77335c887 Author: Andy Whitcroft Date: Wed Jul 23 21:28:55 2008 -0700 checkpatch: Version: 0.20 Signed-off-by: Andy Whitcroft Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 5aa0769d089125e63f8dc23e0283e559e1790493 Author: Hans-Christian Egtvedt Date: Wed Jul 23 21:28:55 2008 -0700 atmel_pwm: set up only one PWM clock when allocating a clock This patch will only setup one clock, if free, and return this clock to the caller. The previous solution would setup both clocks with the same prescaler and divider and return PWM_CPR_CLKB, thus taking both clocks in the same call without the caller knowing. Signed-off-by: Hans-Christian Egtvedt Cc: David Brownell Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 82736f4d1d2b7063b829cc93171a6e5aea8a9c49 Author: Uwe Kleine-König Date: Wed Jul 23 21:28:54 2008 -0700 generic irqs: handle failure of irqchip->set_type in setup_irq set_type returns an int indicating success or failure, but up to now setup_irq ignores that. In my case this resulted in a machine hang: gpio-keys requested IRQF_TRIGGER_RISING | IRQF_TRIGGER_FALLING, but arm/ns9xxx can only trigger on one direction so set_type didn't touch the configuration which happens do default to a level sensitiveness and returned -EINVAL. setup_irq ignored that and unmasked the irq. This resulted in an endless triggering of the gpio-key interrupt service routine which effectively killed the machine. With this patch applied setup_irq propagates the error to the caller. Note that before in the case chip && !chip->set_type && !chip->name a NULL pointer was feed to printk. This is fixed, too. Signed-off-by: Uwe Kleine-König Cc: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f606ddf42fd4edc558eeb48bfee66d2c591571d2 Author: Adrian Bunk Date: Wed Jul 23 21:28:50 2008 -0700 remove the v850 port Trying to compile the v850 port brings many compile errors, one of them exists since at least kernel 2.6.19. There also seems to be noone willing to bring this port back into a usable state. This patch therefore removes the v850 port. If anyone ever decides to revive the v850 port the code will still be available from older kernels, and it wouldn't be impossible for the port to reenter the kernel if it would become actively maintained again. Signed-off-by: Adrian Bunk Acked-by: Greg Ungerer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 99764fa4ceeecba8b9e0a8a5565b418a2e94f83b Author: WANG Cong Date: Wed Jul 23 21:28:49 2008 -0700 UML: make several more things static - Make some variables and functions static, since they don't need to be global. - Remove an unused function - arch/um/kernel/time.c::sched_clock(). - Clean the style a bit as complained by checkpatch.pl. Cc: Jeff Dike Signed-off-by: WANG Cong Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 4a5675820436e4ad738dd442c1cc8a165101509b Author: WANG Cong Date: Wed Jul 23 21:28:49 2008 -0700 arch/um/kernel/mem.c: remove arch_validate() - Remove arch_validate(), because no one uses it. - Remove useless macro HAVE_ARCH_VALIDATE. - Make the variable 'empty_bad_page' static. Cc: Jeff Dike Signed-off-by: WANG Cong Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 4c182ae7810f3fe444e666f3f78c209a7c116fdf Author: WANG Cong Date: Wed Jul 23 21:28:47 2008 -0700 arch/um/kernel/irq.c: clean up some functions Make activate_fd() and free_irq_by_irq_and_dev() static. Remove init_aio_irq() since it has no users. Cc: Jeff Dike Signed-off-by: WANG Cong Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit ed62f77bb631bc4a2d8acb0521b720cb55e58183 Author: Akinobu Mita Date: Wed Jul 23 21:28:46 2008 -0700 cris: use simple_read_from_buffer() Signed-off-by: Akinobu Mita Cc: Mikael Starvik Cc: Jesper Nilsson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit d50004b0867a59f8a81116f000edb352595343d9 Author: Fernando Luis Vazquez Cao Date: Wed Jul 23 21:28:45 2008 -0700 cris: remove unused global_flush_tlb global_flush_tlb is declared but never used. Signed-off-by: Fernando Luis Vazquez Cao Cc: Mikael Starvik Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 912019572180f287e85b5534fbb1c1e3ca6df6c9 Author: Adrian Bunk Date: Wed Jul 23 21:28:45 2008 -0700 mn10300: move sg_dma_{address,len}() to asm/scatterlist.h mn10300 was the only architecture where sg_dma_{address,len}() were not in asm/scatterlist.h, and it's not a big surprise that this caused a compile error somewhere: /home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/media/video/videobuf-dma-sg.c: In function `videobuf_dma_map': /home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/media/video/videobuf-dma-sg.c:238: error: implicit declaration of function 'sg_dma_address' Acked-by: David Howells Signed-off-by: Adrian Bunk Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f0af566da6e9a4a2f5a83c5a70f3d0a772050e21 Author: David Howells Date: Wed Jul 23 21:28:44 2008 -0700 pm: fix try_to_freeze_tasks()'s use of do_div() Fix try_to_freeze_tasks()'s use of do_div() on an s64 by making elapsed_csecs64 a u64 instead and dividing that. Possibly this should be guarded lest the interval calculation turn up negative, but the possible negativity of the result of the division is cast away anyway. This was introduced by patch 438e2ce68dfd4af4cfcec2f873564fb921db4bb5. Signed-off-by: David Howells Acked-by: "Rafael J. Wysocki" Acked-by: Pavel Machek Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit e41fb7c58e3ca18ec5c9c9bb7bb68e8e653c9e8e Author: Carlos Corbacho Date: Wed Jul 23 21:28:43 2008 -0700 pm: acpi pm: add DMI quirk list for ACPI 1.0 suspend ordering There are a few BIOSes that we know of already that need to use the ACPI 1.0 suspend order. This appears to be only be a small minority of mostly nVidia based systems. Based on observation of Windows behaviour, it's clear that Windows is also doing maintaining its own list of broken hardware that needs this workaround. Signed-off-by: Carlos Corbacho Signed-off-by: Rafael J. Wysocki Cc: Andi Kleen Cc: Len Brown Acked-by: Pavel Machek Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit bdfe6b7c681669148dae4db27eb24ee5408ba371 Author: Shaohua Li Date: Wed Jul 23 21:28:41 2008 -0700 pm: acpi hibernation: utilize hardware signature ACPI defines a hardware signature. BIOS calculates the signature according to hardware configure and if hardware changes while hibernated, the signature will change. In that case, S4 resume should fail. Still, there may be systems on which this mechanism does not work correctly, so it is better to provide a workaround for them. For this reason, add a new switch to the acpi_sleep= command line argument allowing one to disable hardware signature checking. [shaohua.li@intel.com: build fix] Signed-off-by: Shaohua Li Signed-off-by: Rafael J. Wysocki Cc: Andi Kleen Cc: Len Brown Acked-by: Pavel Machek Cc: Cc: Shaohua Li Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 2f15fc4bdf91eb399da3f47a09c55831d9f22826 Author: Zhang Rui Date: Wed Jul 23 21:28:40 2008 -0700 pm: schedule sysrq poweroff on boot cpu schedule sysrq poweroff on boot cpu. sysrq poweroff needs to disable nonboot cpus, and we need to run this on boot cpu to avoid any recursion. http://bugzilla.kernel.org/show_bug.cgi?id=10897 [kosaki.motohiro@jp.fujitsu.com: build fix] Signed-off-by: Zhang Rui Tested-by: Rus Signed-off-by: Rafael J. Wysocki Acked-by: Pavel Machek Signed-off-by: KOSAKI Motohiro Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit c1a220e7acf8ad2c03504891f4a70cd9c32c904b Author: Zhang Rui Date: Wed Jul 23 21:28:39 2008 -0700 pm: introduce new interfaces schedule_work_on() and queue_work_on() This interface allows adding a job on a specific cpu. Although a work struct on a cpu will be scheduled to other cpu if the cpu dies, there is a recursion if a work task tries to offline the cpu it's running on. we need to schedule the task to a specific cpu in this case. http://bugzilla.kernel.org/show_bug.cgi?id=10897 [oleg@tv-sign.ru: cleanups] Signed-off-by: Zhang Rui Tested-by: Rus Signed-off-by: Rafael J. Wysocki Acked-by: Pavel Machek Signed-off-by: Oleg Nesterov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 0d83304c7e7bd3b05be90281b3a47841bc8f057a Author: Akinobu Mita Date: Wed Jul 23 21:28:38 2008 -0700 pm: hibernation: simplify memory bitmap This patch simplifies the memory bitmap manipulations. - remove the member size in struct bm_block It is not necessary for struct bm_block to have the number of bit chunks that can be calculated by using end_pfn and start_pfn. - use find_next_bit() for memory_bm_next_pfn No need to invent the bitmap library only for the memory bitmap. Signed-off-by: Akinobu Mita Signed-off-by: Rafael J. Wysocki Acked-by: Pavel Machek Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 8111d1b552349921aae1acf73e4e8cea98e80970 Author: Alan Stern Date: Wed Jul 23 21:28:37 2008 -0700 pm: add new PM_EVENT codes for runtime power transitions This patch (as1112) adds some new PM_EVENT_* codes for use by kernel subsystems. They describe runtime power-state transitions of the sort already implemented by the USB subsystem. Signed-off-by: Alan Stern Signed-off-by: Rafael J. Wysocki Acked-by: Pavel Machek Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 8c363265d57d755e62053e9f69a1f2164e83f7ea Author: Rafael J. Wysocki Date: Wed Jul 23 21:28:37 2008 -0700 pm: drop unnecessary includes from pm.h Drop unnecessary includes from include/linux/pm.h . Signed-off-by: Rafael J. Wysocki Acked-by: Pavel Machek Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 40b4ac33b4d1bdd5cbeb2241be2399c550fa3696 Author: Rafael J. Wysocki Date: Wed Jul 23 21:28:36 2008 -0700 pm: remove obsolete piece of PM documentation Remove some obsolete PM documentation. The majority of contents of Documentation/power/pm.txt are outdated. Remove the outdated parts of this file and move the rest to Documentation/power/apm-acpi.txt . Update the index in Documentation/power/ as appropriate. Signed-off-by: Rafael J. Wysocki Acked-by: Pavel Machek Acked-by: Randy Dunlap Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit e7ecb331e11d1f7aa66aeef9170fc20781c9bb55 Author: Rafael J. Wysocki Date: Wed Jul 23 21:28:35 2008 -0700 pm: remove remaining obsolete definitions from pm.h Remove the remaining obsolete definitions from include/linux/pm.h and move the definitions of PM_SUSPEND and PM_RESUME to the header of h3600 which is the only user of them. Signed-off-by: Rafael J. Wysocki Acked-by: Pavel Machek Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 558481f038e587b22d02167af58914c814ce9de5 Author: Rafael J. Wysocki Date: Wed Jul 23 21:28:35 2008 -0700 pm: remove definition of struct pm_dev Remove the definition of 'struct pm_dev', which is not used any more, along with some related stuff from include/linux/pm.h . Signed-off-by: Rafael J. Wysocki Acked-by: Pavel Machek Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit d75f65fd247fe85d90a3880d143b1bb22fe13a48 Author: Adrian Bunk Date: Wed Jul 23 21:28:34 2008 -0700 remove include/linux/pm_legacy.h Remove the obsolete and no longer used include/linux/pm_legacy.h Reviewed-by: Robert P. J. Day Signed-off-by: Adrian Bunk Cc: Pavel Machek Acked-by: "Rafael J. Wysocki" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 77437fd4e61f87cc94d9314baa5cbf50e3ccdf54 Author: David Brownell Date: Wed Jul 23 21:28:33 2008 -0700 pm: boot time suspend selftest Boot-time test for system suspend states (STR or standby). The generic RTC framework triggers wakeup alarms, which are used to exit those states. - Measures some aspects of suspend time ... this uses "jiffies" until someone converts it to use a timebase that works properly even while timer IRQs are disabled. - Triggered by a command line parameter. By default nothing even vaguely troublesome will happen, but "test_suspend=mem" will give you a brief STR test during system boot. (Or you may need to use "test_suspend=standby" instead, if your hardware needs that.) This isn't without problems. It fires early enough during boot that for example both PCMCIA and MMC stacks have misbehaved. The workaround in those cases was to boot without such media cards inserted. [matthltc@us.ibm.com: fix compile failure in boot time suspend selftest] Signed-off-by: David Brownell Cc: Ingo Molnar Cc: Pavel Machek Cc: "Rafael J. Wysocki" Signed-off-by: Matt Helsley Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 0d63081d418c73cc187c893069e0f24c4c6eecd3 Author: Pavel Machek Date: Wed Jul 23 21:28:32 2008 -0700 swsusp: provide users with a hint about the no_console_suspend option Tell the user about the no_console_suspend option, so that we don't have to tell each bug reporter personally. [akpm@linux-foundation.org: clarify the text a little] Signed-off-by: Pavel Machek Cc: "Rafael J. Wysocki" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit fb9ba4e95