aboutsummaryrefslogtreecommitdiffstats
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2006-06-27[PATCH] spin/rwlock init cleanupsIngo Molnar14-18/+18
locking init cleanups: - convert " = SPIN_LOCK_UNLOCKED" to spin_lock_init() or DEFINE_SPINLOCK() - convert rwlocks in a similar manner this patch was generated automatically. Motivation: - cleanliness - lockdep needs control of lock initialization, which the open-coded variants do not give - it's also useful for -rt and for lock debugging in general Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivialLinus Torvalds2-2/+2
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: typo fixes Clean up 'inline is not at beginning' warnings for usb storage Storage class should be first i386: Trivial typo fixes ixj: make ixj_set_tone_off() static spelling fixes fix paniced->panicked typos Spelling fixes for Documentation/atomic_ops.txt move acknowledgment for Mark Adler to CREDITS remove the bouncing email address of David Campbell
2006-06-26Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds26-459/+706
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (42 commits) [IOAT]: Do not dereference THIS_MODULE directly to set unsafe. [NETROM]: Fix possible null pointer dereference. [NET] netpoll: break recursive loop in netpoll rx path [NET] netpoll: don't spin forever sending to stopped queues [IRDA]: add some IBM think pads [ATM]: atm/mpc.c warning fix [NET]: skb_find_text ignores to argument [NET]: make net/core/dev.c:netdev_nit static [NET]: Fix GSO problems in dev_hard_start_xmit() [NET]: Fix CHECKSUM_HW GSO problems. [TIPC]: Fix incorrect correction to discovery timer frequency computation. [TIPC]: Get rid of dynamically allocated arrays in broadcast code. [TIPC]: Fixed link switchover bugs [TIPC]: Enhanced & cleaned up system messages; fixed 2 obscure memory leaks. [TIPC]: First phase of assert() cleanup [TIPC]: Disallow config operations that aren't supported in certain modes. [TIPC]: Fixed memory leak in tipc_link_send() when destination is unreachable [TIPC]: Added missing warning for out-of-memory condition [TIPC]: Withdrawing all names from nameless port now returns success, not error [TIPC]: Optimized argument validation done by connect(). ...
2006-06-26[PATCH] net/rxrpc: use list_move()Akinobu Mita3-6/+3
This patch converts the combination of list_del(A) and list_add(A, B) to list_move(A, B) under net/rxrpc. Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Akinobu Mita <mita@miraclelinux.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26spelling fixesAndreas Mohr2-2/+2
acquired (aquired) contiguous (contigious) successful (succesful, succesfull) surprise (suprise) whether (weather) some other misspellings Signed-off-by: Andreas Mohr <andi@lisas.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-26[NETROM]: Fix possible null pointer dereference.Ralf Baechle1-4/+8
If in nr_link_failed the neighbour list is non-empty but the node list is empty we'll end dereferencing a in a NULL pointer. This fixes coverity 362. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-26[NET] netpoll: break recursive loop in netpoll rx pathNeil Horman1-2/+24
The netpoll system currently has a rx to tx path via: netpoll_rx __netpoll_rx arp_reply netpoll_send_skb dev->hard_start_tx This rx->tx loop places network drivers at risk of inadvertently causing a deadlock or BUG halt by recursively trying to acquire a spinlock that is used in both their rx and tx paths (this problem was origionally reported to me in the 3c59x driver, which shares a spinlock between the boomerang_interrupt and boomerang_start_xmit routines). This patch breaks this loop, by queueing arp frames, so that they can be responded to after all receive operations have been completed. Tested by myself and the reported with successful results. Specifically it was tested with netdump. Heres the BZ with details: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=194055 Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-26[NET] netpoll: don't spin forever sending to stopped queuesJeremy Fitzhardinge1-7/+3
When transmitting a skb in netpoll_send_skb(), only retry a limited number of times if the device queue is stopped. Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-26[ATM]: atm/mpc.c warning fixAndrew Morton1-2/+1
net/atm/mpc.c: In function 'MPOA_res_reply_rcvd': net/atm/mpc.c:1116: warning: unused variable 'ip' Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-26[NET]: skb_find_text ignores to argumentPhil Oester1-1/+4
skb_find_text takes a "to" argument which is supposed to limit how far into the skb it will search for the given text. At present, it seems to ignore that argument on the first skb, and instead return a match even if the text occurs beyond the limit. Patch below fixes this, after adjusting for the "from" starting point. This consequently fixes the netfilter string match's "--to" handling, which currently is broken. Signed-off-by: Phil Oester <kernel@linuxace.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[NET]: make net/core/dev.c:netdev_nit staticAdrian Bunk1-1/+1
netdev_nit can now become static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[NET]: Fix GSO problems in dev_hard_start_xmit()Michael Chan1-0/+3
Fix 2 problems in dev_hard_start_xmit(): 1. nskb->next needs to link back to skb->next if hard_start_xmit() returns non-zero. 2. Since the total number of GSO fragments may exceed MAX_SKB_FRAGS + 1, it needs to stop transmitting if the netif_queue is stopped. Signed-off-by: Michael Chan <mchan@broadcom.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[NET]: Fix CHECKSUM_HW GSO problems.Herbert Xu1-11/+11
Fix checksum problems in the GSO code path for CHECKSUM_HW packets. The ipv4 TCP pseudo header checksum has to be adjusted for GSO segmented packets. The adjustment is needed because the length field in the pseudo-header changes. However, because we have the inequality oldlen > newlen, we know that delta = (u16)~oldlen + newlen is still a 16-bit quantity. This also means that htonl(delta) + th->check still fits in 32 bits. Therefore we don't have to use csum_add on this operations. This is based on a patch by Michael Chan <mchan@broadcom.com>. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Fix incorrect correction to discovery timer frequency computation.Allan Stephens1-3/+3
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Get rid of dynamically allocated arrays in broadcast code.Allan Stephens1-24/+17
This change improves an earlier change which replaced the large local variable arrays used during broadcasting with dynamically allocated arrays. The temporary arrays are now incoprorated into the multicast link data structure. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Fixed link switchover bugsAllan Stephens3-8/+31
Incorporates several related fixes: - switchover now occurs when switching from an active link to a standby link - failure of a standby link no longer initiates switchover - links now display correct # of received packtes following reactivation Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Enhanced & cleaned up system messages; fixed 2 obscure memory leaks.Allan Stephens13-126/+149
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: First phase of assert() cleanupAllan Stephens7-109/+153
This also contains enhancements to simplify comparisons in name table publication removal algorithm and to simplify name table sanity checking when shutting down TIPC. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Disallow config operations that aren't supported in certain modes.Allan Stephens1-45/+38
This change provides user-friendly feedback when TIPC is unable to perform certain configuration operations that don't work properly in certain modes. (In particular, any reconfiguration request that would temporarily take TIPC from network mode to standalone mode, or from standalone mode to not running mode, is disallowed.) Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Fixed memory leak in tipc_link_send() when destination is unreachableAllan Stephens1-1/+5
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Added missing warning for out-of-memory conditionAllan Stephens1-0/+1
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Withdrawing all names from nameless port now returns success, not errorAllan Stephens1-3/+0
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Optimized argument validation done by connect().Allan Stephens1-4/+13
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Simplify code for returning partial success of stream send request.Allan Stephens1-2/+2
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: recvmsg() now returns TIPC ancillary data using correct level (SOL_TIPC)Allan Stephens1-3/+3
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Improved performance of error checking during socket creation.Allan Stephens1-6/+3
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Stream socket send indicates partial success if data partially sent.Allan Stephens1-7/+11
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Connected send now checks socket state when retrying congested send.Allan Stephens1-8/+8
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Can now return destination name of form {0,x,y} via ancillary data.Allan Stephens1-2/+6
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Implied connect now saves dest name for retrieval as ancillary data.Allan Stephens1-4/+4
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Fixed connect() to detect a dest address that is missing or too short.Allan Stephens1-1/+1
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Non-operation-affecting corrections to comments & function definitions.Allan Stephens1-5/+7
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Validate entire interface name when locating bearer to enable.Allan Stephens1-3/+2
This fix prevents a bearer from being enabled using the wrong interface. For example, specifying "eth:eth14" might enable "eth:eth1" by mistake. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com>
2006-06-25[TIPC]: Added support for MODULE_VERSION capability.Allan Stephens1-2/+3
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Fix misleading comment in buf_discard() routine.Allan Stephens1-1/+1
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Fixed privilege checking typo in dest_name_check().Allan Stephens1-1/+1
This patch originated by Stephane Ouellette <ouellettes@videotron.ca>. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC] Fix for NULL pointer dereferenceEric Sesterhenn1-1/+3
This fixes a bug spotted by the coverity checker, bug id #366. If (mod(seqno - prev) != 1) we set buf to NULL, dereference it in the for case, and set it to whatever value happes to be at adress 0+next, if it happens to be non-zero, we even stay in the loop. It seems that the author intended to break there. Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Allow compilation when CONFIG_TIPC_DEBUG is not set.Allan Stephens1-5/+14
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Multicast link failure now resets all links to "nacking" node.Allan Stephens2-28/+128
This fix prevents node from crashing. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Links now validate destination node specified by incoming messages.Allan Stephens1-0/+5
This fix prevents link flopping and name table inconsistency problems arising when a node is assigned a different <Z.C.N> value than it used previously. (Changing the <Z.C.N> value causes other nodes to have two link endpoints sending to the same MAC address using two different destination <Z.C.N> values, requiring the receiving node to filter out the unwanted messages.) Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Allow ports to receive multicast messages through native API.Allan Stephens1-10/+16
This fix prevents a kernel panic if an application mistakenly sends a multicast message to TIPC's topology service or configuration service. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Corrected potential misuse of tipc_media_addr structure.Allan Stephens1-1/+3
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Use correct upper bound when validating network zone number.Allan Stephens1-1/+1
Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC]: Prevent name table corruption if no room for new publicationAllan Stephens1-9/+9
Now exits cleanly if attempt to allocate larger array of subsequences fails, without losing track of pointer to existing array. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25[TIPC] Improved tolerance to promiscuous mode interfaceJon Maloy1-9/+11
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25Merge git://git.linux-nfs.org/pub/linux/nfs-2.6Linus Torvalds6-7/+41
* git://git.linux-nfs.org/pub/linux/nfs-2.6: (51 commits) nfs: remove nfs_put_link() nfs-build-fix-99 git-nfs-build-fixes Merge branch 'odirect' NFS: alloc nfs_read/write_data as direct I/O is scheduled NFS: Eliminate nfs_get_user_pages() NFS: refactor nfs_direct_free_user_pages NFS: remove user_addr, user_count, and pos from nfs_direct_req NFS: "open code" the NFS direct write rescheduler NFS: Separate functions for counting outstanding NFS direct I/Os NLM: Fix reclaim races NLM: sem to mutex conversion locks.c: add the fl_owner to nlm_compare_locks NFS: Display the chosen RPCSEC_GSS security flavour in /proc/mounts NFS: Split fs/nfs/inode.c NFS: Fix typo in nfs_do_clone_mount() NFS: Fix compile errors introduced by referrals patches NFSv4: Ensure that referral mounts bind to a reserved port NFSv4: A root pathname is sent as a zero component4 NFSv4: Follow a referral ...
2006-06-25[PATCH] Define __raw_get_cpu_var and use itPaul Mackerras1-1/+1
There are several instances of per_cpu(foo, raw_smp_processor_id()), which is semantically equivalent to __get_cpu_var(foo) but without the warning that smp_processor_id() can give if CONFIG_DEBUG_PREEMPT is enabled. For those architectures with optimized per-cpu implementations, namely ia64, powerpc, s390, sparc64 and x86_64, per_cpu() turns into more and slower code than __get_cpu_var(), so it would be preferable to use __get_cpu_var on those platforms. This defines a __raw_get_cpu_var(x) macro which turns into per_cpu(x, raw_smp_processor_id()) on architectures that use the generic per-cpu implementation, and turns into __get_cpu_var(x) on the architectures that have an optimized per-cpu implementation. Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] remove for_each_cpu()Andrew Morton1-1/+1
Convert a few stragglers over to for_each_possible_cpu(), remove for_each_cpu(). Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-24Merge branch 'master' of /home/trondmy/kernel/linux-2.6/Trond Myklebust25-122/+635
Conflicts: fs/nfs/inode.c fs/super.c Fix conflicts between patch 'NFS: Split fs/nfs/inode.c' and patch 'VFS: Permit filesystem to override root dentry on mount'
2006-06-23Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds16-85/+602
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [NET]: Require CAP_NET_ADMIN to create tuntap devices. [NET]: fix net-core kernel-doc [TCP]: Move inclusion of <linux/dmaengine.h> to correct place in <linux/tcp.h> [IPSEC]: Handle GSO packets [NET]: Added GSO toggle [NET]: Add software TSOv4 [NET]: Add generic segmentation offload [NET]: Merge TSO/UFO fields in sk_buff [NET]: Prevent transmission after dev_deactivate [IPV6] ADDRCONF: Fix default source address selection without CONFIG_IPV6_PRIVACY [IPV6]: Fix source address selection. [NET]: Avoid allocating skb in skb_pad
2006-06-23[PATCH] list: use list_replace_init() instead of list_splice_init()Oleg Nesterov2-6/+5
list_splice_init(list, head) does unneeded job if it is known that list_empty(head) == 1. We can use list_replace_init() instead. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] clean up default value of IP_DCCP_ACKVECJean-Luc Leger1-1/+1
Default values for boolean and tristate options can only be 'y', 'm' or 'n'. This patch removes wrong default for IP_DCCP_ACKVEC. Signed-off-by: Jean-Luc Leger <jean-luc.leger@dspnet.fr.eu.org> Cc: Arnaldo Carvalho de Melo <acme@conectiva.com.br> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] VFS: Permit filesystem to override root dentry on mountDavid Howells2-6/+7
Extend the get_sb() filesystem operation to take an extra argument that permits the VFS to pass in the target vfsmount that defines the mountpoint. The filesystem is then required to manually set the superblock and root dentry pointers. For most filesystems, this should be done with simple_set_mnt() which will set the superblock pointer and then set the root dentry to the superblock's s_root (as per the old default behaviour). The get_sb() op now returns an integer as there's now no need to return the superblock pointer. This patch permits a superblock to be implicitly shared amongst several mount points, such as can be done with NFS to avoid potential inode aliasing. In such a case, simple_set_mnt() would not be called, and instead the mnt_root and mnt_sb would be set directly. The patch also makes the following changes: (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount pointer argument and return an integer, so most filesystems have to change very little. (*) If one of the convenience function is not used, then get_sb() should normally call simple_set_mnt() to instantiate the vfsmount. This will always return 0, and so can be tail-called from get_sb(). (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the dcache upon superblock destruction rather than shrink_dcache_anon(). This is required because the superblock may now have multiple trees that aren't actually bound to s_root, but that still need to be cleaned up. The currently called functions assume that the whole tree is rooted at s_root, and that anonymous dentries are not the roots of trees which results in dentries being left unculled. However, with the way NFS superblock sharing are currently set to be implemented, these assumptions are violated: the root of the filesystem is simply a dummy dentry and inode (the real inode for '/' may well be inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries with child trees. [*] Anonymous until discovered from another tree. (*) The documentation has been adjusted, including the additional bit of changing ext2_* into foo_* in the documentation. [akpm@osdl.org: convert ipath_fs, do other stuff] Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Nathan Scott <nathans@sgi.com> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[NET]: fix net-core kernel-docRandy Dunlap1-2/+2
Warning(/var/linsrc/linux-2617-g4//include/linux/skbuff.h:304): No description found for parameter 'dma_cookie' Warning(/var/linsrc/linux-2617-g4//include/net/sock.h:1274): No description found for parameter 'copied_early' Warning(/var/linsrc/linux-2617-g4//net/core/dev.c:3309): No description found for parameter 'chan' Warning(/var/linsrc/linux-2617-g4//net/core/dev.c:3309): No description found for parameter 'event' Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23[IPSEC]: Handle GSO packetsHerbert Xu2-10/+83
This patch segments GSO packets received by the IPsec stack. This can happen when a NIC driver injects GSO packets into the stack which are then forwarded to another host. The primary application of this is going to be Xen where its backend driver may inject GSO packets into dom0. Of course this also can be used by other virtualisation schemes such as VMWare or UML since the tap device could be modified to inject GSO packets received through splice. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23[NET]: Added GSO toggleHerbert Xu2-6/+40
This patch adds a generic segmentation offload toggle that can be turned on/off for each net device. For now it only supports in TCPv4. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23[NET]: Add software TSOv4Herbert Xu3-0/+239
This patch adds the GSO implementation for IPv4 TCP. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23[NET]: Add generic segmentation offloadHerbert Xu2-10/+136
This patch adds the infrastructure for generic segmentation offload. The idea is to tap into the potential savings of TSO without hardware support by postponing the allocation of segmented skb's until just before the entry point into the NIC driver. The same structure can be used to support software IPv6 TSO, as well as UFO and segmentation offload for other relevant protocols, e.g., DCCP. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23[NET]: Merge TSO/UFO fields in sk_buffHerbert Xu8-42/+56
Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not going to scale if we add any more segmentation methods (e.g., DCCP). So let's merge them. They were used to tell the protocol of a packet. This function has been subsumed by the new gso_type field. This is essentially a set of netdev feature bits (shifted by 16 bits) that are required to process a specific skb. As such it's easy to tell whether a given device can process a GSO skb: you just have to and the gso_type field and the netdev's features field. I've made gso_type a conjunction. The idea is that you have a base type (e.g., SKB_GSO_TCPV4) that can be modified further to support new features. For example, if we add a hardware TSO type that supports ECN, they would declare NETIF_F_TSO | NETIF_F_TSO_ECN. All TSO packets with CWR set would have a gso_type of SKB_GSO_TCPV4 | SKB_GSO_TCPV4_ECN while all other TSO packets would be SKB_GSO_TCPV4. This means that only the CWR packets need to be emulated in software. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23[NET]: Prevent transmission after dev_deactivateHerbert Xu2-6/+12
The dev_deactivate function has bit-rotted since the introduction of lockless drivers. In particular, the spin_unlock_wait call at the end has no effect on the xmit routine of lockless drivers. With a little bit of work, we can make it much more useful by providing the guarantee that when it returns, no more calls to the xmit routine of the underlying driver will be made. The idea is simple. There are two entry points in to the xmit routine. The first comes from dev_queue_xmit. That one is easily stopped by using synchronize_rcu. This works because we set the qdisc to noop_qdisc before the synchronize_rcu call. That in turn causes all subsequent packets sent to dev_queue_xmit to be dropped. The synchronize_rcu call also ensures all outstanding calls leave their critical section. The other entry point is from qdisc_run. Since we now have a bit that indicates whether it's running, all we have to do is to wait until the bit is off. I've removed the loop to wait for __LINK_STATE_SCHED to clear. This is useless because netif_wake_queue can cause it to be set again. It is also harmless because we've disarmed qdisc_run. I've also removed the spin_unlock_wait on xmit_lock because its only purpose of making sure that all outstanding xmit_lock holders have exited is also given by dev_watchdog_down. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23[IPV6] ADDRCONF: Fix default source address selection without ↵YOSHIFUJI Hideaki1-0/+3
CONFIG_IPV6_PRIVACY We need to update hiscore.rule even if we don't enable CONFIG_IPV6_PRIVACY, because we have more less significant rule; longest match. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23[IPV6]: Fix source address selection.Łukasz Stelmach1-0/+6
Two additional labels (RFC 3484, sec. 10.3) for IPv6 addreses are defined to make a distinction between global unicast addresses and Unique Local Addresses (fc00::/7, RFC 4193) and Teredo (2001::/32, RFC 4380). It is necessary to avoid attempts of connection that would either fail (eg. fec0:: to 2001:feed::) or be sub-optimal (2001:0:: to 2001:feed::). Signed-off-by: Łukasz Stelmach <stlman@poczta.fm> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23[NET]: Avoid allocating skb in skb_padHerbert Xu1-10/+26
First of all it is unnecessary to allocate a new skb in skb_pad since the existing one is not shared. More importantly, our hard_start_xmit interface does not allow a new skb to be allocated since that breaks requeueing. This patch uses pskb_expand_head to expand the existing skb and linearize it if needed. Actually, someone should sift through every instance of skb_pad on a non-linear skb as they do not fit the reasons why this was originally created. Incidentally, this fixes a minor bug when the skb is cloned (tcpdump, TCP, etc.). As it is skb_pad will simply write over a cloned skb. Because of the position of the write it is unlikely to cause problems but still it's best if we don't do it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-22Merge branch 'master' into upstreamJeff Garzik5-23/+19
2006-06-20Merge branch 'master' of /home/trondmy/kernel/linux-2.6/Trond Myklebust147-2077/+5674
2006-06-20[ATM]: fix broken uses of NIPQUAD in net/atmAl Viro2-18/+7
NIPQUAD expects an l-value of type __be32, _NOT_ a pointer to __be32. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-20[SCTP]: sctp_unpack_cookie() fixAl Viro1-2/+2
sizeof(pointer) != sizeof(array)... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-20Merge branch 'upstream' of ↵Jeff Garzik1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into upstream
2006-06-19[NET]: Prevent multiple qdisc runsHerbert Xu1-2/+9
Having two or more qdisc_run's contend against each other is bad because it can induce packet reordering if the packets have to be requeued. It appears that this is an unintended consequence of relinquinshing the queue lock while transmitting. That in turn is needed for devices that spend a lot of time in their transmit routine. There are no advantages to be had as devices with queues are inherently single-threaded (the loopback device is not but then it doesn't have a queue). Even if you were to add a queue to a parallel virtual device (e.g., bolt a tbf filter in front of an ipip tunnel device), you would still want to process the queue in sequence to ensure that the packets are ordered correctly. The solution here is to steal a bit from net_device to prevent this. BTW, as qdisc_restart is no longer used by anyone as a module inside the kernel (IIRC it used to with netif_wake_queue), I have not exported the new __qdisc_run function. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-19[NETFILTER]: xt_sctp: fix endless loop caused by 0 chunk lengthPatrick McHardy1-1/+1
Fix endless loop in the SCTP match similar to those already fixed in the SCTP conntrack helper (was CVE-2006-1527). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-19Merge branch 'for-linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (46 commits) IB/uverbs: Don't serialize with ib_uverbs_idr_mutex IB/mthca: Make all device methods truly reentrant IB/mthca: Fix memory leak on modify_qp error paths IB/uverbs: Factor out common idr code IB/uverbs: Don't decrement usecnt on error paths IB/uverbs: Release lock on error path IB/cm: Use address handle helpers IB/sa: Add ib_init_ah_from_path() IB: Add ib_init_ah_from_wc() IB/ucm: Get rid of duplicate P_Key parameter IB/srp: Factor out common request reset code IB/srp: Support SRP rev. 10 targets [SCSI] srp.h: Add I/O Class values IB/fmr: Use device's max_map_map_per_fmr attribute in FMR pool. IB/mthca: Fill in max_map_per_fmr device attribute IB/ipath: Add client reregister event generation IB/mthca: Add client reregister event generation IB: Move struct port_info from ipath to <rdma/ib_smi.h> IPoIB: Handle client reregister events IB: Add client reregister event type ...
2006-06-19Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds132-1846/+5268
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (109 commits) [ETHTOOL]: Fix UFO typo [SCTP]: Fix persistent slowdown in sctp when a gap ack consumes rx buffer. [SCTP]: Send only 1 window update SACK per message. [SCTP]: Don't do CRC32C checksum over loopback. [SCTP] Reset rtt_in_progress for the chunk when processing its sack. [SCTP]: Reject sctp packets with broadcast addresses. [SCTP]: Limit association max_retrans setting in setsockopt. [PFKEYV2]: Fix inconsistent typing in struct sadb_x_kmprivate. [IPV6]: Sum real space for RTAs. [IRDA]: Use put_unaligned() in irlmp_do_discovery(). [BRIDGE]: Add support for NETIF_F_HW_CSUM devices [NET]: Add NETIF_F_GEN_CSUM and NETIF_F_ALL_CSUM [TG3]: Convert to non-LLTX [TG3]: Remove unnecessary tx_lock [TCP]: Add tcp_slow_start_after_idle sysctl. [BNX2]: Update version and reldate [BNX2]: Use CPU native page size [BNX2]: Use compressed firmware [BNX2]: Add firmware decompression [BNX2]: Allow WoL settings on new 5708 chips ... Manual fixup for conflict in drivers/net/tulip/winbond-840.c
2006-06-17[ETHTOOL]: Fix UFO typoHerbert Xu1-1/+2
The function ethtool_get_ufo was referring to ETHTOOL_GTSO instead of ETHTOOL_GUFO. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[SCTP]: Fix persistent slowdown in sctp when a gap ack consumes rx buffer.Neil Horman1-1/+9
In the event that our entire receive buffer is full with a series of chunks that represent a single gap-ack, and then we accept a chunk (or chunks) that fill in the gap between the ctsn and the first gap, we renege chunks from the end of the buffer, which effectively does nothing but move our gap to the end of our received tsn stream. This does little but move our missing tsns down stream a little, and, if the sender is sending sufficiently large retransmit frames, the result is a perpetual slowdown which can never be recovered from, since the only chunk that can be accepted to allow progress in the tsn stream necessitates that a new gap be created to make room for it. This leads to a constant need for retransmits, and subsequent receiver stalls. The fix I've come up with is to deliver the frame without reneging if we have a full receive buffer and the receiving sockets sk_receive_queue is empty(indicating that the receive buffer is being blocked by a missing tsn). Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[SCTP]: Send only 1 window update SACK per message.Tsutomu Fujii1-2/+28
Right now, every time we increase our rwnd by more then MTU bytes, we trigger a SACK. When processing large messages, this will generate a SACK for almost every other SCTP fragment. However since we are freeing the entire message at the same time, we might as well collapse the SACK generation to 1. Signed-off-by: Tsutomu Fujii <t-fujii@nb.jp.nec.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[SCTP]: Don't do CRC32C checksum over loopback.Sridhar Samudrala2-22/+29
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[SCTP] Reset rtt_in_progress for the chunk when processing its sack.Vlad Yasevich1-0/+1
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[SCTP]: Reject sctp packets with broadcast addresses.Vlad Yasevich4-5/+14
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[SCTP]: Limit association max_retrans setting in setsockopt.Vlad Yasevich1-1/+25
When using ASSOCINFO socket option, we need to limit the number of maximum association retransmissions to be no greater than the sum of all the path retransmissions. This is specified in Section 7.1.2 of the SCTP socket API draft. However, we only do this if the association has multiple paths. If there is only one path, the protocol stack will use the assoc_max_retrans setting when trying to retransmit packets. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[IPV6]: Sum real space for RTAs.YOSHIFUJI Hideaki1-4/+24
This patch fixes RTNLGRP_IPV6_IFINFO netlink notifications. Issue pointed out by Patrick McHardy <kaber@trash.net>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[IRDA]: Use put_unaligned() in irlmp_do_discovery().David S. Miller1-1/+5
irda_device_info->hints[] is byte aligned but is being accessed as a u16 Based upon a patch by Luke Yang <luke.adi@gmail.com>. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[BRIDGE]: Add support for NETIF_F_HW_CSUM devicesHerbert Xu2-6/+12
As it is the bridge will only ever declare NETIF_F_IP_CSUM even if all its constituent devices support NETIF_F_HW_CSUM. This patch fixes this by supporting the first one out of NETIF_F_NO_CSUM, NETIF_F_HW_CSUM, and NETIF_F_IP_CSUM that is supported by all constituent devices. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NET]: Add NETIF_F_GEN_CSUM and NETIF_F_ALL_CSUMHerbert Xu5-19/+8
The current stack treats NETIF_F_HW_CSUM and NETIF_F_NO_CSUM identically so we test for them in quite a few places. For the sake of brevity, I'm adding the macro NETIF_F_GEN_CSUM for these two. We also test the disjunct of NETIF_F_IP_CSUM and the other two in various places, for that purpose I've added NETIF_F_ALL_CSUM. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[TCP]: Add tcp_slow_start_after_idle sysctl.David S. Miller2-1/+13
A lot of people have asked for a way to disable tcp_cwnd_restart(), and it seems reasonable to add a sysctl to do that. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[TCP] Westwood: reset RTT min after FRTOLuca De Cicco1-2/+16
RTT_min is updated each time a timeout event occurs in order to cope with hard handovers in wireless scenarios such as UMTS. Signed-off-by: Luca De Cicco <ldecicco@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[TCP] Westwood: bandwidth filter startupLuca De Cicco1-3/+9
The bandwidth estimate filter is now initialized with the first sample in order to have better performances in the case of small file transfers. Signed-off-by: Luca De Cicco <ldecicco@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[TCP] Westwood: comment fixesLuca De Cicco1-4/+21
Cleanup some comments and add more references Signed-off-by: Luca De Cicco <ldecicco@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[TCP] Westwood: fix first sampleStephen Hemminger1-1/+12
Need to update send sequence number tracking after first ack. Rework of patch from Luca De Cicco. Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NET]: net.ipv4.ip_autoconfig sysctl removalStephen Hemminger1-8/+0
The sysctl net.ipv4.ip_autoconfig is a legacy value that is not used. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[IPX]: Endian bug in ipxrtr_route_packet()Alexey Dobriyan1-1/+1
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NET]: Warn in __skb_trim if skb is pagedHerbert Xu1-5/+2
It's better to warn and fail rather than rarely triggering BUG on paths that incorrectly call skb_trim/__skb_trim on a non-linear skb. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NET]: skb_trim auditHerbert Xu2-18/+6
I found a few more spots where pskb_trim_rcsum could be used but were not. This patch changes them to use it. Also, sk_filter can get paged skb data. Therefore we must use pskb_trim instead of skb_trim. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NET]: Clean up skb_linearizeHerbert Xu5-81/+10
The linearisation operation doesn't need to be super-optimised. So we can replace __skb_linearize with __pskb_pull_tail which does the same thing but is more general. Also, most users of skb_linearize end up testing whether the skb is linear or not so it helps to make skb_linearize do just that. Some callers of skb_linearize also use it to copy cloned data, so it's useful to have a new function skb_linearize_cow to copy the data if it's either non-linear or cloned. Last but not least, I've removed the gfp argument since nobody uses it anymore. If it's ever needed we can easily add it back. Misc bugs fixed by this patch: * via-velocity error handling (also, no SG => no frags) Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NET]: Add netif_tx_lockHerbert Xu7-53/+41
Various drivers use xmit_lock internally to synchronise with their transmission routines. They do so without setting xmit_lock_owner. This is fine as long as netpoll is not in use. With netpoll it is possible for deadlocks to occur if xmit_lock_owner isn't set. This is because if a printk occurs while xmit_lock is held and xmit_lock_owner is not set can cause netpoll to attempt to take xmit_lock recursively. While it is possible to resolve this by getting netpoll to use trylock, it is suboptimal because netpoll's sole objective is to maximise the chance of getting the printk out on the wire. So delaying or dropping the message is to be avoided as much as possible. So the only alternative is to always set xmit_lock_owner. The following patch does this by introducing the netif_tx_lock family of functions that take care of setting/unsetting xmit_lock_owner. I renamed xmit_lock to _xmit_lock to indicate that it should not be used directly. I didn't provide irq versions of the netif_tx_lock functions since xmit_lock is meant to be a BH-disabling lock. This is pretty much a straight text substitution except for a small bug fix in winbond. It currently uses netif_stop_queue/spin_unlock_wait to stop transmission. This is unsafe as an IRQ can potentially wake up the queue. So it is safer to use netif_tx_disable. The hamradio bits used spin_lock_irq but it is unnecessary as xmit_lock must never be taken in an IRQ handler. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NETFILTER]: hashlimit match: fix random initializationPatrick McHardy1-2/+5
hashlimit does: if (!ht->rnd) get_random_bytes(&ht->rnd, 4); ignoring that 0 is also a valid random number. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NETFILTER]: recent match: missing refcnt initializationPatrick McHardy1-0/+1
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NETFILTER]: recent match: fix "sleeping function called from invalid context"Patrick McHardy1-5/+10
create_proc_entry must not be called with locks held. Use a mutex instead to protect data only changed in user context. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[SECMARK]: Add CONNSECMARK xtables targetJames Morris3-0/+167
Add a new xtables target, CONNSECMARK, which is used to specify rules for copying security marks from packets to connections, and for copyying security marks back from connections to packets. This is similar to the CONNMARK target, but is more limited in scope in that it only allows copying of security marks to and from packets, as this is all it needs to do. A typical scenario would be to apply a security mark to a 'new' packet with SECMARK, then copy that to its conntrack via CONNMARK, and then restore the security mark from the connection to established and related packets on that connection. Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[SECMARK]: Add secmark support to conntrackJames Morris6-0/+40
Add a secmark field to IP and NF conntracks, so that security markings on packets can be copied to their associated connections, and also copied back to packets as required. This is similar to the network mark field currently used with conntrack, although it is intended for enforcement of security policy rather than network policy. Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[SECMARK]: Add xtables SECMARK targetJames Morris3-0/+166
Add a SECMARK target to xtables, allowing the admin to apply security marks to packets via both iptables and ip6tables. The target currently handles SELinux security marking, but can be extended for other purposes as needed. Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[SECMARK]: Add secmark support to core networking.James Morris5-1/+12
Add a secmark field to the skbuff structure, to allow security subsystems to place security markings on network packets. This is similar to the nfmark field, except is intended for implementing security policy, rather than than networking policy. This patch was already acked in principle by Dave Miller. Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NET]: Fix warnings after LSM-IPSEC changes.David S. Miller1-2/+2
Assignment used as truth value in xfrm_del_sa() and xfrm_get_policy(). Wrong argument type declared for security_xfrm_state_delete() when SELINUX is disabled. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NET]: NET_TCPPROBE Kconfig fixDave Jones1-1/+1
Just spotted this typo in a new option. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[LSM-IPsec]: SELinux AuthorizeCatherine Zhang2-13/+23
This patch contains a fix for the previous patch that adds security contexts to IPsec policies and security associations. In the previous patch, no authorization (besides the check for write permissions to SAD and SPD) is required to delete IPsec policies and security assocations with security contexts. Thus a user authorized to change SAD and SPD can bypass the IPsec policy authorization by simply deleteing policies with security contexts. To fix this security hole, an additional authorization check is added for removing security policies and security associations with security contexts. Note that if no security context is supplied on add or present on policy to be deleted, the SELinux module allows the change unconditionally. The hook is called on deletion when no context is present, which we may want to change. At present, I left it up to the module. LSM changes: The patch adds two new LSM hooks: xfrm_policy_delete and xfrm_state_delete. The new hooks are necessary to authorize deletion of IPsec policies that have security contexts. The existing hooks xfrm_policy_free and xfrm_state_free lack the context to do the authorization, so I decided to split authorization of deletion and memory management of security data, as is typical in the LSM interface. Use: The new delete hooks are checked when xfrm_policy or xfrm_state are deleted by either the xfrm_user interface (xfrm_get_policy, xfrm_del_sa) or the pfkey interface (pfkey_spddelete, pfkey_delete). SELinux changes: The new policy_delete and state_delete functions are added. Signed-off-by: Catherine Zhang <cxzhang@watson.ibm.com> Signed-off-by: Trent Jaeger <tjaeger@cse.psu.edu> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[IPV4] icmp: Kill local 'ip' arg in icmp_redirect().David S. Miller1-3/+2
It is typed wrong, and it's only assigned and used once. So just pass in iph->daddr directly which fixes both problems. Based upon a patch by Alexey Dobriyan. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[IPV4]: Right prototype of __raw_v4_lookup()Alexey Dobriyan1-1/+1
All users pass 32-bit values as addresses and internally they're compared with 32-bit entities. So, change "laddr" and "raddr" types to __be32. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[IPV4] igmp: Fixup struct ip_mc_list::multiaddr typeAlexey Dobriyan1-1/+1
All users except two expect 32-bit big-endian value. One is of ->multiaddr = ->multiaddr variety. And last one is "%08lX". Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[TCP]: Fix compile warning in tcp_probe.cDavid S. Miller1-1/+3
The suseconds_t et al. are not necessarily any particular type on every platform, so cast to unsigned long so that we can use one printf format string and avoid warnings across the board Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[TCP]: Limited slow start for Highspeed TCPStephen Hemminger1-3/+21
Implementation of RFC3742 limited slow start. Added as part of the TCP highspeed congestion control module. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[TCP]: TCP Probe congestion window tracingStephen Hemminger3-0/+195
This adds a new module for tracking TCP state variables non-intrusively using kprobes. It has a simple /proc interface that outputs one line for each packet received. A sample usage is to collect congestion window and ssthresh over time graphs. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[TCP]: Minimum congestion window consolidation.Stephen Hemminger8-46/+21
Many of the TCP congestion methods all just use ssthresh as the minimum congestion window on decrease. Rather than duplicating the code, just have that be the default if that handle in the ops structure is not set. Minor behaviour change to TCP compound. It probably wants to use this (ssthresh) as lower bound, rather than ssthresh/2 because the latter causes undershoot on loss. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[TCP]: TCP Compound quad root functionStephen Hemminger1-24/+66
The original code did a 64 bit divide directly, which won't work on 32 bit platforms. Rather than doing a 64 bit square root twice, just implement a 4th root function in one pass using Newton's method. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[TCP]: TCP Compound congestion controlAngelo P. Castellani3-0/+418
TCP Compound is a sender-side only change to TCP that uses a mixed Reno/Vegas approach to calculate the cwnd. For further details look here: ftp://ftp.research.microsoft.com/pub/tr/TR-2005-86.pdf Signed-off-by: Angelo P. Castellani <angelo.castellani@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[TCP]: TCP Veno congestion controlBin Zhou3-0/+251
TCP Veno module is a new congestion control module to improve TCP performance over wireless networks. The key innovation in TCP Veno is the enhancement of TCP Reno/Sack congestion control algorithm by using the estimated state of a connection based on TCP Vegas. This scheme significantly reduces "blind" reduction of TCP window regardless of the cause of packet loss. This work is based on the research paper "TCP Veno: TCP Enhancement for Transmission over Wireless Access Networks." C. P. Fu, S. C. Liew, IEEE Journal on Selected Areas in Communication, Feb. 2003. Original paper and many latest research works on veno: http://www.ntu.edu.sg/home/ascpfu/veno/veno.html Signed-off-by: Bin Zhou <zhou0022@ntu.edu.sg> Cheng Peng Fu <ascpfu@ntu.edu.sg> Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[TCP]: TCP Low Priority congestion controlWong Hoi Sing Edison3-0/+349
TCP Low Priority is a distributed algorithm whose goal is to utilize only the excess network bandwidth as compared to the ``fair share`` of bandwidth as targeted by TCP. Available from: http://www.ece.rice.edu/~akuzma/Doc/akuzma/TCP-LP.pdf Original Author: Aleksandar Kuzmanovic <akuzma@northwestern.edu> See http://www-ece.rice.edu/networks/TCP-LP/ for their implementation. As of 2.6.13, Linux supports pluggable congestion control algorithms. Due to the limitation of the API, we take the following changes from the original TCP-LP implementation: o We use newReno in most core CA handling. Only add some checking within cong_avoid. o Error correcting in remote HZ, therefore remote HZ will be keeped on checking and updating. o Handling calculation of One-Way-Delay (OWD) within rtt_sample, sicne OWD have a similar meaning as RTT. Also correct the buggy formular. o Handle reaction for Early Congestion Indication (ECI) within pkts_acked, as mentioned within pseudo code. o OWD is handled in relative format, where local time stamp will in tcp_time_stamp format. Port from 2.4.19 to 2.6.16 as module by: Wong Hoi Sing Edison <hswong3i@gmail.com> Hung Hing Lun <hlhung3i@gmail.com> Signed-off-by: Wong Hoi Sing Edison <hswong3i@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[LLC]: Fix double receive of SKB.Andrew Morton1-1/+0
Oops fix from Stephen: remove duplicate rcv() calls. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NETFILTER]: PPTP helper: fixup gre_keymap_lookup() return typeAlexey Dobriyan1-3/+3
GRE keys are 16-bit wide. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NETFILTER]: Add SIP connection tracking helperPatrick McHardy4-0/+740
Add SIP connection tracking helper. Originally written by Christian Hentschel <chentschel@arnet.com.ar>, some cleanup, minor fixes and bidirectional SIP support added by myself. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NETFILTER]: H.323 helper: replace internal_net_addr parameter by ↵Patrick McHardy1-30/+27
routing-based heuristic Call Forwarding doesn't need to create an expectation if both peers can reach each other without our help. The internal_net_addr parameter lets the user explicitly specify a single network where this is true, but is not very flexible and even fails in the common case that calls will both be forwarded to outside parties and inside parties. Use an optional heuristic based on routing instead, the assumption is that if bpth the outgoing device and the gateway are equal, both peers can reach each other directly. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NETFILTER]: H.323 helper: Add support for Call ForwardingJing Min Zhao4-7/+196
Signed-off-by: Jing Min Zhao <zhaojingmin@users.sourceforge.net> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NETFILTER]: amanda helper: convert to textsearch infrastructurePatrick McHardy2-49/+96
When a port number within a packet is replaced by a differently sized number only the packet is resized, but not the copy of the data. Following port numbers are rewritten based on their offsets within the copy, leading to packet corruption. Convert the amanda helper to the textsearch infrastructure to avoid the copy entirely. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NETFILTER]: FTP helper: search optimizationPatrick McHardy2-68/+86
Instead of skipping search entries for the wrong direction simply index them by direction. Based on patch by Pablo Neira <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NETFILTER]: SNMP helper: fix debug module param typePatrick McHardy1-1/+1
debug is the debug level, not a bool. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NETFILTER]: ctnetlink: change table dumping not to require an unique IDPatrick McHardy2-16/+48
Instead of using the ID to find out where to continue dumping, take a reference to the last entry dumped and try to continue there. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NETFILTER]: ctnetlink: fix NAT configurationPatrick McHardy2-62/+44
The current configuration only allows to configure one manip and overloads conntrack status flags with netlink semantic. Signed-off-by: Patrick Mchardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NETFILTER]: conntrack: add fixed timeout flag in connection trackingEric Leblond2-0/+12
Add a flag in a connection status to have a non updated timeout. This permits to have connection that automatically die at a given time. Signed-off-by: Eric Leblond <eric@inl.fr> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NETFILTER]: conntrack: add sysctl to disable checksummingPatrick McHardy9-8/+32
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NETFILTER]: conntrack: don't call helpers for related ICMP messagesPatrick McHardy3-3/+3
None of the existing helpers expects to get called for related ICMP packets and some even drop them if they can't parse them. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NETFILTER]: recent match: replace by rewritten versionPatrick McHardy1-891/+377
Replace the unmaintainable ipt_recent match by a rewritten version that should be fully compatible. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NETFILTER]: x_tables: add statistic matchPatrick McHardy3-0/+119
Add statistic match which is a combination of the nth and random matches. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NETFILTER]: x_tables: add quota matchPatrick McHardy3-0/+107
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NETFILTER]: x_tables: add SCTP/DCCP support where missingPatrick McHardy3-65/+26
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NETFILTER]: x_tables: remove some unnecessary castsPatrick McHardy6-9/+6
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[IPSEC] proto: Move transport mode input path into xfrm_mode_transportHerbert Xu8-83/+59
Now that we have xfrm_mode objects we can move the transport mode specific input decapsulation code into xfrm_mode_transport. This removes duplicate code as well as unnecessary header movement in case of tunnel mode SAs since we will discard the original IP header immediately. This also fixes a minor bug for transport-mode ESP where the IP payload length is set to the correct value minus the header length (with extension headers for IPv6). Of course the other neat thing is that we no longer have to allocate temporary buffers to hold the IP headers for ESP and IPComp. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[IPSEC] xfrm: Abstract out encapsulation modesHerbert Xu15-170/+532
This patch adds the structure xfrm_mode. It is meant to represent the operations carried out by transport/tunnel modes. By doing this we allow additional encapsulation modes to be added without clogging up the xfrm_input/xfrm_output paths. Candidate modes include 4-to-6 tunnel mode, 6-to-4 tunnel mode, and BEET modes. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[IPSEC] xfrm: Undo afinfo lock proliferationHerbert Xu6-44/+37
The number of locks used to manage afinfo structures can easily be reduced down to one each for policy and state respectively. This is based on the observation that the write locks are only held by module insertion/removal which are very rare events so there is no need to further differentiate between the insertion of modules like ipv6 versus esp6. The removal of the read locks in xfrm4_policy.c/xfrm6_policy.c might look suspicious at first. However, after you realise that nobody ever takes the corresponding write lock you'll feel better :) As far as I can gather it's an attempt to guard against the removal of the corresponding modules. Since neither module can be unloaded at all we can leave it to whoever fixes up IPv6 unloading :) Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[TCP]: tcp_rcv_rtt_measure_ts() call in pure-ACK path is superfluousDavid S. Miller1-2/+0
We only want to take receive RTT mesaurements for data bearing frames, here in the header prediction fast path for a pure-sender, we know that we have a pure-ACK and thus the checks in tcp_rcv_rtt_mesaure_ts() will not pass. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[BRIDGE]: netlink interface for link managementStephen Hemminger6-2/+214
Add basic netlink support to the Ethernet bridge. Including: * dump interfaces in bridges * monitor link status changes * change state of bridge port For some demo programs see: http://developer.osdl.org/shemminger/prototypes/brnl.tar.gz These are to allow building a daemon that does alternative implementations of Spanning Tree Protocol. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[BRIDGE]: fix module startup error handlingStephen Hemminger2-9/+22
Return address in use, if some other kernel code has the SAP. Propogate out error codes from netfilter registration and unwind. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[BRIDGE]: optimize conditional in forward pathStephen Hemminger1-8/+4
Small optimizations of bridge forwarding path. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[LLC]: add multicast support for datagramsStephen Hemminger1-8/+51
Allow mulitcast reception of datagrams (similar to UDP). All sockets bound to the same SAP receive a clone. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[LLC]: allow applications to get copy of kernel datagramsStephen Hemminger1-1/+3
It is legal for an application to bind to a SAP that is also being used by the kernel. This happens if the bridge module binds to the STP SAP, and the user wants to have a daemon for STP as well. It is possible to have kernel doing STP on one bridge, but let application do RSTP on another bridge. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[LLC]: use rcu_dereference on receive handlerStephen Hemminger1-2/+5
The receive hander pointer might be modified during network changes of protocol. So use rcu_dereference (only matters on alpha). Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[LLC]: allow datagram recvmsgStephen Hemminger1-2/+2
LLC receive is broken for SOCK_DGRAM. If an application does recv() on a datagram socket and there is no data present, don't return "not connected". Instead, just do normal datagram semantics. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[LLC]: use more efficient ether address routinesStephen Hemminger1-2/+0
Use more cache efficient Ethernet address manipulation functions in etherdevice.h. Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
2006-06-17[I/OAT]: TCP recv offload to I/OATChris Leech4-22/+185
Locks down user pages and sets up for DMA in tcp_recvmsg, then calls dma_async_try_early_copy in tcp_v4_do_rcv Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[I/OAT]: Add a sysctl for tuning the I/OAT offloaded I/O thresholdChris Leech2-0/+14
Any socket recv of less than this ammount will not be offloaded Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[I/OAT]: Make sk_eat_skb I/OAT aware.Chris Leech3-7/+7
Add an extra argument to sk_eat_skb, and make it move early copied packets to the async_wait_queue instead of freeing them. Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[I/OAT]: Rename cleanup_rbuf to tcp_cleanup_rbuf and make non-staticChris Leech1-5/+5
Needed to be able to call tcp_cleanup_rbuf in tcp_input.c for I/OAT Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[I/OAT]: Structure changes for TCP recv offload to I/OATChris Leech1-0/+6
Adds an async_wait_queue and some additional fields to tcp_sock, and a dma_cookie_t to sk_buff. Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[I/OAT]: Utility functions for offloading sk_buff to iovec copiesChris Leech2-0/+128
Provides for pinning user space pages in memory, copying to iovecs, and copying from sk_buffs including fragmented and chained sk_buffs. Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[I/OAT]: Setup the networking subsystem as a DMA clientChris Leech1-0/+104
Attempts to allocate per-CPU DMA channels Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[NET]: Export ip_dev_find()Sean Hefty1-0/+1
Export ip_dev_find() to allow locating a net_device given an IP address. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-15[PATCH] wireless: correct dump of WPA IELarry Finger1-1/+1
In net/ieee80211/softmac/ieee80211softmac_wx.c, there is a bug that prints extended sign information whenever the byte value exceeds 0x7f. The following patch changes the printk to use a u8 cast to limit the output to 2 digits. This bug was first noticed by Dan Williams <dcbw@redhat.com>. This patch applies to the current master branch of the Linville tree. Signed-Off-By: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-06-13Merge branch 'master' into upstreamJeff Garzik3-3/+3
2006-06-13Merge branch 'from-linus' into upstreamJohn W. Linville9-46/+17
2006-06-12[IPV4]: Increment ipInHdrErrors when TTL expires.Weidong1-0/+1
Signed-off-by: Weidong <weid@nanjing-fnst.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-11[TCP]: continued: reno sacked_out count fixAki M Nyrhinen1-3/+1
From: Aki M Nyrhinen <anyrhine@cs.helsinki.fi> IMHO the current fix to the problem (in_flight underflow in reno) is incorrect. it treats the symptons but ignores the problem. the problem is timing out packets other than the head packet when we don't have sack. i try to explain (sorry if explaining the obvious). with sack, scanning the retransmit queue for timed out packets is fine because we know which packets in our retransmit queue have been acked by the receiver. without sack, we know only how many packets in our retransmit queue the receiver has acknowledged, but no idea which packets. think of a "typical" slow-start overshoot case, where for example every third packet in a window get lost because a router buffer gets full. with sack, we check for timeouts on those every third packet (as the rest have been sacked). the packet counting works out and if there is no reordering, we'll retransmit exactly the packets that were lost. without sack, however, we check for timeout on every packet and end up retransmitting consecutive packets in the retransmit queue. in our slow-start example, 2/3 of those retransmissions are unnecessary. these unnecessary retransmissions eat the congestion window and evetually prevent fast recovery from continuing, if enough packets were lost. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-11[DCCP] Ackvec: fix soft lockup in ackvec handling codeAndrea Bittau1-0/+1
A soft lockup existed in the handling of ack vector records. Specifically, when a tail of the list of ack vector records was removed, it was possible to end up iterating infinitely on an element of the tail. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-09NFS: Display the chosen RPCSEC_GSS security flavour in /proc/mountsTrond Myklebust2-0/+3
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09RPC: Allow struc xdr_stream to read the page section of an xdr_bufTrond Myklebust1-2/+26
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09VFS: Unexport do_kern_mount() and clean up simple_pin_fs()Trond Myklebust1-1/+1
Replace all module uses with the new vfs_kern_mount() interface, and fix up simple_pin_fs(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09SUNRPC: NFS_ROOT always uses the same XIDsChuck Lever1-2/+2
The XID generator uses get_random_bytes to generate an initial XID. NFS_ROOT starts up before the random driver, though, so get_random_bytes doesn't set a random XID for NFS_ROOT. This causes NFS_ROOT mount points to reuse XIDs every time the client is booted. If the client boots often enough, the server will start serving old replies out of its DRC. Use net_random() instead. Test plan: I/O intensive workloads should perform well and generate no errors. Traces taken during client reboots should show that NFS_ROOT mounts use unique XIDs after every reboot. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09SUNRPC: select privileged port numbers at randomChuck Lever1-2/+9
Make the RPC client select privileged ephemeral source ports at random. This improves DRC behavior on the server by using the same port when reconnecting for the same mount point, but using a different port for fresh mounts. The Linux TCP implementation already does this for nonprivileged ports. Note that TCP sockets in TIME_WAIT will prevent quick reuse of a random ephemeral port number by leaving the port INUSE until the connection transitions out of TIME_WAIT. Test plan: Connectathon against every known server implementation using multiple mount points. Locking especially. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-08Merge branch 'upstream' of ↵Jeff Garzik6-109/+125
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into upstream
2006-06-08Merge branch 'master' into upstreamJeff Garzik11-48/+22
2006-06-05[BRIDGE]: fix locking and memory leak in br_add_bridgeJiri Benc1-12/+7
There are several bugs in error handling in br_add_bridge: - when dev_alloc_name fails, allocated net_device is not freed - unregister_netdev is called when rtnl lock is held - free_netdev is called before netdev_run_todo has a chance to be run after unregistering net_device Signed-off-by: Jiri Benc <jbenc@suse.cz> Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-05[IRDA]: Missing allocation result check in irlap_change_speed().Florin Malita1-1/+2
The skb allocation may fail, which can result in a NULL pointer dereference in irlap_queue_xmit(). Coverity CID: 434. Signed-off-by: Florin Malita <fmalita@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-05[NET]: Eliminate unused /proc/sys/net/ethernetJes Sorensen3-23/+0
The /proc/sys/net/ethernet directory has been sitting empty for more than 10 years! Time to eliminate it! Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-05[TCP]: Avoid skb_pull if possible when trimming headHerbert Xu ~{PmVHI~}1-7/+5
Trimming the head of an skb by calling skb_pull can cause the packet to become unaligned if the length pulled is odd. Since the length is entirely arbitrary for a FIN packet carrying data, this is actually quite common. Unaligned data is not the end of the world, but we should avoid it if it's easily done. In this case it is trivial. Since we're discarding all of the head data it doesn't matter whether we move skb->data forward or back. However, it is still possible to have unaligned skb->data in general. So network drivers should be prepared to handle it instead of crashing. This patch also adds an unlikely marking on len < headlen since partial ACKs on head data are extremely rare in the wild. As the return value of __pskb_trim_head is no longer ever NULL that has been removed. Signed-off-by: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-05[PATCH] softmac: unified capabilities computationJoseph Jezak1-32/+53
This patch moves the capabilities field computation to a function for clarity and adds some previously unimplemented bits. Signed off by Joseph Jezak <josejx@gentoo.org> Signed-off-by: Daniel Drake <dsd@gentoo.org> Acked-By: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-06-05[PATCH] softmac: Fix handling of authentication failureDaniel Drake2-4/+23
My router blew up earlier, but exhibited some interesting behaviour during its dying moments. It was broadcasting beacons but wouldn't respond to any authentication requests. I noticed that softmac wasn't playing nice with this, as I couldn't make it try to connect to other networks after it had timed out authenticating to my ill router. To resolve this, I modified the softmac event/notify API to pass the event code to the callback, so that callbacks being notified from IEEE80211SOFTMAC_EVENT_ANY masks can make some judgement. In this case, the ieee80211softmac_assoc callback needs to make a decision based upon whether the association passed or failed. Signed-off-by: Daniel Drake <dsd@gentoo.org> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-06-05[PATCH] softmac: complete shared key authenticationDaniel Drake4-28/+49
This patch finishes of the partially-complete shared key authentication implementation in softmac. The complication here is that we need to encrypt a management frame during the authentication process. I don't think there are any other scenarios where this would have to happen. To get around this without causing too many headaches, we decided to just use software encryption for this frame. The softmac config option now selects IEEE80211_CRYPT_WEP so that we can ensure this available. This also involved a modification to some otherwise unused ieee80211 API. Signed-off-by: Daniel Drake <dsd@gentoo.org> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-06-05[PATCH] ieee80211softmac_io.c: fix warning "defined but not used"Toralf Förster1-45/+0
Got this compiler warning and Johannes Berg <johannes@sipsolutions.net> wrote: Yeah, known 'bug', we have that code there but never use it. Feel free to submit a patch (to John Linville, CC netdev and softmac-dev) to remove it. Signed-off-by: Toralf Foerster <toralf.foerster@gmx.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-06-05Merge branch 'from-linus' into upstreamJohn W. Linville7-19/+30
2006-06-02[TCP] tcp_highspeed: Fix problem observed by Xiaoliang (David) WeiStephen Hemminger1-1/+2
When snd_cwnd is smaller than 38 and the connection is in congestion avoidance phase (snd_cwnd > snd_ssthresh), the snd_cwnd seems to stop growing. The additive increase was confused because C array's are 0 based. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-28[NETFILTER]: PPTP helper: fix sstate/cstate typoAlexey Dobriyan1-2/+2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-28[NETFILTER]: mark H.323 helper experimentalPatrick McHardy1-2/+2
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-28[NETFILTER]: Fix small information leak in SO_ORIGINAL_DST (CVE-2006-1343)Marcel Holtmann2-0/+2
It appears that sockaddr_in.sin_zero is not zeroed during getsockopt(...SO_ORIGINAL_DST...) operation. This can lead to an information leak (CVE-2006-1343). Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-26Merge branch 'upstream-fixes' into upstreamJeff Garzik2-14/+22
2006-05-26[NET]: dev.c comment fixesStephen Hemminger1-9/+11
Noticed that dev_alloc_name() comment was incorrect, and more spellung errors. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-26[IPV6] ROUTE: Don't try less preferred routes for on-link routes.YOSHIFUJI Hideaki1-5/+11
In addition to the real on-link routes, NONEXTHOP routes should be considered on-link. Problem reported by Meelis Roos <mroos@linux.ee>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Meelis Roos <mroos@linux.ee> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-26Merge branch 'from-linus' into upstreamJohn W. Linville9-30/+37
2006-05-24Merge branch 'upstream' of ↵Jeff Garzik6-60/+146
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into upstream
2006-05-24Merge branch 'master' into upstreamJeff Garzik3-17/+23
2006-05-23[BRIDGE]: need to ref count the LLC sapStephen Hemminger1-1/+2
Bridge will OOPS on removal if other application has the SAP open. The bridge SAP might be shared with other usages, so need to do reference counting on module removal rather than explicit close/delete. Since packet might arrive after or during removal, need to clear the receive function handle, so LLC only hands it to user (if any). Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-23[NETFILTER]: SNMP NAT: fix memleak in snmp_object_decodeChris Wright1-0/+1
If kmalloc fails, error path leaks data allocated from asn1_oid_decode(). Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-23[NETFILTER]: H.323 helper: fix sequence extension parsingPatrick McHardy1-1/+1
When parsing unknown sequence extensions the "son"-pointer points behind the last known extension for this type, don't try to interpret it. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-23[NETFILTER]: H.323 helper: fix parser error propagationPatrick McHardy1-15/+19
The condition "> H323_ERROR_STOP" can never be true since H323_ERROR_STOP is positive and is the highest possible return code, while real errors are negative, fix the checks. Also only abort on real errors in some spots that were just interpreting any return value != 0 as error. Fixes crashes caused by use of stale data after a parsing error occured: BUG: unable to handle kernel paging request at virtual address bfffffff printing eip: c01aa0f8 *pde = 1a801067 *pte = 00000000 Oops: 0000 [#1] PREEMPT Modules linked in: ip_nat_h323 ip_conntrack_h323 nfsd exportfs sch_sfq sch_red cls_fw sch_hfsc xt_length ipt_owner xt_MARK iptable_mangle nfs lockd sunrpc pppoe pppoxx CPU: 0 EIP: 0060:[<c01aa0f8>] Not tainted VLI EFLAGS: 00210646 (2.6.17-rc4 #8) EIP is at memmove+0x19/0x22 eax: d77264e9 ebx: d77264e9 ecx: e88d9b17 edx: d77264e9 esi: bfffffff edi: bfffffff ebp: de6a7680 esp: c0349db8 ds: 007b es: 007b ss: 0068 Process asterisk (pid: 3765, threadinfo=c0349000 task=da068540) Stack: <0>00000006 c0349e5e d77264e3 e09a2b4e e09a38a0 d7726052 d7726124 00000491 00000006 00000006 00000006 00000491 de6a7680 d772601e d7726032 c0349f74 e09a2dc2 00000006 c0349e5e 00000006 00000000 d76dda28 00000491 c0349f74 Call Trace: [<e09a2b4e>] mangle_contents+0x62/0xfe [ip_nat] [<e09a2dc2>] ip_nat_mangle_tcp_packet+0xa1/0x191 [ip_nat] [<e0a2712d>] set_addr+0x74/0x14c [ip_nat_h323] [<e0ad531e>] process_setup+0x11b/0x29e [ip_conntrack_h323] [<e0ad534f>] process_setup+0x14c/0x29e [ip_conntrack_h323] [<e0ad57bd>] process_q931+0x3c/0x142 [ip_conntrack_h323] [<e0ad5dff>] q931_help+0xe0/0x144 [ip_conntrack_h323] ... Found by the PROTOS c07-h2250v4 testsuite. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-23Merge branch 'master' into upstreamJeff Garzik7-13/+14
2006-05-23Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds6-13/+13
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [NETFILTER]: SNMP NAT: fix memory corruption [IRDA]: fixup type of ->lsap_state [IRDA]: fix 16/32 bit confusion [NET]: Fix "ntohl(ntohs" bugs [BNX2]: Use kmalloc instead of array [BNX2]: Fix bug in bnx2_nvram_write() [TG3]: Add some missing rx error counters
2006-05-23[PATCH] knfsd: Fix two problems that can cause rmmod nfsd to dieNeilBrown1-0/+1
Both cause the 'entries' count in the export cache to be non-zero at module removal time, so unregistering that cache fails and results in an oops. 1/ exp_pseudoroot (used for NFSv4 only) leaks a reference to an export entry. 2/ sunrpc_cache_update doesn't increment the entries count when it adds an entry. Thanks to "david m. richter" <richterd@citi.umich.edu> for triggering the problem and finding one of the bugs. Cc: "david m. richter" <richterd@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-22[NETFILTER]: SNMP NAT: fix memory corruptionPatrick McHardy1-8/+7
Fix memory corruption caused by snmp_trap_decode: - When snmp_trap_decode fails before the id and address are allocated, the pointers contain random memory, but are freed by the caller (snmp_parse_mangle). - When snmp_trap_decode fails after allocating just the ID, it tries to free both address and ID, but the address pointer still contains random memory. The caller frees both ID and random memory again. - When snmp_trap_decode fails after allocating both, it frees both, and the callers frees both again. The corruption can be triggered remotely when the ip_nat_snmp_basic module is loaded and traffic on port 161 or 162 is NATed. Found by multiple testcases of the trap-app and trap-enc groups of the PROTOS c06-snmpv1 testsuite. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-22[IRDA]: fix 16/32 bit confusionAlexey Dobriyan1-1/+2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-22[NET]: Fix "ntohl(ntohs" bugsAlexey Dobriyan4-4/+4
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-22Merge branch 'from-linus' into upstreamJohn W. Linville18-127/+188
2006-05-20Merge branch 'master' into upstreamJeff Garzik64-389/+554
2006-05-19[SCTP]: Allow linger to abort 1-N style sockets.Vladislav Yasevich1-6/+6
Enable SO_LINGER functionality for 1-N style sockets. The socket API draft will be clarfied to allow for this functionality. The linger settings will apply to all associations on a given socket. Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-05-19[SCTP]: Validate the parameter length in HB-ACK chunk.Vladislav Yasevich1-0/+6
If SCTP receives a badly formatted HB-ACK chunk, it is possible that we may access invalid memory and potentially have a buffer overflow. We should really make sure that the chunk format is what we expect, before attempting to touch the data. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-05-19[SCTP]: A better solution to fix the race between sctp_peeloff() andVladislav Yasevich2-67/+89
sctp_rcv(). The goal is to hold the ref on the association/endpoint throughout the state-machine process. We accomplish like this: /* ref on the assoc/ep is taken during lookup */ if owned_by_user(sk) sctp_add_backlog(skb, sk); else inqueue_push(skb, sk); /* drop the ref on the assoc/ep */ However, in sctp_add_backlog() we take the ref on assoc/ep and hold it while the skb is on the backlog queue. This allows us to get rid of the sock_hold/sock_put in the lookup routines. Now sctp_backlog_rcv() needs to account for potential association move. In the unlikely event that association moved, we need to retest if the new socket is locked by user. If we don't this, we may have two packets racing up the stack toward the same socket and we can't deal with it. If the new socket is still locked, we'll just add the skb to its backlog continuing to hold the ref on the association. This get's rid of the need to move packets from one backlog to another and it also safe in case new packets arrive on the same backlog queue. The last step, is to lock the new socket when we are moving the association to it. This is needed in case any new packets arrive on the association when it moved. We want these to go to the backlog since we would like to avoid the race between this new packet and a packet that may be sitting on the backlog queue of the old socket toward the same association. Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>