aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/connector
AgeCommit message (Collapse)AuthorFilesLines
2024-02-13connector/cn_proc: revert "connector: Fix proc_event_num_listeners count not ↵Keqi Wang1-3/+2
cleared" This reverts commit c46bfba1337d ("connector: Fix proc_event_num_listeners count not cleared"). It is not accurate to reset proc_event_num_listeners according to cn_netlink_send_mult() return value -ESRCH. In the case of stress-ng netlink-proc, -ESRCH will always be returned, because netlink_broadcast_filtered will return -ESRCH, which may cause stress-ng netlink-proc performance degradation. Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202401112259.b23a1567-oliver.sang@intel.com Fixes: c46bfba1337d ("connector: Fix proc_event_num_listeners count not cleared") Signed-off-by: Keqi Wang <wangkeqi_chris@163.com> Link: https://lore.kernel.org/r/20240209091659.68723-1-wangkeqi_chris@163.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-01-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-2/+3
Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/broadcom/bnxt/bnxt.c e009b2efb7a8 ("bnxt_en: Remove mis-applied code from bnxt_cfg_ntp_filters()") 0f2b21477988 ("bnxt_en: Fix compile error without CONFIG_RFS_ACCEL") https://lore.kernel.org/all/20240105115509.225aa8a2@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-02connector: Fix proc_event_num_listeners count not clearedwangkeqi1-2/+3
When we register a cn_proc listening event, the proc_event_num_listener variable will be incremented by one, but if PROC_CN_MCAST_IGNORE is not called, the count will not decrease. This will cause the proc_*_connector function to take the wrong path. It will reappear when the forkstat tool exits via ctrl + c. We solve this problem by determining whether there are still listeners to clear proc_event_num_listener. Signed-off-by: wangkeqi <wangkeqiwang@didiglobal.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-19netlink: introduce typedef for filter functionJiri Pirko1-3/+2
Make the code using filter function a bit nicer by consolidating the filter function arguments using typedef. Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-24Fix NULL pointer dereference in cn_filter()Anjali Kulkarni1-1/+1
Check that sk_user_data is not NULL, else return from cn_filter(). Could not reproduce this issue, but Oliver Sang verified it has fixed the "Closes" problem below. Fixes: 2aa1f7a1f47c ("connector/cn_proc: Add filtering to fix some bugs") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202309201456.84c19e27-oliver.sang@intel.com/ Signed-off-by: Anjali Kulkarni <anjali.k.kulkarni@oracle.com> Link: https://lore.kernel.org/r/20231020234058.2232347-1-anjali.k.kulkarni@oracle.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-07-23connector/cn_proc: Allow non-root users accessAnjali Kulkarni2-6/+19
There were a couple of reasons for not allowing non-root users access initially - one is there was some point no proper receive buffer management in place for netlink multicast. But that should be long fixed. See link below for more context. Second is that some of the messages may contain data that is root only. But this should be handled with a finer granularity, which is being done at the protocol layer. The only problematic protocols are nf_queue and the firewall netlink. Hence, this restriction for non-root access was relaxed for NETLINK_ROUTE initially: https://lore.kernel.org/all/20020612013101.A22399@wotan.suse.de/ This restriction has also been removed for following protocols: NETLINK_KOBJECT_UEVENT, NETLINK_AUDIT, NETLINK_SOCK_DIAG, NETLINK_GENERIC, NETLINK_SELINUX. Since process connector messages are not sensitive (process fork, exit notifications etc.), and anyone can read /proc data, we can allow non-root access here. However, since process event notification is not the only consumer of NETLINK_CONNECTOR, we can make this change even more fine grained than the protocol level, by checking for multicast group within the protocol. Allow non-root access for NETLINK_CONNECTOR via NL_CFG_F_NONROOT_RECV but add new bind function cn_bind(), which allows non-root access only for CN_IDX_PROC multicast group. Signed-off-by: Anjali Kulkarni <anjali.k.kulkarni@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-23connector/cn_proc: Performance improvementsAnjali Kulkarni1-6/+56
This patch adds the capability to filter messages sent by the proc connector on the event type supplied in the message from the client to the connector. The client can register to listen for an event type given in struct proc_input. This event based filteting will greatly enhance performance - handling 8K exits takes about 70ms, whereas 8K-forks + 8K-exits takes about 150ms & handling 8K-forks + 8K-exits + 8K-execs takes 200ms. There are currently 9 different types of events, and we need to listen to all of them. Also, measuring the time using pidfds for monitoring 8K process exits took much longer - 200ms, as compared to 70ms using only exit notifications of proc connector. We also add a new event type - PROC_EVENT_NONZERO_EXIT, which is only sent by kernel to a listening application when any process exiting, has a non-zero exit status. This will help the clients like Oracle DB, where a monitoring process wants notfications for non-zero process exits so it can cleanup after them. This kind of a new event could also be useful to other applications like Google's lmkd daemon, which needs a killed process's exit notification. The patch takes care that existing clients using old mechanism of not sending the event type work without any changes. cn_filter function checks to see if the event type being notified via proc connector matches the event type requested by client, before sending(matches) or dropping(does not match) a packet. Signed-off-by: Anjali Kulkarni <anjali.k.kulkarni@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-23connector/cn_proc: Add filtering to fix some bugsAnjali Kulkarni2-14/+64
The current proc connector code has the foll. bugs - if there are more than one listeners for the proc connector messages, and one of them deregisters for listening using PROC_CN_MCAST_IGNORE, they will still get all proc connector messages, as long as there is another listener. Another issue is if one client calls PROC_CN_MCAST_LISTEN, and another one calls PROC_CN_MCAST_IGNORE, then both will end up not getting any messages. This patch adds filtering and drops packet if client has sent PROC_CN_MCAST_IGNORE. This data is stored in the client socket's sk_user_data. In addition, we only increment or decrement proc_event_num_listeners once per client. This fixes the above issues. cn_release is the release function added for NETLINK_CONNECTOR. It uses the newly added netlink_release function added to netlink_sock. It will free sk_user_data. Signed-off-by: Anjali Kulkarni <anjali.k.kulkarni@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-26connector/cn_proc: Use task_is_in_init_pid_ns()Leo Yan1-1/+1
This patch replaces open code with task_is_in_init_pid_ns() to check if a task is in root PID namespace. Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-16net/connector: Add const qualifier to cb_idGeoff Levand2-6/+6
The connector driver never modifies any cb_id passed to it, so add a const qualifier to those arguments so callers can declare their struct cb_id as a constant object. Fixes build warnings like these when passing a constant struct cb_id: warning: passing argument 1 of ‘cn_add_callback’ discards ‘const’ qualifier from pointer target Signed-off-by: Geoff Levand <geoff@infradead.org> Link: https://lore.kernel.org/r/a9e49c9e-67fa-16e7-0a6b-72f6bd30c58a@infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-21connector: simplify the return expression of cn_add_callback()Qinglang Miao1-6/+1
Simplify the return expression. Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-14treewide: replace '---help---' in Kconfig files with 'help'Masahiro Yamada1-2/+2
Since commit 84af7a6194e4 ("checkpatch: kconfig: prefer 'help' over '---help---'"), the number of '---help---' has been gradually decreasing, but there are still more than 2400 instances. This commit finishes the conversion. While I touched the lines, I also fixed the indentation. There are a variety of indentation styles found. a) 4 spaces + '---help---' b) 7 spaces + '---help---' c) 8 spaces + '---help---' d) 1 space + 1 tab + '---help---' e) 1 tab + '---help---' (correct indentation) f) 1 tab + 1 space + '---help---' g) 1 tab + 2 spaces + '---help---' In order to convert all of them to 1 tab + 'help', I ran the following commend: $ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/' Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-05-28connector/cn_proc: Protect send_msg() with a local lockMike Galbraith1-7/+14
send_msg() disables preemption to avoid out-of-order messages. As the code inside the preempt disabled section acquires regular spinlocks, which are converted to 'sleeping' spinlocks on a PREEMPT_RT kernel and eventually calls into a memory allocator, this conflicts with the RT semantics. Convert it to a local_lock which allows RT kernels to substitute them with a real per CPU lock. On non RT kernels this maps to preempt_disable() as before. No functional change. [bigeasy: Patch description] Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20200527201119.1692513-6-bigeasy@linutronix.de
2019-07-21connector: remove redundant input callback from cn_devVasily Averin1-5/+1
A small cleanup: this callback is never used. Originally fixed by Stanislav Kinsburskiy <skinsbursky@virtuozzo.com> for OpenVZ7 bug OVZ-6877 cc: stanislav.kinsburskiy@gmail.com Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-30treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156Thomas Gleixner3-44/+3
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not write to the free software foundation inc 59 temple place suite 330 boston ma 02111 1307 usa extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 1334 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Richard Fontana <rfontana@redhat.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070033.113240726@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21treewide: Add SPDX license identifier - Makefile/KconfigThomas Gleixner2-0/+2
Add SPDX license identifiers to all Make/Kconfig files which: - Have no license information of any form These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-08connector: fix unsafe usage of ->real_parentLi RongQing1-4/+18
proc_exit_connector() uses ->real_parent lockless. This is not safe that its parent can go away at any moment, so use RCU to protect it, and ensure that this task is not released. [ 747.624551] ================================================================== [ 747.632946] BUG: KASAN: use-after-free in proc_exit_connector+0x1f7/0x310 [ 747.640686] Read of size 4 at addr ffff88a0276988e0 by task sshd/2882 [ 747.648032] [ 747.649804] CPU: 11 PID: 2882 Comm: sshd Tainted: G E 4.19.26-rc2 #11 [ 747.658629] Hardware name: IBM x3550M4 -[7914OFV]-/00AM544, BIOS -[D7E142BUS-1.71]- 07/31/2014 [ 747.668419] Call Trace: [ 747.671269] dump_stack+0xf0/0x19b [ 747.675186] ? show_regs_print_info+0x5/0x5 [ 747.679988] ? kmsg_dump_rewind_nolock+0x59/0x59 [ 747.685302] print_address_description+0x6a/0x270 [ 747.691162] kasan_report+0x258/0x380 [ 747.695835] ? proc_exit_connector+0x1f7/0x310 [ 747.701402] proc_exit_connector+0x1f7/0x310 [ 747.706767] ? proc_coredump_connector+0x2d0/0x2d0 [ 747.712715] ? _raw_write_unlock_irq+0x29/0x50 [ 747.718270] ? _raw_write_unlock_irq+0x29/0x50 [ 747.723820] ? ___preempt_schedule+0x16/0x18 [ 747.729193] ? ___preempt_schedule+0x16/0x18 [ 747.734574] do_exit+0xa11/0x14f0 [ 747.738880] ? mm_update_next_owner+0x590/0x590 [ 747.744525] ? debug_show_all_locks+0x3c0/0x3c0 [ 747.761448] ? ktime_get_coarse_real_ts64+0xeb/0x1c0 [ 747.767589] ? lockdep_hardirqs_on+0x1a6/0x290 [ 747.773154] ? check_chain_key+0x139/0x1f0 [ 747.778345] ? check_flags.part.35+0x240/0x240 [ 747.783908] ? __lock_acquire+0x2300/0x2300 [ 747.789171] ? _raw_spin_unlock_irqrestore+0x59/0x70 [ 747.795316] ? _raw_spin_unlock_irqrestore+0x59/0x70 [ 747.801457] ? do_raw_spin_unlock+0x10f/0x1e0 [ 747.806914] ? do_raw_spin_trylock+0x120/0x120 [ 747.812481] ? preempt_count_sub+0x14/0xc0 [ 747.817645] ? _raw_spin_unlock+0x2e/0x50 [ 747.822708] ? __handle_mm_fault+0x12db/0x1fa0 [ 747.828367] ? __pmd_alloc+0x2d0/0x2d0 [ 747.833143] ? check_noncircular+0x50/0x50 [ 747.838309] ? match_held_lock+0x7f/0x340 [ 747.843380] ? check_noncircular+0x50/0x50 [ 747.848561] ? handle_mm_fault+0x21a/0x5f0 [ 747.853730] ? check_flags.part.35+0x240/0x240 [ 747.859290] ? check_chain_key+0x139/0x1f0 [ 747.864474] ? __do_page_fault+0x40f/0x760 [ 747.869655] ? __audit_syscall_entry+0x4b/0x1f0 [ 747.875319] ? syscall_trace_enter+0x1d5/0x7b0 [ 747.880877] ? trace_raw_output_preemptirq_template+0x90/0x90 [ 747.887895] ? trace_raw_output_sys_exit+0x80/0x80 [ 747.893860] ? up_read+0x3b/0x90 [ 747.898142] ? stop_critical_timings+0x260/0x260 [ 747.903909] do_group_exit+0xe0/0x1c0 [ 747.908591] ? __x64_sys_exit+0x30/0x30 [ 747.913460] ? trace_raw_output_preemptirq_template+0x90/0x90 [ 747.920485] ? tracer_hardirqs_on+0x270/0x270 [ 747.925956] __x64_sys_exit_group+0x28/0x30 [ 747.931214] do_syscall_64+0x117/0x400 [ 747.935988] ? syscall_return_slowpath+0x2f0/0x2f0 [ 747.941931] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 747.947788] ? trace_hardirqs_on_caller+0x1d0/0x1d0 [ 747.953838] ? lockdep_sys_exit+0x16/0x8e [ 747.958915] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 747.964784] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 747.971021] RIP: 0033:0x7f572f154c68 [ 747.975606] Code: Bad RIP value. [ 747.979791] RSP: 002b:00007ffed2dfaa58 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 [ 747.989324] RAX: ffffffffffffffda RBX: 00007f572f431840 RCX: 00007f572f154c68 [ 747.997910] RDX: 0000000000000001 RSI: 000000000000003c RDI: 0000000000000001 [ 748.006495] RBP: 0000000000000001 R08: 00000000000000e7 R09: fffffffffffffee0 [ 748.015079] R10: 00007f572f4387e8 R11: 0000000000000246 R12: 00007f572f431840 [ 748.023664] R13: 000055a7f90f2c50 R14: 000055a7f96e2310 R15: 000055a7f96e2310 [ 748.032287] [ 748.034509] Allocated by task 2300: [ 748.038982] kasan_kmalloc+0xa0/0xd0 [ 748.043562] kmem_cache_alloc_node+0xf5/0x2e0 [ 748.049018] copy_process+0x1781/0x4790 [ 748.053884] _do_fork+0x166/0x9a0 [ 748.058163] do_syscall_64+0x117/0x400 [ 748.062943] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 748.069180] [ 748.071405] Freed by task 15395: [ 748.075591] __kasan_slab_free+0x130/0x180 [ 748.080752] kmem_cache_free+0xc2/0x310 [ 748.085619] free_task+0xea/0x130 [ 748.089901] __put_task_struct+0x177/0x230 [ 748.095063] finish_task_switch+0x51b/0x5d0 [ 748.100315] __schedule+0x506/0xfa0 [ 748.104791] schedule+0xca/0x260 [ 748.108978] futex_wait_queue_me+0x27e/0x420 [ 748.114333] futex_wait+0x251/0x550 [ 748.118814] do_futex+0x75b/0xf80 [ 748.123097] __x64_sys_futex+0x231/0x2a0 [ 748.128065] do_syscall_64+0x117/0x400 [ 748.132835] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 748.139066] [ 748.141289] The buggy address belongs to the object at ffff88a027698000 [ 748.141289] which belongs to the cache task_struct of size 12160 [ 748.156589] The buggy address is located 2272 bytes inside of [ 748.156589] 12160-byte region [ffff88a027698000, ffff88a02769af80) [ 748.171114] The buggy address belongs to the page: [ 748.177055] page:ffffea00809da600 count:1 mapcount:0 mapping:ffff888107d01e00 index:0x0 compound_mapcount: 0 [ 748.189136] flags: 0x57ffffc0008100(slab|head) [ 748.194688] raw: 0057ffffc0008100 ffffea00809a3200 0000000300000003 ffff888107d01e00 [ 748.204424] raw: 0000000000000000 0000000000020002 00000001ffffffff 0000000000000000 [ 748.214146] page dumped because: kasan: bad access detected [ 748.220976] [ 748.223197] Memory state around the buggy address: [ 748.229128] ffff88a027698780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 748.238271] ffff88a027698800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 748.247414] >ffff88a027698880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 748.256564] ^ [ 748.264267] ffff88a027698900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 748.273493] ffff88a027698980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 748.282630] ================================================================== Fixes: b086ff87251b4a4 ("connector: add parent pid and tgid to coredump and exit events") Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Signed-off-by: Li RongQing <lirongqing@baidu.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-08connector: fix defined but not used warningRandy Dunlap1-1/+2
Fix a build warning in connector.c when CONFIG_PROC_FS is not enabled by marking the unused function as __maybe_unused. ../drivers/connector/connector.c:242:12: warning: 'cn_proc_show' defined but not used [-Wunused-function] Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Evgeniy Polyakov <zbr@ioremap.net> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds1-0/+4
Pull networking updates from David Miller: 1) Add Maglev hashing scheduler to IPVS, from Inju Song. 2) Lots of new TC subsystem tests from Roman Mashak. 3) Add TCP zero copy receive and fix delayed acks and autotuning with SO_RCVLOWAT, from Eric Dumazet. 4) Add XDP_REDIRECT support to mlx5 driver, from Jesper Dangaard Brouer. 5) Add ttl inherit support to vxlan, from Hangbin Liu. 6) Properly separate ipv6 routes into their logically independant components. fib6_info for the routing table, and fib6_nh for sets of nexthops, which thus can be shared. From David Ahern. 7) Add bpf_xdp_adjust_tail helper, which can be used to generate ICMP messages from XDP programs. From Nikita V. Shirokov. 8) Lots of long overdue cleanups to the r8169 driver, from Heiner Kallweit. 9) Add BTF ("BPF Type Format"), from Martin KaFai Lau. 10) Add traffic condition monitoring to iwlwifi, from Luca Coelho. 11) Plumb extack down into fib_rules, from Roopa Prabhu. 12) Add Flower classifier offload support to igb, from Vinicius Costa Gomes. 13) Add UDP GSO support, from Willem de Bruijn. 14) Add documentation for eBPF helpers, from Quentin Monnet. 15) Add TLS tx offload to mlx5, from Ilya Lesokhin. 16) Allow applications to be given the number of bytes available to read on a socket via a control message returned from recvmsg(), from Soheil Hassas Yeganeh. 17) Add x86_32 eBPF JIT compiler, from Wang YanQing. 18) Add AF_XDP sockets, with zerocopy support infrastructure as well. From Björn Töpel. 19) Remove indirect load support from all of the BPF JITs and handle these operations in the verifier by translating them into native BPF instead. From Daniel Borkmann. 20) Add GRO support to ipv6 gre tunnels, from Eran Ben Elisha. 21) Allow XDP programs to do lookups in the main kernel routing tables for forwarding. From David Ahern. 22) Allow drivers to store hardware state into an ELF section of kernel dump vmcore files, and use it in cxgb4. From Rahul Lakkireddy. 23) Various RACK and loss detection improvements in TCP, from Yuchung Cheng. 24) Add TCP SACK compression, from Eric Dumazet. 25) Add User Mode Helper support and basic bpfilter infrastructure, from Alexei Starovoitov. 26) Support ports and protocol values in RTM_GETROUTE, from Roopa Prabhu. 27) Support bulking in ->ndo_xdp_xmit() API, from Jesper Dangaard Brouer. 28) Add lots of forwarding selftests, from Petr Machata. 29) Add generic network device failover driver, from Sridhar Samudrala. * ra.kernel.org:/pub/scm/linux/kernel/git/davem/net-next: (1959 commits) strparser: Add __strp_unpause and use it in ktls. rxrpc: Fix terminal retransmission connection ID to include the channel net: hns3: Optimize PF CMDQ interrupt switching process net: hns3: Fix for VF mailbox receiving unknown message net: hns3: Fix for VF mailbox cannot receiving PF response bnx2x: use the right constant Revert "net: sched: cls: Fix offloading when ingress dev is vxlan" net: dsa: b53: Fix for brcm tag issue in Cygnus SoC enic: fix UDP rss bits netdev-FAQ: clarify DaveM's position for stable backports rtnetlink: validate attributes in do_setlink() mlxsw: Add extack messages for port_{un, }split failures netdevsim: Add extack error message for devlink reload devlink: Add extack to reload and port_{un, }split operations net: metrics: add proper netlink validation ipmr: fix error path when ipmr_new_table fails ip6mr: only set ip6mr_table from setsockopt when ip6mr_new_table succeeds net: hns3: remove unused hclgevf_cfg_func_mta_filter netfilter: provide udp*_lib_lookup for nf_tproxy qed*: Utilize FW 8.37.2.0 ...
2018-05-16proc: introduce proc_create_single{,_data}Christoph Hellwig1-14/+1
Variants of proc_create{,_data} that directly take a seq_file show callback and drastically reduces the boilerplate code in the callers. All trivial callers converted over. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-01connector: add parent pid and tgid to coredump and exit eventsStefan Strogin1-0/+4
The intention is to get notified of process failures as soon as possible, before a possible core dumping (which could be very long) (e.g. in some process-manager). Coredump and exit process events are perfect for such use cases (see 2b5faa4c553f "connector: Added coredumping event to the process connector"). The problem is that for now the process-manager cannot know the parent of a dying process using connectors. This could be useful if the process-manager should monitor for failures only children of certain parents, so we could filter the coredump and exit events by parent process and/or thread ID. Add parent pid and tgid to coredump and exit process connectors event data. Signed-off-by: Stefan Strogin <sstrogin@cisco.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22drivers, connector: convert cn_callback_entry.refcnt from atomic_t to refcount_tElena Reshetova2-3/+3
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable cn_callback_entry.refcnt is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-05connector: make cn_proc explicitly non-modularPaul Gortmaker1-3/+1
The Kconfig controlling build of this code is currently: drivers/connector/Kconfig:config PROC_EVENTS drivers/connector/Kconfig: bool "Report process events to userspace" ...meaning that it currently is not being built as a module by anyone. Lets remove the two modular references, so that when reading the driver there is no doubt it is builtin-only. Since module_init translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. Cc: Evgeniy Polyakov <zbr@ioremap.net> Cc: netdev@vger.kernel.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28connector: fix out-of-order cn_proc netlink message deliveryAaron Campbell1-21/+22
The proc connector messages include a sequence number, allowing userspace programs to detect lost messages. However, performing this detection is currently more difficult than necessary, since netlink messages can be delivered to the application out-of-order. To fix this, leave pre-emption disabled during cn_netlink_send(), and use GFP_NOWAIT. The following was written as a test case. Building the kernel w/ make -j32 proved a reliable way to generate out-of-order cn_proc messages. int main(int argc, char *argv[]) { static uint32_t last_seq[CPU_SETSIZE], seq; int cpu, fd; struct sockaddr_nl sa; struct __attribute__((aligned(NLMSG_ALIGNTO))) { struct nlmsghdr nl_hdr; struct __attribute__((__packed__)) { struct cn_msg cn_msg; struct proc_event cn_proc; }; } rmsg; struct __attribute__((aligned(NLMSG_ALIGNTO))) { struct nlmsghdr nl_hdr; struct __attribute__((__packed__)) { struct cn_msg cn_msg; enum proc_cn_mcast_op cn_mcast; }; } smsg; fd = socket(PF_NETLINK, SOCK_DGRAM, NETLINK_CONNECTOR); if (fd < 0) { perror("socket"); } sa.nl_family = AF_NETLINK; sa.nl_groups = CN_IDX_PROC; sa.nl_pid = getpid(); if (bind(fd, (struct sockaddr *)&sa, sizeof(sa)) < 0) { perror("bind"); } memset(&smsg, 0, sizeof(smsg)); smsg.nl_hdr.nlmsg_len = sizeof(smsg); smsg.nl_hdr.nlmsg_pid = getpid(); smsg.nl_hdr.nlmsg_type = NLMSG_DONE; smsg.cn_msg.id.idx = CN_IDX_PROC; smsg.cn_msg.id.val = CN_VAL_PROC; smsg.cn_msg.len = sizeof(enum proc_cn_mcast_op); smsg.cn_mcast = PROC_CN_MCAST_LISTEN; if (send(fd, &smsg, sizeof(smsg), 0) != sizeof(smsg)) { perror("send"); } while (recv(fd, &rmsg, sizeof(rmsg), 0) == sizeof(rmsg)) { cpu = rmsg.cn_proc.cpu; if (cpu < 0) { continue; } seq = rmsg.cn_msg.seq; if ((last_seq[cpu] != 0) && (seq != last_seq[cpu] + 1)) { printf("out-of-order seq=%d on cpu=%d\n", seq, cpu); } last_seq[cpu] = seq; } /* NOTREACHED */ perror("recv"); return -1; } Signed-off-by: Aaron Campbell <aaron@monkey.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-04connector: bump skb->users before callback invocationFlorian Westphal1-8/+3
Dmitry reports memleak with syskaller program. Problem is that connector bumps skb usecount but might not invoke callback. So move skb_get to where we invoke the callback. Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-06mm, page_alloc: distinguish between being unable to sleep, unwilling to ↵Mel Gorman1-1/+2
sleep and avoiding waking kswapd __GFP_WAIT has been used to identify atomic context in callers that hold spinlocks or are in interrupts. They are expected to be high priority and have access one of two watermarks lower than "min" which can be referred to as the "atomic reserve". __GFP_HIGH users get access to the first lower watermark and can be called the "high priority reserve". Over time, callers had a requirement to not block when fallback options were available. Some have abused __GFP_WAIT leading to a situation where an optimisitic allocation with a fallback option can access atomic reserves. This patch uses __GFP_ATOMIC to identify callers that are truely atomic, cannot sleep and have no alternative. High priority users continue to use __GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify callers that want to wake kswapd for background reclaim. __GFP_WAIT is redefined as a caller that is willing to enter direct reclaim and wake kswapd for background reclaim. This patch then converts a number of sites o __GFP_ATOMIC is used by callers that are high priority and have memory pools for those requests. GFP_ATOMIC uses this flag. o Callers that have a limited mempool to guarantee forward progress clear __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall into this category where kswapd will still be woken but atomic reserves are not used as there is a one-entry mempool to guarantee progress. o Callers that are checking if they are non-blocking should use the helper gfpflags_allow_blocking() where possible. This is because checking for __GFP_WAIT as was done historically now can trigger false positives. Some exceptions like dm-crypt.c exist where the code intent is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to flag manipulations. o Callers that built their own GFP flags instead of starting with GFP_KERNEL and friends now also need to specify __GFP_KSWAPD_RECLAIM. The first key hazard to watch out for is callers that removed __GFP_WAIT and was depending on access to atomic reserves for inconspicuous reasons. In some cases it may be appropriate for them to use __GFP_HIGH. The second key hazard is callers that assembled their own combination of GFP flags instead of starting with something like GFP_KERNEL. They may now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless if it's missed in most cases as other activity will wake kswapd. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Vitaly Wool <vitalywool@gmail.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-01-07kconfig: use bool instead of boolean for type definition attributesChristoph Jaeger1-1/+1
Support for keyword 'boolean' will be dropped later on. No functional change. Reference: http://lkml.kernel.org/r/cover.1418003065.git.cj@linux.com Signed-off-by: Christoph Jaeger <cj@linux.com> Signed-off-by: Michal Marek <mmarek@suse.cz>
2014-11-26cn: verify msg->len before making callbackDavid Fries1-0/+6
The struct cn_msg len field comes from userspace and needs to be validated. More logical to do so here where the cn_msg pointer is pulled out of the sk_buff than the callback which is passed cn_msg * and might assume no validation is needed. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: David Fries <David@Fries.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-07-23connector: Use ktime_get_ns()Thomas Gleixner1-27/+9
Replace the ever recurring: ts = ktime_get_ts(); ns = timespec_to_ns(&ts); with ns = ktime_get_ns(); Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: John Stultz <john.stultz@linaro.org>
2014-06-03Merge tag 'char-misc-3.16-rc1' of ↵Linus Torvalds1-2/+15
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc into next Pull char/misc driver patches from Greg KH: "Here is the big char / misc driver update for 3.16-rc1. Lots of different driver updates for a variety of different drivers and minor driver subsystems. All have been in linux-next with no reported issues" * tag 'char-misc-3.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (79 commits) hv: use correct order when freeing monitor_pages spmi: of: fixup generic SPMI devicetree binding example applicom: dereferencing NULL on error path misc: genwqe: fix uninitialized return value in genwqe_free_sync_sgl() miscdevice.h: Simple syntax fix to make pointers consistent. MAINTAINERS: Add miscdevice.h to file list for char/misc drivers. mcb: Add support for shared PCI IRQs drivers: Remove duplicate conditionally included subdirs misc: atmel_pwm: only build for supported platforms mei: me: move probe quirk to cfg structure mei: add per device configuration mei: me: read H_CSR after asserting reset mei: me: drop harmful wait optimization mei: me: fix hw ready reset flow mei: fix memory leak of mei_clients array uio: fix vma io range check in mmap drivers: uio_dmem_genirq: Fix memory leak in uio_dmem_genirq_probe() w1: do not unlock unheld list_mutex in __w1_remove_master_device() w1: optional bundling of netlink kernel replies connector: allow multiple messages to be sent in one packet ...
2014-05-27connector: allow multiple messages to be sent in one packetDavid Fries1-2/+15
This increases the amount of bundling to reduce the number of packets sent. For the one wire use there can be multiple struct w1_netlink_cmd in a struct w1_netlink_msg and multiple of those in struct cn_msg, and with this change multiple of those in a struct nlmsghdr, and at each level the len identifies there being multiple of the next. Signed-off-by: David Fries <David@Fries.net> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-24net: Use netlink_ns_capable to verify the permisions of netlink messagesEric W. Biederman1-1/+1
It is possible by passing a netlink socket to a more privileged executable and then to fool that executable into writing to the socket data that happens to be valid netlink message to do something that privileged executable did not intend to do. To keep this from happening replace bare capable and ns_capable calls with netlink_capable, netlink_net_calls and netlink_ns_capable calls. Which act the same as the previous calls except they verify that the opener of the socket had the desired permissions as well. Reported-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds1-1/+0
Pull networking updates from David Miller: "Here is my initial pull request for the networking subsystem during this merge window: 1) Support for ESN in AH (RFC 4302) from Fan Du. 2) Add full kernel doc for ethtool command structures, from Ben Hutchings. 3) Add BCM7xxx PHY driver, from Florian Fainelli. 4) Export computed TCP rate information in netlink socket dumps, from Eric Dumazet. 5) Allow IPSEC SA to be dumped partially using a filter, from Nicolas Dichtel. 6) Convert many drivers to pci_enable_msix_range(), from Alexander Gordeev. 7) Record SKB timestamps more efficiently, from Eric Dumazet. 8) Switch to microsecond resolution for TCP round trip times, also from Eric Dumazet. 9) Clean up and fix 6lowpan fragmentation handling by making use of the existing inet_frag api for it's implementation. 10) Add TX grant mapping to xen-netback driver, from Zoltan Kiss. 11) Auto size SKB lengths when composing netlink messages based upon past message sizes used, from Eric Dumazet. 12) qdisc dumps can take a long time, add a cond_resched(), From Eric Dumazet. 13) Sanitize netpoll core and drivers wrt. SKB handling semantics. Get rid of never-used-in-tree netpoll RX handling. From Eric W Biederman. 14) Support inter-address-family and namespace changing in VTI tunnel driver(s). From Steffen Klassert. 15) Add Altera TSE driver, from Vince Bridgers. 16) Optimizing csum_replace2() so that it doesn't adjust the checksum by checksumming the entire header, from Eric Dumazet. 17) Expand BPF internal implementation for faster interpreting, more direct translations into JIT'd code, and much cleaner uses of BPF filtering in non-socket ocntexts. From Daniel Borkmann and Alexei Starovoitov" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1976 commits) netpoll: Use skb_irq_freeable to make zap_completion_queue safe. net: Add a test to see if a skb is freeable in irq context qlcnic: Fix build failure due to undefined reference to `vxlan_get_rx_port' net: ptp: move PTP classifier in its own file net: sxgbe: make "core_ops" static net: sxgbe: fix logical vs bitwise operation net: sxgbe: sxgbe_mdio_register() frees the bus Call efx_set_channels() before efx->type->dimension_resources() xen-netback: disable rogue vif in kthread context net/mlx4: Set proper build dependancy with vxlan be2net: fix build dependency on VxLAN mac802154: make csma/cca parameters per-wpan mac802154: allow only one WPAN to be up at any given time net: filter: minor: fix kdoc in __sk_run_filter netlink: don't compare the nul-termination in nla_strcmp can: c_can: Avoid led toggling for every packet. can: c_can: Simplify TX interrupt cleanup can: c_can: Store dlc private can: c_can: Reduce register access can: c_can: Make the code readable ...
2014-03-03connector: remove duplicated code in cn_call_callback()Alexey Khoroshilov1-1/+0
There were a couple of patches fixing the same bug that results in duplicated err = 0; assignment. The patch removes one of them. Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-07connector: add portid to unicast in addition to broadcastingDavid Fries2-16/+22
This allows replying only to the requestor portid while still supporting broadcasting. Pass 0 to portid for the previous behavior. Signed-off-by: David Fries <David@Fries.net> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-14connector: improved unaligned access error fixChris Metcalf1-30/+42
In af3e095a1fb4, Erik Jacobsen fixed one type of unaligned access bug for ia64 by converting a 64-bit write to use put_unaligned(). Unfortunately, since gcc will convert a short memset() to a series of appropriately-aligned stores, the problem is now visible again on tilegx, where the memset that zeros out proc_event is converted to three 64-bit stores, causing an unaligned access panic. A better fix for the original problem is to ensure that proc_event is aligned to 8 bytes here. We can do that relatively easily by arranging to start the struct cn_msg aligned to 8 bytes and then offset by 4 bytes. Doing so means that the immediately following proc_event structure is then correctly aligned to 8 bytes. The result is that the memset() stores are now aligned, and as an added benefit, we can remove the put_unaligned() calls in the code. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-02connector: use 'size' everywhere in cn_netlink_send()Mathias Krause1-1/+1
We calculated the size for the netlink message buffer as size. Use size in the memcpy() call as well instead of recalculating it. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-02connector: use nlmsg_len() to check message lengthMathias Krause1-3/+4
The current code tests the length of the whole netlink message to be at least as long to fit a cn_msg. This is wrong as nlmsg_len includes the length of the netlink message header. Use nlmsg_len() instead to fix this "off-by-NLMSG_HDRLEN" size check. Cc: stable@vger.kernel.org # v2.6.14+ Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-02proc connector: fix info leaksMathias Krause1-0/+18
Initialize event_data for all possible message types to prevent leaking kernel stack contents to userland (up to 20 bytes). Also set the flags member of the connector message to 0 to prevent leaking two more stack bytes this way. Cc: stable@vger.kernel.org # v2.6.15+ Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-28connector: replace obsolete NLMSG_* with type safe nlmsg_*Hong zhi guo1-6/+6
Signed-off-by: Hong Zhiguo <honkiko@gmail.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20connector: Added coredumping event to the process connectorJesper Derehag1-0/+25
Process connector can now also detect coredumping events. Main aim of patch is get notified at start of coredumping, instead of having to wait for it to finish and then being notified through EXIT event. Could be used for instance by process-managers that want to get notified as soon as possible about process failures, and not necessarily beeing notified after coredump, which could be in the order of minutes depending on size of coredump, piping and so on. Signed-off-by: Jesper Derehag <jderehag@hotmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-27proc connector: reject unprivileged listener bumpsKees Cook1-0/+8
While PROC_CN_MCAST_LISTEN/IGNORE is entirely advisory, it was possible for an unprivileged user to turn off notifications for all listeners by sending PROC_CN_MCAST_IGNORE. Instead, require the same privileges as required for a multicast bind. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Evgeniy Polyakov <zbr@ioremap.net> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: stable@vger.kernel.org Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Acked-by: Matt Helsley <matthltc@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-18net: proc: change proc_net_remove to remove_proc_entryGao feng1-1/+1
proc_net_remove is only used to remove proc entries that under /proc/net,it's not a general function for removing proc entries of netns. if we want to remove some proc entries which under /proc/net/stat/, we still need to call remove_proc_entry. this patch use remove_proc_entry to replace proc_net_remove. we can remove proc_net_remove after this patch. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-18net: proc: change proc_net_fops_create to proc_createGao feng1-1/+1
Right now, some modules such as bonding use proc_create to create proc entries under /proc/net/, and other modules such as ipv4 use proc_net_fops_create. It looks a little chaos.this patch changes all of proc_net_fops_create to proc_create. we can remove proc_net_fops_create after this patch. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-03Drivers: misc: remove __dev* attributes.Greg Kroah-Hartman1-2/+2
CONFIG_HOTPLUG is going away as an option. As a result, the __dev* markings need to be removed. This change removes the use of __devinit, __devexit_p, __devinitdata, __devinitconst, and __devexit from these drivers. Based on patches originally written by Bill Pemberton, but redone by me in order to handle some of the coding style issues better, by hand. Cc: Bill Pemberton <wfp5p@virginia.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds1-2/+1
Pull networking changes from David Miller: 1) GRE now works over ipv6, from Dmitry Kozlov. 2) Make SCTP more network namespace aware, from Eric Biederman. 3) TEAM driver now works with non-ethernet devices, from Jiri Pirko. 4) Make openvswitch network namespace aware, from Pravin B Shelar. 5) IPV6 NAT implementation, from Patrick McHardy. 6) Server side support for TCP Fast Open, from Jerry Chu and others. 7) Packet BPF filter supports MOD and XOR, from Eric Dumazet and Daniel Borkmann. 8) Increate the loopback default MTU to 64K, from Eric Dumazet. 9) Use a per-task rather than per-socket page fragment allocator for outgoing networking traffic. This benefits processes that have very many mostly idle sockets, which is quite common. From Eric Dumazet. 10) Use up to 32K for page fragment allocations, with fallbacks to smaller sizes when higher order page allocations fail. Benefits are a) less segments for driver to process b) less calls to page allocator c) less waste of space. From Eric Dumazet. 11) Allow GRO to be used on GRE tunnels, from Eric Dumazet. 12) VXLAN device driver, one way to handle VLAN issues such as the limitation of 4096 VLAN IDs yet still have some level of isolation. From Stephen Hemminger. 13) As usual there is a large boatload of driver changes, with the scale perhaps tilted towards the wireless side this time around. Fix up various fairly trivial conflicts, mostly caused by the user namespace changes. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1012 commits) hyperv: Add buffer for extended info after the RNDIS response message. hyperv: Report actual status in receive completion packet hyperv: Remove extra allocated space for recv_pkt_list elements hyperv: Fix page buffer handling in rndis_filter_send_request() hyperv: Fix the missing return value in rndis_filter_set_packet_filter() hyperv: Fix the max_xfer_size in RNDIS initialization vxlan: put UDP socket in correct namespace vxlan: Depend on CONFIG_INET sfc: Fix the reported priorities of different filter types sfc: Remove EFX_FILTER_FLAG_RX_OVERRIDE_IP sfc: Fix loopback self-test with separate_tx_channels=1 sfc: Fix MCDI structure field lookup sfc: Add parentheses around use of bitfield macro arguments sfc: Fix null function pointer in efx_sriov_channel_type vxlan: virtual extensible lan igmp: export symbol ip_mc_leave_group netlink: add attributes to fdb interface tg3: unconditionally select HWMON support when tg3 is enabled. Revert "net: ti cpsw ethernet: allow reading phy interface mode from DT" gre: fix sparse warning ...
2012-09-08netlink: hide struct module parameter in netlink_kernel_createPablo Neira Ayuso1-2/+1
This patch defines netlink_kernel_create as a wrapper function of __netlink_kernel_create to hide the struct module *me parameter (which seems to be THIS_MODULE in all existing netlink subsystems). Suggested by David S. Miller. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-06userns: Convert process event connector to handle kuids and kgidsEric W. Biederman1-4/+14
- Only allow asking for events from the initial user and pid namespace, where we generate the events in. - Convert kuids and kgids into the initial user namespace to report them via the process event connector. Cc: David Miller <davem@davemloft.net> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-07-16drivers: connector: fixed coding style issuesValentin Ilie3-25/+28
V2: Replaced assignment in if statement. Fixed coding style issues. Signed-off-by: Valentin Ilie <valentin.ilie@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-29netlink: add netlink_kernel_cfg parameter to netlink_kernel_createPablo Neira Ayuso1-4/+9
This patch adds the following structure: struct netlink_kernel_cfg { unsigned int groups; void (*input)(struct sk_buff *skb); struct mutex *cb_mutex; }; That can be passed to netlink_kernel_create to set optional configurations for netlink kernel sockets. I've populated this structure by looking for NULL and zero parameters at the existing code. The remaining parameters that always need to be set are still left in the original interface. That includes optional parameters for the netlink socket creation. This allows easy extensibility of this interface in the future. This patch also adapts all callers to use this new interface. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-26connector: use nlmsg_put() instead of NLMSG_PUT() macro.Javier Martinez Canillas1-6/+6
The NLMSG_PUT() macro contains a hidden goto which makes the code hard to audit and very error prone. While been there also use the inline function nlmsg_data() instead of the NLMSG_DATA() macro to do explicit type checking. Signed-off-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-28connector: add comm change event report to proc connectorVladimir Zapolskiy1-0/+26
Add an event to monitor comm value changes of tasks. Such an event becomes vital, if someone desires to control threads of a process in different manner. A natural characteristic of threads is its comm value, and helpfully application developers have an opportunity to change it in runtime. Reporting about such events via proc connector allows to fine-grain monitoring and control potentials, for instance a process control daemon listening to proc connector and following comm value policies can place specific threads to assigned cgroup partitions. It might be possible to achieve a pale partial one-shot likeness without this update, if an application changes comm value of a thread generator task beforehand, then a new thread is cloned, and after that proc connector listener gets the fork event and reads new thread's comm value from procfs stat file, but this change visibly simplifies and extends the matter. Signed-off-by: Vladimir Zapolskiy <vzapolskiy@gmail.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-28proc_fork_connector: a lockless ->real_parent usage is not safeOleg Nesterov1-2/+6
proc_fork_connector() uses ->real_parent lockless. This is not safe if copy_process() was called with CLONE_THREAD or CLONE_PARENT, in this case the parent != current can go away at any moment. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Vladimir Zapolskiy <vzapolskiy@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Evgeniy Polyakov <zbr@ioremap.net> Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-26atomic: use <linux/atomic.h>Arun Sharma1-1/+2
This allows us to move duplicated code in <asm/atomic.h> (atomic_inc_not_zero() for now) to <linux/atomic.h> Signed-off-by: Arun Sharma <asharma@fb.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-25drivers/connector/cn_proc.c: remove unused localAndrew Morton1-1/+0
Fix the warning drivers/connector/cn_proc.c: In function 'proc_ptrace_connector': drivers/connector/cn_proc.c:176: warning: unused variable 'tracer' Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-22Merge branch 'ptrace' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/miscLinus Torvalds1-0/+35
* 'ptrace' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc: (39 commits) ptrace: do_wait(traced_leader_killed_by_mt_exec) can block forever ptrace: fix ptrace_signal() && STOP_DEQUEUED interaction connector: add an event for monitoring process tracers ptrace: dont send SIGSTOP on auto-attach if PT_SEIZED ptrace: mv send-SIGSTOP from do_fork() to ptrace_init_task() ptrace_init_task: initialize child->jobctl explicitly has_stopped_jobs: s/task_is_stopped/SIGNAL_STOP_STOPPED/ ptrace: make former thread ID available via PTRACE_GETEVENTMSG after PTRACE_EVENT_EXEC stop ptrace: wait_consider_task: s/same_thread_group/ptrace_reparented/ ptrace: kill real_parent_is_ptracer() in in favor of ptrace_reparented() ptrace: ptrace_reparented() should check same_thread_group() redefine thread_group_leader() as exit_signal >= 0 do not change dead_task->exit_signal kill task_detached() reparent_leader: check EXIT_DEAD instead of task_detached() make do_notify_parent() __must_check, update the callers __ptrace_detach: avoid task_detached(), check do_notify_parent() kill tracehook_notify_death() make do_notify_parent() return bool ptrace: s/tracehook_tracer_task()/ptrace_parent()/ ...
2011-07-18connector: add an event for monitoring process tracersVladimir Zapolskiy1-0/+35
This change adds a procfs connector event, which is emitted on every successful process tracer attach or detach. If some process connects to other one, kernelspace connector reports process id and thread group id of both these involved processes. On disconnection null process id is returned. Such an event allows to create a simple automated userspace mechanism to be aware about processes connecting to others, therefore predefined process policies can be applied to them if needed. Note, a detach signal is emitted only in case, if a tracer process explicitly executes PTRACE_DETACH request. In other cases like tracee or tracer exit detach event from proc connector is not reported. Signed-off-by: Vladimir Zapolskiy <vzapolskiy@gmail.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-07Connector: Correctly set the error code in case of success when dispatching ↵K. Y. Srinivasan1-0/+1
receive callbacks The recent changes to the connector code introduced this bug where even when a callback was invoked, we would return an error resulting in double freeing of the skb. This patch fixes this bug. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Cc: stable <stable@kernel.org> [.39] Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-12connector: fix skb double free in cn_rx_skb()Patrick McHardy1-0/+1
When a skb is delivered to a registered callback, cn_call_callback() incorrectly returns -ENODEV after freeing the skb, causing cn_rx_skb() to free the skb a second time. Reported-by: Eric B Munson <emunson@mgebm.net> Signed-off-by: Patrick McHardy <kaber@trash.net> Tested-by: Eric B Munson <emunson@mgebm.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-30connector: convert to synchronous netlink message processingPatrick McHardy2-76/+29
Commits 01a16b21 (netlink: kill eff_cap from struct netlink_skb_parms) and c53fa1ed (netlink: kill loginuid/sessionid/sid members from struct netlink_skb_parms) removed some members from struct netlink_skb_parms that depend on the current context, all netlink users are now required to do synchronous message processing. connector however queues received messages and processes them in a work queue, which is not valid anymore. This patch converts connector to do synchronous message processing by invoking the registered callback handler directly from the netlink receive function. In order to avoid invoking the callback with connector locks held, a reference count is added to struct cn_callback_entry, the reference is taken when finding a matching callback entry on the device's queue_list and released after the callback handler has been invoked. Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-23connector: Convert char *name to const char *nameJoe Perches2-4/+5
Allow more const declarations. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-12-17connector: Use this_cpu operationsChristoph Lameter1-2/+3
The patch was originally in the use cpuops patchset but it needs an inc_return and is therefore dependent on an extension of the cpu ops. Fixed up and verified that it compiles. get_seq can benefit from this_cpu_operations. Address calculation is avoided and the increment is done using an xadd. Cc: Scott James Remnant <scott@ubuntu.com> Cc: Mike Frysinger <vapier@gentoo.org> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2010-12-10connector: add module aliasStephen Hemminger1-0/+1
Since connector can be built as a module and uses netlink socket to communicate. The module should have an alias to autoload when socket of NETLINK_CONNECTOR type is requested. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-24connector: remove lazy workqueue creationTejun Heo2-72/+12
Commit 1a5645bc (connector: create connector workqueue only while needed once) implements lazy workqueue creation for connector workqueue. With cmwq now in place, lazy workqueue creation doesn't make much sense while adding a lot of complexity. Remove it and allocate an ordered workqueue during initialization. This also removes a call to flush_scheduled_work() which is deprecated and scheduled to be removed. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo2-0/+2
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-02-02connector: Delete buggy notification code.Evgeniy Polyakov1-175/+0
On Tue, Feb 02, 2010 at 02:57:14PM -0800, Greg KH (gregkh@suse.de) wrote: > > There are at least two ways to fix it: using a big cannon and a small > > one. The former way is to disable notification registration, since it is > > not used by anyone at all. Second way is to check whether calling > > process is root and its destination group is -1 (kind of priveledged > > one) before command is dispatched to workqueue. > > Well if no one is using it, removing it makes the most sense, right? > > No objection from me, care to make up a patch either way for this? Getting it is not used, let's drop support for notifications about (un)registered events from connector. Another option was to check credentials on receiving, but we can always restore it without bugs if needed, but genetlink has a wider code base and none complained, that userspace can not get notification when some other clients were (un)registered. Kudos for Sebastian Krahmer <krahmer@suse.de>, who found a bug in the code. Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-06connector: Fix incompatible pointer type warningStephen Boyd1-1/+2
Commit 7069331 (connector: Provide the sender's credentials to the callback, 2009-10-02) changed callbacks to take two arguments but missed this one. drivers/connector/cn_proc.c: In function ‘cn_proc_init’: drivers/connector/cn_proc.c:263: warning: passing argument 3 of ‘cn_add_callback’ from incompatible pointer type Signed-off-by: Stephen Boyd <bebarino@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-02connector: Removed the destruct_data callback since it is always kfree_skb()Philipp Reisner2-10/+5
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Acked-by: Lars Ellenberg <lars.ellenberg@linbit.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-02connector: Provide the sender's credentials to the callbackPhilipp Reisner2-5/+6
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Acked-by: Lars Ellenberg <lars.ellenberg@linbit.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-02connector: Keep the skb in cn_callback_dataPhilipp Reisner2-7/+7
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Acked-by: Lars Ellenberg <lars.ellenberg@linbit.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-23proc connector: add event for process becoming session leaderScott James Remnant1-0/+25
The act of a process becoming a session leader is a useful signal to a supervising init daemon such as Upstart. While a daemon will normally do this as part of the process of becoming a daemon, it is rare for its children to do so. When the children do, it is nearly always a sign that the child should be considered detached from the parent and not supervised along with it. The poster-child example is OpenSSH; the per-login children call setsid() so that they may control the pty connected to them. If the primary daemon dies or is restarted, we do not want to consider the per-login children and want to respawn the primary daemon without killing the children. This patch adds a new PROC_SID_EVENT and associated structure to the proc_event event_data union, it arranges for this to be emitted when the special PIDTYPE_SID pid is set. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Scott James Remnant <scott@ubuntu.com> Acked-by: Matt Helsley <matthltc@us.ibm.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Acked-by: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-23Merge branch 'master' of ↵David S. Miller2-3/+3
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/iwmc3200wifi/netdev.c net/wireless/scan.c
2009-07-21connector: maintainer/mail update.Evgeniy Polyakov2-3/+3
Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-17connector: make callback argument type explicitMike Frysinger3-7/+9
The connector documentation states that the argument to the callback function is always a pointer to a struct cn_msg, but rather than encode it in the API itself, it uses a void pointer everywhere. This doesn't make much sense to encode the pointer in documentation as it prevents proper C type checking from occurring and can easily allow people to use the wrong pointer type. So convert the argument type to an explicit struct cn_msg pointer. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-12trivial: Kconfig: .ko is normally not included in module namesPavel Machek1-1/+1
.ko is normally not included in Kconfig help, make it consistent. Signed-off-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-02-02connector: create connector workqueue only while needed onceFrederic Weisbecker2-20/+79
The netlink connector uses its own workqueue to relay the datas sent from userspace to the appropriate callback. If you launch the test from Documentation/connector and change it a bit to send a high flow of data, you will see thousands of events coming to the "cqueue" workqueue by looking at the workqueue tracer. This flow of events can be sent very quickly. So, to not encumber the kevent workqueue and delay other jobs, the "cqueue" workqueue should remain. But this workqueue is pointless most of the time, it will always be created (assuming you have built it of course) although only developpers with specific needs will use it. So avoid this "most of the time useless task", this patch proposes to create this workqueue only when needed once. The first jobs to be sent to connector callbacks will be sent to kevent while the "cqueue" thread creation will be scheduled to kevent too. The following jobs will continue to be scheduled to keventd until the cqueue workqueue is created, and then the rest of the jobs will continue to perform as usual, through this dedicated workqueue. Each time I tested this patch, only the first event was sent to keventd, the rest has been sent to cqueue which have been created quickly. Also, this patch fixes some trailing whitespaces on the connector files. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-14CRED: Use RCU to access another task's creds and to release a task's own credsDavid Howells1-5/+11
Use RCU to access another task's creds and to release a task's own creds. This means that it will be possible for the credentials of a task to be replaced without another task (a) requiring a full lock to read them, and (b) seeing deallocated memory. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14CRED: Separate task security context from task_structDavid Howells1-4/+4
Separate the task security context from task_struct. At this point, the security data is temporarily embedded in the task_struct with two pointers pointing to it. Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in entry.S via asm-offsets. With comment fixes Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>
2008-06-27CONNECTOR: add a proc entry to list connectorsLi Zefan1-0/+40
I got a problem when I wanted to check if the kernel supports process event connector, and It seems there's no way to do this check. At best I can check if the kernel supports connector or not, by looking into /proc/net/netlink, or maybe checking the return value of bind() to see if it's ENOENT. So it would be useful to add /proc/net/connector to list all supported connectors: # cat /proc/net/connector Name ID connector 4294967295:4294967295 cn_proc 1:1 w1 3:1 Changelog: - fix memory leak: s/seq_release/single_release - use spin_lock_bh instead of spin_lock_irqsave Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23connector: convert to single-threaded workqueueEvgeniy Polyakov1-1/+1
From: Evgeniy Polyakov <johnpol@2ka.mipt.ru> We don't need one cqueue thread for each CPU. cqueue is used for receiving userspace datagrams, which are very rare and thus will happily live with a single queue. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-26CONNECTOR: make cn_already_initialized staticLi Zefan1-1/+1
It is used in connector.c only, so make it static. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NETNS]: Consolidate kernel netlink socket destruction.Denis V. Lunev1-6/+3
Create a specific helper for netlink kernel socket disposal. This just let the code look better and provides a ground for proper disposal inside a namespace. Signed-off-by: Denis V. Lunev <den@openvz.org> Tested-by: Alexey Dobriyan <adobriyan@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CONNECTOR]: Cleanup struct cn_callback_entryLi Zefan1-1/+0
- 'cb' is a fake struct member. In a previous patch struct cn_callback was renamed to cn_callback_id, so 'cb' should have been deleted at that time. - 'nls' isn't used and is redundant, we can retrieve this data through cn_callback_entry.pdev->nls. - 'seq' and 'group' should be u32, as they are declared to be u32 in other places. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CONNECTOR]: Cleanup struct cn_queue_devLi Zefan1-1/+0
Struct member netlink_groups is never used, and I don't see how it can be useful. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CONNECTOR]: clean up {,__}cn_rx_skb()Li Zefan1-26/+4
- __cn_rx_skb() does nothing but calls cn_call_callback(), it doesn't check skb and msg sizes as the comment suggests, but cn_rx_skb() checks those sizes. - In cn_rx_skb() Local variable 'len' is not used. 'len' is probably intended to be passed to skb_pull(), but here skb_pull() is not needed, instead skb_free() is called. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CONNECTOR]: add a missing break in cn_netlink_send()Li Zefan1-0/+1
Each entry in the list has a unique id, so just break out of the loop if the matched id is found. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-08[CONNECTOR]: Don't touch queue dev after decrement of ref count.Li Zefan1-1/+1
cn_queue_free_callback() will touch 'dev'(i.e. cbq->pdev), so it should be called before atomic_dec(&dev->refcnt). Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-04[CONNECTOR]: Return proper error code in cn_call_callback()Li Zefan1-0/+2
Error code should be set to EINVAL instead of ENODEV if !queue_work(). There's another call of queue_work() which may set err to EINVAL. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-30[CONNECTOR]: Fix a spurious kfree_skb() callMichal Januszewski1-4/+1
Remove a spurious call to kfree_skb() in the connector rx_skb handler. This fixes a regression introduced by the '[NET]: make netlink user -> kernel interface synchronious' patch (cd40b7d3983c708aabe3d3008ec64ffce56d33b0) Signed-off-by: Michal Januszewski <spock@gentoo.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[NET]: make netlink user -> kernel interface synchroniousDenis V. Lunev1-13/+1
This patch make processing netlink user -> kernel messages synchronious. This change was inspired by the talk with Alexey Kuznetsov about current netlink messages processing. He says that he was badly wrong when introduced asynchronious user -> kernel communication. The call netlink_unicast is the only path to send message to the kernel netlink socket. But, unfortunately, it is also used to send data to the user. Before this change the user message has been attached to the socket queue and sk->sk_data_ready was called. The process has been blocked until all pending messages were processed. The bad thing is that this processing may occur in the arbitrary process context. This patch changes nlk->data_ready callback to get 1 skb and force packet processing right in the netlink_unicast. Kernel -> user path in netlink_unicast remains untouched. EINTR processing for in netlink_run_queue was changed. It forces rtnl_lock drop, but the process remains in the cycle until the message will be fully processed. So, there is no need to use this kludges now. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[NET]: Support multiple network namespaces with netlinkEric W. Biederman1-1/+1
Each netlink socket will live in exactly one network namespace, this includes the controlling kernel sockets. This patch updates all of the existing netlink protocols to only support the initial network namespace. Request by clients in other namespaces will get -ECONREFUSED. As they would if the kernel did not have the support for that netlink protocol compiled in. As each netlink protocol is updated to be multiple network namespace safe it can register multiple kernel sockets to acquire a presence in the rest of the network namespaces. The implementation in af_netlink is a simple filter implementation at hash table insertion and hash table look up time. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16Use menuconfig objects: connectorJan Engelhardt1-3/+4
Use menuconfigs instead of menus, so the whole menu can be disabled at once instead of going through all options. Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Acked-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-25[NETLINK]: Switch cb_lock spinlock to mutex and allow to override itPatrick McHardy1-1/+1
Switch cb_lock to mutex and allow netlink kernel users to override it with a subsystem specific mutex for consistent locking in dump callbacks. All netlink_dump_start users have been audited not to rely on any side-effects of the previously used spinlock. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[NETLINK]: Introduce nlmsg_hdr() helperArnaldo Carvalho de Melo1-1/+1
For the common "(struct nlmsghdr *)skb->data" sequence, so that we reduce the number of direct accesses to skb->data and for consistency with all the other cast skb member helpers. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-07[CONNECTOR]: Bugfix for cn_call_callback()Philipp Reisner1-11/+11
When system under heavy stress and must allocate new work instead of reusing old one, new work must use correct completion callback. Patch is based on Philipp's and Lars' work. I only cleaned small stuff (and removed spaces instead of tabs). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-01-05[PATCH] connector: some fixes for ia64 unaligned access errorsErik Jacobson1-5/+6
On ia64, the various functions that make up cn_proc.c cause kernel unaligned access errors. If you are using these, for example, to get notification about all tasks forking and exiting, you get multiple unaligned access errors per process. Use put_unaligned() in the appropriate palces to fix this. Signed-off-by: Erik Jacobson <erikj@sgi.com> Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: <stable@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-18[CONNECTOR]: Replace delayed work with usual work queue.Evgeniy Polyakov2-12/+9
Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-17[CONNECTOR]: Fix compilation breakage introduced recently.Evgeniy Polyakov1-2/+1
Linus has changed work queue structure and has not tested it with connector compiled in, his changes break the build. Attached patch fixes compilation error. Patch is against commit 99f5e9718185f07458ae70c2282c2153a2256c91. Thanks to Toralf Förster for pointing this out. Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-11-22WorkStruct: make allyesconfigDavid Howells2-19/+20
Fix up for make allyesconfig. Signed-Off-By: David Howells <dhowells@redhat.com>
2006-07-31[PATCH] Process Events: Fix biarch compatibility issue. use __u64 timestampChandra Seetharaman1-5/+15
Events sent by Process Events Connector from a 64-bit kernel are not binary compatible with a 32-bit userspace program because the "timestamp" field (struct timespec) is not arch independent. This affects the fields that follow "timestamp" as they will be be off by 8 bytes. This is a problem for 32-bit userspace programs running with 64-bit kernels on ppc64, s390, x86-64.. any "biarch" system. Matt had submitted a different solution to lkml as an RFC earlier. We have since switched to a solution recommended by Evgeniy Polyakov. This patch fixes the problem by changing the timestamp to be a __u64, which stores the number of nanoseconds. Tested on a x86_64 system with both 32 bit application and 64 bit application and on a i386 system. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Matt Helsley <matthltc@us.ibm.com> Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] connector-exportsAndrew Morton1-4/+3
Put the connector exports at the functions so people can see them in context. Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] Process Events - Header CleanupMatt Helsley1-0/+1
Move connector header include to precisely where it's needed. Remove unused time.h header file as well. This was leftover from previous iterations of the process events patches. Signed-off-by: Matt Helsley <matthltc@us.ibm.com> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Cc: Nguyen Anh Quynh <aquynh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-19[CONNECTOR]: Initialize subsystem earlier.Evgeniy Polyakov1-5/+9
Attached patch declares connector init function as subsys_init() and returns -EAGAIN in case connector is not initialized yet. Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[CONNECTOR]: Fix warning in cn_queue.cAndreas Schwab1-1/+1
cn_queue.c:130: warning: value computed is not used There is no point in testing the atomic value if the result is thrown away. From Evgeniy: It was created to put implicit smp barrier, but it is not needed there. Signed-off-by: Andreas Schwab <schwab@suse.de> Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-23[PATCH] sem2mutex: drivers: raw, connector, dcdbas, ppp_genericArjan van de Ven1-7/+8
Semaphore to mutex conversion. The conversion was generated via scripts, and the result was validated automatically via a script as well. Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-20[CONNECTOR]: Use netlink_has_listeners() to avoind unnecessary allocations.Evgeniy Polyakov1-3/+4
Return -ESRCH from cn_netlink_send() when there are not listeners, just as it could be done by netlink_broadcast(). Propagate netlink_broadcast() error back to the caller. Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10[PATCH] Switch getnstimestamp() calls to ktime_get_ts()Matt Helsley1-5/+6
Use ktime_get_ts() to take the timestamp instead of getnstimestamp(). This patch prepares to remove getnstimestamp() by switching its only user to a different function with almost exactly the same code. Signed-off-by: Matt Helsley <matthltc@us.ibm.com> Cc: john stultz <johnstul@us.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08[PATCH] drivers/connector/cn_proc.c typosDavid S. Miller1-2/+2
The parameter to put_cpu_var() is unreferenced by the implementation, and the compiler doesn't try to comprehend comments, so this wouldn't cause any problem, but if bugged me enough to post a fix :-) Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-12-12[PATCH] Add timestamp field to process eventsMatt Helsley1-0/+5
This adds a timestamp field to the events sent via the process event connector. The timestamp allows listeners to accurately account the duration(s) between a process' events and offers strong means with which to determine the order of events with respect to a given task while also avoiding the addition of per-task data. This alters the size and layout of the event structure and hence would break compatibility if process events connector as it stands in 2.6.15-rc2 were released as a mainline kernel. Signed-off-by: Matt Helsley <matthltc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07[PATCH] Process Events ConnectorMatt Helsley3-0/+231
This patch adds a connector that reports fork, exec, id change, and exit events for all processes to userspace. It replaces the fork_advisor patch that ELSA is currently using. Applications that may find these events useful include accounting/auditing (e.g. ELSA), system activity monitoring (e.g. top), security, and resource management (e.g. CKRM). Signed-off-by: Matt Helsley <matthltc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-08[PATCH] gfp flags annotations - part 1Al Viro1-2/+1
- added typedef unsigned int __nocast gfp_t; - replaced __nocast uses for gfp flags with gfp_t - it gives exactly the same warnings as far as sparse is concerned, doesn't change generated code (from gcc point of view we replaced unsigned int with typedef) and documents what's going on far better. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-04[CONNECTOR]: fix sparse gfp nocast warningsRandy Dunlap1-1/+2
Fix implicit nocast warnings in connector code: drivers/connector/connector.c:102:24: warning: implicit cast to nocast type drivers/connector/connector.c:114:45: warning: implicit cast to nocast type Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-26[CONNECTOR]: async connector mode.Evgeniy Polyakov2-49/+57
If input message rate from userspace is too high, do not drop them, but try to deliver using work queue allocation. Failing there is some kind of congestion control. It also removes warn_on on this condition, which scares people. Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-11[NET]: Add netlink connector.Evgeniy Polyakov4-0/+675
Kernel connector - new userspace <-> kernel space easy to use communication module which implements easy to use bidirectional message bus using netlink as it's backend. Connector was created to eliminate complex skb handling both in send and receive message bus direction. Connector driver adds possibility to connect various agents using as one of it's backends netlink based network. One must register callback and identifier. When driver receives special netlink message with appropriate identifier, appropriate callback will be called. From the userspace point of view it's quite straightforward: socket(); bind(); send(); recv(); But if kernelspace want to use full power of such connections, driver writer must create special sockets, must know about struct sk_buff handling... Connector allows any kernelspace agents to use netlink based networking for inter-process communication in a significantly easier way: int cn_add_callback(struct cb_id *id, char *name, void (*callback) (void *)); void cn_netlink_send(struct cn_msg *msg, u32 __groups, int gfp_mask); struct cb_id { __u32 idx; __u32 val; }; idx and val are unique identifiers which must be registered in connector.h for in-kernel usage. void (*callback) (void *) - is a callback function which will be called when message with above idx.val will be received by connector core. Using connector completely hides low-level transport layer from it's users. Connector uses new netlink ability to have many groups in one socket. [ Incorporating many cleanups and fixes by myself and Andrew Morton -DaveM ] Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>