aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
13 daysMerge remote-tracking branch 'main/main' into nextHEADmastermainDavid Ahern10-25/+24
Signed-off-by: David Ahern <dsahern@kernel.org>
13 daysip link: hsr: Add support for passing information about INTERLINK deviceLukasz Majewski3-3/+24
The HSR capable device can operate in two modes of operations - Doubly Attached Node for HSR (DANH) and RedBOX (HSR-SAN). The latter one allows connection of non-HSR aware device(s) to HSR network. This node is called SAN (Singly Attached Network) and is connected via INTERLINK network device. This patch adds support for passing information about the INTERLINK device, so the Linux driver can properly setup it. Signed-off-by: Lukasz Majewski <lukma@denx.de> Signed-off-by: David Ahern <dsahern@kernel.org>
13 daysUpdate kernel headersDavid Ahern3-5/+176
Update kernel headers to commit: 5829614a7b3b ("Merge branch 'net-sysctl-sentinel'") Signed-off-by: David Ahern <dsahern@kernel.org>
13 daysrdma: Add an option to display driver-specific QPs in the rdma toolChiara Meiohas3-0/+21
Utilize the -dd flag (driver-specific details) in the rdmatool to view driver-specific QPs which are not exposed yet. The following examples show mlx5 UMR QP which is visible now: $ rdma resource show qp link ibp8s0f1 link ibp8s0f1/1 lqpn 360 type UD state RTS sq-psn 0 comm [mlx5_ib] link ibp8s0f1/1 lqpn 0 type SMI state RTS sq-psn 0 comm [ib_core] link ibp8s0f1/1 lqpn 1 type GSI state RTS sq-psn 0 comm [ib_core] $ rdma resource show qp link ibp8s0f1 -dd link ibp8s0f1/1 lqpn 360 type UD state RTS sq-psn 0 comm [mlx5_ib] link ibp8s0f1/1 lqpn 465 type DRIVER subtype REG_UMR state RTS sq-psn 0 comm [mlx5_ib] link ibp8s0f1/1 lqpn 0 type SMI state RTS sq-psn 0 comm [ib_core] link ibp8s0f1/1 lqpn 1 type GSI state RTS sq-psn 0 comm [ib_core] $ rdma resource show 0: ibp8s0f0: pd 3 cq 4 qp 3 cm_id 0 mr 0 ctx 0 srq 2 1: ibp8s0f1: pd 3 cq 4 qp 3 cm_id 0 mr 0 ctx 0 srq 2 $ rdma resource show -dd 0: ibp8s0f0: pd 3 cq 4 qp 4 cm_id 0 mr 0 ctx 0 srq 2 1: ibp8s0f1: pd 3 cq 4 qp 4 cm_id 0 mr 0 ctx 0 srq 2 Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
13 daysrdma: update uapi headerChiara Meiohas1-0/+6
Update rdma_netlink.h file up to kernel commit e18fa0bbcedf ("RDMA/core: Add an option to display driver-specific QPs in the rdmatool") Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-29uapi: update vdpa.hStephen Hemminger1-3/+3
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-04-25ip: Exit exec in child process if setup failsYedaya Katsman1-1/+6
If we forked, returning from the function will make the calling code to continue in both the child and parent process. Make cmd_exec exit if setup failed and it forked already. An example of issues this causes, where a failure in setup causes multiple unnecessary tries: ``` $ ip netns ef ab $ ip -all netns exec ls netns: ef setting the network namespace "ef" failed: Operation not permitted netns: ab setting the network namespace "ab" failed: Operation not permitted netns: ab setting the network namespace "ab" failed: Operation not permitted ``` Signed-off-by: Yedaya Katsman <yedaya.ka@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-04-23Merge branch 'pfcp' into nextDavid Ahern5-1/+161
Wojciech Drewek says: ==================== New PFCP module was accepted in the kernel together with cls_flower changes which allow to filter the packets using PFCP specific fields [1]. Packet Forwarding Control Protocol is a 3GPP Protocol defined in TS 29.244 [2]. Extended ip link with the support for the new PFCP device. Add pfcp_opts support in tc-flower. [1] https://lore.kernel.org/netdev/171196563119.11638.12210788830829801735.git-patchwork-notify@kernel.org/ [2] https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=3111 ==================== Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-23use missing argument helperStephen Hemminger6-19/+13
There is a helper in utilities to handle missing argument, but it was not being used consistently. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-04-23f_flower: implement pfcp optsMichal Swiatkowski3-0/+150
Allow adding tc filter for PFCP header. Add support for parsing TCA_FLOWER_KEY_ENC_OPTS_PFCP. Options are as follows: TYPE:SEID. TYPE is a 8-bit value represented in hex and can be 1 for session header and 0 for node header. In PFCP packet this is S flag in header. SEID is a 64-bit session id value represented in hex. This patch enables adding hardware filters using PFCP fields, see [1]. [1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=d823265dd45bbf14bd67aa476057108feb4143ce Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-23ip: PFCP device supportWojciech Drewek2-1/+11
Packet Forwarding Control Protocol is a 3GPP Protocol defined in TS 29.244 [1]. Add support for PFCP device type in ip link. It is capable of receiving PFCP messages and extracting its metadata (session ID). Its only purpose is to be used together with tc flower to create SW/HW filters. PFCP module does not take any netlink attributes so there is no need to parse any args. Add new sections to the man to let the user know about new device type. [1] https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=3111 Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-20man: fix doc, ip link does support "change"Jiayun Chen2-2/+2
ip link does support "change". if (matches(*argv, "set") == 0 || matches(*argv, "change") == 0) return iplink_modify(RTM_NEWLINK, 0, argc-1, argv+1); The attached patch documents this. Signed-off-by: Jiayun Chen <jiayunchen@smail.nju.edu.cn> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-04-21tc/util: remove unused argument from print_tcstats2_attrStephen Hemminger3-6/+4
The function doesn't use the FILE handle. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-21tc/police: remove unused argument to tc_print_policeStephen Hemminger9-11/+11
FILE handle no longer used. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-21tc/util: remove unused argument from print_action_controlStephen Hemminger20-24/+22
The FILE handle is no longer used. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-21tc/util: remove unused argument from print_tmStephen Hemminger21-21/+21
File argument no longer used. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-21tc/u32: remove FILE argumentStephen Hemminger1-7/+7
The pretty printing routines no longer use the file handle. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-21Merge remote-tracking branch 'main/main' into nextDavid Ahern87-333/+434
Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-16man: use clsact qdisc for port mirroring examples on matchall and mirredArınç ÜNAL2-16/+16
The clsact qdisc supports ingress and egress. Instead of using two qdiscs to do ingress and egress port mirroring, clsact can be used. Therefore, use clsact for the port mirroring examples on the tc-matchall.8 and tc-mirred.8 documents. Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-04-15mnl: initialize generic netlink versionStephen Hemminger1-0/+2
The version field in mnlu was being passed in but never set. This meant that all places mnlu_gen_socket was used, the version would be uninitialized data from malloc(). Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-04-13ss: mptcp: print out last time countersGeliang Tang1-0/+6
Three new "last time" counters have been added to "struct mptcp_info": last_data_sent, last_data_recv and last_ack_recv. They have been added in commit 18d82cde7432 ("mptcp: add last time fields in mptcp_info") in net-next recently. This patch prints out these new counters into mptcp_stats output in ss. Signed-off-by: Geliang Tang <geliang@kernel.org> Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-13devlink: Support setting max_io_eqsParav Pandit2-1/+40
Devices send event notifications for the IO queues, such as tx and rx queues, through event queues. Enable a privileged owner, such as a hypervisor PF, to set the number of IO event queues for the VF and SF during the provisioning stage. example: Get maximum IO event queues of the VF device:: $ devlink port show pci/0000:06:00.0/2 pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1 function: hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs 10 Set maximum IO event queues of the VF device:: $ devlink port function set pci/0000:06:00.0/2 max_io_eqs 32 $ devlink port show pci/0000:06:00.0/2 pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1 function: hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs 32 Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-13Update kernel headersDavid Ahern5-2/+57
Update kernel headers to commit: 32affa5578f0 ("fib: rules: no longer hold RTNL in fib_nl_dumprule()") Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-13ip: Support filter links with no VF inforenmingshuai4-7/+18
Kernel has add IFLA_EXT_MASK attribute for indicating that certain extended ifinfo values are requested by the user application. The ip link show cmd always request VFs extended ifinfo. In this case, RTM_GETLINK for greater than about 220 VFs truncates IFLA_VFINFO_LIST due to the maximum reach of nlattr's nla_len being exceeded. As a result, ip link show command only show the truncated VFs info sucn as: #ip link show dev eth0 1: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 ... link/ether ... vf 0 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff ... Truncated VF list: eth0 This patch add novf to support filter links with no VF info: ip link show novf v2: - use an one word option instead of an option with on/off. - fix the issue that break changes made for the link filter already done for VF's. v3: - "novf" set vfinfo to 0 and the RTEXT_FILTER_VF flag is not added. Signed-off-by: Mingshuai Ren <renmingshuai@huawei.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-04-03man: fix brief explanation of `ip netns attach NAME PID`Yusuke Ichiki1-1/+1
Rewrite the explanation as it was duplicated with that of `ip netns add NAME`. Signed-off-by: Yusuke Ichiki <public@yusuke.pub> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-28arpd: create /var/lib/arpd on first useMax Gautier2-2/+11
The motivation is to build distributions packages without /var to go towards stateless systems, see link below (TL;DR: provisionning anything outside of /usr on boot). We only try do create the database directory when it's in the default location, and assume its parent (/var/lib in the usual case) exists. Links: https://0pointer.net/blog/projects/stateless.html Signed-off-by: Max Gautier <mg@max.gautier.name> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-28ila: allow show, list and lst as synonymsStephen Hemminger1-1/+3
Across ip commands show, list and misspelling lst are treated the same. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-26bridge: vlan: fix compressvlans usageDate Huang1-0/+6
Add the missing 'compressvlans' to man page Signed-off-by: Date Huang <tjjh89017@hotmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-26bridge: vlan: fix compressvlans usageDate Huang1-1/+1
Fix the incorrect short opt for compressvlans and color in usage Signed-off-by: Date Huang <tjjh89017@hotmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-24uapi: update vdpa.hStephen Hemminger1-0/+17
Autogenerated from 6.9-rc1. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-19ifstat: don't set errno if strdup failsDenis Kirjanov1-1/+0
the strdup man page states that the errno value set by the function so there is not need to set it. Signed-off-by: Denis Kirjanov <dkirjanov@suse.de> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-19ifstat: handle strdup return valueDenis Kirjanov1-0/+4
get_nlmsg_extended is missing the check as it's done in get_nlmsg v2: don't set the errno value explicitly Signed-off-by: Denis Kirjanov <dkirjanov@suse.de> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-16uapi: update headersStephen Hemminger4-1/+69
User headers based on pre 6.9-rc1 Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-15Merge branch 'nexthop-grp-stats' into nextDavid Ahern4-0/+156
Petr Machata says: ==================== Next hop group stats allow verification of balancedness of a next hop group. The feature was merged in kernel commit 7cf497e5a122 ("Merge branch 'nexthop-group-stats'"). This patchset adds to ip the corresponding support. NH group stats come in two flavors: as statistics for SW and for HW datapaths. The former is shown when -s is given to "ip nexthop". The latter demands more work from the kernel, and possibly driver and HW, and might not be always necessary. Therefore tie it to -s -s, similarly to how ip link shows more detailed stats when -s is given twice. Here's an example usage: # ip link add name gre1 up type gre \ local 172.16.1.1 remote 172.16.1.2 tos inherit # ip nexthop replace id 1001 dev gre1 # ip nexthop replace id 1002 dev gre1 # ip nexthop replace id 1111 group 1001/1002 hw_stats on # ip -s -s -j -p nexthop show id 1111 [ { [ ...snip... ] "hw_stats": { "enabled": true, "used": true }, "group_stats": [ { "id": 1001, "packets": 0, "packets_hw": 0 },{ "id": 1002, "packets": 0, "packets_hw": 0 } ] } ] hw_stats.enabled shows whether hw_stats have been requested for the given group. hw_stats.used shows whether any driver actually implemented the counter. group_stats[].packets show the total stats, packets_hw only the HW-datapath stats. ==================== Signed-off-by: David Ahern <dsahern@kernel.org>
2024-03-15ip: ipnexthop: Allow toggling collection of nexthop group HW statisticsPetr Machata2-0/+14
Besides SW datapath stats, the kernel also support collecting statistics from HW datapath, for nexthop groups offloaded to HW. Since collection of these statistics may consume HW resources, there is an interface to request that the HW stats be recorded. Add this toggle to "ip nexthop". Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-03-15ip: ipnexthop: Support dumping next hop group HW statsPetr Machata2-0/+33
Besides SW datapath stats, the kernel also support collecting statistics from HW datapath, for nexthop groups offloaded to HW. Request that these be collected when ip is given "-s -s", similarly to how "ip link" shows more statistics in that case. Besides the statistics themselves, also show whether the collection of HW statistics was in fact requested, and whether any driver actually implemented the request. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-03-15ip: ipnexthop: Support dumping next hop group statsPetr Machata2-0/+95
Next hop group stats allow verification of balancedness of a next hop group. The feature was merged in kernel commit 7cf497e5a122 ("Merge branch 'nexthop-group-stats'"). Add to ip the corresponding support. The statistics are requested if "ip nexthop" is started with -s. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-03-15libnetlink: Add rta_getattr_uint()Petr Machata1-0/+14
NLA_UINT attributes have a 4-byte payload if possible, and an 8-byte one if necessary. Add a function to extract these. Since we need to dispatch on length anyway, make the getter truly universal by supporting also u8 and u16. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-03-15Update kernel headersDavid Ahern4-1/+69
Update kernel headers to commit: 237bb5f7f7f5 ("cxgb4: unnecessary check for 0 in the free_sge_txq_uld() function") Signed-off-by: David Ahern <dsahern@kernel.org>
2024-03-13tc-simple.8: take Jamal's prompt off examplesStephen Hemminger1-6/+6
The examples on tc-simple man page had extra stuff in the prompt which is not necessary. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-13simple: support json outputStephen Hemminger1-3/+5
Last action that never got JSON support. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-13skbmod: support json in printStephen Hemminger1-16/+21
This tc action never got jsonized. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-13pedit: log errors to stderrStephen Hemminger1-3/+3
The errors should bo to stderr, not to stdout. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-13tc: support JSON for legacy statsStephen Hemminger1-13/+15
The extended stats already supported JSON output, add to the legacy stats as well. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-13man: fix typo found by LintianLuca Boccassi1-1/+1
Signed-off-by: Luca Boccassi <bluca@debian.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-13tc: remove no longer used helpersStephen Hemminger4-20/+3
The removal of tick usage in netem, means that some of the helper functions in tc are no longer used and can be safely removed. Other functions can be made static. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-13netem: use 64 bit value for latency and jitterStephen Hemminger1-36/+46
The current version of netem in iproute2 has a maximum of 4.3 seconds because of scaled 32 bit clock values. Some users would like to be able to use larger delays to emulate things like storage delays. Since kernel version 4.15, netem qdisc had netlink parameters to express wider range of delays in nanoseconds. But the iproute2 side was never updated to use them. This does break compatibility with older kernels (4.14 and earlier). With these out of support kernels, the latency/delay parameter will end up being ignored. Reported-by: Marc Blanchet <marc.blanchet@viagenie.ca> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-13README: add note about kernel version compatibilityStephen Hemminger1-0/+13
Since next netem changes will break some usages of out of support kernels, add an explicit policy about range of kernel versions. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-12tc: make exec_util arg constStephen Hemminger3-3/+3
The callbacks in exec_util should not be modifying underlying qdisc operations structure. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-12tc: make action_util arg constStephen Hemminger21-49/+49
The callbacks in action_util should not be modifying underlying qdisc operations structure. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-12tc: make filter_util args constStephen Hemminger12-27/+27
The callbacks in filter_util should not be modifying underlying qdisc operations structure. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-12tc: make qdisc_util arg constStephen Hemminger34-102/+103
The callbacks in qdisc_util should not be modifying underlying qdisc operations structure. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-11Merge branch 'main' of ↵Stephen Hemminger21-97/+838
git://git.kernel.org/pub/scm/network/iproute2/iproute2-next
2024-03-11tc/action: remove trailing whitespaceStephen Hemminger1-1/+1
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-11rt_names: whitespace cleanupStephen Hemminger1-21/+22
Fix indentation. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-11v6.8.0v6.8.0Stephen Hemminger1-1/+1
2024-03-06iproute2: move generic_proc_open into libDenis Kirjanov4-34/+24
the function has the same definition in ifstat and ss v2: fix the typo in the chagelog v3: rebase on master Signed-off-by: Denis Kirjanov <dkirjanov@suse.de> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-03-06ifstat: support 64 interface statsStephen Hemminger1-5/+32
The 32 bit statistics are problematic since 32 bit value can easily wraparound at high speed. Use 64 bit stats if available. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-03-04ss: fix output of MD5 signature keys configured on TCP socketsLars Ellenberg1-1/+16
da9cc6ab introduced printing of MD5 signature keys when found. But when changing printf() to out() calls with 90351722, the implicit printf call in print_escape_buf() was overlooked. That results in a funny output in the first line: "<all-your-tcp-signature-keys-concatenated>State" and ambiguity as to which of those bytes belong to which socket. Add a static void out_escape_buf() immediately before we use it. da9cc6ab (ss: print MD5 signature keys configured on TCP sockets, 2017-10-06) 90351722 (ss: Replace printf() calls for "main" output by calls to helper, 2017-12-12) Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-04tc: Fix json output for f_u32Takanori Hirano1-1/+1
Signed-off-by: Takanori Hirano <me@hrntknr.net> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-03ifstat: handle unlink return valueDenis Kirjanov1-2/+4
Print an error message if we can't remove the history file v2: exit if unlink failed v3: restore the changelog Signed-off-by: Denis Kirjanov <dkirjanov@suse.de> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-03-03nstat: convert sprintf to snprintfDenis Kirjanov1-2/+2
Use snprintf to print only valid data. That's the similar change done for ifstat. Signed-off-by: Denis Kirjanov <dkirjanov@suse.de> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-03-03nstat: use stack space for history file nameDenis Kirjanov1-5/+6
as the name doesn't require a lot of storage put it on the stack. Moreover the memory allocated via malloc wasn't returned. Signed-off-by: Denis Kirjanov <dkirjanov@suse.de> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-03-03nstat: constify name argument in generic_proc_openDenis Kirjanov1-1/+1
the argument passed to the function is always a constant value Signed-off-by: Denis Kirjanov <dkirjanov@suse.de> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-03-03man8: ioam: add doc for monitor commandJustin Iurman1-0/+5
Add a sentence in the doc to describe what the new "monitor" command does. Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-03-03ip: ioam6: add monitor commandJustin Iurman1-1/+77
Add the "ip ioam monitor" command to be able to read all IOAM data received. This is based on a netlink multicast group. Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-03-03Update kernel headersDavid Ahern5-6/+52
Update kernel headers to commit 4b2765ae410a ("Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next") Signed-off-by: David Ahern <dsahern@kernel.org>
2024-02-29uapi: update in6.hStephen Hemminger1-1/+1
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-02-27Merge remote-tracking branch 'main/main' into nextDavid Ahern6-46/+65
Signed-off-by: David Ahern <dsahern@kernel.org>
2024-02-27Merge branch 'ss-socket-local-storage' into nextDavid Ahern2-9/+407
Quentin Deslandes says: ==================== BPF allows programs to store socket-specific data using BPF_MAP_TYPE_SK_STORAGE maps. The data is attached to the socket itself, and Martin added INET_DIAG_REQ_SK_BPF_STORAGES, so it can be fetched using the INET_DIAG mechanism. Currently, ss doesn't request the socket-local data, this patch aims to fix this. The first patch requests the socket-local data for the requested map ID (--bpf-map-id=) or all the maps (--bpf-maps). It then prints the map_id in COL_EXT. Patch #2 uses libbpf and BTF to pretty print the map's content, like `bpftool map dump` would do. Patch #3 updates ss' man page to explain new options. While I think it makes sense for ss to provide the socket-local storage content for the sockets, it's difficult to conciliate the column-based output of ss and having readable socket-local data. Hence, the socket-local data is printed in a readable fashion over multiple lines under its socket statistics, independently of the column-based approach. Here is an example of ss' output with --bpf-maps: [...] ESTAB 340116 0 [...] map_id: 114 [ (struct my_sk_storage){ .field_hh = (char)3, (union){ .a = (int)17, .b = (int)17, }, } ] Changed this series to an RFC as the merging window for net-next is closed. Changes from v8: * Remove usage of libbpf_bpf_map_type_str() which requires libbpf-1.0+ and provide very little added value (David). * Use ENABLE_BPF_SKSTORAGE_SUPPORT to gate the BPF socket-local storage support, instead of HAVE_LIBBPF. iproute2 depends on libbpf-0.1, but this change needs libbpf-0.5+. If the requirements are not met, ss can still be compiled and used without BPF socket-local storage support, but a warning will be printed at compile time. Changes from v7: * Fix comment format and checkpatch warnings (Stephen, David). * Replaced Co-authored-by with Co-developed-by + Signed-off-by for Martin's contribution on patch #1 to follow checkpatch requirements, with Martin's approval. Changes from v6: * Remove column dedicated to BPF socket-local storage (COL_SKSTOR), use COL_EXT instead (Matthieu). Changes from v5: * Add support for --oneline when printing socket-local data. * Use \t to indent instead of " " to be consistent with other columns. * Removed Martin's ack on patch #2 due to amount of lines changed. Changes from v4: * Fix return code for 2 calls. * Fix issue when inet_show_netlink() retries a request. * BPF dump object is created in bpf_map_opts_load_info(). Changes from v3: * Minor refactoring to reduce number of HAVE_LIBBF usage. * Update ss' man page. * btf_dump structure created to print the socket-local data is cached in bpf_map_opts. Creation of the btf_dump structure is performed if needed, before printing the data. * If a map can't be pretty-printed, print its ID and a message instead of skipping it. * If show_all=true, send an empty message to the kernel to retrieve all the maps (as Martin suggested). Changes from v2: * bpf_map_opts_is_enabled is not inline anymore. * Add more #ifdef HAVE_LIBBPF to prevent compilation error if libbpf support is disabled. * Fix erroneous usage of args instead of _args in vout(). * Add missing btf__free() and close(fd). Changes from v1: * Remove the first patch from the series (fix) and submit it separately. * Remove double allocation of struct rtattr. * Close BPF map FDs on exit. * If bpf_map_get_fd_by_id() fails with ENOENT, print an error message and continue to the next map ID. * Fix typo in new command line option documentation. * Only use bpf_map_info.btf_value_type_id and ignore bpf_map_info.btf_vmlinux_value_type_id (unused for socket-local storage). * Use btf_dump__dump_type_data() instead of manually using BTF to pretty-print socket-local storage data. This change alone divides the size of the patch series by 2. ==================== Signed-off-by: David Ahern <dsahern@kernel.org>
2024-02-27ss: update man page to document --bpf-maps and --bpf-map-id=Quentin Deslandes1-0/+6
Document new --bpf-maps and --bpf-map-id= options. Signed-off-by: Quentin Deslandes <qde@naccy.de> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-02-27ss: pretty-print BPF socket-local storageQuentin Deslandes1-11/+137
ss is able to print the map ID(s) for which a given socket has BPF socket-local storage defined (using --bpf-maps or --bpf-map-id=). However, the actual content of the map remains hidden. This change aims to pretty-print the socket-local storage content following the socket details, similar to what `bpftool map dump` would do. The exact output format is inspired by drgn, while the BTF data processing is similar to bpftool's. ss will use libbpf's btf_dump__dump_type_data() to ease pretty-printing of binary data. This requires out_bpf_sk_storage_print_fn() as a print callback function used by btf_dump__dump_type_data(). vout() is also introduced, which is similar to out() but accepts a va_list as parameter. ss' output remains unchanged unless --bpf-maps or --bpf-map-id= is used, in which case each socket containing BPF local storage will be followed by the content of the storage before the next socket's info is displayed. Signed-off-by: Quentin Deslandes <qde@naccy.de> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-02-26ip: Add missing command exaplantions in man pageYedaya Katsman1-2/+31
There are a few commands missing from the ip command syntax list, add them. They are also missing from the see also section, add them there as well. Note there isn't a ip-ila man page, so I didn't link to it. Also fix a few punctuation mistakes. Signed-off-by: Yedaya Katsman <yedaya.ka@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-02-26iptuntap: use TUNDEV macroDenis Kirjanov1-3/+2
the code already has a path to the tan/tap device Signed-off-by: Denis Kirjanov <dkirjanov@suse.de> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-02-26ss: add support for BPF socket-local storageQuentin Deslandes1-3/+269
While sock_diag is able to return BPF socket-local storage in response to INET_DIAG_REQ_SK_BPF_STORAGES requests, ss doesn't request it. This change introduces the --bpf-maps and --bpf-map-id= options to request BPF socket-local storage for all SK_STORAGE maps, or only specific ones. The bigger part of this change will check the requested map IDs and ensure they are valid. The column COL_EXT is used to print the socket-local data into. When --bpf-maps is used, ss will send an empty INET_DIAG_REQ_SK_BPF_STORAGES request, in return the kernel will send all the BPF socket-local storage entries for a given socket. The BTF data for each map is loaded on demand, as ss can't predict which map ID are used. When --bpf-map-id=ID is used, a file descriptor to the requested maps is open to 1) ensure the map doesn't disappear before the data is printed, and 2) ensure the map type is BPF_MAP_TYPE_SK_STORAGE. The BTF data for each requested map is loaded before the request is sent to the kernel. Co-developed-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Quentin Deslandes <qde@naccy.de> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-02-19man: ip-link.8: add a note for gso_ipv4_max_sizeXin Long1-0/+5
As Paolo noticed, a skb->len check against gso_max_size was added in: https://lore.kernel.org/netdev/20231219125331.4127498-1-edumazet@google.com/ gso_max_size needs to be set to a value greater than or equal to gso_ipv4_max_size to make BIG TCP IPv4 work properly. To not break the current setup, this patch just adds a note into its man doc for this. Reported-by: Xiumei Mu <xmu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com>
2024-02-19ip: Update command usage in man pageYedaya Katsman1-4/+6
The usage in the man page was out of date with the usage help, fix it. Also sort the commands alphabetically, the same as the command usage. Signed-off-by: Yedaya Katsman <yedaya.ka@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-02-19tc: Support json option in tc-cgroup, tc-flow and tc-routeTakanori Hirano3-36/+48
Fix json corruption when using the "-json" option in some cases Signed-off-by: Takanori Hirano <me@hrntknr.net> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-02-19tc: Change of json format in tc-fwTakanori Hirano1-6/+6
In the case of a process such as mapping a json to a structure, it can be difficult if the keys have the same name but different types. Since handle is used in hex string, change it to fw. Signed-off-by: Takanori Hirano <me@hrntknr.net> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-02-18ifstat: convert sprintf to snprintfDenis Kirjanov1-6/+6
Use snprintf to print only valid data v2: adjust formatting v3: fix the issue with a buffer length Signed-off-by: Denis Kirjanov <dkirjanov@suse.de> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-02-18netlink: display information from missing type extackStephen Hemminger1-0/+4
The kernel will now send missing type information in error response. Print it if present. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-02-18Merge remote-tracking branch 'main/main' into nextDavid Ahern29-51/+77
Signed-off-by: David Ahern <dsahern@kernel.org>
2024-02-14iproute2: fix type incompatibility in ifstat.cStephen Gallagher1-1/+1
Throughout ifstat.c, ifstat_ent.val is accessed as a long long unsigned type, however it is defined as __u64. This works by coincidence on many systems, however on ppc64le, __u64 is a long unsigned. This patch makes the type definition consistent with all of the places where it is accessed. Fixes: 5a52102b7c8f ("ifstat: Add extended statistics to ifstat") Reviewed-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Gallagher <sgallagh@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-02-10tc: u32: check return value from snprintfStephen Hemminger1-0/+4
Add assertion to check for case of snprintf failing (bad format?) or buffer getting full. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-02-10tc: drop no longer used prototype from tc_util.hStephen Hemminger1-1/+0
Part of the ipt removal missed this. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-02-10tc: print unknown action on stderrStephen Hemminger1-1/+1
This is an error, and should not go to stdout. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-02-10tc: bpf: fix extra newline in JSON outputStephen Hemminger1-1/+1
Don't print newline at end of bpf if in JSON mode. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-02-10tc: Support json option in tc-fw.Takanori Hirano1-7/+15
Fix json corruption when using the "-json" option in cases where tc-fw is set. Signed-off-by: Takanori Hirano <me@hrntknr.net> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-02-10tc: u32: errors should be printed on stderrStephen Hemminger1-2/+2
Don't corrupt stdout with error messages, matters if JSON is used. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-02-09docs, man: fix some typosAndrea Claudi1-1/+1
Fix some typos and spelling errors in iproute2 documentation. Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-02-09treewide: fix typos in various commentsAndrea Claudi3-3/+3
Fix various typos and spelling errors in some iproute2 comments. Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-02-08ip: detect rtnl_listen errors while monitoring netnsStephen Hemminger1-1/+2
If rtnl_listen detects error (such as netlink socket EOF), then exit with status 2 like other iproute2 monitor commands. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-02-08ip: detect errors in netconf monitor modeStephen Hemminger1-1/+2
If rtnl_listen() returns error while looking for netconf events, then exit with status of 2 as other iproute2 monitor actions do. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-02-08ctrl: Fix fd leak in ctrl_listen()Maks Mishin1-2/+3
Use the same pattern for handling rtnl_listen() errors that is used across other iproute2 commands. All other commands exit with status of 2 if rtnl_listen fails. Reported-off-by: Maks Mishin <maks.mishinFZ@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-02-08ctrl: Fix fd leak in ctrl_list()Maks Mishin1-1/+1
if ctrl_list is called with get operation and wrong number of parameters, it would forget to close the local netlink handle. Signed-off-by: Maks Mishin <maks.mishinFZ@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-02-07ip/bond: add coupled_control supportAahil Awatramani1-1/+17
coupled_control specifies whether the LACP state machine's MUX in the 802.3ad mode should have separate Collecting and Distributing states per IEEE 802.1AX-2008 5.4.15 for coupled and independent control state. By default this setting is on and does not separate the Collecting and Distributing states, maintaining the bond in coupled control. If set off, will toggle independent control state machine which will seperate Collecting and Distributing states. Signed-off-by: Aahil Awatramani <aahila@google.com> v2: Dropped uapi header change Use of print_on_off and parse_on_off Signed-off-by: David Ahern <dsahern@kernel.org>
2024-02-07Update kernel headersDavid Ahern3-5/+82
Update kernel headers to commit: 1e8f1477aba5 ("Merge branch 'net-phy-c22-c45-enumeration'") Signed-off-by: David Ahern <dsahern@kernel.org>
2024-02-05ip: Add missing -echo option to usageYedaya Katsman1-1/+1
In commit b264b4c6568c ("ip: add NLM_F_ECHO support") the "-echo" option was added, but not to the options in the usage. Add it. Note there doesn't seem to be any praticular order for the options here, so it's placed kind of randomly. Fixes: b264b4c6568c ("ip: add NLM_F_ECHO support") Signed-off-by: Yedaya Katsman <yedaya.ka@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-02-05ip: Add missing stats command to usageYedaya Katsman1-1/+1
The stats command was added in 54d82b0699a0 ("ip: Add a new family of commands, "stats""), but wasn't included in the subcommand list in the help usage. Add it in the right position alphabetically. Fixes: 54d82b0699a0 ("ip: Add a new family of commands, "stats"") Signed-off-by: Yedaya Katsman <yedaya.ka@gmail.com> Reviewed-by: Petr Machata <me@pmachata.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-31ip: remove non-existent amt subcommand from usageYedaya Katsman1-2/+2
Commit 6e15d27aae94 ("ip: add AMT support") added "amt" to the list of "first level" commands list, which isn't correct, as it isn't present in the cmds list. remove it from the usage help. Fixes: 6e15d27aae94 ("ip: add AMT support") Signed-off-by: Yedaya Katsman <yedaya.ka@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-30ifstat: make load_info() more verbose on errorDenis Kirjanov1-8/+16
convert frprintf calls to perror() so the caller can see the reason of an error Signed-off-by: Denis Kirjanov <dkirjanov@suse.de> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-30Merge branch 'echo-tc-filter-actions' into nextDavid Ahern4-6/+39
Victor Nogueira says: ==================== Continuing on what Hangbin Liu started [1], this patch set adds support for the NLM_F_ECHO flag for tc actions and filters. For qdiscs it will require some kernel surgery, and we'll send it soon after this surgery is merged. When user space configures the kernel with netlink messages, it can set NLM_F_ECHO flag to request the kernel to send the applied configuration back to the caller. This allows user space to receive back configuration information that is populated by the kernel. Often because there are parameters that can only be set by the kernel which become visible with the echo, or because user space lets the kernel choose a default value. To illustrate a use case where the kernel will give us a default value, the example below shows the user not specifying the action index: tc -echo actions add action mirred egress mirror dev lo total acts 0 Added action action order 1: mirred (Egress Mirror to device lo) pipe index 1 ref 1 bind 0 not_in_hw Note that the echoed response indicates that the kernel gave us a value of index 1 [1] https://lore.kernel.org/netdev/20220916033428.400131-2-liuhangbin@gmail.com/ ==================== Signed-off-by: David Ahern <dsahern@kernel.org>
2024-01-30tc: Add NLM_F_ECHO support for filtersVictor Nogueira1-1/+7
If the user specifies this flag for a filter command the kernel will return the command's result back to user space. For example: tc -echo filter add dev lo parent ffff: protocol ip matchall action ok added filter dev lo parent ffff: protocol ip pref 49152 matchall chain 0 As illustrated above, the kernel will give us a pref of 491252 The same can be done for other filter commands (replace, delete, and change). For example: tc -echo filter del dev lo parent ffff: pref 49152 protocol ip matchall deleted filter dev lo parent ffff: protocol ip pref 49152 matchall chain 0 Signed-off-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-01-30tc: add NLM_F_ECHO support for actionsVictor Nogueira3-5/+32
This patch adds the -echo flag to tc command line and support for it in tc actions. If the user specifies this flag for an action command, the kernel will return the command's result back to user space. For example: tc -echo actions add action mirred egress mirror dev lo total acts 0 Added action action order 1: mirred (Egress Mirror to device lo) pipe index 10 ref 1 bind 0 not_in_hw As illustrated above, the kernel will give us an index of 10 The same can be done for other action commands (replace, change, and delete). For example: tc -echo actions delete action mirred index 10 total acts 0 Deleted action action order 1: mirred (Egress Mirror to device lo) pipe index 10 ref 0 bind 0 not_in_hw Signed-off-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-01-27bpf: fix warning from basename()Stephen Hemminger1-2/+2
The function basename() expects a mutable character string, which now causes a warning: bpf_legacy.c: In function ‘bpf_load_common’: bpf_legacy.c:975:38: warning: passing argument 1 of ‘__xpg_basename’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] 975 | basename(cfg->object), cfg->mode == EBPF_PINNED ? | ~~~^~~~~~~~ In file included from bpf_legacy.c:21: /usr/include/libgen.h:34:36: note: expected ‘char *’ but argument is of type ‘const char *’ 34 | extern char *__xpg_basename (char *__path) __THROW; Fixes: f20ff2f19552 ("bpf: keep parsed program mode in struct bpf_cfg_in") Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-25spelling fixesStephen Hemminger9-10/+10
Use codespell and ispell to fix some spelling errors in comments and README's. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-25m_mirred: Allow mirred to blockVictor Nogueira2-14/+71
So far the mirred action has dealt with syntax that handles mirror/redirection for netdev. A matching packet is redirected or mirrored to a target netdev. In this patch we enable mirred to mirror to a tc block as well. IOW, the new syntax looks as follows: ... mirred <ingress | egress> <mirror | redirect> [index INDEX] < <blockid BLOCKID> | <dev <devname>> > Examples of mirroring or redirecting to a tc block: $ tc filter add block 22 protocol ip pref 25 \ flower dst_ip 192.168.0.0/16 action mirred egress mirror blockid 22 $ tc filter add block 22 protocol ip pref 25 \ flower dst_ip 10.10.10.10/32 action mirred egress redirect blockid 22 Co-developed-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Co-developed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-01-22bpf: include libgen.h for basenamePedro Tammela1-0/+1
In musl basename() is only available via libgen.h Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-22color: handle case where fmt is NULLStephen Hemminger1-0/+3
There are cases where NULL is passed as format string when nothing is to be printed. This is commonly done in the print_bool function when a flag is false. Glibc seems to handle this case nicely but for musl it will cause a segmentation fault Since nothing needs to be printed, in this case; just check for NULL and return. Reported-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-22uapi: update virtio_config.hStephen Hemminger1-1/+7
Updated from to 6.8.0-rc1. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-22man: fix duplicate words in l2tp, sfb and tipcStephen Hemminger3-4/+4
Doing simple regex found a couple more duplicates. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-22man: correct double word in htbSimon Egli1-1/+1
There was a word too much in the documentation of tc-htb Signed-off-by: Simon Egli <simon@egli.online> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-22ss: add option to suppress queue columnsChristian Göttsche2-5/+22
Add a new option `-Q/--no-queues` to ss(8) to suppress the two standard columns Send-Q and Recv-Q. This helps to keep the output steady for monitoring purposes (like listening sockets). Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-01-22Merge branch 'main' into nextDavid Ahern20-936/+158
Signed-off-by: David Ahern <dsahern@kernel.org>
2024-01-21tc: better clockid handlingStephen Hemminger1-2/+16
All clockid values not available on some older glibc versions. Also, add some comments. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-19tc: unify clockid handlingStephen Hemminger5-127/+44
There are three places in tc which all have same code for handling clockid (copy/paste). Move it into tc_util.c. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-18uapi: remove tc_ipt.hStephen Hemminger1-20/+0
Removed upstream. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-17doc: remove out dated actions-generalStephen Hemminger1-256/+0
This file is rather free form, out dated, and redundant. Everything here should be covered on man pages. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-17doc: remove ifb READMEStephen Hemminger1-125/+0
Most of this document goes back to when IFB was first integrated and covers the motivation. Only of historical interest. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-17man/tc-gact: move generic action documentation to man pageStephen Hemminger3-78/+86
Convert from free form doc to man page. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-17man: get rid of doc/actions/mirred-usageStephen Hemminger2-164/+8
The only bit of information not already on the man page is some of the limitations. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-17ss: show extra info when '--processes' is not usedMatthieu Baerts (NGI0)1-0/+2
A recent modification broke "extra" options for all protocols showing info about the processes when '-p' / '--processes' option was not used as well. In other words, all the additional bits displayed at the end or at the next line were no longer printed if the user didn't ask to show info about processes as well. The reason is that, the "current_field" pointer never switched to the "Ext" column. If the user didn't ask to display the processes, nothing happened when trying to print extra bits using the "out()" function, because the current field was still pointing to the "Process" one, now marked as disabled. Before the commit mentioned below, it was not an issue not to switch to the "Ext" or "Process" columns because they were never marked as "disabled". Here is a quick list of options that were no longer displayed if '-p' / '--processes' was not set: - AF_INET(6): -o, --options -e, --extended --tos --cgroup --inet-sockopt -m, --memory -i, --info - AF_PACKET: -e, --extended - AF_XDP: -e, --extended - AF_UNIX: -m, --memory -e, --extended - TIPC: --tipcinfo That was just by quickly reading the code, I probably missed some. But this shows that the impact can be quite important for all scripts using 'ss' to monitor connections or to report info. Fixes: 1607bf53 ("ss: prevent "Process" column from being printed unless requested") Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-17Reapply "ss: prevent "Process" column from being printed unless requested"Stephen Hemminger1-1/+4
This reverts commit f22c49730c3691c25a1147081363eb35aa9d1048. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-15uapi: update headers from 6.8-rc1Stephen Hemminger2-158/+2
Removal of no longer used TC structs. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-13Revert "ss: prevent "Process" column from being printed unless requested"Stephen Hemminger1-4/+1
This reverts commit 1607bf531fd2f984438d227ea97312df80e7cf56. This commit is being reverted because it breaks output of tcp info. The order of the columns enum is order sensistive. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=218372 Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-12man: drop references to ifconfigStephen Hemminger2-6/+0
The documentation does not need to have any references to the legacy command ifconfig. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-11genl: ctrl.c: spelling fix in error messageAndrea Claudi1-1/+1
Canot --> Cannot Fixes: 65018ae43b14 ("This patch adds a generic netlink controller...") Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-11iplink_xstats: spelling fix in error messageAndrea Claudi1-1/+1
Cannont --> Cannot Fixes: 2b99748a60bf ("add missing iplink_xstats.c") Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-08Merge remote-tracking branch 'main/main' into nextDavid Ahern35-523/+463
Signed-off-by: David Ahern <dsahern@kernel.org>
2024-01-08v6.7.0v6.7.0Stephen Hemminger1-1/+1
2024-01-08lnstat: Fix deref of null in print_json() functionMaks Mishin1-0/+4
Now pointer `jw` is being checked for NULL before using in function `jsonw_start_object`. Added exit from function when `jw==NULL`. Found by RASU JSC Signed-off-by: Maks Mishin <maks.mishinFZ@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-08rdma: Fix the error of accessing string variable outside the lifecyclewenglianfa8-16/+8
All these SPRINT_BUF(b) definitions are inside the 'if' block, but accessed outside the 'if' block through the pointers 'comm'. This leads to empty 'comm' attribute when querying resource information. So move the definitions to the beginning of the functions to extend their life cycle. Before: $ rdma res show srq dev hns_0 srqn 0 type BASIC lqpn 18 pdn 5 pid 7775 comm After: $ rdma res show srq dev hns_0 srqn 0 type BASIC lqpn 18 pdn 5 pid 7775 comm ib_send_bw Fixes: 1808f002dfdd ("lib/fs: fix memory leak in get_task_name()") Signed-off-by: wenglianfa <wenglianfa@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Acked-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-08remove support for iptables actionStephen Hemminger14-1898/+0
There is an open upstream kernel patch to remove ipt action from kernel. This is corresponding iproute2 change. - Remove support fot ipt and xt in tc. - Remove no longer used header files. - Update man pages. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2024-01-03rdma: do not mix newline and json objectStephen Hemminger16-212/+216
Mixing the semantics of ending lines with the json object leads to several bugs where json object is closed twice, etc. Replace by breaking the meaning of newline() function into two parts. Now, lots of functions were taking the rdma data structure as argument but never using it. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-03rdma: add oneline flagStephen Hemminger7-9/+25
Add oneline output format like other commands. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-03rdma: make supress_errors a bitStephen Hemminger1-1/+1
Like other command line flags supress_errors can be a bit. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-03rdma: make pretty behave like other commandsStephen Hemminger4-11/+5
For tc, ip, etc the -pretty flag only has meaning if json is used. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-03rdma: use standard flag for jsonStephen Hemminger5-15/+14
The other iproute2 utils use variable json as flag. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-03rdma: shorten print_ linesStephen Hemminger9-61/+32
With the shorter form of print_ function some of the lines can now be shortened. Max line length in iproute2 should be 100 characters or less. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-03ip: merge duplicate if clausesStephen Hemminger1-2/+0
The code that handles brief option had two exactly matching if (filter == AF_PACKET) clauses; merge them Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-01configure: avoid un-recommended command substitution formEli Schwartz1-12/+12
The use of backticks to surround commands instead of "$(cmd)" is a legacy of the oldest pre-POSIX shells. It is confusing, unreliable, and hard to read. Its use is not recommended in new programs. Link: http://mywiki.wooledge.org/BashFAQ/082 Signed-off-by: Eli Schwartz <eschwartz93@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2024-01-01rdma: use print_XXX instead of COLOR_NONEStephen Hemminger10-78/+78
The rdma utility should be using same code pattern as rest of iproute2. When printing, color should only be requested when desired; if no color wanted, use the simpler print_XXX instead. Fixes: b0a688a542cd ("rdma: Rewrite custom JSON and prints logic to use common API") Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-30bridge: mdb: Add flush supportIdo Schimmel2-1/+203
Implement MDB flush functionality, allowing user space to flush MDB entries from the kernel according to provided parameters. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2023-12-30Update kernel headersDavid Ahern6-3/+9
Update kernel headers to commit: 92de776d2090 ("Merge tag 'mlx5-updates-2023-12-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux") Signed-off-by: David Ahern <dsahern@kernel.org>
2023-12-22ip-link: use shorter URL to kernel docsStephen Hemminger1-1/+1
Use shorter URL (docs.kernel.org) so that manual entry does not have too long a line. The debian troff checker would fail when doing make check. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-22bridge: Provide rta_type()Benjamin Poirier3-9/+11
Factor out the repeated code pattern rta_type = attr->rta_type & NLA_TYPE_MASK into a helper which is similar to the existing kernel function nla_type(). Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-22bridge: Deduplicate print_range()Benjamin Poirier4-28/+17
The two implementations are now identical so keep only one instance and move it to json_print.c where there are already a few other specialized printing functions. The string that's formatted in the "end" buffer is only needed when outputting a range so move the snprintf() call within the condition. The second argument's purpose is better conveyed by calling it "end" rather than "id" so rename it. Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-22bridge: vni: Indent statistics with 2 spacesBenjamin Poirier1-4/+4
`bridge -s vlan` indents statistics with 2 spaces compared to the vlan id column while `bridge -s vni` indents them with 1 space. Change `bridge vni` to match the behavior of `bridge vlan` since that second command predates `bridge vni`. Before: $ bridge -s vni dev vni group/remote vxlan1 4001 RX: bytes 0 pkts 0 drops 0 errors 0 TX: bytes 0 pkts 0 drops 0 errors 0 4002 10.0.0.1 RX: bytes 0 pkts 0 drops 0 errors 0 TX: bytes 0 pkts 0 drops 0 errors 0 vxlan2 100 RX: bytes 0 pkts 0 drops 0 errors 0 TX: bytes 0 pkts 0 drops 0 errors 0 After: $ bridge -s vni dev vni group/remote vxlan1 4001 RX: bytes 0 pkts 0 drops 0 errors 0 TX: bytes 0 pkts 0 drops 0 errors 0 4002 10.0.0.1 RX: bytes 0 pkts 0 drops 0 errors 0 TX: bytes 0 pkts 0 drops 0 errors 0 vxlan2 100 RX: bytes 0 pkts 0 drops 0 errors 0 TX: bytes 0 pkts 0 drops 0 errors 0 Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-22bridge: vni: Align output columnsBenjamin Poirier1-19/+21
Use fixed column widths to improve readability. These changes are similar to commit e0c457b1a5a2 ("bridge: Align output columns"). Before: $ bridge vni dev vni group/remote vxlan1 4001 4002 10.0.0.1 5000-5010 16777214-16777215 10.0.0.2 vxlan2 100 After: $ bridge vni dev vni group/remote vxlan1 4001 4002 10.0.0.1 5000-5010 16777214-16777215 10.0.0.2 vxlan2 100 Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-22bridge: vni: Remove unused argument in open_vni_port()Benjamin Poirier1-2/+2
Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-22bridge: vni: Replace open-coded instance of print_nl()Benjamin Poirier1-1/+1
Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-22bridge: vni: Remove stray newlines after each interfaceBenjamin Poirier1-2/+0
Currently, `bridge vni` outputs an empty line after each interface. This is not consistent with the output style of other iproute2 commands, in particular `bridge vlan`. Therefore, remove the empty lines. If there are scripts that parse the normal text output of `bridge vni`, those scripts might be broken by the removal of the empty lines. This is a secondary concern because those scripts should consume the JSON output instead. Before: $ bridge vni dev vni group/remote vxlan1 4001 5000-5010 vxlan2 100 $ After: $ ./bridge/bridge vni dev vni group/remote vxlan1 4001 5000-5010 vxlan2 100 $ Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-22bridge: vni: Reverse the logic in print_vnifilter_rtm()Benjamin Poirier1-4/+5
print_vnifilter_rtm() is structured similarly to print_vlan_tunnel_info() except that in the former, the open_vni_port() call is guarded by a "if (first)" check whereas in the latter, the open_vlan_port() call is guarded by a "if (!opened)" check. Reverse the logic in one of the functions to have the same structure in both. Since the calls being guarded are "open_...()", "close_...()", use the "opened" logic structure. Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-22bridge: vni: Guard close_vni_port() callBenjamin Poirier1-1/+3
Currently, the call to open_vni_port() within print_vnifilter_rtm() is written in a way that is safe if there is a RTM_{NEW,DEL,GET}TUNNEL message without any VXLAN_VNIFILTER_ENTRY attribute. However the close_vni_port() call is written in a way that assumes there is always at least one VXLAN_VNIFILTER_ENTRY attribute within every RTM_*TUNNEL message. At this time, this assumption is correct. However, the code should be consistent in its assumptions. Choose the safe approach and fix the asymmetry between the open_vni_port() and close_vni_port() calls by guarding the latter call with a check. Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-22bridge: vni: Move open_json_object() within print_vni()Benjamin Poirier1-2/+1
print_vni() is used to output one vni or vni range which, in json output mode, looks like { "vni": 100 } Currently, the closing bracket is handled within the function but the opening bracket is handled by open_json_object() before calling the function. For consistency, move the call to open_json_object() within print_vni(). Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-22bridge: vni: Remove print_vnifilter_rtm_filter()Benjamin Poirier1-6/+1
print_vnifilter_rtm_filter() adds an unnecessary level of indirection so remove it to simplify the code. Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-22bridge: vlan: Remove paranoid checkBenjamin Poirier1-7/+2
To make the code lighter, remove the check on the actual print_range() output width. In the odd case that an out-of-range, wide vlan id is printed, printf() will treat the negative field width as positive and the output will simply be further misaligned. Suggested-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-22bridge: vlan: Use printf() to avoid temporary bufferBenjamin Poirier1-5/+2
Currently, print_vlan_tunnel_info() is first outputting a formatted string to a temporary buffer in order to use print_string() which can handle json or normal text mode. Since this specific string is only output in normal text mode, by calling printf() directly, we can avoid the need to first output to a temporary string buffer. Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-22bridge: vni: Fix vni filter help stringsBenjamin Poirier2-3/+3
Add the missing 'vni' subcommand to the top level `bridge help`. For `bridge vni { add | del } ...`, 'dev' is a mandatory argument. For `bridge vni show`, 'dev' is an optional argument. Fixes: 45cd32f9f7d5 ("bridge: vxlan device vnifilter support") Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-22bridge: vni: Report duplicate vni argument using duparg()Benjamin Poirier1-1/+1
When there is a duplicate 'vni' option, report the error using duparg() instead of the generic invarg(). Before: $ bridge vni add vni 100 vni 101 dev vxlan2 Error: argument "101" is wrong: duplicate vni After: $ ./bridge/bridge vni add vni 100 vni 101 dev vxlan2 Error: duplicate "vni": "101" is the second value. Fixes: 45cd32f9f7d5 ("bridge: vxlan device vnifilter support") Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-22bridge: vni: Fix duplicate group and remote error messagesBenjamin Poirier1-8/+8
Consider the following command with a duplicated "remote" argument: $ bridge vni add vni 150 remote 10.0.0.1 remote 10.0.0.2 dev vxlan2 Error: argument "remote" is wrong: duplicate group The error message is misleading because there is no "group" argument. Both of the "group" and "remote" options specify a destination address and are mutually exclusive so change the variable name and error messages accordingly. The result is: $ ./bridge/bridge vni add vni 150 remote 10.0.0.1 remote 10.0.0.2 dev vxlan2 Error: duplicate "destination": "10.0.0.2" is the second value. Fixes: 45cd32f9f7d5 ("bridge: vxlan device vnifilter support") Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-22bridge: vni: Remove dead code in group argument parsingBenjamin Poirier1-5/+0
is_addrtype_inet_not_multi(&daddr) may read an uninitialized "daddr". Even if that is fixed, the error message that follows cannot be reached because the situation would be caught by the previous test (group_present). Therefore, remove this test on daddr. Fixes: 45cd32f9f7d5 ("bridge: vxlan device vnifilter support") Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-22bridge: vni: Accept 'del' commandBenjamin Poirier1-1/+2
`bridge vni help` shows "bridge vni { add | del } ..." but currently `bridge vni del ...` errors out unexpectedly: # bridge vni del Command "del" is unknown, try "bridge vni help". Recognize 'del' as a synonym of the original 'delete' command. Fixes: 45cd32f9f7d5 ("bridge: vxlan device vnifilter support") Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-22configure: drop test for ATMStephen Hemminger1-23/+0
The ATM qdisc was removed by: commit 8a20feb6388f ("tc: drop support for ATM qdisc") but configure check was not removed. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-20man: Fix malformatted database file locationsPhil Sutter3-24/+23
The .BR macro does not put spaces in between its arguments. Also it will apply to all arguments. Fixes: 0a0a8f12fa1b ("Read configuration files from /etc and /usr") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-20man: ip-route.8: Fix typo in rt_protos location specPhil Sutter1-1/+1
RTPROTO description erroneously specified /etc/iproute2/rt_protos twice. Fixes: 0a0a8f12fa1b ("Read configuration files from /etc and /usr") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-20ss: Add support for dumping TCP bound-inactive sockets.Guillaume Nault2-1/+26
Make ss aware of the new "bound-inactive" pseudo-state for TCP (see Linux commit 91051f003948 ("tcp: Dump bound-only sockets in inet_diag.")). These are TCP sockets that have been bound, but are neither listening nor connecting. With this patch, these sockets can now be dumped with: * the existing -a (--all) option, to dump all sockets, including bound-inactive ones, * the new -B (--bound-inactive) option, to dump them exclusively, * the new "bound-inactive" state, to be used in a STATE-FILTER. Note that the SS_BOUND_INACTIVE state is a pseudo-state used for queries only. The kernel returns them as SS_CLOSE. The SS_NEW_SYN_RECV pseudo-state is added in this patch only because we have to set its entry in the sstate_namel array (in scan_state()). Care is taken not to make it visible by users. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2023-12-08mnl_utils: sanitize incoming netlink payload size in callbacksJiri Pirko1-1/+6
Don't trust the kernel to send payload of certain size. Sanitize that by checking the payload length in mnlu_cb_stop() and mnlu_cb_error() and only access the payload if it is of required size. Note that for mnlu_cb_stop(), this is happening already for example with devlink resource. Kernel sends NLMSG_DONE with zero size payload. Fixes: 049c58539f5d ("devlink: mnlg: Add support for extended ack") Fixes: c934da8aaacb ("devlink: mnlg: Catch returned error value of dumpit commands") Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-08uapi: update stddef.hStephen Hemminger1-1/+1
Change from upstream 6.7-rc4 Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-06ss: prevent "Process" column from being printed unless requestedQuentin Deslandes1-1/+4
Commit 5883c6eba517 ("ss: show header for --processes/-p") added "Process" to the list of columns printed by ss. However, the "Process" header is now printed even if --processes/-p is not used. This change aims to fix this by moving the COL_PROC column ID to the same index as the corresponding column structure in the columns array, and enabling it if --processes/-p is used. Fixes: 5883c6eba517 ("ss: show header for --processes/-p") Signed-off-by: Quentin Deslandes <qde@naccy.de> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-12-06ss: mptcp: print out subflows_total counterGeliang Tang1-0/+2
A new counter mptcpi_subflows_total has been added in mptcpi_flags to count the total amount of subflows from mptcp_info including the initial one into kernel in this commit: 6ebf6f90ab4a ("mptcp: add mptcpi_subflows_total counter") This patch prints out this counter into mptcp_stats output. Acked-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@linux.dev> Signed-off-by: David Ahern <dsahern@kernel.org>
2023-12-06Update kernel headersDavid Ahern3-16/+554
Update kernel headers to commit: 074ac38d5b95 ("octeontx2-af: cn10k: Increase outstanding LMTST transactions") Signed-off-by: David Ahern <dsahern@kernel.org>
2023-12-06configure: Add _GNU_SOURCE to strlcpy configure testSam James1-0/+1
>=glibc-2.38 adds strlcpy but it's guarded under a feature-test macro. Just add _GNU_SOURCE to the configure test because we already pass _GNU_SOURCE unconditionally in the Makefiles when building iproute2. Signed-off-by: Sam James <sam@gentoo.org> Signed-off-by: David Ahern <dsahern@kernel.org>
2023-12-06ip: require RTM_NEWLINKStephen Hemminger1-455/+37
The kernel support for creating network devices was added back in 2007 and iproute2 has been carrying backward compatability support since then. After 16 years, it is enough time to drop the code. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2023-12-06Merge branch 'tcp-usec-fq' into nextDavid Ahern3-1/+137
Eric Dumazet says: ==================== Add iproute2 patches to support recent TCP usec timestamps, and FQ changes landed in linux-6.7 ==================== Signed-off-by: David Ahern <dsahern@kernel.org>
2023-12-06tc: fq: reports stats added in linux-6.7Eric Dumazet1-1/+24
Report new fields added in linux-6.7: - fastpath : Number of packets that have used the fast path. - band[012]_pkts : Number of packets currently queued per band. - band[012]_drops : Counters of dropped packets, per band (only printed if not zero) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2023-12-06tc: fq: add TCA_FQ_WEIGHTS handlingEric Dumazet1-0/+43
Linux-6.7 FQ got WRR scheduling. TCA_FQ_WEIGHTS attribute can report/change per-band weights. tc qdisc show dev eth1 ... qdisc fq ... weights 589824 196608 65536 quantum 8364b ... Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2023-12-06tc: fq: add TCA_FQ_PRIOMAP handlingEric Dumazet1-0/+59
linux-6.7 FQ packet scheduler gets 3-bands, and the ability to report or program the associated priomap. $ tc qdisc show dev eth0 ... qdisc fq ... bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 $ tc qdisc change dev eth0 ... qdisc fq ... bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2023-12-06ss: add report of TCPI_OPT_USEC_TSEric Dumazet1-0/+4
linux-6.7 supports usec resolution in TCP TS values. ss -ie can show if a flow is using this new resolution. $ ss -tie ... State Recv-Q Send-Q Local Address:Port Peer Address:Port Process ESTAB 0 12869632 [2002:a05:6608:295::]:37054 [2002:a05:6608:297::]:35721 ts usec_ts sack bbr2s wscale:12,12 ... Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2023-12-06ip route: add support for TCP usec TSEric Dumazet1-0/+7
linux-6.7 got support for TCP usec resolution timestamps, using one bit in the features mask : RTAX_FEATURE_TCP_USEC_TS. ip route add 10/8 ... features tcp_usec_ts Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2023-12-03ip: require RTM_NEWLINKStephen Hemminger1-455/+37
The kernel support for creating network devices was added back in 2007 and iproute2 has been carrying backward compatability support since then. After 16 years, it is enough time to drop the code. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-11-28iplink: spelling fix in error messageStephen Hemminger1-1/+1
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-11-28iproute2: prevent memory leak on error returnheminhong1-0/+2
When rtnl_statsdump_req_filter() or rtnl_dump_filter() failed to process, just return will cause memory leak. Signed-off-by: heminhong <heminhong@kylinos.cn> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-11-22Merge branch 'main' into nextDavid Ahern39-2679/+90
Signed-off-by: David Ahern <dsahern@kernel.org>
2023-11-22Merge branch 'parsing-cleanup' into nextDavid Ahern3-11/+49
Petr Machata says: ==================== Library functions parse_one_of() and parse_on_off() were added about three years ago to unify all the disparate reimplementations of the same basic idea. It used the matches() function to determine whether a string under consideration corresponds to one of the keywords. This reflected many, though not all cases of on/off parsing at the time. This decision has some odd consequences. In particular, "o" can be used as a shorthand for "off", which is not obvious, because "o" is the prefix of both. By sheer luck, the end result actually makes some sense: "on" means on, anything else either means off or errors out. Similar issues are in principle also possible for parse_one_of() uses, though currently this does not come up. Ideally parse_on_off() would accept the strings "on" and "off" and no others. Patch #1 is a cleanup. Patch #2 is shaping the code for the next patches. Patch #3 converts parse_on_off() to strcmp(). See the commit message for the rationale of why the change should be considered acceptable. We'd ideally do parse_one_of() likewise. But the strings this function parses tend to be longer, which means more opportunities for typos and more of a reason to abbreviate things. So instead, patch #4 adds a function parse_one_of_deprecated() for ip macsec to use in one place, where these typos are to be expected, and converts that site to the new function. Then patch #5 changes the behavior of parse_one_of() to accept prefixes like it has so far, but to warn that they are deprecated: # dcb ets set dev swp1 tc-tsa 0:s WARNING: 's' matches 'strict' by prefix. Matching by prefix is deprecated in this context, please use the full string. The idea is that several releases down the line, we might consider switching over to strcmp(), as presumably enough advance warning will have been given. ==================== Signed-off-by: David Ahern <dsahern@kernel.org>
2023-11-22lib: utils: Have parse_one_of() warn about prefix matchesPetr Machata1-1/+18
The function parse_one_of() currently uses matches() for string comparison under the hood. Extending matches()-based parsers is tricky, because newly added matches might change the way strings are parsed, if the newly-added string shares a prefix with a string that is matched later in the code. Therefore in this patch, add a twist to parse_one_of() that partial prefix matches yield a warning. This will not disturb standard output or the overall behavior, but will make it obvious that the usage is discouraged and prompt users to update their scripts and habits. An example of output: # dcb ets set dev swp1 tc-tsa 0:s WARNING: 's' matches 'strict' by prefix. Matching by prefix is deprecated in this context, please use the full string. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2023-11-22lib: utils: Introduce parse_one_of_deprecated()Petr Machata3-2/+14
The function parse_one_of() currently uses matches() for string comparison under the hood. Extending matches()-based parsers is tricky, because newly added matches might change the way strings are parsed, if the newly-added string shares a prefix with a string that is matched later in the code. In this patch, introduce a new function, parse_one_of_deprecated(). This will be currently synonymous with parse_one_of(), however the latter will change behavior in the next patch. Use the new function for parsing of the macsec "validate" option. The reason is that the valid strings for that option are "disabled", "check" and "strict". It is not hard to see how "disabled" could be misspelled as "disable", and be baked in some script in this form. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2023-11-22lib: utils: Convert parse_on_off() to strcmp()Petr Machata1-1/+1
The function parse_on_off() currently uses matches() for string comparison under the hood. This has some odd consequences. In particular, "o" can be used as a shorthand for "off", which is not obvious, because "o" is the prefix of both. In this patch, change parsing to strcmp(). This is a breaking change. The following paragraphs give arguments for why it should be considered acceptable. First and foremost: on/off are very short strings that it makes practically no sense to shorten. Since "o" is the universal prefix, the only unambiguous shortening is "of" for "off". It is doubtful that anyone would intentionally decide to save typing of the second "f" when they already typed the first. It also seems unlikely that the typo of "of" for "off" would not be caught immediately, as missing a third of the word length would likely be noticed. In other words, it seems improbable that the abbreviated variants are used, intentionally or by mistake. Commit 9262ccc3ed32 ("bridge: link: Port over to parse_on_off()") and commit 3e0d2a73ba06 ("ip: iplink_bridge_slave: Port over to parse_on_off()") converted several sites from open-coding strcmp()-based on/off parsing to parse_on_off(), which is itself based on matches(). This made the list of permissible strings more generic, but the behavior was exact match to begin with, and this patch restores it. Commit 5f685d064b03 ("ip: iplink: Convert to use parse_on_off()") has changed from matches()-based parsing, which however had branches in the other order, and "o" would parse to mean on. This indicates that at least in this context, people were not using the shorthand of "o" or the commit would have broken their use case. This supports the thesis that the abbreviations are not really used for on/off parsing. For completeness, commit 82604d28525a ("lib: Add parse_one_of(), parse_on_off()") introduced parse_on_off(), converting several users in the ip link macsec code in the process. Those users have always used matches(), and had branches in the same order as the newly-introduced parse_on_off(). A survey of selftests and documentation of Linux kernel (by way of git grep), has not discovered any cases of the involved options getting arguments other than the exact strings on and off. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2023-11-22lib: utils: Generalize parse_one_of()Petr Machata1-4/+13
The following patch will change the way parse_one_of() and parse_on_off() parse the strings they are given. To prepare for this change, extract from parse_one_of() the functional core, which express in terms of a configurable matcher, a pointer to a function that does the string comparison. Then rewrite parse_one_of() and parse_on_off() as wrappers that just pass matches() as the matcher, thereby maintaining the same behavior as they currently have. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2023-11-22lib: utils: Switch matches() to returning int againPetr Machata2-5/+5
Since commit 1f420318bda3 ("utils: don't match empty strings as prefixes") the function has pretended to return a boolean. But every user expects it to return zero on success and a non-zero value on failure, like strcmp(). Even the function itself actually returns "true" to mean "no match". This only makes sense if one considers a boolean to be a one-bit unsigned integer with no inherent meaning, which I do not think is reasonable. Switch the prototype back to int, and return 1 instead of true. Cc: Matteo Croce <mcroce@redhat.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2023-11-22ip, link: Add support for netkitDaniel Borkmann6-4/+238
Add base support for creating/dumping netkit devices. Minimal example usage: # ip link add type netkit # ip -d a [...] 7: nk0@nk1: <BROADCAST,MULTICAST,NOARP,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 netkit mode l3 type peer policy forward numtxqueues 1 numrxqueues 1 [...] 8: nk1@nk0: <BROADCAST,MULTICAST,NOARP,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 netkit mode l3 type primary policy forward numtxqueues 1 numrxqueues 1 [...] Example usage with netns (for BPF examples, see BPF selftests linked below): # ip netns add blue # ip link add nk0 type netkit peer nk1 netns blue # ip link set up nk0 # ip addr add 10.0.0.1/24 dev nk0 # ip -n blue link set up nk1 # ip -n blue addr add 10.0.0.2/24 dev nk1 # ping -c1 10.0.0.2 PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data. 64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=0.021 ms Example usage with L2 mode and peer blackholing when no BPF is attached: # ip link add foo type netkit mode l2 forward peer blackhole bar # ip -d a [...] 13: bar@foo: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 5e:5b:81:17:02:27 brd ff:ff:ff:ff:ff:ff promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 netkit mode l2 type peer policy blackhole numtxqueues 1 numrxqueues 1 [...] 14: foo@bar: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether de:01:a5:88:9e:99 brd ff:ff:ff:ff:ff:ff promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 netkit mode l2 type primary policy forward numtxqueues 1 numrxqueues 1 [...] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://git.kernel.org/torvalds/c/35dfaad7188c Link: https://git.kernel.org/torvalds/c/05c31b4ab205 Link: https://git.kernel.org/torvalds/c/ace15f91e569 Signed-off-by: David Ahern <dsahern@kernel.org>
2023-11-19man: allow up to 100 character linesStephen Hemminger1-1/+1
There are some long URL's that cause warnings from the man page checker. Go ahead and allow these even though Debian lintian may still complain. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-11-17man: fix man page errorsStephen Hemminger4-2/+5
Debian is now more picky about man pages. Need to tell man command that tbl is being used on a man page now. Also, font macros need to have proper font. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-11-17ip: move get_failed blocksStephen Hemminger6-42/+42
Rather than doing goto back into the middle of an earlier if() statement. Move the error returns to the end of the functions to follow kernel coding practice. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-11-17iproute2: prevent memory leakheminhong6-6/+12
When the return value of rtnl_talk() is not less than 0, 'answer' will be allocated. The 'answer' should be free after using, otherwise it will cause memory leak. Fixes: a066cc6623e1 ("gre/gre6: Unify local/remote endpoint address parsing") Signed-off-by: heminhong <heminhong@kylinos.cn> Reviewed-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-11-16Makefile: use /usr/share/iproute2 for config filesAndrea Claudi1-2/+2
According to FHS: "/usr/lib includes object files and libraries. On some systems, it may also include internal binaries that are not intended to be executed directly by users or shell scripts." A better directory to store config files is /usr/share: "The /usr/share hierarchy is for all read-only architecture independent data files. This hierarchy is intended to be shareable among all architecture platforms of a given OS; thus, for example, a site with i386, Alpha, and PPC platforms might maintain a single /usr/share directory that is centrally-mounted." Accordingly, move configuration files to $(DATADIR)/iproute2. Fixes: 946753a4459b ("Makefile: ensure CONF_USR_DIR honours the libdir config") Reported-by: Luca Boccassi <luca.boccassi@gmail.com> Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Acked-by: Luca Boccassi <bluca@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-11-13uapi: update headers from 6.7-rc1Stephen Hemminger2-5/+6
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2023-11-11Merge branch 'devlink-instances' into nextDavid Ahern4-75/+197
Jiri Pirko says: ==================== Print out recently added attributes that expose relationships between devlink instances. This patchset extends the outputs by "nested_devlink" attributes. Examples: $ devlink dev pci/0000:08:00.0: nested_devlink: auxiliary/mlx5_core.eth.0 auxiliary/mlx5_core.eth.0 pci/0000:08:00.1: nested_devlink: auxiliary/mlx5_core.eth.1 auxiliary/mlx5_core.eth.1 $ devlink dev -j -p { "dev": { "pci/0000:08:00.0": { "nested_devlink": { "auxiliary/mlx5_core.eth.0": {} } }, "auxiliary/mlx5_core.eth.0": {}, "pci/0000:08:00.1": { "nested_devlink": { "auxiliary/mlx5_core.eth.1": {} } }, "auxiliary/mlx5_core.eth.1": {} } } $ devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 106 pci/0000:08:00.0/32768: type eth netdev eth2 flavour pcisf controller 0 pfnum 0 sfnum 106 splittable false function: hw_addr 00:00:00:00:00:00 state inactive opstate detached roce enable $ devlink port function set pci/0000:08:00.0/32768 state active $ devlink port show pci/0000:08:00.0/32768 pci/0000:08:00.0/32768: type eth netdev eth2 flavour pcisf controller 0 pfnum 0 sfnum 106 splittable false function: hw_addr 00:00:00:00:00:00 state active opstate attached roce enable nested_devlink: auxiliary/mlx5_core.sf.2 $ devlink port show pci/0000:08:00.0/32768 -j -p { "port": { "pci/0000:08:00.0/32768": { "type": "eth", "netdev": "eth2", "flavour": "pcisf", "controller": 0, "pfnum": 0, "sfnum": 106, "splittable": false, "function": { "hw_addr": "00:00:00:00:00:00", "state": "active", "opstate": "attached", "roce": "enable", "nested_devlink": { "auxiliary/mlx5_core.sf.2": {} } } } } } $ devlink dev reload auxiliary/mlx5_core.sf.2 netns ns1 $ devlink port show pci/0000:08:00.0/32768 pci/0000:08:00.0/32768: type eth netdev eth2 flavour pcisf controller 0 pfnum 0 sfnum 106 splittable false function: hw_addr 00:00:00:00:00:00 state active opstate attached roce enable nested_devlink: auxiliary/mlx5_core.sf.2: netns ns1 $ devlink port show pci/0000:08:00.0/32768 -j -p { "port": { "pci/0000:08:00.0/32768": { "type": "eth", "netdev": "eth2", "flavour": "pcisf", "controller": 0, "pfnum": 0, "sfnum": 106, "splittable": false, "function": { "hw_addr": "00:00:00:00:00:00", "state": "active", "opstate": "attached", "roce": "enable", "nested_devlink": { "auxiliary/mlx5_core.sf.2": { "netns": "ns1" } } } } } } ==================== Signed-off-by: David Ahern <dsahern@kernel.org>
2023-11-11devlink: print nested devlink handle for devlink devJiri Pirko1-4/+26
Devlink dev may contain one or more nested devlink instances. Print them using previously introduced pr_out_nested_handle_obj() helper. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
2023-11-11devlink: print nested handle for port functionJiri Pirko1-0/+20
If port function contains nested handle attribute, print it. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>