aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2024-03-12 17:44:08 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2024-03-12 17:44:08 -0700
commit9187210eee7d87eea37b45ea93454a88681894a4 (patch)
tree31b4610e62cdd5e1dfb700014aa619e41145d7d3 /Documentation
parent1f440397665f4241346e4cc6d93f8b73880815d1 (diff)
parented1f164038b50c5864aa85389f3ffd456f050cca (diff)
downloadlinux-9187210eee7d87eea37b45ea93454a88681894a4.tar.gz
Merge tag 'net-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski: "Core & protocols: - Large effort by Eric to lower rtnl_lock pressure and remove locks: - Make commonly used parts of rtnetlink (address, route dumps etc) lockless, protected by RCU instead of rtnl_lock. - Add a netns exit callback which already holds rtnl_lock, allowing netns exit to take rtnl_lock once in the core instead of once for each driver / callback. - Remove locks / serialization in the socket diag interface. - Remove 6 calls to synchronize_rcu() while holding rtnl_lock. - Remove the dev_base_lock, depend on RCU where necessary. - Support busy polling on a per-epoll context basis. Poll length and budget parameters can be set independently of system defaults. - Introduce struct net_hotdata, to make sure read-mostly global config variables fit in as few cache lines as possible. - Add optional per-nexthop statistics to ease monitoring / debug of ECMP imbalance problems. - Support TCP_NOTSENT_LOWAT in MPTCP. - Ensure that IPv6 temporary addresses' preferred lifetimes are long enough, compared to other configured lifetimes, and at least 2 sec. - Support forwarding of ICMP Error messages in IPSec, per RFC 4301. - Add support for the independent control state machine for bonding per IEEE 802.1AX-2008 5.4.15 in addition to the existing coupled control state machine. - Add "network ID" to MCTP socket APIs to support hosts with multiple disjoint MCTP networks. - Re-use the mono_delivery_time skbuff bit for packets which user space wants to be sent at a specified time. Maintain the timing information while traversing veth links, bridge etc. - Take advantage of MSG_SPLICE_PAGES for RxRPC DATA and ACK packets. - Simplify many places iterating over netdevs by using an xarray instead of a hash table walk (hash table remains in place, for use on fastpaths). - Speed up scanning for expired routes by keeping a dedicated list. - Speed up "generic" XDP by trying harder to avoid large allocations. - Support attaching arbitrary metadata to netconsole messages. Things we sprinkled into general kernel code: - Enforce VM_IOREMAP flag and range in ioremap_page_range and introduce VM_SPARSE kind and vm_area_[un]map_pages (used by bpf_arena). - Rework selftest harness to enable the use of the full range of ksft exit code (pass, fail, skip, xfail, xpass). Netfilter: - Allow userspace to define a table that is exclusively owned by a daemon (via netlink socket aliveness) without auto-removing this table when the userspace program exits. Such table gets marked as orphaned and a restarting management daemon can re-attach/regain ownership. - Speed up element insertions to nftables' concatenated-ranges set type. Compact a few related data structures. BPF: - Add BPF token support for delegating a subset of BPF subsystem functionality from privileged system-wide daemons such as systemd through special mount options for userns-bound BPF fs to a trusted & unprivileged application. - Introduce bpf_arena which is sparse shared memory region between BPF program and user space where structures inside the arena can have pointers to other areas of the arena, and pointers work seamlessly for both user-space programs and BPF programs. - Introduce may_goto instruction that is a contract between the verifier and the program. The verifier allows the program to loop assuming it's behaving well, but reserves the right to terminate it. - Extend the BPF verifier to enable static subprog calls in spin lock critical sections. - Support registration of struct_ops types from modules which helps projects like fuse-bpf that seeks to implement a new struct_ops type. - Add support for retrieval of cookies for perf/kprobe multi links. - Support arbitrary TCP SYN cookie generation / validation in the TC layer with BPF to allow creating SYN flood handling in BPF firewalls. - Add code generation to inline the bpf_kptr_xchg() helper which improves performance when stashing/popping the allocated BPF objects. Wireless: - Add SPP (signaling and payload protected) AMSDU support. - Support wider bandwidth OFDMA, as required for EHT operation. Driver API: - Major overhaul of the Energy Efficient Ethernet internals to support new link modes (2.5GE, 5GE), share more code between drivers (especially those using phylib), and encourage more uniform behavior. Convert and clean up drivers. - Define an API for querying per netdev queue statistics from drivers. - IPSec: account in global stats for fully offloaded sessions. - Create a concept of Ethernet PHY Packages at the Device Tree level, to allow parameterizing the existing PHY package code. - Enable Rx hashing (RSS) on GTP protocol fields. Misc: - Improvements and refactoring all over networking selftests. - Create uniform module aliases for TC classifiers, actions, and packet schedulers to simplify creating modprobe policies. - Address all missing MODULE_DESCRIPTION() warnings in networking. - Extend the Netlink descriptions in YAML to cover message encapsulation or "Netlink polymorphism", where interpretation of nested attributes depends on link type, classifier type or some other "class type". Drivers: - Ethernet high-speed NICs: - Add a new driver for Marvell's Octeon PCI Endpoint NIC VF. - Intel (100G, ice, idpf): - support E825-C devices - nVidia/Mellanox: - support devices with one port and multiple PCIe links - Broadcom (bnxt): - support n-tuple filters - support configuring the RSS key - Wangxun (ngbe/txgbe): - implement irq_domain for TXGBE's sub-interrupts - Pensando/AMD: - support XDP - optimize queue submission and wakeup handling (+17% bps) - optimize struct layout, saving 28% of memory on queues - Ethernet NICs embedded and virtual: - Google cloud vNIC: - refactor driver to perform memory allocations for new queue config before stopping and freeing the old queue memory - Synopsys (stmmac): - obey queueMaxSDU and implement counters required by 802.1Qbv - Renesas (ravb): - support packet checksum offload - suspend to RAM and runtime PM support - Ethernet switches: - nVidia/Mellanox: - support for nexthop group statistics - Microchip: - ksz8: implement PHY loopback - add support for KSZ8567, a 7-port 10/100Mbps switch - PTP: - New driver for RENESAS FemtoClock3 Wireless clock generator. - Support OCP PTP cards designed and built by Adva. - CAN: - Support recvmsg() flags for own, local and remote traffic on CAN BCM sockets. - Support for esd GmbH PCIe/402 CAN device family. - m_can: - Rx/Tx submission coalescing - wake on frame Rx - WiFi: - Intel (iwlwifi): - enable signaling and payload protected A-MSDUs - support wider-bandwidth OFDMA - support for new devices - bump FW API to 89 for AX devices; 90 for BZ/SC devices - MediaTek (mt76): - mt7915: newer ADIE version support - mt7925: radio temperature sensor support - Qualcomm (ath11k): - support 6 GHz station power modes: Low Power Indoor (LPI), Standard Power) SP and Very Low Power (VLP) - QCA6390 & WCN6855: support 2 concurrent station interfaces - QCA2066 support - Qualcomm (ath12k): - refactoring in preparation for Multi-Link Operation (MLO) support - 1024 Block Ack window size support - firmware-2.bin support - support having multiple identical PCI devices (firmware needs to have ATH12K_FW_FEATURE_MULTI_QRTR_ID) - QCN9274: support split-PHY devices - WCN7850: enable Power Save Mode in station mode - WCN7850: P2P support - RealTek: - rtw88: support for more rtw8811cu and rtw8821cu devices - rtw89: support SCAN_RANDOM_SN and SET_SCAN_DWELL - rtlwifi: speed up USB firmware initialization - rtwl8xxxu: - RTL8188F: concurrent interface support - Channel Switch Announcement (CSA) support in AP mode - Broadcom (brcmfmac): - per-vendor feature support - per-vendor SAE password setup - DMI nvram filename quirk for ACEPC W5 Pro" * tag 'net-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2255 commits) nexthop: Fix splat with CONFIG_DEBUG_PREEMPT=y nexthop: Fix out-of-bounds access during attribute validation nexthop: Only parse NHA_OP_FLAGS for dump messages that require it nexthop: Only parse NHA_OP_FLAGS for get messages that require it bpf: move sleepable flag from bpf_prog_aux to bpf_prog bpf: hardcode BPF_PROG_PACK_SIZE to 2MB * num_possible_nodes() selftests/bpf: Add kprobe multi triggering benchmarks ptp: Move from simple ida to xarray vxlan: Remove generic .ndo_get_stats64 vxlan: Do not alloc tstats manually devlink: Add comments to use netlink gen tool nfp: flower: handle acti_netdevs allocation failure net/packet: Add getsockopt support for PACKET_COPY_THRESH net/netlink: Add getsockopt support for NETLINK_LISTEN_ALL_NSID selftests/bpf: Add bpf_arena_htab test. selftests/bpf: Add bpf_arena_list test. selftests/bpf: Add unit tests for bpf_arena_alloc/free_pages bpf: Add helper macro bpf_addr_space_cast() libbpf: Recognize __arena global variables. bpftool: Recognize arena map type ...
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ABI/testing/sysfs-class-net-queues23
-rw-r--r--Documentation/admin-guide/sysctl/net.rst5
-rw-r--r--Documentation/bpf/kfuncs.rst8
-rw-r--r--Documentation/bpf/map_lpm_trie.rst2
-rw-r--r--Documentation/bpf/standardization/instruction-set.rst594
-rw-r--r--Documentation/bpf/verifier.rst2
-rw-r--r--Documentation/dev-tools/kselftest.rst12
-rw-r--r--Documentation/devicetree/bindings/leds/common.yaml12
-rw-r--r--Documentation/devicetree/bindings/leds/leds-bcm63138.yaml4
-rw-r--r--Documentation/devicetree/bindings/leds/leds-bcm6328.yaml4
-rw-r--r--Documentation/devicetree/bindings/leds/leds-bcm6358.txt2
-rw-r--r--Documentation/devicetree/bindings/leds/leds-pwm-multicolor.yaml4
-rw-r--r--Documentation/devicetree/bindings/leds/leds-pwm.yaml5
-rw-r--r--Documentation/devicetree/bindings/net/brcm,asp-v2.0.yaml4
-rw-r--r--Documentation/devicetree/bindings/net/brcm,unimac-mdio.yaml1
-rw-r--r--Documentation/devicetree/bindings/net/can/tcan4x5x.txt3
-rw-r--r--Documentation/devicetree/bindings/net/can/xilinx,can.yaml5
-rw-r--r--Documentation/devicetree/bindings/net/cdns,macb.yaml5
-rw-r--r--Documentation/devicetree/bindings/net/dsa/ar9331.txt147
-rw-r--r--Documentation/devicetree/bindings/net/dsa/microchip,ksz.yaml1
-rw-r--r--Documentation/devicetree/bindings/net/dsa/qca,ar9331.yaml161
-rw-r--r--Documentation/devicetree/bindings/net/dsa/realtek.yaml4
-rw-r--r--Documentation/devicetree/bindings/net/ethernet-controller.yaml1
-rw-r--r--Documentation/devicetree/bindings/net/ethernet-phy-package.yaml52
-rw-r--r--Documentation/devicetree/bindings/net/fsl,fec.yaml3
-rw-r--r--Documentation/devicetree/bindings/net/nfc/ti,trf7970a.yaml2
-rw-r--r--Documentation/devicetree/bindings/net/qca,qca808x.yaml54
-rw-r--r--Documentation/devicetree/bindings/net/qcom,ethqos.yaml9
-rw-r--r--Documentation/devicetree/bindings/net/qcom,ipa.yaml2
-rw-r--r--Documentation/devicetree/bindings/net/qcom,ipq4019-mdio.yaml15
-rw-r--r--Documentation/devicetree/bindings/net/qcom,qca807x.yaml184
-rw-r--r--Documentation/devicetree/bindings/net/renesas,etheravb.yaml1
-rw-r--r--Documentation/devicetree/bindings/net/snps,dwmac.yaml17
-rw-r--r--Documentation/devicetree/bindings/net/starfive,jh7110-dwmac.yaml72
-rw-r--r--Documentation/devicetree/bindings/net/ti,cpsw-switch.yaml5
-rw-r--r--Documentation/devicetree/bindings/net/ti,dp83822.yaml34
-rw-r--r--Documentation/devicetree/bindings/net/ti,k3-am654-cpsw-nuss.yaml5
-rw-r--r--Documentation/devicetree/bindings/net/ti,k3-am654-cpts.yaml5
-rw-r--r--Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml33
-rw-r--r--Documentation/devicetree/bindings/net/wireless/qcom,ath10k.yaml1
-rw-r--r--Documentation/devicetree/bindings/net/wireless/qcom,ath11k-pci.yaml1
-rw-r--r--Documentation/devicetree/bindings/net/wireless/qcom,ath11k.yaml1
-rw-r--r--Documentation/netlink/genetlink-c.yaml41
-rw-r--r--Documentation/netlink/genetlink-legacy.yaml41
-rw-r--r--Documentation/netlink/genetlink.yaml21
-rw-r--r--Documentation/netlink/netlink-raw.yaml37
-rw-r--r--Documentation/netlink/specs/devlink.yaml2
-rw-r--r--Documentation/netlink/specs/dpll.yaml40
-rw-r--r--Documentation/netlink/specs/mptcp_pm.yaml3
-rw-r--r--Documentation/netlink/specs/netdev.yaml91
-rw-r--r--Documentation/netlink/specs/nlctrl.yaml206
-rw-r--r--Documentation/netlink/specs/tc.yaml2119
-rw-r--r--Documentation/networking/af_xdp.rst33
-rw-r--r--Documentation/networking/bonding.rst12
-rw-r--r--Documentation/networking/can.rst34
-rw-r--r--Documentation/networking/device_drivers/ethernet/amazon/ena.rst6
-rw-r--r--Documentation/networking/device_drivers/ethernet/index.rst1
-rw-r--r--Documentation/networking/device_drivers/ethernet/intel/ice.rst21
-rw-r--r--Documentation/networking/device_drivers/ethernet/marvell/octeon_ep_vf.rst24
-rw-r--r--Documentation/networking/device_drivers/wwan/t7xx.rst46
-rw-r--r--Documentation/networking/devlink/mlx5.rst9
-rw-r--r--Documentation/networking/index.rst1
-rw-r--r--Documentation/networking/ip-sysctl.rst14
-rw-r--r--Documentation/networking/l2tp.rst135
-rw-r--r--Documentation/networking/multi-pf-netdev.rst174
-rw-r--r--Documentation/networking/netconsole.rst66
-rw-r--r--Documentation/networking/netdevices.rst4
-rw-r--r--Documentation/networking/sfp-phylink.rst147
-rw-r--r--Documentation/networking/statistics.rst15
-rw-r--r--Documentation/networking/xfrm_device.rst4
-rw-r--r--Documentation/userspace-api/ioctl/ioctl-number.rst1
-rw-r--r--Documentation/userspace-api/netlink/netlink-raw.rst42
72 files changed, 4241 insertions, 654 deletions
diff --git a/Documentation/ABI/testing/sysfs-class-net-queues b/Documentation/ABI/testing/sysfs-class-net-queues
index 5bff64d256c207..84aa25e0d14d12 100644
--- a/Documentation/ABI/testing/sysfs-class-net-queues
+++ b/Documentation/ABI/testing/sysfs-class-net-queues
@@ -96,3 +96,26 @@ Description:
Indicates the absolute minimum limit of bytes allowed to be
queued on this network device transmit queue. Default value is
0.
+
+What: /sys/class/net/<iface>/queues/tx-<queue>/byte_queue_limits/stall_thrs
+Date: Jan 2024
+KernelVersion: 6.9
+Contact: netdev@vger.kernel.org
+Description:
+ Tx completion stall detection threshold in ms. Kernel will
+ guarantee to detect all stalls longer than this threshold but
+ may also detect stalls longer than half of the threshold.
+
+What: /sys/class/net/<iface>/queues/tx-<queue>/byte_queue_limits/stall_cnt
+Date: Jan 2024
+KernelVersion: 6.9
+Contact: netdev@vger.kernel.org
+Description:
+ Number of detected Tx completion stalls.
+
+What: /sys/class/net/<iface>/queues/tx-<queue>/byte_queue_limits/stall_max
+Date: Jan 2024
+KernelVersion: 6.9
+Contact: netdev@vger.kernel.org
+Description:
+ Longest detected Tx completion stall. Write 0 to clear.
diff --git a/Documentation/admin-guide/sysctl/net.rst b/Documentation/admin-guide/sysctl/net.rst
index 3960916519557f..7250c0542828b4 100644
--- a/Documentation/admin-guide/sysctl/net.rst
+++ b/Documentation/admin-guide/sysctl/net.rst
@@ -206,6 +206,11 @@ Will increase power usage.
Default: 0 (off)
+mem_pcpu_rsv
+------------
+
+Per-cpu reserved forward alloc cache size in page units. Default 1MB per CPU.
+
rmem_default
------------
diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst
index 7985c6615f3c2f..a8f5782bd83318 100644
--- a/Documentation/bpf/kfuncs.rst
+++ b/Documentation/bpf/kfuncs.rst
@@ -177,10 +177,10 @@ In addition to kfuncs' arguments, verifier may need more information about the
type of kfunc(s) being registered with the BPF subsystem. To do so, we define
flags on a set of kfuncs as follows::
- BTF_SET8_START(bpf_task_set)
+ BTF_KFUNCS_START(bpf_task_set)
BTF_ID_FLAGS(func, bpf_get_task_pid, KF_ACQUIRE | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_put_pid, KF_RELEASE)
- BTF_SET8_END(bpf_task_set)
+ BTF_KFUNCS_END(bpf_task_set)
This set encodes the BTF ID of each kfunc listed above, and encodes the flags
along with it. Ofcourse, it is also allowed to specify no flags.
@@ -347,10 +347,10 @@ Once the kfunc is prepared for use, the final step to making it visible is
registering it with the BPF subsystem. Registration is done per BPF program
type. An example is shown below::
- BTF_SET8_START(bpf_task_set)
+ BTF_KFUNCS_START(bpf_task_set)
BTF_ID_FLAGS(func, bpf_get_task_pid, KF_ACQUIRE | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_put_pid, KF_RELEASE)
- BTF_SET8_END(bpf_task_set)
+ BTF_KFUNCS_END(bpf_task_set)
static const struct btf_kfunc_id_set bpf_task_kfunc_set = {
.owner = THIS_MODULE,
diff --git a/Documentation/bpf/map_lpm_trie.rst b/Documentation/bpf/map_lpm_trie.rst
index 74d64a30f50073..f9cd579496c9ce 100644
--- a/Documentation/bpf/map_lpm_trie.rst
+++ b/Documentation/bpf/map_lpm_trie.rst
@@ -17,7 +17,7 @@ significant byte.
LPM tries may be created with a maximum prefix length that is a multiple
of 8, in the range from 8 to 2048. The key used for lookup and update
-operations is a ``struct bpf_lpm_trie_key``, extended by
+operations is a ``struct bpf_lpm_trie_key_u8``, extended by
``max_prefixlen/8`` bytes.
- For IPv4 addresses the data length is 4 bytes
diff --git a/Documentation/bpf/standardization/instruction-set.rst b/Documentation/bpf/standardization/instruction-set.rst
index 245b6defc298c2..a5ab00ac0b1487 100644
--- a/Documentation/bpf/standardization/instruction-set.rst
+++ b/Documentation/bpf/standardization/instruction-set.rst
@@ -1,11 +1,11 @@
.. contents::
.. sectnum::
-=======================================
-BPF Instruction Set Specification, v1.0
-=======================================
+======================================
+BPF Instruction Set Architecture (ISA)
+======================================
-This document specifies version 1.0 of the BPF instruction set.
+This document specifies the BPF instruction set architecture (ISA).
Documentation conventions
=========================
@@ -24,22 +24,22 @@ a type's signedness (`S`) and bit width (`N`), respectively.
.. table:: Meaning of signedness notation.
==== =========
- `S` Meaning
+ S Meaning
==== =========
- `u` unsigned
- `s` signed
+ u unsigned
+ s signed
==== =========
.. table:: Meaning of bit-width notation.
===== =========
- `N` Bit width
+ N Bit width
===== =========
- `8` 8 bits
- `16` 16 bits
- `32` 32 bits
- `64` 64 bits
- `128` 128 bits
+ 8 8 bits
+ 16 16 bits
+ 32 32 bits
+ 64 64 bits
+ 128 128 bits
===== =========
For example, `u32` is a type whose valid values are all the 32-bit unsigned
@@ -48,31 +48,31 @@ numbers.
Functions
---------
-* `htobe16`: Takes an unsigned 16-bit number in host-endian format and
+* htobe16: Takes an unsigned 16-bit number in host-endian format and
returns the equivalent number as an unsigned 16-bit number in big-endian
format.
-* `htobe32`: Takes an unsigned 32-bit number in host-endian format and
+* htobe32: Takes an unsigned 32-bit number in host-endian format and
returns the equivalent number as an unsigned 32-bit number in big-endian
format.
-* `htobe64`: Takes an unsigned 64-bit number in host-endian format and
+* htobe64: Takes an unsigned 64-bit number in host-endian format and
returns the equivalent number as an unsigned 64-bit number in big-endian
format.
-* `htole16`: Takes an unsigned 16-bit number in host-endian format and
+* htole16: Takes an unsigned 16-bit number in host-endian format and
returns the equivalent number as an unsigned 16-bit number in little-endian
format.
-* `htole32`: Takes an unsigned 32-bit number in host-endian format and
+* htole32: Takes an unsigned 32-bit number in host-endian format and
returns the equivalent number as an unsigned 32-bit number in little-endian
format.
-* `htole64`: Takes an unsigned 64-bit number in host-endian format and
+* htole64: Takes an unsigned 64-bit number in host-endian format and
returns the equivalent number as an unsigned 64-bit number in little-endian
format.
-* `bswap16`: Takes an unsigned 16-bit number in either big- or little-endian
+* bswap16: Takes an unsigned 16-bit number in either big- or little-endian
format and returns the equivalent number with the same bit width but
opposite endianness.
-* `bswap32`: Takes an unsigned 32-bit number in either big- or little-endian
+* bswap32: Takes an unsigned 32-bit number in either big- or little-endian
format and returns the equivalent number with the same bit width but
opposite endianness.
-* `bswap64`: Takes an unsigned 64-bit number in either big- or little-endian
+* bswap64: Takes an unsigned 64-bit number in either big- or little-endian
format and returns the equivalent number with the same bit width but
opposite endianness.
@@ -97,40 +97,101 @@ Definitions
A: 10000110
B: 11111111 10000110
+Conformance groups
+------------------
+
+An implementation does not need to support all instructions specified in this
+document (e.g., deprecated instructions). Instead, a number of conformance
+groups are specified. An implementation must support the base32 conformance
+group and may support additional conformance groups, where supporting a
+conformance group means it must support all instructions in that conformance
+group.
+
+The use of named conformance groups enables interoperability between a runtime
+that executes instructions, and tools as such compilers that generate
+instructions for the runtime. Thus, capability discovery in terms of
+conformance groups might be done manually by users or automatically by tools.
+
+Each conformance group has a short ASCII label (e.g., "base32") that
+corresponds to a set of instructions that are mandatory. That is, each
+instruction has one or more conformance groups of which it is a member.
+
+This document defines the following conformance groups:
+
+* base32: includes all instructions defined in this
+ specification unless otherwise noted.
+* base64: includes base32, plus instructions explicitly noted
+ as being in the base64 conformance group.
+* atomic32: includes 32-bit atomic operation instructions (see `Atomic operations`_).
+* atomic64: includes atomic32, plus 64-bit atomic operation instructions.
+* divmul32: includes 32-bit division, multiplication, and modulo instructions.
+* divmul64: includes divmul32, plus 64-bit division, multiplication,
+ and modulo instructions.
+* packet: deprecated packet access instructions.
+
Instruction encoding
====================
BPF has two instruction encodings:
* the basic instruction encoding, which uses 64 bits to encode an instruction
-* the wide instruction encoding, which appends a second 64-bit immediate (i.e.,
- constant) value after the basic instruction for a total of 128 bits.
+* the wide instruction encoding, which appends a second 64 bits
+ after the basic instruction for a total of 128 bits.
-The fields conforming an encoded basic instruction are stored in the
-following order::
+Basic instruction encoding
+--------------------------
- opcode:8 src_reg:4 dst_reg:4 offset:16 imm:32 // In little-endian BPF.
- opcode:8 dst_reg:4 src_reg:4 offset:16 imm:32 // In big-endian BPF.
+A basic instruction is encoded as follows::
-**imm**
- signed integer immediate value
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | opcode | regs | offset |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | imm |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
-**offset**
- signed integer offset used with pointer arithmetic
+**opcode**
+ operation to perform, encoded as follows::
-**src_reg**
- the source register number (0-10), except where otherwise specified
- (`64-bit immediate instructions`_ reuse this field for other purposes)
+ +-+-+-+-+-+-+-+-+
+ |specific |class|
+ +-+-+-+-+-+-+-+-+
-**dst_reg**
- destination register number (0-10)
+ **specific**
+ The format of these bits varies by instruction class
-**opcode**
- operation to perform
+ **class**
+ The instruction class (see `Instruction classes`_)
+
+**regs**
+ The source and destination register numbers, encoded as follows
+ on a little-endian host::
+
+ +-+-+-+-+-+-+-+-+
+ |src_reg|dst_reg|
+ +-+-+-+-+-+-+-+-+
+
+ and as follows on a big-endian host::
+
+ +-+-+-+-+-+-+-+-+
+ |dst_reg|src_reg|
+ +-+-+-+-+-+-+-+-+
+
+ **src_reg**
+ the source register number (0-10), except where otherwise specified
+ (`64-bit immediate instructions`_ reuse this field for other purposes)
+
+ **dst_reg**
+ destination register number (0-10)
+
+**offset**
+ signed integer offset used with pointer arithmetic
+
+**imm**
+ signed integer immediate value
-Note that the contents of multi-byte fields ('imm' and 'offset') are
-stored using big-endian byte ordering in big-endian BPF and
-little-endian byte ordering in little-endian BPF.
+Note that the contents of multi-byte fields ('offset' and 'imm') are
+stored using big-endian byte ordering on big-endian hosts and
+little-endian byte ordering on little-endian hosts.
For example::
@@ -143,71 +204,83 @@ For example::
Note that most instructions do not use all of the fields.
Unused fields shall be cleared to zero.
-As discussed below in `64-bit immediate instructions`_, a 64-bit immediate
-instruction uses a 64-bit immediate value that is constructed as follows.
-The 64 bits following the basic instruction contain a pseudo instruction
-using the same format but with opcode, dst_reg, src_reg, and offset all set to zero,
-and imm containing the high 32 bits of the immediate value.
+Wide instruction encoding
+--------------------------
+
+Some instructions are defined to use the wide instruction encoding,
+which uses two 32-bit immediate values. The 64 bits following
+the basic instruction format contain a pseudo instruction
+with 'opcode', 'dst_reg', 'src_reg', and 'offset' all set to zero.
This is depicted in the following figure::
- basic_instruction
- .-----------------------------.
- | |
- code:8 regs:8 offset:16 imm:32 unused:32 imm:32
- | |
- '--------------'
- pseudo instruction
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | opcode | regs | offset |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | imm |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | reserved |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | next_imm |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+**opcode**
+ operation to perform, encoded as explained above
-Thus the 64-bit immediate value is constructed as follows:
+**regs**
+ The source and destination register numbers, encoded as explained above
+
+**offset**
+ signed integer offset used with pointer arithmetic
+
+**imm**
+ signed integer immediate value
- imm64 = (next_imm << 32) | imm
+**reserved**
+ unused, set to zero
-where 'next_imm' refers to the imm value of the pseudo instruction
-following the basic instruction. The unused bytes in the pseudo
-instruction are reserved and shall be cleared to zero.
+**next_imm**
+ second signed integer immediate value
Instruction classes
-------------------
-The three LSB bits of the 'opcode' field store the instruction class:
-
-========= ===== =============================== ===================================
-class value description reference
-========= ===== =============================== ===================================
-BPF_LD 0x00 non-standard load operations `Load and store instructions`_
-BPF_LDX 0x01 load into register operations `Load and store instructions`_
-BPF_ST 0x02 store from immediate operations `Load and store instructions`_
-BPF_STX 0x03 store from register operations `Load and store instructions`_
-BPF_ALU 0x04 32-bit arithmetic operations `Arithmetic and jump instructions`_
-BPF_JMP 0x05 64-bit jump operations `Arithmetic and jump instructions`_
-BPF_JMP32 0x06 32-bit jump operations `Arithmetic and jump instructions`_
-BPF_ALU64 0x07 64-bit arithmetic operations `Arithmetic and jump instructions`_
-========= ===== =============================== ===================================
+The three least significant bits of the 'opcode' field store the instruction class:
+
+===== ===== =============================== ===================================
+class value description reference
+===== ===== =============================== ===================================
+LD 0x0 non-standard load operations `Load and store instructions`_
+LDX 0x1 load into register operations `Load and store instructions`_
+ST 0x2 store from immediate operations `Load and store instructions`_
+STX 0x3 store from register operations `Load and store instructions`_
+ALU 0x4 32-bit arithmetic operations `Arithmetic and jump instructions`_
+JMP 0x5 64-bit jump operations `Arithmetic and jump instructions`_
+JMP32 0x6 32-bit jump operations `Arithmetic and jump instructions`_
+ALU64 0x7 64-bit arithmetic operations `Arithmetic and jump instructions`_
+===== ===== =============================== ===================================
Arithmetic and jump instructions
================================
-For arithmetic and jump instructions (``BPF_ALU``, ``BPF_ALU64``, ``BPF_JMP`` and
-``BPF_JMP32``), the 8-bit 'opcode' field is divided into three parts:
+For arithmetic and jump instructions (``ALU``, ``ALU64``, ``JMP`` and
+``JMP32``), the 8-bit 'opcode' field is divided into three parts::
-============== ====== =================
-4 bits (MSB) 1 bit 3 bits (LSB)
-============== ====== =================
-code source instruction class
-============== ====== =================
+ +-+-+-+-+-+-+-+-+
+ | code |s|class|
+ +-+-+-+-+-+-+-+-+
**code**
the operation code, whose meaning varies by instruction class
-**source**
+**s (source)**
the source operand location, which unless otherwise specified is one of:
====== ===== ==============================================
source value description
====== ===== ==============================================
- BPF_K 0x00 use 32-bit 'imm' value as source operand
- BPF_X 0x08 use 'src_reg' register value as source operand
+ K 0 use 32-bit 'imm' value as source operand
+ X 1 use 'src_reg' register value as source operand
====== ===== ==============================================
**instruction class**
@@ -216,70 +289,75 @@ code source instruction class
Arithmetic instructions
-----------------------
-``BPF_ALU`` uses 32-bit wide operands while ``BPF_ALU64`` uses 64-bit wide operands for
-otherwise identical operations.
+``ALU`` uses 32-bit wide operands while ``ALU64`` uses 64-bit wide operands for
+otherwise identical operations. ``ALU64`` instructions belong to the
+base64 conformance group unless noted otherwise.
The 'code' field encodes the operation as below, where 'src' and 'dst' refer
to the values of the source and destination registers, respectively.
-========= ===== ======= ==========================================================
-code value offset description
-========= ===== ======= ==========================================================
-BPF_ADD 0x00 0 dst += src
-BPF_SUB 0x10 0 dst -= src
-BPF_MUL 0x20 0 dst \*= src
-BPF_DIV 0x30 0 dst = (src != 0) ? (dst / src) : 0
-BPF_SDIV 0x30 1 dst = (src != 0) ? (dst s/ src) : 0
-BPF_OR 0x40 0 dst \|= src
-BPF_AND 0x50 0 dst &= src
-BPF_LSH 0x60 0 dst <<= (src & mask)
-BPF_RSH 0x70 0 dst >>= (src & mask)
-BPF_NEG 0x80 0 dst = -dst
-BPF_MOD 0x90 0 dst = (src != 0) ? (dst % src) : dst
-BPF_SMOD 0x90 1 dst = (src != 0) ? (dst s% src) : dst
-BPF_XOR 0xa0 0 dst ^= src
-BPF_MOV 0xb0 0 dst = src
-BPF_MOVSX 0xb0 8/16/32 dst = (s8,s16,s32)src
-BPF_ARSH 0xc0 0 :term:`sign extending<Sign Extend>` dst >>= (src & mask)
-BPF_END 0xd0 0 byte swap operations (see `Byte swap instructions`_ below)
-========= ===== ======= ==========================================================
+===== ===== ======= ==========================================================
+name code offset description
+===== ===== ======= ==========================================================
+ADD 0x0 0 dst += src
+SUB 0x1 0 dst -= src
+MUL 0x2 0 dst \*= src
+DIV 0x3 0 dst = (src != 0) ? (dst / src) : 0
+SDIV 0x3 1 dst = (src != 0) ? (dst s/ src) : 0
+OR 0x4 0 dst \|= src
+AND 0x5 0 dst &= src
+LSH 0x6 0 dst <<= (src & mask)
+RSH 0x7 0 dst >>= (src & mask)
+NEG 0x8 0 dst = -dst
+MOD 0x9 0 dst = (src != 0) ? (dst % src) : dst
+SMOD 0x9 1 dst = (src != 0) ? (dst s% src) : dst
+XOR 0xa 0 dst ^= src
+MOV 0xb 0 dst = src
+MOVSX 0xb 8/16/32 dst = (s8,s16,s32)src
+ARSH 0xc 0 :term:`sign extending<Sign Extend>` dst >>= (src & mask)
+END 0xd 0 byte swap operations (see `Byte swap instructions`_ below)
+===== ===== ======= ==========================================================
Underflow and overflow are allowed during arithmetic operations, meaning
the 64-bit or 32-bit value will wrap. If BPF program execution would
result in division by zero, the destination register is instead set to zero.
-If execution would result in modulo by zero, for ``BPF_ALU64`` the value of
-the destination register is unchanged whereas for ``BPF_ALU`` the upper
+If execution would result in modulo by zero, for ``ALU64`` the value of
+the destination register is unchanged whereas for ``ALU`` the upper
32 bits of the destination register are zeroed.
-``BPF_ADD | BPF_X | BPF_ALU`` means::
+``{ADD, X, ALU}``, where 'code' = ``ADD``, 'source' = ``X``, and 'class' = ``ALU``, means::
dst = (u32) ((u32) dst + (u32) src)
where '(u32)' indicates that the upper 32 bits are zeroed.
-``BPF_ADD | BPF_X | BPF_ALU64`` means::
+``{ADD, X, ALU64}`` means::
dst = dst + src
-``BPF_XOR | BPF_K | BPF_ALU`` means::
+``{XOR, K, ALU}`` means::
- dst = (u32) dst ^ (u32) imm32
+ dst = (u32) dst ^ (u32) imm
-``BPF_XOR | BPF_K | BPF_ALU64`` means::
+``{XOR, K, ALU64}`` means::
- dst = dst ^ imm32
+ dst = dst ^ imm
Note that most instructions have instruction offset of 0. Only three instructions
-(``BPF_SDIV``, ``BPF_SMOD``, ``BPF_MOVSX``) have a non-zero offset.
+(``SDIV``, ``SMOD``, ``MOVSX``) have a non-zero offset.
+Division, multiplication, and modulo operations for ``ALU`` are part
+of the "divmul32" conformance group, and division, multiplication, and
+modulo operations for ``ALU64`` are part of the "divmul64" conformance
+group.
The division and modulo operations support both unsigned and signed flavors.
-For unsigned operations (``BPF_DIV`` and ``BPF_MOD``), for ``BPF_ALU``,
-'imm' is interpreted as a 32-bit unsigned value. For ``BPF_ALU64``,
+For unsigned operations (``DIV`` and ``MOD``), for ``ALU``,
+'imm' is interpreted as a 32-bit unsigned value. For ``ALU64``,
'imm' is first :term:`sign extended<Sign Extend>` from 32 to 64 bits, and then
interpreted as a 64-bit unsigned value.
-For signed operations (``BPF_SDIV`` and ``BPF_SMOD``), for ``BPF_ALU``,
-'imm' is interpreted as a 32-bit signed value. For ``BPF_ALU64``, 'imm'
+For signed operations (``SDIV`` and ``SMOD``), for ``ALU``,
+'imm' is interpreted as a 32-bit signed value. For ``ALU64``, 'imm'
is first :term:`sign extended<Sign Extend>` from 32 to 64 bits, and then
interpreted as a 64-bit signed value.
@@ -291,11 +369,15 @@ etc. This specification requires that signed modulo use truncated division
a % n = a - n * trunc(a / n)
-The ``BPF_MOVSX`` instruction does a move operation with sign extension.
-``BPF_ALU | BPF_MOVSX`` :term:`sign extends<Sign Extend>` 8-bit and 16-bit operands into 32
+The ``MOVSX`` instruction does a move operation with sign extension.
+``{MOVSX, X, ALU}`` :term:`sign extends<Sign Extend>` 8-bit and 16-bit operands into 32
bit operands, and zeroes the remaining upper 32 bits.
-``BPF_ALU64 | BPF_MOVSX`` :term:`sign extends<Sign Extend>` 8-bit, 16-bit, and 32-bit
-operands into 64 bit operands.
+``{MOVSX, X, ALU64}`` :term:`sign extends<Sign Extend>` 8-bit, 16-bit, and 32-bit
+operands into 64 bit operands. Unlike other arithmetic instructions,
+``MOVSX`` is only defined for register source operands (``X``).
+
+The ``NEG`` instruction is only defined when the source bit is clear
+(``K``).
Shift operations use a mask of 0x3F (63) for 64-bit operations and 0x1F (31)
for 32-bit operations.
@@ -303,43 +385,45 @@ for 32-bit operations.
Byte swap instructions
----------------------
-The byte swap instructions use instruction classes of ``BPF_ALU`` and ``BPF_ALU64``
-and a 4-bit 'code' field of ``BPF_END``.
+The byte swap instructions use instruction classes of ``ALU`` and ``ALU64``
+and a 4-bit 'code' field of ``END``.
The byte swap instructions operate on the destination register
only and do not use a separate source register or immediate value.
-For ``BPF_ALU``, the 1-bit source operand field in the opcode is used to
+For ``ALU``, the 1-bit source operand field in the opcode is used to
select what byte order the operation converts from or to. For
-``BPF_ALU64``, the 1-bit source operand field in the opcode is reserved
+``ALU64``, the 1-bit source operand field in the opcode is reserved
and must be set to 0.
-========= ========= ===== =================================================
-class source value description
-========= ========= ===== =================================================
-BPF_ALU BPF_TO_LE 0x00 convert between host byte order and little endian
-BPF_ALU BPF_TO_BE 0x08 convert between host byte order and big endian
-BPF_ALU64 Reserved 0x00 do byte swap unconditionally
-========= ========= ===== =================================================
+===== ======== ===== =================================================
+class source value description
+===== ======== ===== =================================================
+ALU TO_LE 0 convert between host byte order and little endian
+ALU TO_BE 1 convert between host byte order and big endian
+ALU64 Reserved 0 do byte swap unconditionally
+===== ======== ===== =================================================
The 'imm' field encodes the width of the swap operations. The following widths
-are supported: 16, 32 and 64.
+are supported: 16, 32 and 64. Width 64 operations belong to the base64
+conformance group and other swap operations belong to the base32
+conformance group.
Examples:
-``BPF_ALU | BPF_TO_LE | BPF_END`` with imm = 16/32/64 means::
+``{END, TO_LE, ALU}`` with imm = 16/32/64 means::
dst = htole16(dst)
dst = htole32(dst)
dst = htole64(dst)
-``BPF_ALU | BPF_TO_BE | BPF_END`` with imm = 16/32/64 means::
+``{END, TO_BE, ALU}`` with imm = 16/32/64 means::
dst = htobe16(dst)
dst = htobe32(dst)
dst = htobe64(dst)
-``BPF_ALU64 | BPF_TO_LE | BPF_END`` with imm = 16/32/64 means::
+``{END, TO_LE, ALU64}`` with imm = 16/32/64 means::
dst = bswap16(dst)
dst = bswap32(dst)
@@ -348,56 +432,61 @@ Examples:
Jump instructions
-----------------
-``BPF_JMP32`` uses 32-bit wide operands while ``BPF_JMP`` uses 64-bit wide operands for
-otherwise identical operations.
+``JMP32`` uses 32-bit wide operands and indicates the base32
+conformance group, while ``JMP`` uses 64-bit wide operands for
+otherwise identical operations, and indicates the base64 conformance
+group unless otherwise specified.
The 'code' field encodes the operation as below:
-======== ===== === =========================================== =========================================
-code value src description notes
-======== ===== === =========================================== =========================================
-BPF_JA 0x0 0x0 PC += offset BPF_JMP class
-BPF_JA 0x0 0x0 PC += imm BPF_JMP32 class
-BPF_JEQ 0x1 any PC += offset if dst == src
-BPF_JGT 0x2 any PC += offset if dst > src unsigned
-BPF_JGE 0x3 any PC += offset if dst >= src unsigned
-BPF_JSET 0x4 any PC += offset if dst & src
-BPF_JNE 0x5 any PC += offset if dst != src
-BPF_JSGT 0x6 any PC += offset if dst > src signed
-BPF_JSGE 0x7 any PC += offset if dst >= src signed
-BPF_CALL 0x8 0x0 call helper function by address see `Helper functions`_
-BPF_CALL 0x8 0x1 call PC += imm see `Program-local functions`_
-BPF_CALL 0x8 0x2 call helper function by BTF ID see `Helper functions`_
-BPF_EXIT 0x9 0x0 return BPF_JMP only
-BPF_JLT 0xa any PC += offset if dst < src unsigned
-BPF_JLE 0xb any PC += offset if dst <= src unsigned
-BPF_JSLT 0xc any PC += offset if dst < src signed
-BPF_JSLE 0xd any PC += offset if dst <= src signed
-======== ===== === =========================================== =========================================
-
-The BPF program needs to store the return value into register R0 before doing a
-``BPF_EXIT``.
+======== ===== ======= =============================== ===================================================
+code value src_reg description notes
+======== ===== ======= =============================== ===================================================
+JA 0x0 0x0 PC += offset {JA, K, JMP} only
+JA 0x0 0x0 PC += imm {JA, K, JMP32} only
+JEQ 0x1 any PC += offset if dst == src
+JGT 0x2 any PC += offset if dst > src unsigned
+JGE 0x3 any PC += offset if dst >= src unsigned
+JSET 0x4 any PC += offset if dst & src
+JNE 0x5 any PC += offset if dst != src
+JSGT 0x6 any PC += offset if dst > src signed
+JSGE 0x7 any PC += offset if dst >= src signed
+CALL 0x8 0x0 call helper function by address {CALL, K, JMP} only, see `Helper functions`_
+CALL 0x8 0x1 call PC += imm {CALL, K, JMP} only, see `Program-local functions`_
+CALL 0x8 0x2 call helper function by BTF ID {CALL, K, JMP} only, see `Helper functions`_
+EXIT 0x9 0x0 return {CALL, K, JMP} only
+JLT 0xa any PC += offset if dst < src unsigned
+JLE 0xb any PC += offset if dst <= src unsigned
+JSLT 0xc any PC += offset if dst < src signed
+JSLE 0xd any PC += offset if dst <= src signed
+======== ===== ======= =============================== ===================================================
+
+The BPF program needs to store the return value into register R0 before doing an
+``EXIT``.
Example:
-``BPF_JSGE | BPF_X | BPF_JMP32`` (0x7e) means::
+``{JSGE, X, JMP32}`` means::
if (s32)dst s>= (s32)src goto +offset
where 's>=' indicates a signed '>=' comparison.
-``BPF_JA | BPF_K | BPF_JMP32`` (0x06) means::
+``{JA, K, JMP32}`` means::
gotol +imm
where 'imm' means the branch offset comes from insn 'imm' field.
-Note that there are two flavors of ``BPF_JA`` instructions. The
-``BPF_JMP`` class permits a 16-bit jump offset specified by the 'offset'
-field, whereas the ``BPF_JMP32`` class permits a 32-bit jump offset
+Note that there are two flavors of ``JA`` instructions. The
+``JMP`` class permits a 16-bit jump offset specified by the 'offset'
+field, whereas the ``JMP32`` class permits a 32-bit jump offset
specified by the 'imm' field. A > 16-bit conditional jump may be
converted to a < 16-bit conditional jump plus a 32-bit unconditional
jump.
+All ``CALL`` and ``JA`` instructions belong to the
+base32 conformance group.
+
Helper functions
~~~~~~~~~~~~~~~~
@@ -416,78 +505,83 @@ Program-local functions
~~~~~~~~~~~~~~~~~~~~~~~
Program-local functions are functions exposed by the same BPF program as the
caller, and are referenced by offset from the call instruction, similar to
-``BPF_JA``. The offset is encoded in the imm field of the call instruction.
-A ``BPF_EXIT`` within the program-local function will return to the caller.
+``JA``. The offset is encoded in the imm field of the call instruction.
+A ``EXIT`` within the program-local function will return to the caller.
Load and store instructions
===========================
-For load and store instructions (``BPF_LD``, ``BPF_LDX``, ``BPF_ST``, and ``BPF_STX``), the
-8-bit 'opcode' field is divided as:
-
-============ ====== =================
-3 bits (MSB) 2 bits 3 bits (LSB)
-============ ====== =================
-mode size instruction class
-============ ====== =================
-
-The mode modifier is one of:
-
- ============= ===== ==================================== =============
- mode modifier value description reference
- ============= ===== ==================================== =============
- BPF_IMM 0x00 64-bit immediate instructions `64-bit immediate instructions`_
- BPF_ABS 0x20 legacy BPF packet access (absolute) `Legacy BPF Packet access instructions`_
- BPF_IND 0x40 legacy BPF packet access (indirect) `Legacy BPF Packet access instructions`_
- BPF_MEM 0x60 regular load and store operations `Regular load and store operations`_
- BPF_MEMSX 0x80 sign-extension load operations `Sign-extension load operations`_
- BPF_ATOMIC 0xc0 atomic operations `Atomic operations`_
- ============= ===== ==================================== =============
-
-The size modifier is one of:
-
- ============= ===== =====================
- size modifier value description
- ============= ===== =====================
- BPF_W 0x00 word (4 bytes)
- BPF_H 0x08 half word (2 bytes)
- BPF_B 0x10 byte
- BPF_DW 0x18 double word (8 bytes)
- ============= ===== =====================
+For load and store instructions (``LD``, ``LDX``, ``ST``, and ``STX``), the
+8-bit 'opcode' field is divided as::
+
+ +-+-+-+-+-+-+-+-+
+ |mode |sz |class|
+ +-+-+-+-+-+-+-+-+
+
+**mode**
+ The mode modifier is one of:
+
+ ============= ===== ==================================== =============
+ mode modifier value description reference
+ ============= ===== ==================================== =============
+ IMM 0 64-bit immediate instructions `64-bit immediate instructions`_
+ ABS 1 legacy BPF packet access (absolute) `Legacy BPF Packet access instructions`_
+ IND 2 legacy BPF packet access (indirect) `Legacy BPF Packet access instructions`_
+ MEM 3 regular load and store operations `Regular load and store operations`_
+ MEMSX 4 sign-extension load operations `Sign-extension load operations`_
+ ATOMIC 6 atomic operations `Atomic operations`_
+ ============= ===== ==================================== =============
+
+**sz (size)**
+ The size modifier is one of:
+
+ ==== ===== =====================
+ size value description
+ ==== ===== =====================
+ W 0 word (4 bytes)
+ H 1 half word (2 bytes)
+ B 2 byte
+ DW 3 double word (8 bytes)
+ ==== ===== =====================
+
+ Instructions using ``DW`` belong to the base64 conformance group.
+
+**class**
+ The instruction class (see `Instruction classes`_)
Regular load and store operations
---------------------------------
-The ``BPF_MEM`` mode modifier is used to encode regular load and store
+The ``MEM`` mode modifier is used to encode regular load and store
instructions that transfer data between a register and memory.
-``BPF_MEM | <size> | BPF_STX`` means::
+``{MEM, <size>, STX}`` means::
*(size *) (dst + offset) = src
-``BPF_MEM | <size> | BPF_ST`` means::
+``{MEM, <size>, ST}`` means::
- *(size *) (dst + offset) = imm32
+ *(size *) (dst + offset) = imm
-``BPF_MEM | <size> | BPF_LDX`` means::
+``{MEM, <size>, LDX}`` means::
dst = *(unsigned size *) (src + offset)
-Where size is one of: ``BPF_B``, ``BPF_H``, ``BPF_W``, or ``BPF_DW`` and
-'unsigned size' is one of u8, u16, u32 or u64.
+Where '<size>' is one of: ``B``, ``H``, ``W``, or ``DW``, and
+'unsigned size' is one of: u8, u16, u32, or u64.
Sign-extension load operations
------------------------------
-The ``BPF_MEMSX`` mode modifier is used to encode :term:`sign-extension<Sign Extend>` load
+The ``MEMSX`` mode modifier is used to encode :term:`sign-extension<Sign Extend>` load
instructions that transfer data between a register and memory.
-``BPF_MEMSX | <size> | BPF_LDX`` means::
+``{MEMSX, <size>, LDX}`` means::
dst = *(signed size *) (src + offset)
-Where size is one of: ``BPF_B``, ``BPF_H`` or ``BPF_W``, and
-'signed size' is one of s8, s16 or s32.
+Where size is one of: ``B``, ``H``, or ``W``, and
+'signed size' is one of: s8, s16, or s32.
Atomic operations
-----------------
@@ -497,10 +591,12 @@ interrupted or corrupted by other access to the same memory region
by other BPF programs or means outside of this specification.
All atomic operations supported by BPF are encoded as store operations
-that use the ``BPF_ATOMIC`` mode modifier as follows:
+that use the ``ATOMIC`` mode modifier as follows:
-* ``BPF_ATOMIC | BPF_W | BPF_STX`` for 32-bit operations
-* ``BPF_ATOMIC | BPF_DW | BPF_STX`` for 64-bit operations
+* ``{ATOMIC, W, STX}`` for 32-bit operations, which are
+ part of the "atomic32" conformance group.
+* ``{ATOMIC, DW, STX}`` for 64-bit operations, which are
+ part of the "atomic64" conformance group.
* 8-bit and 16-bit wide atomic operations are not supported.
The 'imm' field is used to encode the actual atomic operation.
@@ -510,18 +606,18 @@ arithmetic operations in the 'imm' field to encode the atomic operation:
======== ===== ===========
imm value description
======== ===== ===========
-BPF_ADD 0x00 atomic add
-BPF_OR 0x40 atomic or
-BPF_AND 0x50 atomic and
-BPF_XOR 0xa0 atomic xor
+ADD 0x00 atomic add
+OR 0x40 atomic or
+AND 0x50 atomic and
+XOR 0xa0 atomic xor
======== ===== ===========
-``BPF_ATOMIC | BPF_W | BPF_STX`` with 'imm' = BPF_ADD means::
+``{ATOMIC, W, STX}`` with 'imm' = ADD means::
*(u32 *)(dst + offset) += src
-``BPF_ATOMIC | BPF_DW | BPF_STX`` with 'imm' = BPF ADD means::
+``{ATOMIC, DW, STX}`` with 'imm' = ADD means::
*(u64 *)(dst + offset) += src
@@ -531,20 +627,20 @@ two complex atomic operations:
=========== ================ ===========================
imm value description
=========== ================ ===========================
-BPF_FETCH 0x01 modifier: return old value
-BPF_XCHG 0xe0 | BPF_FETCH atomic exchange
-BPF_CMPXCHG 0xf0 | BPF_FETCH atomic compare and exchange
+FETCH 0x01 modifier: return old value
+XCHG 0xe0 | FETCH atomic exchange
+CMPXCHG 0xf0 | FETCH atomic compare and exchange
=========== ================ ===========================
-The ``BPF_FETCH`` modifier is optional for simple atomic operations, and
-always set for the complex atomic operations. If the ``BPF_FETCH`` flag
+The ``FETCH`` modifier is optional for simple atomic operations, and
+always set for the complex atomic operations. If the ``FETCH`` flag
is set, then the operation also overwrites ``src`` with the value that
was in memory before it was modified.
-The ``BPF_XCHG`` operation atomically exchanges ``src`` with the value
+The ``XCHG`` operation atomically exchanges ``src`` with the value
addressed by ``dst + offset``.
-The ``BPF_CMPXCHG`` operation atomically compares the value addressed by
+The ``CMPXCHG`` operation atomically compares the value addressed by
``dst + offset`` with ``R0``. If they match, the value addressed by
``dst + offset`` is replaced with ``src``. In either case, the
value that was at ``dst + offset`` before the operation is zero-extended
@@ -553,25 +649,25 @@ and loaded back to ``R0``.
64-bit immediate instructions
-----------------------------
-Instructions with the ``BPF_IMM`` 'mode' modifier use the wide instruction
-encoding defined in `Instruction encoding`_, and use the 'src' field of the
+Instructions with the ``IMM`` 'mode' modifier use the wide instruction
+encoding defined in `Instruction encoding`_, and use the 'src_reg' field of the
basic instruction to hold an opcode subtype.
-The following table defines a set of ``BPF_IMM | BPF_DW | BPF_LD`` instructions
-with opcode subtypes in the 'src' field, using new terms such as "map"
+The following table defines a set of ``{IMM, DW, LD}`` instructions
+with opcode subtypes in the 'src_reg' field, using new terms such as "map"
defined further below:
-========================= ====== === ========================================= =========== ==============
-opcode construction opcode src pseudocode imm type dst type
-========================= ====== === ========================================= =========== ==============
-BPF_IMM | BPF_DW | BPF_LD 0x18 0x0 dst = imm64 integer integer
-BPF_IMM | BPF_DW | BPF_LD 0x18 0x1 dst = map_by_fd(imm) map fd map
-BPF_IMM | BPF_DW | BPF_LD 0x18 0x2 dst = map_val(map_by_fd(imm)) + next_imm map fd data pointer
-BPF_IMM | BPF_DW | BPF_LD 0x18 0x3 dst = var_addr(imm) variable id data pointer
-BPF_IMM | BPF_DW | BPF_LD 0x18 0x4 dst = code_addr(imm) integer code pointer
-BPF_IMM | BPF_DW | BPF_LD 0x18 0x5 dst = map_by_idx(imm) map index map
-BPF_IMM | BPF_DW | BPF_LD 0x18 0x6 dst = map_val(map_by_idx(imm)) + next_imm map index data pointer
-========================= ====== === ========================================= =========== ==============
+======= ========================================= =========== ==============
+src_reg pseudocode imm type dst type
+======= ========================================= =========== ==============
+0x0 dst = (next_imm << 32) | imm integer integer
+0x1 dst = map_by_fd(imm) map fd map
+0x2 dst = map_val(map_by_fd(imm)) + next_imm map fd data pointer
+0x3 dst = var_addr(imm) variable id data pointer
+0x4 dst = code_addr(imm) integer code pointer
+0x5 dst = map_by_idx(imm) map index map
+0x6 dst = map_val(map_by_idx(imm)) + next_imm map index data pointer
+======= ========================================= =========== ==============
where
@@ -609,5 +705,9 @@ Legacy BPF Packet access instructions
-------------------------------------
BPF previously introduced special instructions for access to packet data that were
-carried over from classic BPF. However, these instructions are
-deprecated and should no longer be used.
+carried over from classic BPF. These instructions used an instruction
+class of ``LD``, a size modifier of ``W``, ``H``, or ``B``, and a
+mode modifier of ``ABS`` or ``IND``. The 'dst_reg' and 'offset' fields were
+set to zero, and 'src_reg' was set to zero for ``ABS``. However, these
+instructions are deprecated and should no longer be used. All legacy packet
+access instructions belong to the "packet" conformance group.
diff --git a/Documentation/bpf/verifier.rst b/Documentation/bpf/verifier.rst
index f0ec19db301c69..356894399fbf83 100644
--- a/Documentation/bpf/verifier.rst
+++ b/Documentation/bpf/verifier.rst
@@ -562,7 +562,7 @@ works::
* ``checkpoint[0].r1`` is marked as read;
* At instruction #5 exit is reached and ``checkpoint[0]`` can now be processed
- by ``clean_live_states()``. After this processing ``checkpoint[0].r0`` has a
+ by ``clean_live_states()``. After this processing ``checkpoint[0].r1`` has a
read mark and all other registers and stack slots are marked as ``NOT_INIT``
or ``STACK_INVALID``
diff --git a/Documentation/dev-tools/kselftest.rst b/Documentation/dev-tools/kselftest.rst
index 7f3582a67318be..ff10dc6eef5d91 100644
--- a/Documentation/dev-tools/kselftest.rst
+++ b/Documentation/dev-tools/kselftest.rst
@@ -259,9 +259,21 @@ Contributing new tests (details)
TEST_PROGS_EXTENDED, TEST_GEN_PROGS_EXTENDED mean it is the
executable which is not tested by default.
+
TEST_FILES, TEST_GEN_FILES mean it is the file which is used by
test.
+ TEST_INCLUDES is similar to TEST_FILES, it lists files which should be
+ included when exporting or installing the tests, with the following
+ differences:
+
+ * symlinks to files in other directories are preserved
+ * the part of paths below tools/testing/selftests/ is preserved when
+ copying the files to the output directory
+
+ TEST_INCLUDES is meant to list dependencies located in other directories of
+ the selftests hierarchy.
+
* First use the headers inside the kernel source and/or git repo, and then the
system headers. Headers for the kernel release as opposed to headers
installed by the distro on the system should be the primary focus to be able
diff --git a/Documentation/devicetree/bindings/leds/common.yaml b/Documentation/devicetree/bindings/leds/common.yaml
index 55a8d1385e2104..8a3c2398b10ce0 100644
--- a/Documentation/devicetree/bindings/leds/common.yaml
+++ b/Documentation/devicetree/bindings/leds/common.yaml
@@ -200,6 +200,18 @@ properties:
#trigger-source-cells property in the source node.
$ref: /schemas/types.yaml#/definitions/phandle-array
+ active-low:
+ type: boolean
+ description:
+ Makes LED active low. To turn the LED ON, line needs to be
+ set to low voltage instead of high.
+
+ inactive-high-impedance:
+ type: boolean
+ description:
+ Set LED to high-impedance mode to turn the LED OFF. LED might also
+ describe this mode as tristate.
+
# Required properties for flash LED child nodes:
flash-max-microamp:
description:
diff --git a/Documentation/devicetree/bindings/leds/leds-bcm63138.yaml b/Documentation/devicetree/bindings/leds/leds-bcm63138.yaml
index 52252fb6bb321d..bb20394fca5c38 100644
--- a/Documentation/devicetree/bindings/leds/leds-bcm63138.yaml
+++ b/Documentation/devicetree/bindings/leds/leds-bcm63138.yaml
@@ -52,10 +52,6 @@ patternProperties:
maxItems: 1
description: LED pin number
- active-low:
- type: boolean
- description: Makes LED active low
-
required:
- reg
diff --git a/Documentation/devicetree/bindings/leds/leds-bcm6328.yaml b/Documentation/devicetree/bindings/leds/leds-bcm6328.yaml
index 51cc0d82c12eb8..f3a3ef99292995 100644
--- a/Documentation/devicetree/bindings/leds/leds-bcm6328.yaml
+++ b/Documentation/devicetree/bindings/leds/leds-bcm6328.yaml
@@ -78,10 +78,6 @@ patternProperties:
- maximum: 23
description: LED pin number (only LEDs 0 to 23 are valid).
- active-low:
- type: boolean
- description: Makes LED active low.
-
brcm,hardware-controlled:
type: boolean
description: Makes this LED hardware controlled.
diff --git a/Documentation/devicetree/bindings/leds/leds-bcm6358.txt b/Documentation/devicetree/bindings/leds/leds-bcm6358.txt
index 6e51c6b91ee54c..211ffc3c4a2012 100644
--- a/Documentation/devicetree/bindings/leds/leds-bcm6358.txt
+++ b/Documentation/devicetree/bindings/leds/leds-bcm6358.txt
@@ -25,8 +25,6 @@ LED sub-node required properties:
LED sub-node optional properties:
- label : see Documentation/devicetree/bindings/leds/common.txt
- - active-low : Boolean, makes LED active low.
- Default : false
- default-state : see
Documentation/devicetree/bindings/leds/common.txt
- linux,default-trigger : see
diff --git a/Documentation/devicetree/bindings/leds/leds-pwm-multicolor.yaml b/Documentation/devicetree/bindings/leds/leds-pwm-multicolor.yaml
index bd6ec04a87277f..a31a202afe5ccf 100644
--- a/Documentation/devicetree/bindings/leds/leds-pwm-multicolor.yaml
+++ b/Documentation/devicetree/bindings/leds/leds-pwm-multicolor.yaml
@@ -41,9 +41,7 @@ properties:
pwm-names: true
- active-low:
- description: For PWMs where the LED is wired to supply rather than ground.
- type: boolean
+ active-low: true
color: true
diff --git a/Documentation/devicetree/bindings/leds/leds-pwm.yaml b/Documentation/devicetree/bindings/leds/leds-pwm.yaml
index 7de6da58be3c53..113b7c218303ad 100644
--- a/Documentation/devicetree/bindings/leds/leds-pwm.yaml
+++ b/Documentation/devicetree/bindings/leds/leds-pwm.yaml
@@ -34,11 +34,6 @@ patternProperties:
Maximum brightness possible for the LED
$ref: /schemas/types.yaml#/definitions/uint32
- active-low:
- description:
- For PWMs where the LED is wired to supply rather than ground.
- type: boolean
-
required:
- pwms
- max-brightness
diff --git a/Documentation/devicetree/bindings/net/brcm,asp-v2.0.yaml b/Documentation/devicetree/bindings/net/brcm,asp-v2.0.yaml
index 75d8138298fbbe..660e2ca42daf50 100644
--- a/Documentation/devicetree/bindings/net/brcm,asp-v2.0.yaml
+++ b/Documentation/devicetree/bindings/net/brcm,asp-v2.0.yaml
@@ -17,6 +17,10 @@ properties:
oneOf:
- items:
- enum:
+ - brcm,bcm74165b0-asp
+ - const: brcm,asp-v2.2
+ - items:
+ - enum:
- brcm,bcm74165-asp
- const: brcm,asp-v2.1
- items:
diff --git a/Documentation/devicetree/bindings/net/brcm,unimac-mdio.yaml b/Documentation/devicetree/bindings/net/brcm,unimac-mdio.yaml
index 6684810fcbf01c..23dfe0838dca48 100644
--- a/Documentation/devicetree/bindings/net/brcm,unimac-mdio.yaml
+++ b/Documentation/devicetree/bindings/net/brcm,unimac-mdio.yaml
@@ -24,6 +24,7 @@ properties:
- brcm,genet-mdio-v5
- brcm,asp-v2.0-mdio
- brcm,asp-v2.1-mdio
+ - brcm,asp-v2.2-mdio
- brcm,unimac-mdio
reg:
diff --git a/Documentation/devicetree/bindings/net/can/tcan4x5x.txt b/Documentation/devicetree/bindings/net/can/tcan4x5x.txt
index 170e23f0610db9..20c0572c985342 100644
--- a/Documentation/devicetree/bindings/net/can/tcan4x5x.txt
+++ b/Documentation/devicetree/bindings/net/can/tcan4x5x.txt
@@ -28,6 +28,8 @@ Optional properties:
available with tcan4552/4553.
- device-wake-gpios: Wake up GPIO to wake up the TCAN device. Not
available with tcan4552/4553.
+ - wakeup-source: Leave the chip running when suspended, and configure
+ the RX interrupt to wake up the device.
Example:
tcan4x5x: tcan4x5x@0 {
@@ -42,4 +44,5 @@ tcan4x5x: tcan4x5x@0 {
device-state-gpios = <&gpio3 21 GPIO_ACTIVE_HIGH>;
device-wake-gpios = <&gpio1 15 GPIO_ACTIVE_HIGH>;
reset-gpios = <&gpio1 27 GPIO_ACTIVE_HIGH>;
+ wakeup-source;
};
diff --git a/Documentation/devicetree/bindings/net/can/xilinx,can.yaml b/Documentation/devicetree/bindings/net/can/xilinx,can.yaml
index 64d57c343e6f0a..8d4e5af6fd6c84 100644
--- a/Documentation/devicetree/bindings/net/can/xilinx,can.yaml
+++ b/Documentation/devicetree/bindings/net/can/xilinx,can.yaml
@@ -49,6 +49,10 @@ properties:
resets:
maxItems: 1
+ xlnx,has-ecc:
+ $ref: /schemas/types.yaml#/definitions/flag
+ description: CAN TX_OL, TX_TL and RX FIFOs have ECC support(AXI CAN)
+
required:
- compatible
- reg
@@ -137,6 +141,7 @@ examples:
interrupts = <GIC_SPI 59 IRQ_TYPE_EDGE_RISING>;
tx-fifo-depth = <0x40>;
rx-fifo-depth = <0x40>;
+ xlnx,has-ecc;
};
- |
diff --git a/Documentation/devicetree/bindings/net/cdns,macb.yaml b/Documentation/devicetree/bindings/net/cdns,macb.yaml
index bf8894a0257e9d..2c71e2cf3a2fd2 100644
--- a/Documentation/devicetree/bindings/net/cdns,macb.yaml
+++ b/Documentation/devicetree/bindings/net/cdns,macb.yaml
@@ -59,6 +59,11 @@ properties:
- cdns,gem # Generic
- cdns,macb # Generic
+ - items:
+ - enum:
+ - microchip,sam9x7-gem # Microchip SAM9X7 gigabit ethernet interface
+ - const: microchip,sama7g5-gem # Microchip SAMA7G5 gigabit ethernet interface
+
reg:
minItems: 1
items:
diff --git a/Documentation/devicetree/bindings/net/dsa/ar9331.txt b/Documentation/devicetree/bindings/net/dsa/ar9331.txt
deleted file mode 100644
index f824fdae0da20a..00000000000000
--- a/Documentation/devicetree/bindings/net/dsa/ar9331.txt
+++ /dev/null
@@ -1,147 +0,0 @@
-Atheros AR9331 built-in switch
-=============================
-
-It is a switch built-in to Atheros AR9331 WiSoC and addressable over internal
-MDIO bus. All PHYs are built-in as well.
-
-Required properties:
-
- - compatible: should be: "qca,ar9331-switch"
- - reg: Address on the MII bus for the switch.
- - resets : Must contain an entry for each entry in reset-names.
- - reset-names : Must include the following entries: "switch"
- - interrupt-parent: Phandle to the parent interrupt controller
- - interrupts: IRQ line for the switch
- - interrupt-controller: Indicates the switch is itself an interrupt
- controller. This is used for the PHY interrupts.
- - #interrupt-cells: must be 1
- - mdio: Container of PHY and devices on the switches MDIO bus.
-
-See Documentation/devicetree/bindings/net/dsa/dsa.txt for a list of additional
-required and optional properties.
-Examples:
-
-eth0: ethernet@19000000 {
- compatible = "qca,ar9330-eth";
- reg = <0x19000000 0x200>;
- interrupts = <4>;
-
- resets = <&rst 9>, <&rst 22>;
- reset-names = "mac", "mdio";
- clocks = <&pll ATH79_CLK_AHB>, <&pll ATH79_CLK_AHB>;
- clock-names = "eth", "mdio";
-
- phy-mode = "mii";
- phy-handle = <&phy_port4>;
-};
-
-eth1: ethernet@1a000000 {
- compatible = "qca,ar9330-eth";
- reg = <0x1a000000 0x200>;
- interrupts = <5>;
- resets = <&rst 13>, <&rst 23>;
- reset-names = "mac", "mdio";
- clocks = <&pll ATH79_CLK_AHB>, <&pll ATH79_CLK_AHB>;
- clock-names = "eth", "mdio";
-
- phy-mode = "gmii";
-
- fixed-link {
- speed = <1000>;
- full-duplex;
- };
-
- mdio {
- #address-cells = <1>;
- #size-cells = <0>;
-
- switch10: switch@10 {
- #address-cells = <1>;
- #size-cells = <0>;
-
- compatible = "qca,ar9331-switch";
- reg = <0x10>;
- resets = <&rst 8>;
- reset-names = "switch";
-
- interrupt-parent = <&miscintc>;
- interrupts = <12>;
-
- interrupt-controller;
- #interrupt-cells = <1>;
-
- ports {
- #address-cells = <1>;
- #size-cells = <0>;
-
- switch_port0: port@0 {
- reg = <0x0>;
- ethernet = <&eth1>;
-
- phy-mode = "gmii";
-
- fixed-link {
- speed = <1000>;
- full-duplex;
- };
- };
-
- switch_port1: port@1 {
- reg = <0x1>;
- phy-handle = <&phy_port0>;
- phy-mode = "internal";
- };
-
- switch_port2: port@2 {
- reg = <0x2>;
- phy-handle = <&phy_port1>;
- phy-mode = "internal";
- };
-
- switch_port3: port@3 {
- reg = <0x3>;
- phy-handle = <&phy_port2>;
- phy-mode = "internal";
- };
-
- switch_port4: port@4 {
- reg = <0x4>;
- phy-handle = <&phy_port3>;
- phy-mode = "internal";
- };
- };
-
- mdio {
- #address-cells = <1>;
- #size-cells = <0>;
-
- interrupt-parent = <&switch10>;
-
- phy_port0: phy@0 {
- reg = <0x0>;
- interrupts = <0>;
- };
-
- phy_port1: phy@1 {
- reg = <0x1>;
- interrupts = <0>;
- };
-
- phy_port2: phy@2 {
- reg = <0x2>;
- interrupts = <0>;
- };
-
- phy_port3: phy@3 {
- reg = <0x3>;
- interrupts = <0>;
- };
-
- phy_port4: phy@4 {
- reg = <0x4>;
- interrupts = <0>;
- };
- };
- };
- };
-};
diff --git a/Documentation/devicetree/bindings/net/dsa/microchip,ksz.yaml b/Documentation/devicetree/bindings/net/dsa/microchip,ksz.yaml
index c963dc09e8e12a..52acc15ebcbf36 100644
--- a/Documentation/devicetree/bindings/net/dsa/microchip,ksz.yaml
+++ b/Documentation/devicetree/bindings/net/dsa/microchip,ksz.yaml
@@ -31,6 +31,7 @@ properties:
- microchip,ksz9893
- microchip,ksz9563
- microchip,ksz8563
+ - microchip,ksz8567
reset-gpios:
description:
diff --git a/Documentation/devicetree/bindings/net/dsa/qca,ar9331.yaml b/Documentation/devicetree/bindings/net/dsa/qca,ar9331.yaml
new file mode 100644
index 00000000000000..fd9ddc59d38c31
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/dsa/qca,ar9331.yaml
@@ -0,0 +1,161 @@
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/net/dsa/qca,ar9331.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Qualcomm Atheros AR9331 built-in switch
+
+maintainers:
+ - Oleksij Rempel <o.rempel@pengutronix.de>
+
+description:
+ Qualcomm Atheros AR9331 is a switch built-in to Atheros AR9331 WiSoC and
+ addressable over internal MDIO bus. All PHYs are built-in as well.
+
+properties:
+ compatible:
+ const: qca,ar9331-switch
+
+ reg:
+ maxItems: 1
+
+ interrupts:
+ maxItems: 1
+
+ interrupt-controller: true
+
+ '#interrupt-cells':
+ const: 1
+
+ mdio:
+ $ref: /schemas/net/mdio.yaml#
+ unevaluatedProperties: false
+ properties:
+ interrupt-parent: true
+
+ patternProperties:
+ '(ethernet-)?phy@[0-4]+$':
+ type: object
+ unevaluatedProperties: false
+
+ properties:
+ reg: true
+ interrupts:
+ maxItems: 1
+
+ resets:
+ maxItems: 1
+
+ reset-names:
+ items:
+ - const: switch
+
+required:
+ - compatible
+ - reg
+ - interrupts
+ - interrupt-controller
+ - '#interrupt-cells'
+ - mdio
+ - ports
+ - resets
+ - reset-names
+
+allOf:
+ - $ref: dsa.yaml#/$defs/ethernet-ports
+
+unevaluatedProperties: false
+
+examples:
+ - |
+ mdio {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ switch10: switch@10 {
+ compatible = "qca,ar9331-switch";
+ reg = <0x10>;
+
+ interrupt-parent = <&miscintc>;
+ interrupts = <12>;
+ interrupt-controller;
+ #interrupt-cells = <1>;
+
+ resets = <&rst 8>;
+ reset-names = "switch";
+
+ ports {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ port@0 {
+ reg = <0x0>;
+ ethernet = <&eth1>;
+
+ phy-mode = "gmii";
+
+ fixed-link {
+ speed = <1000>;
+ full-duplex;
+ };
+ };
+
+ port@1 {
+ reg = <0x1>;
+ phy-handle = <&phy_port0>;
+ phy-mode = "internal";
+ };
+
+ port@2 {
+ reg = <0x2>;
+ phy-handle = <&phy_port1>;
+ phy-mode = "internal";
+ };
+
+ port@3 {
+ reg = <0x3>;
+ phy-handle = <&phy_port2>;
+ phy-mode = "internal";
+ };
+
+ port@4 {
+ reg = <0x4>;
+ phy-handle = <&phy_port3>;
+ phy-mode = "internal";
+ };
+ };
+
+ mdio {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ interrupt-parent = <&switch10>;
+
+ phy_port0: ethernet-phy@0 {
+ reg = <0x0>;
+ interrupts = <0>;
+ };
+
+ phy_port1: ethernet-phy@1 {
+ reg = <0x1>;
+ interrupts = <0>;
+ };
+
+ phy_port2: ethernet-phy@2 {
+ reg = <0x2>;
+ interrupts = <0>;
+ };
+
+ phy_port3: ethernet-phy@3 {
+ reg = <0x3>;
+ interrupts = <0>;
+ };
+
+ phy_port4: ethernet-phy@4 {
+ reg = <0x4>;
+ interrupts = <0>;
+ };
+ };
+ };
+ };
diff --git a/Documentation/devicetree/bindings/net/dsa/realtek.yaml b/Documentation/devicetree/bindings/net/dsa/realtek.yaml
index cce692f57b0800..70b6bda3cf98e5 100644
--- a/Documentation/devicetree/bindings/net/dsa/realtek.yaml
+++ b/Documentation/devicetree/bindings/net/dsa/realtek.yaml
@@ -59,6 +59,9 @@ properties:
description: GPIO to be used to reset the whole device
maxItems: 1
+ resets:
+ maxItems: 1
+
realtek,disable-leds:
type: boolean
description: |
@@ -127,7 +130,6 @@ else:
- mdc-gpios
- mdio-gpios
- mdio
- - reset-gpios
required:
- compatible
diff --git a/Documentation/devicetree/bindings/net/ethernet-controller.yaml b/Documentation/devicetree/bindings/net/ethernet-controller.yaml
index d14d123ad7a028..b2785b03139f9d 100644
--- a/Documentation/devicetree/bindings/net/ethernet-controller.yaml
+++ b/Documentation/devicetree/bindings/net/ethernet-controller.yaml
@@ -14,7 +14,6 @@ properties:
pattern: "^ethernet(@.*)?$"
label:
- $ref: /schemas/types.yaml#/definitions/string
description: Human readable label on a port of a box.
local-mac-address:
diff --git a/Documentation/devicetree/bindings/net/ethernet-phy-package.yaml b/Documentation/devicetree/bindings/net/ethernet-phy-package.yaml
new file mode 100644
index 00000000000000..e567101e6f38b1
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/ethernet-phy-package.yaml
@@ -0,0 +1,52 @@
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/net/ethernet-phy-package.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Ethernet PHY Package Common Properties
+
+maintainers:
+ - Christian Marangi <ansuelsmth@gmail.com>
+
+description:
+ PHY packages are multi-port Ethernet PHY of the same family
+ and each Ethernet PHY is affected by the global configuration
+ of the PHY package.
+
+ Each reg of the PHYs defined in the PHY package node is
+ absolute and describe the real address of the Ethernet PHY on
+ the MDIO bus.
+
+properties:
+ $nodename:
+ pattern: "^ethernet-phy-package@[a-f0-9]+$"
+
+ reg:
+ minimum: 0
+ maximum: 31
+ description:
+ The base ID number for the PHY package.
+ Commonly the ID of the first PHY in the PHY package.
+
+ Some PHY in the PHY package might be not defined but
+ still occupy ID on the device (just not attached to
+ anything) hence the PHY package reg might correspond
+ to a not attached PHY (offset 0).
+
+ '#address-cells':
+ const: 1
+
+ '#size-cells':
+ const: 0
+
+patternProperties:
+ ^ethernet-phy@[a-f0-9]+$:
+ $ref: ethernet-phy.yaml#
+
+required:
+ - reg
+ - '#address-cells'
+ - '#size-cells'
+
+additionalProperties: true
diff --git a/Documentation/devicetree/bindings/net/fsl,fec.yaml b/Documentation/devicetree/bindings/net/fsl,fec.yaml
index 8948a11c994e48..5536c06139cae5 100644
--- a/Documentation/devicetree/bindings/net/fsl,fec.yaml
+++ b/Documentation/devicetree/bindings/net/fsl,fec.yaml
@@ -224,6 +224,9 @@ properties:
Can be omitted thus no delay is observed. Delay is in range of 1ms to 1000ms.
Other delays are invalid.
+ iommus:
+ maxItems: 1
+
required:
- compatible
- reg
diff --git a/Documentation/devicetree/bindings/net/nfc/ti,trf7970a.yaml b/Documentation/devicetree/bindings/net/nfc/ti,trf7970a.yaml
index 9cc236ec42f232..d0332eb76ad263 100644
--- a/Documentation/devicetree/bindings/net/nfc/ti,trf7970a.yaml
+++ b/Documentation/devicetree/bindings/net/nfc/ti,trf7970a.yaml
@@ -73,7 +73,7 @@ examples:
#include <dt-bindings/gpio/gpio.h>
#include <dt-bindings/interrupt-controller/irq.h>
- i2c {
+ spi {
#address-cells = <1>;
#size-cells = <0>;
diff --git a/Documentation/devicetree/bindings/net/qca,qca808x.yaml b/Documentation/devicetree/bindings/net/qca,qca808x.yaml
new file mode 100644
index 00000000000000..e2552655902a38
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/qca,qca808x.yaml
@@ -0,0 +1,54 @@
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/net/qca,qca808x.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Qualcomm Atheros QCA808X PHY
+
+maintainers:
+ - Christian Marangi <ansuelsmth@gmail.com>
+
+description:
+ QCA808X PHYs can have up to 3 LEDs attached.
+ All 3 LEDs are disabled by default.
+ 2 LEDs have dedicated pins with the 3rd LED having the
+ double function of Interrupt LEDs/GPIO or additional LED.
+
+ By default this special PIN is set to LED function.
+
+allOf:
+ - $ref: ethernet-phy.yaml#
+
+properties:
+ compatible:
+ enum:
+ - ethernet-phy-id004d.d101
+
+unevaluatedProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/leds/common.h>
+
+ mdio {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ ethernet-phy@0 {
+ compatible = "ethernet-phy-id004d.d101";
+ reg = <0>;
+
+ leds {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ led@0 {
+ reg = <0>;
+ color = <LED_COLOR_ID_GREEN>;
+ function = LED_FUNCTION_WAN;
+ default-state = "keep";
+ };
+ };
+ };
+ };
diff --git a/Documentation/devicetree/bindings/net/qcom,ethqos.yaml b/Documentation/devicetree/bindings/net/qcom,ethqos.yaml
index 7bdb412a018553..69a337c7e345ea 100644
--- a/Documentation/devicetree/bindings/net/qcom,ethqos.yaml
+++ b/Documentation/devicetree/bindings/net/qcom,ethqos.yaml
@@ -37,12 +37,14 @@ properties:
items:
- description: Combined signal for various interrupt events
- description: The interrupt that occurs when Rx exits the LPI state
+ - description: The interrupt that occurs when HW safety error triggered
interrupt-names:
minItems: 1
items:
- const: macirq
- - const: eth_lpi
+ - enum: [eth_lpi, sfty]
+ - const: sfty
clocks:
maxItems: 4
@@ -89,8 +91,9 @@ examples:
<&gcc GCC_ETH_PTP_CLK>,
<&gcc GCC_ETH_RGMII_CLK>;
interrupts = <GIC_SPI 56 IRQ_TYPE_LEVEL_HIGH>,
- <GIC_SPI 55 IRQ_TYPE_LEVEL_HIGH>;
- interrupt-names = "macirq", "eth_lpi";
+ <GIC_SPI 55 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 782 IRQ_TYPE_LEVEL_HIGH>;
+ interrupt-names = "macirq", "eth_lpi", "sfty";
rx-fifo-depth = <4096>;
tx-fifo-depth = <4096>;
diff --git a/Documentation/devicetree/bindings/net/qcom,ipa.yaml b/Documentation/devicetree/bindings/net/qcom,ipa.yaml
index c30218684cfe46..53cae71d99572c 100644
--- a/Documentation/devicetree/bindings/net/qcom,ipa.yaml
+++ b/Documentation/devicetree/bindings/net/qcom,ipa.yaml
@@ -159,7 +159,7 @@ properties:
when the AP (not the modem) performs early initialization.
firmware-name:
- $ref: /schemas/types.yaml#/definitions/string
+ maxItems: 1
description:
If present, name (or relative path) of the file within the
firmware search path containing the firmware image used when
diff --git a/Documentation/devicetree/bindings/net/qcom,ipq4019-mdio.yaml b/Documentation/devicetree/bindings/net/qcom,ipq4019-mdio.yaml
index 3407e909e8a7ad..0029e197a8251e 100644
--- a/Documentation/devicetree/bindings/net/qcom,ipq4019-mdio.yaml
+++ b/Documentation/devicetree/bindings/net/qcom,ipq4019-mdio.yaml
@@ -44,6 +44,21 @@ properties:
items:
- const: gcc_mdio_ahb_clk
+ clock-frequency:
+ description:
+ The MDIO bus clock that must be output by the MDIO bus hardware, if
+ absent, the default hardware values are used.
+
+ MDC rate is feed by an external clock (fixed 100MHz) and is divider
+ internally. The default divider is /256 resulting in the default rate
+ applied of 390KHz.
+
+ To follow 802.3 standard that instruct up to 2.5MHz by default, if
+ this property is not declared and the divider is set to /256, by
+ default 1.5625Mhz is select.
+ enum: [ 390625, 781250, 1562500, 3125000, 6250000, 12500000 ]
+ default: 1562500
+
required:
- compatible
- reg
diff --git a/Documentation/devicetree/bindings/net/qcom,qca807x.yaml b/Documentation/devicetree/bindings/net/qcom,qca807x.yaml
new file mode 100644
index 00000000000000..7290024024f526
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/qcom,qca807x.yaml
@@ -0,0 +1,184 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/net/qcom,qca807x.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Qualcomm QCA807x Ethernet PHY
+
+maintainers:
+ - Christian Marangi <ansuelsmth@gmail.com>
+ - Robert Marko <robert.marko@sartura.hr>
+
+description: |
+ Qualcomm QCA8072/5 Ethernet PHY is PHY package of 2 or 5
+ IEEE 802.3 clause 22 compliant 10BASE-Te, 100BASE-TX and
+ 1000BASE-T PHY-s.
+
+ They feature 2 SerDes, one for PSGMII or QSGMII connection with
+ MAC, while second one is SGMII for connection to MAC or fiber.
+
+ Both models have a combo port that supports 1000BASE-X and
+ 100BASE-FX fiber.
+
+ Each PHY inside of QCA807x series has 4 digitally controlled
+ output only pins that natively drive LED-s for up to 2 attached
+ LEDs. Some vendor also use these 4 output for GPIO usage without
+ attaching LEDs.
+
+ Note that output pins can be set to drive LEDs OR GPIO, mixed
+ definition are not accepted.
+
+$ref: ethernet-phy-package.yaml#
+
+properties:
+ compatible:
+ enum:
+ - qcom,qca8072-package
+ - qcom,qca8075-package
+
+ qcom,package-mode:
+ description: |
+ PHY package can be configured in 3 mode following this table:
+
+ First Serdes mode Second Serdes mode
+ Option 1 PSGMII for copper Disabled
+ ports 0-4
+ Option 2 PSGMII for copper 1000BASE-X / 100BASE-FX
+ ports 0-4
+ Option 3 QSGMII for copper SGMII for
+ ports 0-3 copper port 4
+
+ PSGMII mode (option 1 or 2) is configured dynamically based on
+ the presence of a connected SFP device.
+ $ref: /schemas/types.yaml#/definitions/string
+ enum:
+ - qsgmii
+ - psgmii
+ default: psgmii
+
+ qcom,tx-drive-strength-milliwatt:
+ description: set the TX Amplifier value in mv.
+ $ref: /schemas/types.yaml#/definitions/uint32
+ enum: [140, 160, 180, 200, 220,
+ 240, 260, 280, 300, 320,
+ 400, 500, 600]
+ default: 600
+
+patternProperties:
+ ^ethernet-phy@[a-f0-9]+$:
+ $ref: ethernet-phy.yaml#
+
+ properties:
+ qcom,dac-full-amplitude:
+ description:
+ Set Analog MDI driver amplitude to FULL.
+
+ With this not defined, amplitude is set to DSP.
+ (amplitude is adjusted based on cable length)
+
+ With this enabled and qcom,dac-full-bias-current
+ and qcom,dac-disable-bias-current-tweak disabled,
+ bias current is half.
+ type: boolean
+
+ qcom,dac-full-bias-current:
+ description:
+ Set Analog MDI driver bias current to FULL.
+
+ With this not defined, bias current is set to DSP.
+ (bias current is adjusted based on cable length)
+
+ Actual bias current might be different with
+ qcom,dac-disable-bias-current-tweak disabled.
+ type: boolean
+
+ qcom,dac-disable-bias-current-tweak:
+ description: |
+ Set Analog MDI driver bias current to disable tweak
+ to bias current.
+
+ With this not defined, bias current tweak are enabled
+ by default.
+
+ With this enabled the following tweak are NOT applied:
+ - With both FULL amplitude and FULL bias current: bias current
+ is set to half.
+ - With only DSP amplitude: bias current is set to half and
+ is set to 1/4 with cable < 10m.
+ - With DSP bias current (included both DSP amplitude and
+ DSP bias current): bias current is half the detected current
+ with cable < 10m.
+ type: boolean
+
+ gpio-controller: true
+
+ '#gpio-cells':
+ const: 2
+
+ if:
+ required:
+ - gpio-controller
+ then:
+ properties:
+ leds: false
+
+ unevaluatedProperties: false
+
+required:
+ - compatible
+
+unevaluatedProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/leds/common.h>
+
+ mdio {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ ethernet-phy-package@0 {
+ #address-cells = <1>;
+ #size-cells = <0>;
+ compatible = "qcom,qca8075-package";
+ reg = <0>;
+
+ qcom,package-mode = "qsgmii";
+
+ ethernet-phy@0 {
+ reg = <0>;
+
+ leds {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ led@0 {
+ reg = <0>;
+ color = <LED_COLOR_ID_GREEN>;
+ function = LED_FUNCTION_LAN;
+ default-state = "keep";
+ };
+ };
+ };
+
+ ethernet-phy@1 {
+ reg = <1>;
+ };
+
+ ethernet-phy@2 {
+ reg = <2>;
+
+ gpio-controller;
+ #gpio-cells = <2>;
+ };
+
+ ethernet-phy@3 {
+ reg = <3>;
+ };
+
+ ethernet-phy@4 {
+ reg = <4>;
+ };
+ };
+ };
diff --git a/Documentation/devicetree/bindings/net/renesas,etheravb.yaml b/Documentation/devicetree/bindings/net/renesas,etheravb.yaml
index 890f7858d0dc4c..de7ba7f345a937 100644
--- a/Documentation/devicetree/bindings/net/renesas,etheravb.yaml
+++ b/Documentation/devicetree/bindings/net/renesas,etheravb.yaml
@@ -46,6 +46,7 @@ properties:
- enum:
- renesas,etheravb-r8a779a0 # R-Car V3U
- renesas,etheravb-r8a779g0 # R-Car V4H
+ - renesas,etheravb-r8a779h0 # R-Car V4M
- const: renesas,etheravb-rcar-gen4 # R-Car Gen4
- items:
diff --git a/Documentation/devicetree/bindings/net/snps,dwmac.yaml b/Documentation/devicetree/bindings/net/snps,dwmac.yaml
index 5c2769dc689af7..6b0341a8e0ea5f 100644
--- a/Documentation/devicetree/bindings/net/snps,dwmac.yaml
+++ b/Documentation/devicetree/bindings/net/snps,dwmac.yaml
@@ -95,6 +95,7 @@ properties:
- snps,dwmac-5.20
- snps,dwxgmac
- snps,dwxgmac-2.10
+ - starfive,jh7100-dwmac
- starfive,jh7110-dwmac
reg:
@@ -107,13 +108,15 @@ properties:
- description: Combined signal for various interrupt events
- description: The interrupt to manage the remote wake-up packet detection
- description: The interrupt that occurs when Rx exits the LPI state
+ - description: The interrupt that occurs when HW safety error triggered
interrupt-names:
minItems: 1
items:
- const: macirq
- - enum: [eth_wake_irq, eth_lpi]
- - const: eth_lpi
+ - enum: [eth_wake_irq, eth_lpi, sfty]
+ - enum: [eth_wake_irq, eth_lpi, sfty]
+ - enum: [eth_wake_irq, eth_lpi, sfty]
clocks:
minItems: 1
@@ -144,10 +147,12 @@ properties:
- description: AHB reset
reset-names:
- minItems: 1
- items:
- - const: stmmaceth
- - const: ahb
+ oneOf:
+ - items:
+ - enum: [stmmaceth, ahb]
+ - items:
+ - const: stmmaceth
+ - const: ahb
power-domains:
maxItems: 1
diff --git a/Documentation/devicetree/bindings/net/starfive,jh7110-dwmac.yaml b/Documentation/devicetree/bindings/net/starfive,jh7110-dwmac.yaml
index 5e7cfbbebce6cc..0d1962980f57f5 100644
--- a/Documentation/devicetree/bindings/net/starfive,jh7110-dwmac.yaml
+++ b/Documentation/devicetree/bindings/net/starfive,jh7110-dwmac.yaml
@@ -16,16 +16,20 @@ select:
compatible:
contains:
enum:
+ - starfive,jh7100-dwmac
- starfive,jh7110-dwmac
required:
- compatible
properties:
compatible:
- items:
- - enum:
- - starfive,jh7110-dwmac
- - const: snps,dwmac-5.20
+ oneOf:
+ - items:
+ - const: starfive,jh7100-dwmac
+ - const: snps,dwmac
+ - items:
+ - const: starfive,jh7110-dwmac
+ - const: snps,dwmac-5.20
reg:
maxItems: 1
@@ -46,24 +50,6 @@ properties:
- const: tx
- const: gtx
- interrupts:
- minItems: 3
- maxItems: 3
-
- interrupt-names:
- minItems: 3
- maxItems: 3
-
- resets:
- items:
- - description: MAC Reset signal.
- - description: AHB Reset signal.
-
- reset-names:
- items:
- - const: stmmaceth
- - const: ahb
-
starfive,tx-use-rgmii-clk:
description:
Tx clock is provided by external rgmii clock.
@@ -94,6 +80,48 @@ required:
allOf:
- $ref: snps,dwmac.yaml#
+ - if:
+ properties:
+ compatible:
+ contains:
+ const: starfive,jh7100-dwmac
+ then:
+ properties:
+ interrupts:
+ minItems: 2
+ maxItems: 2
+
+ interrupt-names:
+ minItems: 2
+ maxItems: 2
+
+ resets:
+ maxItems: 1
+
+ reset-names:
+ const: ahb
+
+ - if:
+ properties:
+ compatible:
+ contains:
+ const: starfive,jh7110-dwmac
+ then:
+ properties:
+ interrupts:
+ minItems: 3
+ maxItems: 3
+
+ interrupt-names:
+ minItems: 3
+ maxItems: 3
+
+ resets:
+ minItems: 2
+
+ reset-names:
+ minItems: 2
+
unevaluatedProperties: false
examples:
diff --git a/Documentation/devicetree/bindings/net/ti,cpsw-switch.yaml b/Documentation/devicetree/bindings/net/ti,cpsw-switch.yaml
index f07ae3173b03d4..d5bd93ee4dbb16 100644
--- a/Documentation/devicetree/bindings/net/ti,cpsw-switch.yaml
+++ b/Documentation/devicetree/bindings/net/ti,cpsw-switch.yaml
@@ -7,8 +7,9 @@ $schema: http://devicetree.org/meta-schemas/core.yaml#
title: TI SoC Ethernet Switch Controller (CPSW)
maintainers:
- - Grygorii Strashko <grygorii.strashko@ti.com>
- - Sekhar Nori <nsekhar@ti.com>
+ - Siddharth Vadapalli <s-vadapalli@ti.com>
+ - Ravi Gunasekaran <r-gunasekaran@ti.com>
+ - Roger Quadros <rogerq@kernel.org>
description:
The 3-port switch gigabit ethernet subsystem provides ethernet packet
diff --git a/Documentation/devicetree/bindings/net/ti,dp83822.yaml b/Documentation/devicetree/bindings/net/ti,dp83822.yaml
index db74474207ed47..784866ea392b20 100644
--- a/Documentation/devicetree/bindings/net/ti,dp83822.yaml
+++ b/Documentation/devicetree/bindings/net/ti,dp83822.yaml
@@ -62,6 +62,40 @@ properties:
for the PHY. The internal delay for the PHY is fixed to 3.5ns relative
to transmit data.
+ ti,cfg-dac-minus-one-bp:
+ description: |
+ DP83826 PHY only.
+ Sets the voltage ratio (with respect to the nominal value)
+ of the logical level -1 for the MLT-3 encoded TX data.
+ enum: [5000, 5625, 6250, 6875, 7500, 8125, 8750, 9375, 10000,
+ 10625, 11250, 11875, 12500, 13125, 13750, 14375, 15000]
+ default: 10000
+
+ ti,cfg-dac-plus-one-bp:
+ description: |
+ DP83826 PHY only.
+ Sets the voltage ratio (with respect to the nominal value)
+ of the logical level +1 for the MLT-3 encoded TX data.
+ enum: [5000, 5625, 6250, 6875, 7500, 8125, 8750, 9375, 10000,
+ 10625, 11250, 11875, 12500, 13125, 13750, 14375, 15000]
+ default: 10000
+
+ ti,rmii-mode:
+ description: |
+ If present, select the RMII operation mode. Two modes are
+ available:
+ - RMII master, where the PHY outputs a 50MHz reference clock which can
+ be connected to the MAC.
+ - RMII slave, where the PHY expects a 50MHz reference clock input
+ shared with the MAC.
+ The RMII operation mode can also be configured by its straps.
+ If the strap pin is not set correctly or not set at all, then this can be
+ used to configure it.
+ $ref: /schemas/types.yaml#/definitions/string
+ enum:
+ - master
+ - slave
+
required:
- reg
diff --git a/Documentation/devicetree/bindings/net/ti,k3-am654-cpsw-nuss.yaml b/Documentation/devicetree/bindings/net/ti,k3-am654-cpsw-nuss.yaml
index c9c25132d1544a..73ed5951d29695 100644
--- a/Documentation/devicetree/bindings/net/ti,k3-am654-cpsw-nuss.yaml
+++ b/Documentation/devicetree/bindings/net/ti,k3-am654-cpsw-nuss.yaml
@@ -7,8 +7,9 @@ $schema: http://devicetree.org/meta-schemas/core.yaml#
title: The TI AM654x/J721E/AM642x SoC Gigabit Ethernet MAC (Media Access Controller)
maintainers:
- - Grygorii Strashko <grygorii.strashko@ti.com>
- - Sekhar Nori <nsekhar@ti.com>
+ - Siddharth Vadapalli <s-vadapalli@ti.com>
+ - Ravi Gunasekaran <r-gunasekaran@ti.com>
+ - Roger Quadros <rogerq@kernel.org>
description:
The TI AM654x/J721E SoC Gigabit Ethernet MAC (CPSW2G NUSS) has two ports
diff --git a/Documentation/devicetree/bindings/net/ti,k3-am654-cpts.yaml b/Documentation/devicetree/bindings/net/ti,k3-am654-cpts.yaml
index 3e910d3b24a08b..b1c875325776d4 100644
--- a/Documentation/devicetree/bindings/net/ti,k3-am654-cpts.yaml
+++ b/Documentation/devicetree/bindings/net/ti,k3-am654-cpts.yaml
@@ -7,8 +7,9 @@ $schema: http://devicetree.org/meta-schemas/core.yaml#
title: The TI AM654x/J721E Common Platform Time Sync (CPTS) module
maintainers:
- - Grygorii Strashko <grygorii.strashko@ti.com>
- - Sekhar Nori <nsekhar@ti.com>
+ - Siddharth Vadapalli <s-vadapalli@ti.com>
+ - Ravi Gunasekaran <r-gunasekaran@ti.com>
+ - Roger Quadros <rogerq@kernel.org>
description: |+
The TI AM654x/J721E CPTS module is used to facilitate host control of time
diff --git a/Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml b/Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml
index 252207adbc54c5..eabceb849537c4 100644
--- a/Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml
+++ b/Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml
@@ -19,9 +19,6 @@ description: |
Alternatively, it can specify the wireless part of the MT7628/MT7688
or MT7622/MT7986 SoC.
-allOf:
- - $ref: ieee80211.yaml#
-
properties:
compatible:
enum:
@@ -38,7 +35,12 @@ properties:
MT7986 should contain 3 regions consys, dcm, and sku, in this order.
interrupts:
- maxItems: 1
+ minItems: 1
+ items:
+ - description: major interrupt for rings
+ - description: additional interrupt for ring 19
+ - description: additional interrupt for ring 4
+ - description: additional interrupt for ring 5
power-domains:
maxItems: 1
@@ -217,6 +219,24 @@ required:
- compatible
- reg
+allOf:
+ - $ref: ieee80211.yaml#
+ - if:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - mediatek,mt7981-wmac
+ - mediatek,mt7986-wmac
+ then:
+ properties:
+ interrupts:
+ minItems: 4
+ else:
+ properties:
+ interrupts:
+ maxItems: 1
+
unevaluatedProperties: false
examples:
@@ -293,7 +313,10 @@ examples:
reg = <0x18000000 0x1000000>,
<0x10003000 0x1000>,
<0x11d10000 0x1000>;
- interrupts = <GIC_SPI 213 IRQ_TYPE_LEVEL_HIGH>;
+ interrupts = <GIC_SPI 213 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 214 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 215 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 216 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&topckgen 50>,
<&topckgen 62>;
clock-names = "mcu", "ap2conn";
diff --git a/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.yaml b/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.yaml
index 7758a55dd32866..9b3ef4bc373258 100644
--- a/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.yaml
+++ b/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.yaml
@@ -8,6 +8,7 @@ title: Qualcomm Technologies ath10k wireless devices
maintainers:
- Kalle Valo <kvalo@kernel.org>
+ - Jeff Johnson <jjohnson@kernel.org>
description:
Qualcomm Technologies, Inc. IEEE 802.11ac devices.
diff --git a/Documentation/devicetree/bindings/net/wireless/qcom,ath11k-pci.yaml b/Documentation/devicetree/bindings/net/wireless/qcom,ath11k-pci.yaml
index 817f02a8b481d8..41d023797d7d3b 100644
--- a/Documentation/devicetree/bindings/net/wireless/qcom,ath11k-pci.yaml
+++ b/Documentation/devicetree/bindings/net/wireless/qcom,ath11k-pci.yaml
@@ -9,6 +9,7 @@ title: Qualcomm Technologies ath11k wireless devices (PCIe)
maintainers:
- Kalle Valo <kvalo@kernel.org>
+ - Jeff Johnson <jjohnson@kernel.org>
description: |
Qualcomm Technologies IEEE 802.11ax PCIe devices
diff --git a/Documentation/devicetree/bindings/net/wireless/qcom,ath11k.yaml b/Documentation/devicetree/bindings/net/wireless/qcom,ath11k.yaml
index 7d5f982a3d09db..672282cdfc2fc4 100644
--- a/Documentation/devicetree/bindings/net/wireless/qcom,ath11k.yaml
+++ b/Documentation/devicetree/bindings/net/wireless/qcom,ath11k.yaml
@@ -9,6 +9,7 @@ title: Qualcomm Technologies ath11k wireless devices
maintainers:
- Kalle Valo <kvalo@kernel.org>
+ - Jeff Johnson <jjohnson@kernel.org>
description: |
These are dt entries for Qualcomm Technologies, Inc. IEEE 802.11ax
diff --git a/Documentation/netlink/genetlink-c.yaml b/Documentation/netlink/genetlink-c.yaml
index c58f7153fcf828..4dfd899a1661ce 100644
--- a/Documentation/netlink/genetlink-c.yaml
+++ b/Documentation/netlink/genetlink-c.yaml
@@ -11,7 +11,7 @@ $defs:
minimum: 0
len-or-define:
type: [ string, integer ]
- pattern: ^[0-9A-Za-z_]+( - 1)?$
+ pattern: ^[0-9A-Za-z_-]+( - 1)?$
minimum: 0
len-or-limit:
# literal int or limit based on fixed-width type e.g. u8-min, u16-max, etc.
@@ -126,8 +126,9 @@ properties:
Prefix for the C enum name of the attributes. Default family[name]-set[name]-a-
type: string
enum-name:
- description: Name for the enum type of the attribute.
- type: string
+ description: |
+ Name for the enum type of the attribute, if empty no name will be used.
+ type: [ string, "null" ]
doc:
description: Documentation of the space.
type: string
@@ -208,6 +209,11 @@ properties:
exact-len:
description: Exact length for a string or a binary attribute.
$ref: '#/$defs/len-or-define'
+ unterminated-ok:
+ description: |
+ For string attributes, do not check whether attribute
+ contains the terminating null character.
+ type: boolean
sub-type: *attr-type
display-hint: &display-hint
description: |
@@ -261,14 +267,16 @@ properties:
the prefix with the upper case name of the command, with dashes replaced by underscores.
type: string
enum-name:
- description: Name for the enum type with commands.
- type: string
+ description: |
+ Name for the enum type with commands, if empty no name will be used.
+ type: [ string, "null" ]
async-prefix:
description: Same as name-prefix but used to render notifications and events to separate enum.
type: string
async-enum:
- description: Name for the enum type with notifications/events.
- type: string
+ description: |
+ Name for the enum type with commands, if empty no name will be used.
+ type: [ string, "null" ]
list:
description: List of commands
type: array
@@ -370,3 +378,22 @@ properties:
type: string
# End genetlink-c
flags: *cmd_flags
+
+ kernel-family:
+ description: Additional global attributes used for kernel C code generation.
+ type: object
+ additionalProperties: False
+ properties:
+ headers:
+ description: |
+ List of extra headers which should be included in the source
+ of the generated code.
+ type: array
+ items:
+ type: string
+ sock-priv:
+ description: |
+ Literal name of the type which is used within the kernel
+ to store the socket state. The type / structure is internal
+ to the kernel, and is not defined in the spec.
+ type: string
diff --git a/Documentation/netlink/genetlink-legacy.yaml b/Documentation/netlink/genetlink-legacy.yaml
index 938703088306ba..b48ad3b1cc32e2 100644
--- a/Documentation/netlink/genetlink-legacy.yaml
+++ b/Documentation/netlink/genetlink-legacy.yaml
@@ -11,7 +11,7 @@ $defs:
minimum: 0
len-or-define:
type: [ string, integer ]
- pattern: ^[0-9A-Za-z_]+( - 1)?$
+ pattern: ^[0-9A-Za-z_-]+( - 1)?$
minimum: 0
len-or-limit:
# literal int or limit based on fixed-width type e.g. u8-min, u16-max, etc.
@@ -168,8 +168,9 @@ properties:
Prefix for the C enum name of the attributes. Default family[name]-set[name]-a-
type: string
enum-name:
- description: Name for the enum type of the attribute.
- type: string
+ description: |
+ Name for the enum type of the attribute, if empty no name will be used.
+ type: [ string, "null" ]
doc:
description: Documentation of the space.
type: string
@@ -251,6 +252,11 @@ properties:
exact-len:
description: Exact length for a string or a binary attribute.
$ref: '#/$defs/len-or-define'
+ unterminated-ok:
+ description: |
+ For string attributes, do not check whether attribute
+ contains the terminating null character.
+ type: boolean
sub-type: *attr-type
display-hint: *display-hint
# Start genetlink-c
@@ -304,14 +310,16 @@ properties:
the prefix with the upper case name of the command, with dashes replaced by underscores.
type: string
enum-name:
- description: Name for the enum type with commands.
- type: string
+ description: |
+ Name for the enum type with commands, if empty no name will be used.
+ type: [ string, "null" ]
async-prefix:
description: Same as name-prefix but used to render notifications and events to separate enum.
type: string
async-enum:
- description: Name for the enum type with notifications/events.
- type: string
+ description: |
+ Name for the enum type with commands, if empty no name will be used.
+ type: [ string, "null" ]
# Start genetlink-legacy
fixed-header: &fixed-header
description: |
@@ -431,3 +439,22 @@ properties:
type: string
# End genetlink-c
flags: *cmd_flags
+
+ kernel-family:
+ description: Additional global attributes used for kernel C code generation.
+ type: object
+ additionalProperties: False
+ properties:
+ headers:
+ description: |
+ List of extra headers which should be included in the source
+ of the generated code.
+ type: array
+ items:
+ type: string
+ sock-priv:
+ description: |
+ Literal name of the type which is used within the kernel
+ to store the socket state. The type / structure is internal
+ to the kernel, and is not defined in the spec.
+ type: string
diff --git a/Documentation/netlink/genetlink.yaml b/Documentation/netlink/genetlink.yaml
index 3283bf458ff13d..ebd6ee743fccc2 100644
--- a/Documentation/netlink/genetlink.yaml
+++ b/Documentation/netlink/genetlink.yaml
@@ -11,7 +11,7 @@ $defs:
minimum: 0
len-or-define:
type: [ string, integer ]
- pattern: ^[0-9A-Za-z_]+( - 1)?$
+ pattern: ^[0-9A-Za-z_-]+( - 1)?$
minimum: 0
len-or-limit:
# literal int or limit based on fixed-width type e.g. u8-min, u16-max, etc.
@@ -328,3 +328,22 @@ properties:
The name for the group, used to form the define and the value of the define.
type: string
flags: *cmd_flags
+
+ kernel-family:
+ description: Additional global attributes used for kernel C code generation.
+ type: object
+ additionalProperties: False
+ properties:
+ headers:
+ description: |
+ List of extra headers which should be included in the source
+ of the generated code.
+ type: array
+ items:
+ type: string
+ sock-priv:
+ description: |
+ Literal name of the type which is used within the kernel
+ to store the socket state. The type / structure is internal
+ to the kernel, and is not defined in the spec.
+ type: string
diff --git a/Documentation/netlink/netlink-raw.yaml b/Documentation/netlink/netlink-raw.yaml
index 04b92f1a5cd6ef..a76e54cbadbc44 100644
--- a/Documentation/netlink/netlink-raw.yaml
+++ b/Documentation/netlink/netlink-raw.yaml
@@ -11,7 +11,7 @@ $defs:
minimum: 0
len-or-define:
type: [ string, integer ]
- pattern: ^[0-9A-Za-z_]+( - 1)?$
+ pattern: ^[0-9A-Za-z_-]+( - 1)?$
minimum: 0
# Schema for specs
@@ -152,14 +152,23 @@ properties:
the right formatting mechanism when displaying values of this
type.
enum: [ hex, mac, fddi, ipv4, ipv6, uuid ]
+ struct:
+ description: Name of the nested struct type.
+ type: string
if:
properties:
type:
- oneOf:
- - const: binary
- - const: pad
+ const: pad
then:
required: [ len ]
+ if:
+ properties:
+ type:
+ const: binary
+ then:
+ oneOf:
+ - required: [ len ]
+ - required: [ struct ]
# End genetlink-legacy
attribute-sets:
@@ -180,8 +189,9 @@ properties:
Prefix for the C enum name of the attributes. Default family[name]-set[name]-a-
type: string
enum-name:
- description: Name for the enum type of the attribute.
- type: string
+ description: |
+ Name for the enum type of the attribute, if empty no name will be used.
+ type: [ string, "null" ]
doc:
description: Documentation of the space.
type: string
@@ -261,6 +271,11 @@ properties:
exact-len:
description: Exact length for a string or a binary attribute.
$ref: '#/$defs/len-or-define'
+ unterminated-ok:
+ description: |
+ For string attributes, do not check whether attribute
+ contains the terminating null character.
+ type: boolean
sub-type: *attr-type
display-hint: *display-hint
# Start genetlink-c
@@ -362,14 +377,16 @@ properties:
the prefix with the upper case name of the command, with dashes replaced by underscores.
type: string
enum-name:
- description: Name for the enum type with commands.
- type: string
+ description: |
+ Name for the enum type with commands, if empty no name will be used.
+ type: [ string, "null" ]
async-prefix:
description: Same as name-prefix but used to render notifications and events to separate enum.
type: string
async-enum:
- description: Name for the enum type with notifications/events.
- type: string
+ description: |
+ Name for the enum type with commands, if empty no name will be used.
+ type: [ string, "null" ]
# Start genetlink-legacy
fixed-header: &fixed-header
description: |
diff --git a/Documentation/netlink/specs/devlink.yaml b/Documentation/netlink/specs/devlink.yaml
index cf6eaa0da821b3..09fbb4c03fc8d8 100644
--- a/Documentation/netlink/specs/devlink.yaml
+++ b/Documentation/netlink/specs/devlink.yaml
@@ -290,7 +290,7 @@ attribute-sets:
enum: eswitch-mode
-
name: eswitch-inline-mode
- type: u16
+ type: u8
enum: eswitch-inline-mode
-
name: dpipe-tables
diff --git a/Documentation/netlink/specs/dpll.yaml b/Documentation/netlink/specs/dpll.yaml
index 3dcc9ece272aad..95b0eb1486bf7e 100644
--- a/Documentation/netlink/specs/dpll.yaml
+++ b/Documentation/netlink/specs/dpll.yaml
@@ -52,6 +52,40 @@ definitions:
dpll's lock-state shall remain DPLL_LOCK_STATUS_UNLOCKED)
render-max: true
-
+ type: enum
+ name: lock-status-error
+ doc: |
+ if previous status change was done due to a failure, this provides
+ information of dpll device lock status error.
+ Valid values for DPLL_A_LOCK_STATUS_ERROR attribute
+ entries:
+ -
+ name: none
+ doc: |
+ dpll device lock status was changed without any error
+ value: 1
+ -
+ name: undefined
+ doc: |
+ dpll device lock status was changed due to undefined error.
+ Driver fills this value up in case it is not able
+ to obtain suitable exact error type.
+ -
+ name: media-down
+ doc: |
+ dpll device lock status was changed because of associated
+ media got down.
+ This may happen for example if dpll device was previously
+ locked on an input pin of type PIN_TYPE_SYNCE_ETH_PORT.
+ -
+ name: fractional-frequency-offset-too-high
+ doc: |
+ the FFO (Fractional Frequency Offset) between the RX and TX
+ symbol rate on the media got too high.
+ This may happen for example if dpll device was previously
+ locked on an input pin of type PIN_TYPE_SYNCE_ETH_PORT.
+ render-max: true
+ -
type: const
name: temp-divider
value: 1000
@@ -214,6 +248,10 @@ attribute-sets:
name: type
type: u32
enum: type
+ -
+ name: lock-status-error
+ type: u32
+ enum: lock-status-error
-
name: pin
enum-name: dpll_a_pin
@@ -274,6 +312,7 @@ attribute-sets:
-
name: capabilities
type: u32
+ enum: pin-capabilities
-
name: parent-device
type: nest
@@ -379,6 +418,7 @@ operations:
- mode
- mode-supported
- lock-status
+ - lock-status-error
- temp
- clock-id
- type
diff --git a/Documentation/netlink/specs/mptcp_pm.yaml b/Documentation/netlink/specs/mptcp_pm.yaml
index 49f90cfb469894..af525ed2979234 100644
--- a/Documentation/netlink/specs/mptcp_pm.yaml
+++ b/Documentation/netlink/specs/mptcp_pm.yaml
@@ -292,13 +292,14 @@ operations:
-
name: get-addr
doc: Get endpoint information
- attribute-set: endpoint
+ attribute-set: attr
dont-validate: [ strict ]
flags: [ uns-admin-perm ]
do: &get-addr-attrs
request:
attributes:
- addr
+ - token
reply:
attributes:
- addr
diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml
index 3addac97068048..76352dbd2be4eb 100644
--- a/Documentation/netlink/specs/netdev.yaml
+++ b/Documentation/netlink/specs/netdev.yaml
@@ -74,6 +74,10 @@ definitions:
name: queue-type
type: enum
entries: [ rx, tx ]
+ -
+ name: qstats-scope
+ type: flags
+ entries: [ queue ]
attribute-sets:
-
@@ -265,6 +269,73 @@ attribute-sets:
doc: ID of the NAPI instance which services this queue.
type: u32
+ -
+ name: qstats
+ doc: |
+ Get device statistics, scoped to a device or a queue.
+ These statistics extend (and partially duplicate) statistics available
+ in struct rtnl_link_stats64.
+ Value of the `scope` attribute determines how statistics are
+ aggregated. When aggregated for the entire device the statistics
+ represent the total number of events since last explicit reset of
+ the device (i.e. not a reconfiguration like changing queue count).
+ When reported per-queue, however, the statistics may not add
+ up to the total number of events, will only be reported for currently
+ active objects, and will likely report the number of events since last
+ reconfiguration.
+ attributes:
+ -
+ name: ifindex
+ doc: ifindex of the netdevice to which stats belong.
+ type: u32
+ checks:
+ min: 1
+ -
+ name: queue-type
+ doc: Queue type as rx, tx, for queue-id.
+ type: u32
+ enum: queue-type
+ -
+ name: queue-id
+ doc: Queue ID, if stats are scoped to a single queue instance.
+ type: u32
+ -
+ name: scope
+ doc: |
+ What object type should be used to iterate over the stats.
+ type: uint
+ enum: qstats-scope
+ -
+ name: rx-packets
+ doc: |
+ Number of wire packets successfully received and passed to the stack.
+ For drivers supporting XDP, XDP is considered the first layer
+ of the stack, so packets consumed by XDP are still counted here.
+ type: uint
+ value: 8 # reserve some attr ids in case we need more metadata later
+ -
+ name: rx-bytes
+ doc: Successfully received bytes, see `rx-packets`.
+ type: uint
+ -
+ name: tx-packets
+ doc: |
+ Number of wire packets successfully sent. Packet is considered to be
+ successfully sent once it is in device memory (usually this means
+ the device has issued a DMA completion for the packet).
+ type: uint
+ -
+ name: tx-bytes
+ doc: Successfully sent bytes, see `tx-packets`.
+ type: uint
+ -
+ name: rx-alloc-fail
+ doc: |
+ Number of times skb or buffer allocation failed on the Rx datapath.
+ Allocation failure may, or may not result in a packet drop, depending
+ on driver implementation and whether system recovers quickly.
+ type: uint
+
operations:
list:
-
@@ -405,6 +476,26 @@ operations:
attributes:
- ifindex
reply: *napi-get-op
+ -
+ name: qstats-get
+ doc: |
+ Get / dump fine grained statistics. Which statistics are reported
+ depends on the device and the driver, and whether the driver stores
+ software counters per-queue.
+ attribute-set: qstats
+ dump:
+ request:
+ attributes:
+ - scope
+ reply:
+ attributes:
+ - ifindex
+ - queue-type
+ - queue-id
+ - rx-packets
+ - rx-bytes
+ - tx-packets
+ - tx-bytes
mcast-groups:
list:
diff --git a/Documentation/netlink/specs/nlctrl.yaml b/Documentation/netlink/specs/nlctrl.yaml
new file mode 100644
index 00000000000000..b1632b95f725fe
--- /dev/null
+++ b/Documentation/netlink/specs/nlctrl.yaml
@@ -0,0 +1,206 @@
+# SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)
+
+name: nlctrl
+protocol: genetlink-legacy
+uapi-header: linux/genetlink.h
+
+doc: |
+ genetlink meta-family that exposes information about all genetlink
+ families registered in the kernel (including itself).
+
+definitions:
+ -
+ name: op-flags
+ type: flags
+ enum-name:
+ entries:
+ - admin-perm
+ - cmd-cap-do
+ - cmd-cap-dump
+ - cmd-cap-haspol
+ - uns-admin-perm
+ -
+ name: attr-type
+ enum-name: netlink-attribute-type
+ type: enum
+ entries:
+ - invalid
+ - flag
+ - u8
+ - u16
+ - u32
+ - u64
+ - s8
+ - s16
+ - s32
+ - s64
+ - binary
+ - string
+ - nul-string
+ - nested
+ - nested-array
+ - bitfield32
+ - sint
+ - uint
+
+attribute-sets:
+ -
+ name: ctrl-attrs
+ name-prefix: ctrl-attr-
+ attributes:
+ -
+ name: family-id
+ type: u16
+ -
+ name: family-name
+ type: string
+ -
+ name: version
+ type: u32
+ -
+ name: hdrsize
+ type: u32
+ -
+ name: maxattr
+ type: u32
+ -
+ name: ops
+ type: array-nest
+ nested-attributes: op-attrs
+ -
+ name: mcast-groups
+ type: array-nest
+ nested-attributes: mcast-group-attrs
+ -
+ name: policy
+ type: nest-type-value
+ type-value: [ policy-id, attr-id ]
+ nested-attributes: policy-attrs
+ -
+ name: op-policy
+ type: nest-type-value
+ type-value: [ op-id ]
+ nested-attributes: op-policy-attrs
+ -
+ name: op
+ type: u32
+ -
+ name: mcast-group-attrs
+ name-prefix: ctrl-attr-mcast-grp-
+ enum-name:
+ attributes:
+ -
+ name: name
+ type: string
+ -
+ name: id
+ type: u32
+ -
+ name: op-attrs
+ name-prefix: ctrl-attr-op-
+ enum-name:
+ attributes:
+ -
+ name: id
+ type: u32
+ -
+ name: flags
+ type: u32
+ enum: op-flags
+ enum-as-flags: true
+ -
+ name: policy-attrs
+ name-prefix: nl-policy-type-attr-
+ enum-name:
+ attributes:
+ -
+ name: type
+ type: u32
+ enum: attr-type
+ -
+ name: min-value-s
+ type: s64
+ -
+ name: max-value-s
+ type: s64
+ -
+ name: min-value-u
+ type: u64
+ -
+ name: max-value-u
+ type: u64
+ -
+ name: min-length
+ type: u32
+ -
+ name: max-length
+ type: u32
+ -
+ name: policy-idx
+ type: u32
+ -
+ name: policy-maxtype
+ type: u32
+ -
+ name: bitfield32-mask
+ type: u32
+ -
+ name: mask
+ type: u64
+ -
+ name: pad
+ type: pad
+ -
+ name: op-policy-attrs
+ name-prefix: ctrl-attr-policy-
+ enum-name:
+ attributes:
+ -
+ name: do
+ type: u32
+ -
+ name: dump
+ type: u32
+
+operations:
+ enum-model: directional
+ name-prefix: ctrl-cmd-
+ list:
+ -
+ name: getfamily
+ doc: Get / dump genetlink families
+ attribute-set: ctrl-attrs
+ do:
+ request:
+ value: 3
+ attributes:
+ - family-name
+ reply: &all-attrs
+ value: 1
+ attributes:
+ - family-id
+ - family-name
+ - hdrsize
+ - maxattr
+ - mcast-groups
+ - ops
+ - version
+ dump:
+ reply: *all-attrs
+ -
+ name: getpolicy
+ doc: Get / dump genetlink policies
+ attribute-set: ctrl-attrs
+ dump:
+ request:
+ value: 10
+ attributes:
+ - family-name
+ - family-id
+ - op
+ reply:
+ value: 10
+ attributes:
+ - family-id
+ - op-policy
+ - policy
diff --git a/Documentation/netlink/specs/tc.yaml b/Documentation/netlink/specs/tc.yaml
index 4346fa402fc91d..324fa182cd1491 100644
--- a/Documentation/netlink/specs/tc.yaml
+++ b/Documentation/netlink/specs/tc.yaml
@@ -48,21 +48,28 @@ definitions:
-
name: bytes
type: u64
+ doc: Number of enqueued bytes
-
name: packets
type: u32
+ doc: Number of enqueued packets
-
name: drops
type: u32
+ doc: Packets dropped because of lack of resources
-
name: overlimits
type: u32
+ doc: |
+ Number of throttle events when this flow goes out of allocated bandwidth
-
name: bps
type: u32
+ doc: Current flow byte rate
-
name: pps
type: u32
+ doc: Current flow packet rate
-
name: qlen
type: u32
@@ -112,6 +119,7 @@ definitions:
-
name: limit
type: u32
+ doc: Queue length; bytes for bfifo, packets for pfifo
-
name: tc-htb-opt
type: struct
@@ -119,11 +127,11 @@ definitions:
-
name: rate
type: binary
- len: 12
+ struct: tc-ratespec
-
name: ceil
type: binary
- len: 12
+ struct: tc-ratespec
-
name: buffer
type: u32
@@ -149,15 +157,19 @@ definitions:
-
name: rate2quantum
type: u32
+ doc: bps->quantum divisor
-
name: defcls
type: u32
+ doc: Default class number
-
name: debug
type: u32
+ doc: Debug flags
-
name: direct-pkts
type: u32
+ doc: Count of non shaped packets
-
name: tc-gred-qopt
type: struct
@@ -165,15 +177,19 @@ definitions:
-
name: limit
type: u32
+ doc: HARD maximal queue length in bytes
-
name: qth-min
type: u32
+ doc: Min average length threshold in bytes
-
name: qth-max
type: u32
+ doc: Max average length threshold in bytes
-
name: DP
type: u32
+ doc: Up to 2^32 DPs
-
name: backlog
type: u32
@@ -195,15 +211,19 @@ definitions:
-
name: Wlog
type: u8
+ doc: log(W)
-
name: Plog
type: u8
+ doc: log(P_max / (qth-max - qth-min))
-
name: Scell_log
type: u8
+ doc: cell size for idle damping
-
name: prio
type: u8
+ doc: Priority of this VQ
-
name: packets
type: u32
@@ -266,9 +286,11 @@ definitions:
-
name: bands
type: u16
+ doc: Number of bands
-
name: max-bands
type: u16
+ doc: Maximum number of queues
-
name: tc-netem-qopt
type: struct
@@ -276,21 +298,138 @@ definitions:
-
name: latency
type: u32
+ doc: Added delay in microseconds
-
name: limit
type: u32
+ doc: Fifo limit in packets
-
name: loss
type: u32
+ doc: Random packet loss (0=none, ~0=100%)
-
name: gap
type: u32
+ doc: Re-ordering gap (0 for none)
-
name: duplicate
type: u32
+ doc: Random packet duplication (0=none, ~0=100%)
-
name: jitter
type: u32
+ doc: Random jitter latency in microseconds
+ -
+ name: tc-netem-gimodel
+ doc: State transition probabilities for 4 state model
+ type: struct
+ members:
+ -
+ name: p13
+ type: u32
+ -
+ name: p31
+ type: u32
+ -
+ name: p32
+ type: u32
+ -
+ name: p14
+ type: u32
+ -
+ name: p23
+ type: u32
+ -
+ name: tc-netem-gemodel
+ doc: Gilbert-Elliot models
+ type: struct
+ members:
+ -
+ name: p
+ type: u32
+ -
+ name: r
+ type: u32
+ -
+ name: h
+ type: u32
+ -
+ name: k1
+ type: u32
+ -
+ name: tc-netem-corr
+ type: struct
+ members:
+ -
+ name: delay-corr
+ type: u32
+ doc: Delay correlation
+ -
+ name: loss-corr
+ type: u32
+ doc: Packet loss correlation
+ -
+ name: dup-corr
+ type: u32
+ doc: Duplicate correlation
+ -
+ name: tc-netem-reorder
+ type: struct
+ members:
+ -
+ name: probability
+ type: u32
+ -
+ name: correlation
+ type: u32
+ -
+ name: tc-netem-corrupt
+ type: struct
+ members:
+ -
+ name: probability
+ type: u32
+ -
+ name: correlation
+ type: u32
+ -
+ name: tc-netem-rate
+ type: struct
+ members:
+ -
+ name: rate
+ type: u32
+ -
+ name: packet-overhead
+ type: s32
+ -
+ name: cell-size
+ type: u32
+ -
+ name: cell-overhead
+ type: s32
+ -
+ name: tc-netem-slot
+ type: struct
+ members:
+ -
+ name: min-delay
+ type: s64
+ -
+ name: max-delay
+ type: s64
+ -
+ name: max-packets
+ type: s32
+ -
+ name: max-bytes
+ type: s32
+ -
+ name: dist-delay
+ type: s64
+ -
+ name: dist-jitter
+ type: s64
-
name: tc-plug-qopt
type: struct
@@ -307,11 +446,13 @@ definitions:
members:
-
name: bands
- type: u16
+ type: u32
+ doc: Number of bands
-
name: priomap
type: binary
len: 16
+ doc: Map of logical priority -> PRIO band
-
name: tc-red-qopt
type: struct
@@ -319,21 +460,27 @@ definitions:
-
name: limit
type: u32
+ doc: Hard queue length in packets
-
name: qth-min
type: u32
+ doc: Min average threshold in packets
-
name: qth-max
type: u32
+ doc: Max average threshold in packets
-
name: Wlog
type: u8
+ doc: log(W)
-
name: Plog
type: u8
+ doc: log(P_max / (qth-max - qth-min))
-
name: Scell-log
type: u8
+ doc: Cell size for idle damping
-
name: flags
type: u8
@@ -369,71 +516,128 @@ definitions:
name: penalty-burst
type: u32
-
- name: tc-sfq-qopt-v1 # TODO nested structs
+ name: tc-sfq-qopt
type: struct
members:
-
name: quantum
type: u32
+ doc: Bytes per round allocated to flow
-
name: perturb-period
type: s32
+ doc: Period of hash perturbation
-
name: limit
type: u32
+ doc: Maximal packets in queue
-
name: divisor
type: u32
+ doc: Hash divisor
-
name: flows
type: u32
+ doc: Maximal number of flows
+ -
+ name: tc-sfqred-stats
+ type: struct
+ members:
+ -
+ name: prob-drop
+ type: u32
+ doc: Early drops, below max threshold
+ -
+ name: forced-drop
+ type: u32
+ doc: Early drops, after max threshold
+ -
+ name: prob-mark
+ type: u32
+ doc: Marked packets, below max threshold
+ -
+ name: forced-mark
+ type: u32
+ doc: Marked packets, after max threshold
+ -
+ name: prob-mark-head
+ type: u32
+ doc: Marked packets, below max threshold
+ -
+ name: forced-mark-head
+ type: u32
+ doc: Marked packets, after max threshold
+ -
+ name: tc-sfq-qopt-v1
+ type: struct
+ members:
+ -
+ name: v0
+ type: binary
+ struct: tc-sfq-qopt
-
name: depth
type: u32
+ doc: Maximum number of packets per flow
-
name: headdrop
type: u32
-
name: limit
type: u32
+ doc: HARD maximal flow queue length in bytes
-
name: qth-min
type: u32
+ doc: Min average length threshold in bytes
-
- name: qth-mac
+ name: qth-max
type: u32
+ doc: Max average length threshold in bytes
-
name: Wlog
type: u8
+ doc: log(W)
-
name: Plog
type: u8
+ doc: log(P_max / (qth-max - qth-min))
-
name: Scell-log
type: u8
+ doc: Cell size for idle damping
-
name: flags
type: u8
-
name: max-P
type: u32
+ doc: probabilty, high resolution
-
- name: prob-drop
- type: u32
+ name: stats
+ type: binary
+ struct: tc-sfqred-stats
+ -
+ name: tc-ratespec
+ type: struct
+ members:
-
- name: forced-drop
- type: u32
+ name: cell-log
+ type: u8
-
- name: prob-mark
- type: u32
+ name: linklayer
+ type: u8
-
- name: forced-mark
- type: u32
+ name: overhead
+ type: u8
-
- name: prob-mark-head
- type: u32
+ name: cell-align
+ type: u8
-
- name: forced-mark-head
+ name: mpu
+ type: u8
+ -
+ name: rate
type: u32
-
name: tc-tbf-qopt
@@ -441,12 +645,12 @@ definitions:
members:
-
name: rate
- type: binary # TODO nested struct tc_ratespec
- len: 12
+ type: binary
+ struct: tc-ratespec
-
name: peakrate
- type: binary # TODO nested struct tc_ratespec
- len: 12
+ type: binary
+ struct: tc-ratespec
-
name: limit
type: u32
@@ -491,9 +695,663 @@ definitions:
-
name: interval
type: s8
+ doc: Sampling period
-
name: ewma-log
type: u8
+ doc: The log() of measurement window weight
+ -
+ name: tc-choke-xstats
+ type: struct
+ members:
+ -
+ name: early
+ type: u32
+ doc: Early drops
+ -
+ name: pdrop
+ type: u32
+ doc: Drops due to queue limits
+ -
+ name: other
+ type: u32
+ doc: Drops due to drop() calls
+ -
+ name: marked
+ type: u32
+ doc: Marked packets
+ -
+ name: matched
+ type: u32
+ doc: Drops due to flow match
+ -
+ name: tc-codel-xstats
+ type: struct
+ members:
+ -
+ name: maxpacket
+ type: u32
+ doc: Largest packet we've seen so far
+ -
+ name: count
+ type: u32
+ doc: How many drops we've done since the last time we entered dropping state
+ -
+ name: lastcount
+ type: u32
+ doc: Count at entry to dropping state
+ -
+ name: ldelay
+ type: u32
+ doc: in-queue delay seen by most recently dequeued packet
+ -
+ name: drop-next
+ type: s32
+ doc: Time to drop next packet
+ -
+ name: drop-overlimit
+ type: u32
+ doc: Number of times max qdisc packet limit was hit
+ -
+ name: ecn-mark
+ type: u32
+ doc: Number of packets we've ECN marked instead of dropped
+ -
+ name: dropping
+ type: u32
+ doc: Are we in a dropping state?
+ -
+ name: ce-mark
+ type: u32
+ doc: Number of CE marked packets because of ce-threshold
+ -
+ name: tc-fq-codel-xstats
+ type: struct
+ members:
+ -
+ name: type
+ type: u32
+ -
+ name: maxpacket
+ type: u32
+ doc: Largest packet we've seen so far
+ -
+ name: drop-overlimit
+ type: u32
+ doc: Number of times max qdisc packet limit was hit
+ -
+ name: ecn-mark
+ type: u32
+ doc: Number of packets we ECN marked instead of being dropped
+ -
+ name: new-flow-count
+ type: u32
+ doc: Number of times packets created a new flow
+ -
+ name: new-flows-len
+ type: u32
+ doc: Count of flows in new list
+ -
+ name: old-flows-len
+ type: u32
+ doc: Count of flows in old list
+ -
+ name: ce-mark
+ type: u32
+ doc: Packets above ce-threshold
+ -
+ name: memory-usage
+ type: u32
+ doc: Memory usage in bytes
+ -
+ name: drop-overmemory
+ type: u32
+ -
+ name: tc-fq-pie-xstats
+ type: struct
+ members:
+ -
+ name: packets-in
+ type: u32
+ doc: Total number of packets enqueued
+ -
+ name: dropped
+ type: u32
+ doc: Packets dropped due to fq_pie_action
+ -
+ name: overlimit
+ type: u32
+ doc: Dropped due to lack of space in queue
+ -
+ name: overmemory
+ type: u32
+ doc: Dropped due to lack of memory in queue
+ -
+ name: ecn-mark
+ type: u32
+ doc: Packets marked with ecn
+ -
+ name: new-flow-count
+ type: u32
+ doc: Count of new flows created by packets
+ -
+ name: new-flows-len
+ type: u32
+ doc: Count of flows in new list
+ -
+ name: old-flows-len
+ type: u32
+ doc: Count of flows in old list
+ -
+ name: memory-usage
+ type: u32
+ doc: Total memory across all queues
+ -
+ name: tc-fq-qd-stats
+ type: struct
+ members:
+ -
+ name: gc-flows
+ type: u64
+ -
+ name: highprio-packets
+ type: u64
+ doc: obsolete
+ -
+ name: tcp-retrans
+ type: u64
+ doc: obsolete
+ -
+ name: throttled
+ type: u64
+ -
+ name: flows-plimit
+ type: u64
+ -
+ name: pkts-too-long
+ type: u64
+ -
+ name: allocation-errors
+ type: u64
+ -
+ name: time-next-delayed-flow
+ type: s64
+ -
+ name: flows
+ type: u32
+ -
+ name: inactive-flows
+ type: u32
+ -
+ name: throttled-flows
+ type: u32
+ -
+ name: unthrottle-latency-ns
+ type: u32
+ -
+ name: ce-mark
+ type: u64
+ doc: Packets above ce-threshold
+ -
+ name: horizon-drops
+ type: u64
+ -
+ name: horizon-caps
+ type: u64
+ -
+ name: fastpath-packets
+ type: u64
+ -
+ name: band-drops
+ type: binary
+ len: 24
+ -
+ name: band-pkt-count
+ type: binary
+ len: 12
+ -
+ name: pad
+ type: pad
+ len: 4
+ -
+ name: tc-hhf-xstats
+ type: struct
+ members:
+ -
+ name: drop-overlimit
+ type: u32
+ doc: Number of times max qdisc packet limit was hit
+ -
+ name: hh-overlimit
+ type: u32
+ doc: Number of times max heavy-hitters was hit
+ -
+ name: hh-tot-count
+ type: u32
+ doc: Number of captured heavy-hitters so far
+ -
+ name: hh-cur-count
+ type: u32
+ doc: Number of current heavy-hitters
+ -
+ name: tc-pie-xstats
+ type: struct
+ members:
+ -
+ name: prob
+ type: u64
+ doc: Current probability
+ -
+ name: delay
+ type: u32
+ doc: Current delay in ms
+ -
+ name: avg-dq-rate
+ type: u32
+ doc: Current average dq rate in bits/pie-time
+ -
+ name: dq-rate-estimating
+ type: u32
+ doc: Is avg-dq-rate being calculated?
+ -
+ name: packets-in
+ type: u32
+ doc: Total number of packets enqueued
+ -
+ name: dropped
+ type: u32
+ doc: Packets dropped due to pie action
+ -
+ name: overlimit
+ type: u32
+ doc: Dropped due to lack of space in queue
+ -
+ name: maxq
+ type: u32
+ doc: Maximum queue size
+ -
+ name: ecn-mark
+ type: u32
+ doc: Packets marked with ecn
+ -
+ name: tc-red-xstats
+ type: struct
+ members:
+ -
+ name: early
+ type: u32
+ doc: Early drops
+ -
+ name: pdrop
+ type: u32
+ doc: Drops due to queue limits
+ -
+ name: other
+ type: u32
+ doc: Drops due to drop() calls
+ -
+ name: marked
+ type: u32
+ doc: Marked packets
+ -
+ name: tc-sfb-xstats
+ type: struct
+ members:
+ -
+ name: earlydrop
+ type: u32
+ -
+ name: penaltydrop
+ type: u32
+ -
+ name: bucketdrop
+ type: u32
+ -
+ name: queuedrop
+ type: u32
+ -
+ name: childdrop
+ type: u32
+ doc: drops in child qdisc
+ -
+ name: marked
+ type: u32
+ -
+ name: maxqlen
+ type: u32
+ -
+ name: maxprob
+ type: u32
+ -
+ name: avgprob
+ type: u32
+ -
+ name: tc-sfq-xstats
+ type: struct
+ members:
+ -
+ name: allot
+ type: s32
+ -
+ name: gnet-stats-basic
+ type: struct
+ members:
+ -
+ name: bytes
+ type: u64
+ -
+ name: packets
+ type: u32
+ -
+ name: gnet-stats-rate-est
+ type: struct
+ members:
+ -
+ name: bps
+ type: u32
+ -
+ name: pps
+ type: u32
+ -
+ name: gnet-stats-rate-est64
+ type: struct
+ members:
+ -
+ name: bps
+ type: u64
+ -
+ name: pps
+ type: u64
+ -
+ name: gnet-stats-queue
+ type: struct
+ members:
+ -
+ name: qlen
+ type: u32
+ -
+ name: backlog
+ type: u32
+ -
+ name: drops
+ type: u32
+ -
+ name: requeues
+ type: u32
+ -
+ name: overlimits
+ type: u32
+ -
+ name: tc-u32-key
+ type: struct
+ members:
+ -
+ name: mask
+ type: u32
+ byte-order: big-endian
+ -
+ name: val
+ type: u32
+ byte-order: big-endian
+ -
+ name: "off"
+ type: s32
+ -
+ name: offmask
+ type: s32
+ -
+ name: tc-u32-sel
+ type: struct
+ members:
+ -
+ name: flags
+ type: u8
+ -
+ name: offshift
+ type: u8
+ -
+ name: nkeys
+ type: u8
+ -
+ name: offmask
+ type: u16
+ byte-order: big-endian
+ -
+ name: "off"
+ type: u16
+ -
+ name: offoff
+ type: s16
+ -
+ name: hoff
+ type: s16
+ -
+ name: hmask
+ type: u32
+ byte-order: big-endian
+ -
+ name: keys
+ type: binary
+ struct: tc-u32-key # TODO: array
+ -
+ name: tc-u32-pcnt
+ type: struct
+ members:
+ -
+ name: rcnt
+ type: u64
+ -
+ name: rhit
+ type: u64
+ -
+ name: kcnts
+ type: u64 # TODO: array
+ -
+ name: tcf-t
+ type: struct
+ members:
+ -
+ name: install
+ type: u64
+ -
+ name: lastuse
+ type: u64
+ -
+ name: expires
+ type: u64
+ -
+ name: firstuse
+ type: u64
+ -
+ name: tc-gen
+ type: struct
+ members:
+ -
+ name: index
+ type: u32
+ -
+ name: capab
+ type: u32
+ -
+ name: action
+ type: s32
+ -
+ name: refcnt
+ type: s32
+ -
+ name: bindcnt
+ type: s32
+ -
+ name: tc-gact-p
+ type: struct
+ members:
+ -
+ name: ptype
+ type: u16
+ -
+ name: pval
+ type: u16
+ -
+ name: paction
+ type: s32
+ -
+ name: tcf-ematch-tree-hdr
+ type: struct
+ members:
+ -
+ name: nmatches
+ type: u16
+ -
+ name: progid
+ type: u16
+ -
+ name: tc-basic-pcnt
+ type: struct
+ members:
+ -
+ name: rcnt
+ type: u64
+ -
+ name: rhit
+ type: u64
+ -
+ name: tc-matchall-pcnt
+ type: struct
+ members:
+ -
+ name: rhit
+ type: u64
+ -
+ name: tc-mpls
+ type: struct
+ members:
+ -
+ name: index
+ type: u32
+ -
+ name: capab
+ type: u32
+ -
+ name: action
+ type: s32
+ -
+ name: refcnt
+ type: s32
+ -
+ name: bindcnt
+ type: s32
+ -
+ name: m-action
+ type: s32
+ -
+ name: tc-police
+ type: struct
+ members:
+ -
+ name: index
+ type: u32
+ -
+ name: action
+ type: s32
+ -
+ name: limit
+ type: u32
+ -
+ name: burst
+ type: u32
+ -
+ name: mtu
+ type: u32
+ -
+ name: rate
+ type: binary
+ struct: tc-ratespec
+ -
+ name: peakrate
+ type: binary
+ struct: tc-ratespec
+ -
+ name: refcnt
+ type: s32
+ -
+ name: bindcnt
+ type: s32
+ -
+ name: capab
+ type: u32
+ -
+ name: tc-pedit-sel
+ type: struct
+ members:
+ -
+ name: index
+ type: u32
+ -
+ name: capab
+ type: u32
+ -
+ name: action
+ type: s32
+ -
+ name: refcnt
+ type: s32
+ -
+ name: bindcnt
+ type: s32
+ -
+ name: nkeys
+ type: u8
+ -
+ name: flags
+ type: u8
+ -
+ name: keys
+ type: binary
+ struct: tc-pedit-key # TODO: array
+ -
+ name: tc-pedit-key
+ type: struct
+ members:
+ -
+ name: mask
+ type: u32
+ -
+ name: val
+ type: u32
+ -
+ name: "off"
+ type: u32
+ -
+ name: at
+ type: u32
+ -
+ name: offmask
+ type: u32
+ -
+ name: shift
+ type: u32
+ -
+ name: tc-vlan
+ type: struct
+ members:
+ -
+ name: index
+ type: u32
+ -
+ name: capab
+ type: u32
+ -
+ name: action
+ type: s32
+ -
+ name: refcnt
+ type: s32
+ -
+ name: bindcnt
+ type: s32
+ -
+ name: v-action
+ type: s32
attribute-sets:
-
name: tc-attrs
@@ -512,7 +1370,9 @@ attribute-sets:
struct: tc-stats
-
name: xstats
- type: binary
+ type: sub-message
+ sub-message: tca-stats-app-msg
+ selector: kind
-
name: rate
type: binary
@@ -553,6 +1413,582 @@ attribute-sets:
name: ext-warn-msg
type: string
-
+ name: tc-act-attrs
+ attributes:
+ -
+ name: kind
+ type: string
+ -
+ name: options
+ type: sub-message
+ sub-message: tc-act-options-msg
+ selector: kind
+ -
+ name: index
+ type: u32
+ -
+ name: stats
+ type: nest
+ nested-attributes: tc-act-stats-attrs
+ -
+ name: pad
+ type: pad
+ -
+ name: cookie
+ type: binary
+ -
+ name: flags
+ type: bitfield32
+ -
+ name: hw-stats
+ type: bitfield32
+ -
+ name: used-hw-stats
+ type: bitfield32
+ -
+ name: in-hw-count
+ type: u32
+ -
+ name: tc-act-stats-attrs
+ attributes:
+ -
+ name: basic
+ type: binary
+ struct: gnet-stats-basic
+ -
+ name: rate-est
+ type: binary
+ struct: gnet-stats-rate-est
+ -
+ name: queue
+ type: binary
+ struct: gnet-stats-queue
+ -
+ name: app
+ type: binary
+ -
+ name: rate-est64
+ type: binary
+ struct: gnet-stats-rate-est64
+ -
+ name: pad
+ type: pad
+ -
+ name: basic-hw
+ type: binary
+ struct: gnet-stats-basic
+ -
+ name: pkt64
+ type: u64
+ -
+ name: tc-act-bpf-attrs
+ attributes:
+ -
+ name: tm
+ type: binary
+ struct: tcf-t
+ -
+ name: parms
+ type: binary
+ -
+ name: ops-len
+ type: u16
+ -
+ name: ops
+ type: binary
+ -
+ name: fd
+ type: u32
+ -
+ name: name
+ type: string
+ -
+ name: pad
+ type: pad
+ -
+ name: tag
+ type: binary
+ -
+ name: id
+ type: binary
+ -
+ name: tc-act-connmark-attrs
+ attributes:
+ -
+ name: parms
+ type: binary
+ -
+ name: tm
+ type: binary
+ struct: tcf-t
+ -
+ name: pad
+ type: pad
+ -
+ name: tc-act-csum-attrs
+ attributes:
+ -
+ name: parms
+ type: binary
+ -
+ name: tm
+ type: binary
+ struct: tcf-t
+ -
+ name: pad
+ type: pad
+ -
+ name: tc-act-ct-attrs
+ attributes:
+ -
+ name: parms
+ type: binary
+ -
+ name: tm
+ type: binary
+ struct: tcf-t
+ -
+ name: action
+ type: u16
+ -
+ name: zone
+ type: u16
+ -
+ name: mark
+ type: u32
+ -
+ name: mark-mask
+ type: u32
+ -
+ name: labels
+ type: binary
+ -
+ name: labels-mask
+ type: binary
+ -
+ name: nat-ipv4-min
+ type: u32
+ byte-order: big-endian
+ -
+ name: nat-ipv4-max
+ type: u32
+ byte-order: big-endian
+ -
+ name: nat-ipv6-min
+ type: binary
+ -
+ name: nat-ipv6-max
+ type: binary
+ -
+ name: nat-port-min
+ type: u16
+ byte-order: big-endian
+ -
+ name: nat-port-max
+ type: u16
+ byte-order: big-endian
+ -
+ name: pad
+ type: pad
+ -
+ name: helper-name
+ type: string
+ -
+ name: helper-family
+ type: u8
+ -
+ name: helper-proto
+ type: u8
+ -
+ name: tc-act-ctinfo-attrs
+ attributes:
+ -
+ name: pad
+ type: pad
+ -
+ name: tm
+ type: binary
+ struct: tcf-t
+ -
+ name: act
+ type: binary
+ -
+ name: zone
+ type: u16
+ -
+ name: parms-dscp-mask
+ type: u32
+ -
+ name: parms-dscp-statemask
+ type: u32
+ -
+ name: parms-cpmark-mask
+ type: u32
+ -
+ name: stats-dscp-set
+ type: u64
+ -
+ name: stats-dscp-error
+ type: u64
+ -
+ name: stats-cpmark-set
+ type: u64
+ -
+ name: tc-act-gate-attrs
+ attributes:
+ -
+ name: tm
+ type: binary
+ struct: tcf-t
+ -
+ name: parms
+ type: binary
+ -
+ name: pad
+ type: pad
+ -
+ name: priority
+ type: s32
+ -
+ name: entry-list
+ type: binary
+ -
+ name: base-time
+ type: u64
+ -
+ name: cycle-time
+ type: u64
+ -
+ name: cycle-time-ext
+ type: u64
+ -
+ name: flags
+ type: u32
+ -
+ name: clockid
+ type: s32
+ -
+ name: tc-act-ife-attrs
+ attributes:
+ -
+ name: parms
+ type: binary
+ -
+ name: tm
+ type: binary
+ struct: tcf-t
+ -
+ name: dmac
+ type: binary
+ -
+ name: smac
+ type: binary
+ -
+ name: type
+ type: u16
+ -
+ name: metalst
+ type: binary
+ -
+ name: pad
+ type: pad
+ -
+ name: tc-act-mirred-attrs
+ attributes:
+ -
+ name: tm
+ type: binary
+ struct: tcf-t
+ -
+ name: parms
+ type: binary
+ -
+ name: pad
+ type: pad
+ -
+ name: blockid
+ type: binary
+ -
+ name: tc-act-mpls-attrs
+ attributes:
+ -
+ name: tm
+ type: binary
+ struct: tcf-t
+ -
+ name: parms
+ type: binary
+ struct: tc-mpls
+ -
+ name: pad
+ type: pad
+ -
+ name: proto
+ type: u16
+ byte-order: big-endian
+ -
+ name: label
+ type: u32
+ -
+ name: tc
+ type: u8
+ -
+ name: ttl
+ type: u8
+ -
+ name: bos
+ type: u8
+ -
+ name: tc-act-nat-attrs
+ attributes:
+ -
+ name: parms
+ type: binary
+ -
+ name: tm
+ type: binary
+ struct: tcf-t
+ -
+ name: pad
+ type: pad
+ -
+ name: tc-act-pedit-attrs
+ attributes:
+ -
+ name: tm
+ type: binary
+ struct: tcf-t
+ -
+ name: parms
+ type: binary
+ struct: tc-pedit-sel
+ -
+ name: pad
+ type: pad
+ -
+ name: parms-ex
+ type: binary
+ -
+ name: keys-ex
+ type: binary
+ -
+ name: key-ex
+ type: binary
+ -
+ name: tc-act-simple-attrs
+ attributes:
+ -
+ name: tm
+ type: binary
+ struct: tcf-t
+ -
+ name: parms
+ type: binary
+ -
+ name: data
+ type: binary
+ -
+ name: pad
+ type: pad
+ -
+ name: tc-act-skbedit-attrs
+ attributes:
+ -
+ name: tm
+ type: binary
+ struct: tcf-t
+ -
+ name: parms
+ type: binary
+ -
+ name: priority
+ type: u32
+ -
+ name: queue-mapping
+ type: u16
+ -
+ name: mark
+ type: u32
+ -
+ name: pad
+ type: pad
+ -
+ name: ptype
+ type: u16
+ -
+ name: mask
+ type: u32
+ -
+ name: flags
+ type: u64
+ -
+ name: queue-mapping-max
+ type: u16
+ -
+ name: tc-act-skbmod-attrs
+ attributes:
+ -
+ name: tm
+ type: binary
+ struct: tcf-t
+ -
+ name: parms
+ type: binary
+ -
+ name: dmac
+ type: binary
+ -
+ name: smac
+ type: binary
+ -
+ name: etype
+ type: binary
+ -
+ name: pad
+ type: pad
+ -
+ name: tc-act-tunnel-key-attrs
+ attributes:
+ -
+ name: tm
+ type: binary
+ struct: tcf-t
+ -
+ name: parms
+ type: binary
+ -
+ name: enc-ipv4-src
+ type: u32
+ byte-order: big-endian
+ -
+ name: enc-ipv4-dst
+ type: u32
+ byte-order: big-endian
+ -
+ name: enc-ipv6-src
+ type: binary
+ -
+ name: enc-ipv6-dst
+ type: binary
+ -
+ name: enc-key-id
+ type: u64
+ byte-order: big-endian
+ -
+ name: pad
+ type: pad
+ -
+ name: enc-dst-port
+ type: u16
+ byte-order: big-endian
+ -
+ name: no-csum
+ type: u8
+ -
+ name: enc-opts
+ type: binary
+ -
+ name: enc-tos
+ type: u8
+ -
+ name: enc-ttl
+ type: u8
+ -
+ name: no-frag
+ type: flag
+ -
+ name: tc-act-vlan-attrs
+ attributes:
+ -
+ name: tm
+ type: binary
+ struct: tcf-t
+ -
+ name: parms
+ type: binary
+ struct: tc-vlan
+ -
+ name: push-vlan-id
+ type: u16
+ -
+ name: push-vlan-protocol
+ type: u16
+ -
+ name: pad
+ type: pad
+ -
+ name: push-vlan-priority
+ type: u8
+ -
+ name: push-eth-dst
+ type: binary
+ -
+ name: push-eth-src
+ type: binary
+ -
+ name: tc-basic-attrs
+ attributes:
+ -
+ name: classid
+ type: u32
+ -
+ name: ematches
+ type: nest
+ nested-attributes: tc-ematch-attrs
+ -
+ name: act
+ type: array-nest
+ nested-attributes: tc-act-attrs
+ -
+ name: police
+ type: nest
+ nested-attributes: tc-police-attrs
+ -
+ name: pcnt
+ type: binary
+ struct: tc-basic-pcnt
+ -
+ name: pad
+ type: pad
+ -
+ name: tc-bpf-attrs
+ attributes:
+ -
+ name: act
+ type: nest
+ nested-attributes: tc-act-attrs
+ -
+ name: police
+ type: nest
+ nested-attributes: tc-police-attrs
+ -
+ name: classid
+ type: u32
+ -
+ name: ops-len
+ type: u16
+ -
+ name: ops
+ type: binary
+ -
+ name: fd
+ type: u32
+ -
+ name: name
+ type: string
+ -
+ name: flags
+ type: u32
+ -
+ name: flags-gen
+ type: u32
+ -
+ name: tag
+ type: binary
+ -
+ name: id
+ type: u32
+ -
name: tc-cake-attrs
attributes:
-
@@ -641,7 +2077,8 @@ attribute-sets:
type: u32
-
name: tin-stats
- type: binary
+ type: array-nest
+ nested-attributes: tc-cake-tin-stats-attrs
-
name: deficit
type: s32
@@ -661,6 +2098,84 @@ attribute-sets:
name: blue-timer-us
type: s32
-
+ name: tc-cake-tin-stats-attrs
+ attributes:
+ -
+ name: pad
+ type: pad
+ -
+ name: sent-packets
+ type: u32
+ -
+ name: sent-bytes64
+ type: u64
+ -
+ name: dropped-packets
+ type: u32
+ -
+ name: dropped-bytes64
+ type: u64
+ -
+ name: acks-dropped-packets
+ type: u32
+ -
+ name: acks-dropped-bytes64
+ type: u64
+ -
+ name: ecn-marked-packets
+ type: u32
+ -
+ name: ecn-marked-bytes64
+ type: u64
+ -
+ name: backlog-packets
+ type: u32
+ -
+ name: backlog-bytes
+ type: u32
+ -
+ name: threshold-rate64
+ type: u64
+ -
+ name: target-us
+ type: u32
+ -
+ name: interval-us
+ type: u32
+ -
+ name: way-indirect-hits
+ type: u32
+ -
+ name: way-misses
+ type: u32
+ -
+ name: way-collisions
+ type: u32
+ -
+ name: peak-delay-us
+ type: u32
+ -
+ name: avg-delay-us
+ type: u32
+ -
+ name: base-delay-us
+ type: u32
+ -
+ name: sparse-flows
+ type: u32
+ -
+ name: bulk-flows
+ type: u32
+ -
+ name: unresponsive-flows
+ type: u32
+ -
+ name: max-skblen
+ type: u32
+ -
+ name: flow-quantum
+ type: u32
+ -
name: tc-cbs-attrs
attributes:
-
@@ -668,6 +2183,20 @@ attribute-sets:
type: binary
struct: tc-cbs-qopt
-
+ name: tc-cgroup-attrs
+ attributes:
+ -
+ name: act
+ type: nest
+ nested-attributes: tc-act-attrs
+ -
+ name: police
+ type: nest
+ nested-attributes: tc-police-attrs
+ -
+ name: ematches
+ type: binary
+ -
name: tc-choke-attrs
attributes:
-
@@ -677,6 +2206,9 @@ attribute-sets:
-
name: stab
type: binary
+ checks:
+ min-len: 256
+ max-len: 256
-
name: max-p
type: u32
@@ -705,6 +2237,56 @@ attribute-sets:
name: quantum
type: u32
-
+ name: tc-ematch-attrs
+ attributes:
+ -
+ name: tree-hdr
+ type: binary
+ struct: tcf-ematch-tree-hdr
+ -
+ name: tree-list
+ type: binary
+ -
+ name: tc-flow-attrs
+ attributes:
+ -
+ name: keys
+ type: u32
+ -
+ name: mode
+ type: u32
+ -
+ name: baseclass
+ type: u32
+ -
+ name: rshift
+ type: u32
+ -
+ name: addend
+ type: u32
+ -
+ name: mask
+ type: u32
+ -
+ name: xor
+ type: u32
+ -
+ name: divisor
+ type: u32
+ -
+ name: act
+ type: binary
+ -
+ name: police
+ type: nest
+ nested-attributes: tc-police-attrs
+ -
+ name: ematches
+ type: binary
+ -
+ name: perturb
+ type: u32
+ -
name: tc-flower-attrs
attributes:
-
@@ -953,15 +2535,19 @@ attribute-sets:
-
name: key-arp-sha
type: binary
+ display-hint: mac
-
name: key-arp-sha-mask
type: binary
+ display-hint: mac
-
name: key-arp-tha
type: binary
+ display-hint: mac
-
name: key-arp-tha-mask
type: binary
+ display-hint: mac
-
name: key-mpls-ttl
type: u8
@@ -1020,10 +2606,12 @@ attribute-sets:
type: u8
-
name: key-enc-opts
- type: binary
+ type: nest
+ nested-attributes: tc-flower-key-enc-opts-attrs
-
name: key-enc-opts-mask
- type: binary
+ type: nest
+ nested-attributes: tc-flower-key-enc-opts-attrs
-
name: in-hw-count
type: u32
@@ -1069,7 +2657,8 @@ attribute-sets:
type: binary
-
name: key-mpls-opts
- type: binary
+ type: nest
+ nested-attributes: tc-flower-key-mpls-opt-attrs
-
name: key-hash
type: u32
@@ -1091,6 +2680,129 @@ attribute-sets:
name: key-l2-tpv3-sid
type: u32
byte-order: big-endian
+ -
+ name: l2-miss
+ type: u8
+ -
+ name: key-cfm
+ type: nest
+ nested-attributes: tc-flower-key-cfm-attrs
+ -
+ name: key-spi
+ type: u32
+ byte-order: big-endian
+ -
+ name: key-spi-mask
+ type: u32
+ byte-order: big-endian
+ -
+ name: tc-flower-key-enc-opts-attrs
+ attributes:
+ -
+ name: geneve
+ type: nest
+ nested-attributes: tc-flower-key-enc-opt-geneve-attrs
+ -
+ name: vxlan
+ type: nest
+ nested-attributes: tc-flower-key-enc-opt-vxlan-attrs
+ -
+ name: erspan
+ type: nest
+ nested-attributes: tc-flower-key-enc-opt-erspan-attrs
+ -
+ name: gtp
+ type: nest
+ nested-attributes: tc-flower-key-enc-opt-gtp-attrs
+ -
+ name: tc-flower-key-enc-opt-geneve-attrs
+ attributes:
+ -
+ name: class
+ type: u16
+ -
+ name: type
+ type: u8
+ -
+ name: data
+ type: binary
+ -
+ name: tc-flower-key-enc-opt-vxlan-attrs
+ attributes:
+ -
+ name: gbp
+ type: u32
+ -
+ name: tc-flower-key-enc-opt-erspan-attrs
+ attributes:
+ -
+ name: ver
+ type: u8
+ -
+ name: index
+ type: u32
+ -
+ name: dir
+ type: u8
+ -
+ name: hwid
+ type: u8
+ -
+ name: tc-flower-key-enc-opt-gtp-attrs
+ attributes:
+ -
+ name: pdu-type
+ type: u8
+ -
+ name: qfi
+ type: u8
+ -
+ name: tc-flower-key-mpls-opt-attrs
+ attributes:
+ -
+ name: lse-depth
+ type: u8
+ -
+ name: lse-ttl
+ type: u8
+ -
+ name: lse-bos
+ type: u8
+ -
+ name: lse-tc
+ type: u8
+ -
+ name: lse-label
+ type: u32
+ -
+ name: tc-flower-key-cfm-attrs
+ attributes:
+ -
+ name: md-level
+ type: u8
+ -
+ name: opcode
+ type: u8
+ -
+ name: tc-fw-attrs
+ attributes:
+ -
+ name: classid
+ type: u32
+ -
+ name: police
+ type: nest
+ nested-attributes: tc-police-attrs
+ -
+ name: indev
+ type: string
+ -
+ name: act
+ type: array-nest
+ nested-attributes: tc-act-attrs
+ -
+ name: mask
+ type: u32
-
name: tc-gred-attrs
attributes:
@@ -1135,7 +2847,7 @@ attribute-sets:
type: u32
-
name: stat-bytes
- type: u32
+ type: u64
-
name: stat-packets
type: u32
@@ -1232,40 +2944,25 @@ attribute-sets:
name: offload
type: flag
-
- name: tc-act-attrs
+ name: tc-matchall-attrs
attributes:
-
- name: kind
- type: string
+ name: classid
+ type: u32
-
- name: options
- type: sub-message
- sub-message: tc-act-options-msg
- selector: kind
+ name: act
+ type: array-nest
+ nested-attributes: tc-act-attrs
-
- name: index
+ name: flags
type: u32
-
- name: stats
+ name: pcnt
type: binary
+ struct: tc-matchall-pcnt
-
name: pad
type: pad
- -
- name: cookie
- type: binary
- -
- name: flags
- type: bitfield32
- -
- name: hw-stats
- type: bitfield32
- -
- name: used-hw-stats
- type: bitfield32
- -
- name: in-hw-count
- type: u32
-
name: tc-etf-attrs
attributes:
@@ -1304,48 +3001,71 @@ attribute-sets:
-
name: plimit
type: u32
+ doc: Limit of total number of packets in queue
-
name: flow-plimit
type: u32
+ doc: Limit of packets per flow
-
name: quantum
type: u32
+ doc: RR quantum
-
name: initial-quantum
type: u32
+ doc: RR quantum for new flow
-
name: rate-enable
type: u32
+ doc: Enable / disable rate limiting
-
name: flow-default-rate
type: u32
+ doc: Obsolete, do not use
-
name: flow-max-rate
type: u32
+ doc: Per flow max rate
-
name: buckets-log
type: u32
+ doc: log2(number of buckets)
-
name: flow-refill-delay
type: u32
+ doc: Flow credit refill delay in usec
-
name: orphan-mask
type: u32
+ doc: Mask applied to orphaned skb hashes
-
name: low-rate-threshold
type: u32
+ doc: Per packet delay under this rate
-
name: ce-threshold
type: u32
+ doc: DCTCP-like CE marking threshold
-
name: timer-slack
type: u32
-
name: horizon
type: u32
+ doc: Time horizon in usec
-
name: horizon-drop
type: u8
+ doc: Drop packets beyond horizon, or cap their EDT
+ -
+ name: priomap
+ type: binary
+ struct: tc-prio-qopt
+ -
+ name: weights
+ type: binary
+ sub-type: s32
+ doc: Weights for each band
-
name: tc-fq-codel-attrs
attributes:
@@ -1427,6 +3147,7 @@ attribute-sets:
-
name: corr
type: binary
+ struct: tc-netem-corr
-
name: delay-dist
type: binary
@@ -1434,15 +3155,19 @@ attribute-sets:
-
name: reorder
type: binary
+ struct: tc-netem-reorder
-
name: corrupt
type: binary
+ struct: tc-netem-corrupt
-
name: loss
- type: binary
+ type: nest
+ nested-attributes: tc-netem-loss-attrs
-
name: rate
type: binary
+ struct: tc-netem-rate
-
name: ecn
type: u32
@@ -1461,10 +3186,27 @@ attribute-sets:
-
name: slot
type: binary
+ struct: tc-netem-slot
-
name: slot-dist
type: binary
sub-type: s16
+ -
+ name: prng-seed
+ type: u64
+ -
+ name: tc-netem-loss-attrs
+ attributes:
+ -
+ name: gi
+ type: binary
+ doc: General Intuitive - 4 state model
+ struct: tc-netem-gimodel
+ -
+ name: ge
+ type: binary
+ doc: Gilbert Elliot models
+ struct: tc-netem-gemodel
-
name: tc-pie-attrs
attributes:
@@ -1493,6 +3235,44 @@ attribute-sets:
name: dq-rate-estimator
type: u32
-
+ name: tc-police-attrs
+ attributes:
+ -
+ name: tbf
+ type: binary
+ struct: tc-police
+ -
+ name: rate
+ type: binary
+ -
+ name: peakrate
+ type: binary
+ -
+ name: avrate
+ type: u32
+ -
+ name: result
+ type: u32
+ -
+ name: tm
+ type: binary
+ struct: tcf-t
+ -
+ name: pad
+ type: pad
+ -
+ name: rate64
+ type: u64
+ -
+ name: peakrate64
+ type: u64
+ -
+ name: pktrate64
+ type: u64
+ -
+ name: pktburst64
+ type: u64
+ -
name: tc-qfq-attrs
attributes:
-
@@ -1516,7 +3296,7 @@ attribute-sets:
type: u32
-
name: flags
- type: binary
+ type: bitfield32
-
name: early-drop-block
type: u32
@@ -1524,6 +3304,29 @@ attribute-sets:
name: mark-block
type: u32
-
+ name: tc-route-attrs
+ attributes:
+ -
+ name: classid
+ type: u32
+ -
+ name: to
+ type: u32
+ -
+ name: from
+ type: u32
+ -
+ name: iif
+ type: u32
+ -
+ name: police
+ type: nest
+ nested-attributes: tc-police-attrs
+ -
+ name: act
+ type: array-nest
+ nested-attributes: tc-act-attrs
+ -
name: tc-taprio-attrs
attributes:
-
@@ -1573,6 +3376,7 @@ attribute-sets:
name: entry
type: nest
nested-attributes: tc-taprio-sched-entry
+ multi-attr: true
-
name: tc-taprio-sched-entry
attributes:
@@ -1629,17 +3433,43 @@ attribute-sets:
name: pad
type: pad
-
- name: tca-gact-attrs
+ name: tc-act-sample-attrs
attributes:
-
name: tm
type: binary
+ struct: tcf-t
-
name: parms
type: binary
+ struct: tc-gen
+ -
+ name: rate
+ type: u32
+ -
+ name: trunc-size
+ type: u32
+ -
+ name: psample-group
+ type: u32
+ -
+ name: pad
+ type: pad
+ -
+ name: tc-act-gact-attrs
+ attributes:
+ -
+ name: tm
+ type: binary
+ struct: tcf-t
+ -
+ name: parms
+ type: binary
+ struct: tc-gen
-
name: prob
type: binary
+ struct: tc-gact-p
-
name: pad
type: pad
@@ -1659,35 +3489,90 @@ attribute-sets:
-
name: basic
type: binary
+ struct: gnet-stats-basic
-
name: rate-est
type: binary
+ struct: gnet-stats-rate-est
-
name: queue
type: binary
+ struct: gnet-stats-queue
-
name: app
- type: binary # TODO sub-message needs 2+ level deep lookup
+ type: sub-message
sub-message: tca-stats-app-msg
selector: kind
-
name: rate-est64
type: binary
+ struct: gnet-stats-rate-est64
-
name: pad
type: pad
-
name: basic-hw
type: binary
+ struct: gnet-stats-basic
-
name: pkt64
+ type: u64
+ -
+ name: tc-u32-attrs
+ attributes:
+ -
+ name: classid
+ type: u32
+ -
+ name: hash
+ type: u32
+ -
+ name: link
+ type: u32
+ -
+ name: divisor
+ type: u32
+ -
+ name: sel
type: binary
+ struct: tc-u32-sel
+ -
+ name: police
+ type: nest
+ nested-attributes: tc-police-attrs
+ -
+ name: act
+ type: array-nest
+ nested-attributes: tc-act-attrs
+ -
+ name: indev
+ type: string
+ -
+ name: pcnt
+ type: binary
+ struct: tc-u32-pcnt
+ -
+ name: mark
+ type: binary
+ struct: tc-u32-mark
+ -
+ name: flags
+ type: u32
+ -
+ name: pad
+ type: pad
sub-messages:
-
name: tc-options-msg
formats:
-
+ value: basic
+ attribute-set: tc-basic-attrs
+ -
+ value: bpf
+ attribute-set: tc-bpf-attrs
+ -
value: bfifo
fixed-header: tc-fifo-qopt
-
@@ -1697,6 +3582,9 @@ sub-messages:
value: cbs
attribute-set: tc-cbs-attrs
-
+ value: cgroup
+ attribute-set: tc-cgroup-attrs
+ -
value: choke
attribute-set: tc-choke-attrs
-
@@ -1714,6 +3602,12 @@ sub-messages:
value: ets
attribute-set: tc-ets-attrs
-
+ value: flow
+ attribute-set: tc-flow-attrs
+ -
+ value: flower
+ attribute-set: tc-flower-attrs
+ -
value: fq
attribute-set: tc-fq-attrs
-
@@ -1723,8 +3617,8 @@ sub-messages:
value: fq_pie
attribute-set: tc-fq-pie-attrs
-
- value: flower
- attribute-set: tc-flower-attrs
+ value: fw
+ attribute-set: tc-fw-attrs
-
value: gred
attribute-set: tc-gred-attrs
@@ -1740,6 +3634,9 @@ sub-messages:
-
value: ingress # no content
-
+ value: matchall
+ attribute-set: tc-matchall-attrs
+ -
value: mq # no content
-
value: mqprio
@@ -1776,6 +3673,9 @@ sub-messages:
value: red
attribute-set: tc-red-attrs
-
+ value: route
+ attribute-set: tc-route-attrs
+ -
value: sfb
fixed-header: tc-sfb-qopt
-
@@ -1787,88 +3687,105 @@ sub-messages:
-
value: tbf
attribute-set: tc-tbf-attrs
+ -
+ value: u32
+ attribute-set: tc-u32-attrs
-
name: tc-act-options-msg
formats:
-
- value: gact
- attribute-set: tca-gact-attrs
- -
- name: tca-stats-app-msg
- formats:
+ value: bpf
+ attribute-set: tc-act-bpf-attrs
-
- value: bfifo
+ value: connmark
+ attribute-set: tc-act-connmark-attrs
-
- value: blackhole
+ value: csum
+ attribute-set: tc-act-csum-attrs
-
- value: cake
- attribute-set: tc-cake-stats-attrs
+ value: ct
+ attribute-set: tc-act-ct-attrs
-
- value: cbs
+ value: ctinfo
+ attribute-set: tc-act-ctinfo-attrs
-
- value: choke
+ value: gact
+ attribute-set: tc-act-gact-attrs
-
- value: clsact
+ value: gate
+ attribute-set: tc-act-gate-attrs
-
- value: codel
+ value: ife
+ attribute-set: tc-act-ife-attrs
-
- value: drr
+ value: mirred
+ attribute-set: tc-act-mirred-attrs
-
- value: etf
+ value: mpls
+ attribute-set: tc-act-mpls-attrs
-
- value: ets
- -
- value: fq
- -
- value: fq_codel
+ value: nat
+ attribute-set: tc-act-nat-attrs
-
- value: fq_pie
+ value: pedit
+ attribute-set: tc-act-pedit-attrs
-
- value: flower
+ value: police
+ attribute-set: tc-act-police-attrs
-
- value: gred
+ value: sample
+ attribute-set: tc-act-sample-attrs
-
- value: hfsc
+ value: simple
+ attribute-set: tc-act-simple-attrs
-
- value: hhf
+ value: skbedit
+ attribute-set: tc-act-skbedit-attrs
-
- value: htb
+ value: skbmod
+ attribute-set: tc-act-skbmod-attrs
-
- value: ingress
+ value: tunnel_key
+ attribute-set: tc-act-tunnel-key-attrs
-
- value: mq
+ value: vlan
+ attribute-set: tc-act-vlan-attrs
+ -
+ name: tca-stats-app-msg
+ formats:
-
- value: mqprio
+ value: cake
+ attribute-set: tc-cake-stats-attrs
-
- value: multiq
+ value: choke
+ fixed-header: tc-choke-xstats
-
- value: netem
+ value: codel
+ fixed-header: tc-codel-xstats
-
- value: noqueue
+ value: fq
+ fixed-header: tc-fq-qd-stats
-
- value: pfifo
+ value: fq_codel
+ fixed-header: tc-fq-codel-xstats
-
- value: pfifo_fast
+ value: fq_pie
+ fixed-header: tc-fq-pie-xstats
-
- value: pfifo_head_drop
+ value: hhf
+ fixed-header: tc-hhf-xstats
-
value: pie
- -
- value: plug
- -
- value: prio
- -
- value: qfq
+ fixed-header: tc-pie-xstats
-
value: red
+ fixed-header: tc-red-xstats
-
value: sfb
+ fixed-header: tc-sfb-xstats
-
value: sfq
- -
- value: taprio
- -
- value: tbf
+ fixed-header: tc-sfq-xstats
operations:
enum-model: directional
diff --git a/Documentation/networking/af_xdp.rst b/Documentation/networking/af_xdp.rst
index dceeb0d763aa23..72da7057e4cf96 100644
--- a/Documentation/networking/af_xdp.rst
+++ b/Documentation/networking/af_xdp.rst
@@ -329,23 +329,24 @@ XDP_SHARED_UMEM option and provide the initial socket's fd in the
sxdp_shared_umem_fd field as you registered the UMEM on that
socket. These two sockets will now share one and the same UMEM.
-There is no need to supply an XDP program like the one in the previous
-case where sockets were bound to the same queue id and
-device. Instead, use the NIC's packet steering capabilities to steer
-the packets to the right queue. In the previous example, there is only
-one queue shared among sockets, so the NIC cannot do this steering. It
-can only steer between queues.
-
-In libbpf, you need to use the xsk_socket__create_shared() API as it
-takes a reference to a FILL ring and a COMPLETION ring that will be
-created for you and bound to the shared UMEM. You can use this
-function for all the sockets you create, or you can use it for the
-second and following ones and use xsk_socket__create() for the first
-one. Both methods yield the same result.
+In this case, it is possible to use the NIC's packet steering
+capabilities to steer the packets to the right queue. This is not
+possible in the previous example as there is only one queue shared
+among sockets, so the NIC cannot do this steering as it can only steer
+between queues.
+
+In libxdp (or libbpf prior to version 1.0), you need to use the
+xsk_socket__create_shared() API as it takes a reference to a FILL ring
+and a COMPLETION ring that will be created for you and bound to the
+shared UMEM. You can use this function for all the sockets you create,
+or you can use it for the second and following ones and use
+xsk_socket__create() for the first one. Both methods yield the same
+result.
Note that a UMEM can be shared between sockets on the same queue id
and device, as well as between queues on the same device and between
-devices at the same time.
+devices at the same time. It is also possible to redirect to any
+socket as long as it is bound to the same umem with XDP_SHARED_UMEM.
XDP_USE_NEED_WAKEUP bind flag
-----------------------------
@@ -822,6 +823,10 @@ A: The short answer is no, that is not supported at the moment. The
switch, or other distribution mechanism, in your NIC to direct
traffic to the correct queue id and socket.
+ Note that if you are using the XDP_SHARED_UMEM option, it is
+ possible to switch traffic between any socket bound to the same
+ umem.
+
Q: My packets are sometimes corrupted. What is wrong?
A: Care has to be taken not to feed the same buffer in the UMEM into
diff --git a/Documentation/networking/bonding.rst b/Documentation/networking/bonding.rst
index f7a73421eb76a1..e774b48de9f511 100644
--- a/Documentation/networking/bonding.rst
+++ b/Documentation/networking/bonding.rst
@@ -444,6 +444,18 @@ arp_missed_max
The default value is 2, and the allowable range is 1 - 255.
+coupled_control
+
+ Specifies whether the LACP state machine's MUX in the 802.3ad mode
+ should have separate Collecting and Distributing states.
+
+ This is by implementing the independent control state machine per
+ IEEE 802.1AX-2008 5.4.15 in addition to the existing coupled control
+ state machine.
+
+ The default value is 1. This setting does not separate the Collecting
+ and Distributing states, maintaining the bond in coupled control.
+
downdelay
Specifies the time, in milliseconds, to wait before disabling
diff --git a/Documentation/networking/can.rst b/Documentation/networking/can.rst
index d7e1ada905b2d3..62519d38c58bad 100644
--- a/Documentation/networking/can.rst
+++ b/Documentation/networking/can.rst
@@ -444,6 +444,24 @@ definitions are specified for CAN specific MTUs in include/linux/can.h:
#define CANFD_MTU (sizeof(struct canfd_frame)) == 72 => CAN FD frame
+Returned Message Flags
+----------------------
+
+When using the system call recvmsg(2) on a RAW or a BCM socket, the
+msg->msg_flags field may contain the following flags:
+
+MSG_DONTROUTE:
+ set when the received frame was created on the local host.
+
+MSG_CONFIRM:
+ set when the frame was sent via the socket it is received on.
+ This flag can be interpreted as a 'transmission confirmation' when the
+ CAN driver supports the echo of frames on driver level, see
+ :ref:`socketcan-local-loopback1` and :ref:`socketcan-local-loopback2`.
+ (Note: In order to receive such messages on a RAW socket,
+ CAN_RAW_RECV_OWN_MSGS must be set.)
+
+
.. _socketcan-raw-sockets:
RAW Protocol Sockets with can_filters (SOCK_RAW)
@@ -693,22 +711,6 @@ where the CAN_INV_FILTER flag is set in order to notch single CAN IDs or
CAN ID ranges from the incoming traffic.
-RAW Socket Returned Message Flags
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-When using recvmsg() call, the msg->msg_flags may contain following flags:
-
-MSG_DONTROUTE:
- set when the received frame was created on the local host.
-
-MSG_CONFIRM:
- set when the frame was sent via the socket it is received on.
- This flag can be interpreted as a 'transmission confirmation' when the
- CAN driver supports the echo of frames on driver level, see
- :ref:`socketcan-local-loopback1` and :ref:`socketcan-local-loopback2`.
- In order to receive such messages, CAN_RAW_RECV_OWN_MSGS must be set.
-
-
Broadcast Manager Protocol Sockets (SOCK_DGRAM)
-----------------------------------------------
diff --git a/Documentation/networking/device_drivers/ethernet/amazon/ena.rst b/Documentation/networking/device_drivers/ethernet/amazon/ena.rst
index b842bcb14255b5..a4c7d0c65fd7e5 100644
--- a/Documentation/networking/device_drivers/ethernet/amazon/ena.rst
+++ b/Documentation/networking/device_drivers/ethernet/amazon/ena.rst
@@ -211,10 +211,16 @@ Documentation/networking/net_dim.rst
RX copybreak
============
+
The rx_copybreak is initialized by default to ENA_DEFAULT_RX_COPYBREAK
and can be configured by the ETHTOOL_STUNABLE command of the
SIOCETHTOOL ioctl.
+This option controls the maximum packet length for which the RX
+descriptor it was received on would be recycled. When a packet smaller
+than RX copybreak bytes is received, it is copied into a new memory
+buffer and the RX descriptor is returned to HW.
+
Statistics
==========
diff --git a/Documentation/networking/device_drivers/ethernet/index.rst b/Documentation/networking/device_drivers/ethernet/index.rst
index 43de285b8a923a..6932d8c043c2c1 100644
--- a/Documentation/networking/device_drivers/ethernet/index.rst
+++ b/Documentation/networking/device_drivers/ethernet/index.rst
@@ -42,6 +42,7 @@ Contents:
intel/ice
marvell/octeontx2
marvell/octeon_ep
+ marvell/octeon_ep_vf
mellanox/mlx5/index
microsoft/netvsc
neterion/s2io
diff --git a/Documentation/networking/device_drivers/ethernet/intel/ice.rst b/Documentation/networking/device_drivers/ethernet/intel/ice.rst
index 5038e54586af66..934752f675ba40 100644
--- a/Documentation/networking/device_drivers/ethernet/intel/ice.rst
+++ b/Documentation/networking/device_drivers/ethernet/intel/ice.rst
@@ -368,15 +368,28 @@ more options for Receive Side Scaling (RSS) hash byte configuration.
# ethtool -N <ethX> rx-flow-hash <type> <option>
Where <type> is:
- tcp4 signifying TCP over IPv4
- udp4 signifying UDP over IPv4
- tcp6 signifying TCP over IPv6
- udp6 signifying UDP over IPv6
+ tcp4 signifying TCP over IPv4
+ udp4 signifying UDP over IPv4
+ gtpc4 signifying GTP-C over IPv4
+ gtpc4t signifying GTP-C (include TEID) over IPv4
+ gtpu4 signifying GTP-U over IPV4
+ gtpu4e signifying GTP-U and Extension Header over IPV4
+ gtpu4u signifying GTP-U PSC Uplink over IPV4
+ gtpu4d signifying GTP-U PSC Downlink over IPV4
+ tcp6 signifying TCP over IPv6
+ udp6 signifying UDP over IPv6
+ gtpc6 signifying GTP-C over IPv6
+ gtpc6t signifying GTP-C (include TEID) over IPv6
+ gtpu6 signifying GTP-U over IPV6
+ gtpu6e signifying GTP-U and Extension Header over IPV6
+ gtpu6u signifying GTP-U PSC Uplink over IPV6
+ gtpu6d signifying GTP-U PSC Downlink over IPV6
And <option> is one or more of:
s Hash on the IP source address of the Rx packet.
d Hash on the IP destination address of the Rx packet.
f Hash on bytes 0 and 1 of the Layer 4 header of the Rx packet.
n Hash on bytes 2 and 3 of the Layer 4 header of the Rx packet.
+ e Hash on GTP Packet on TEID (4bytes) of the Rx packet.
Accelerated Receive Flow Steering (aRFS)
diff --git a/Documentation/networking/device_drivers/ethernet/marvell/octeon_ep_vf.rst b/Documentation/networking/device_drivers/ethernet/marvell/octeon_ep_vf.rst
new file mode 100644
index 00000000000000..603133d0b92f88
--- /dev/null
+++ b/Documentation/networking/device_drivers/ethernet/marvell/octeon_ep_vf.rst
@@ -0,0 +1,24 @@
+.. SPDX-License-Identifier: GPL-2.0+
+
+=======================================================================
+Linux kernel networking driver for Marvell's Octeon PCI Endpoint NIC VF
+=======================================================================
+
+Network driver for Marvell's Octeon PCI EndPoint NIC VF.
+Copyright (c) 2020 Marvell International Ltd.
+
+Overview
+========
+This driver implements networking functionality of Marvell's Octeon PCI
+EndPoint NIC VF.
+
+Supported Devices
+=================
+Currently, this driver support following devices:
+ * Network controller: Cavium, Inc. Device b203
+ * Network controller: Cavium, Inc. Device b403
+ * Network controller: Cavium, Inc. Device b103
+ * Network controller: Cavium, Inc. Device b903
+ * Network controller: Cavium, Inc. Device ba03
+ * Network controller: Cavium, Inc. Device bc03
+ * Network controller: Cavium, Inc. Device bd03
diff --git a/Documentation/networking/device_drivers/wwan/t7xx.rst b/Documentation/networking/device_drivers/wwan/t7xx.rst
index dd5b731957ca40..f346f5f85f154e 100644
--- a/Documentation/networking/device_drivers/wwan/t7xx.rst
+++ b/Documentation/networking/device_drivers/wwan/t7xx.rst
@@ -39,6 +39,34 @@ command and receive response:
- open the AT control channel using a UART tool or a special user tool
+Sysfs
+=====
+The driver provides sysfs interfaces to userspace.
+
+t7xx_mode
+---------
+The sysfs interface provides userspace with access to the device mode, this interface
+supports read and write operations.
+
+Device mode:
+
+- ``unknown`` represents that device in unknown status
+- ``ready`` represents that device in ready status
+- ``reset`` represents that device in reset status
+- ``fastboot_switching`` represents that device in fastboot switching status
+- ``fastboot_download`` represents that device in fastboot download status
+- ``fastboot_dump`` represents that device in fastboot dump status
+
+Read from userspace to get the current device mode.
+
+::
+ $ cat /sys/bus/pci/devices/${bdf}/t7xx_mode
+
+Write from userspace to set the device mode.
+
+::
+ $ echo fastboot_switching > /sys/bus/pci/devices/${bdf}/t7xx_mode
+
Management application development
==================================
The driver and userspace interfaces are described below. The MBIM protocol is
@@ -97,6 +125,20 @@ The driver exposes an AT port by implementing AT WWAN Port.
The userspace end of the control port is a /dev/wwan0at0 character
device. Application shall use this interface to issue AT commands.
+fastboot port userspace ABI
+---------------------------
+
+/dev/wwan0fastboot0 character device
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The driver exposes a fastboot protocol interface by implementing
+fastboot WWAN Port. The userspace end of the fastboot channel pipe is a
+/dev/wwan0fastboot0 character device. Application shall use this interface for
+fastboot protocol communication.
+
+Please note that driver needs to be reloaded to export /dev/wwan0fastboot0
+port, because device needs a cold reset after enter ``fastboot_switching``
+mode.
+
The MediaTek's T700 modem supports the 3GPP TS 27.007 [4] specification.
References
@@ -118,3 +160,7 @@ speak the Mobile Interface Broadband Model (MBIM) protocol"*
[4] *Specification # 27.007 - 3GPP*
- https://www.3gpp.org/DynaReport/27007.htm
+
+[5] *fastboot "a mechanism for communicating with bootloaders"*
+
+- https://android.googlesource.com/platform/system/core/+/refs/heads/main/fastboot/README.md
diff --git a/Documentation/networking/devlink/mlx5.rst b/Documentation/networking/devlink/mlx5.rst
index 702f204a3dbd35..456985407475f1 100644
--- a/Documentation/networking/devlink/mlx5.rst
+++ b/Documentation/networking/devlink/mlx5.rst
@@ -97,6 +97,10 @@ parameters.
When metadata is disabled, the above use cases will fail to initialize if
users try to enable them.
+
+ Note: Setting this parameter does not take effect immediately. Setting
+ must happen in legacy mode and eswitch port metadata takes effect after
+ enabling switchdev mode.
* - ``hairpin_num_queues``
- u32
- driverinit
@@ -246,7 +250,7 @@ them in realtime.
Description of the vnic counters:
-- total_q_under_processor_handle
+- total_error_queues
number of queues in an error state due to
an async error or errored command.
- send_queue_priority_update_flow
@@ -255,7 +259,8 @@ Description of the vnic counters:
number of times CQ entered an error state due to an overflow.
- async_eq_overrun
number of times an EQ mapped to async events was overrun.
- comp_eq_overrun number of times an EQ mapped to completion events was
+- comp_eq_overrun
+ number of times an EQ mapped to completion events was
overrun.
- quota_exceeded_command
number of commands issued and failed due to quota exceeded.
diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index 69f3d6dcd9fd81..473d72c36d6175 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -74,6 +74,7 @@ Contents:
mpls-sysctl
mptcp-sysctl
multiqueue
+ multi-pf-netdev
napi
net_cachelines/index
netconsole
diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index 7afff42612e949..bd50df6a5a42e6 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -2503,7 +2503,7 @@ use_tempaddr - INTEGER
temp_valid_lft - INTEGER
valid lifetime (in seconds) for temporary addresses. If less than the
- minimum required lifetime (typically 5 seconds), temporary addresses
+ minimum required lifetime (typically 5-7 seconds), temporary addresses
will not be created.
Default: 172800 (2 days)
@@ -2511,7 +2511,7 @@ temp_valid_lft - INTEGER
temp_prefered_lft - INTEGER
Preferred lifetime (in seconds) for temporary addresses. If
temp_prefered_lft is less than the minimum required lifetime (typically
- 5 seconds), temporary addresses will not be created. If
+ 5-7 seconds), the preferred lifetime is the minimum required. If
temp_prefered_lft is greater than temp_valid_lft, the preferred lifetime
is temp_valid_lft.
@@ -2535,6 +2535,16 @@ max_desync_factor - INTEGER
Default: 600
+regen_min_advance - INTEGER
+ How far in advance (in seconds), at minimum, to create a new temporary
+ address before the current one is deprecated. This value is added to
+ the amount of time that may be required for duplicate address detection
+ to determine when to create a new address. Linux permits setting this
+ value to less than the default of 2 seconds, but a value less than 2
+ does not conform to RFC 8981.
+
+ Default: 2
+
regen_max_retry - INTEGER
Number of attempts before give up attempting to generate
valid temporary addresses.
diff --git a/Documentation/networking/l2tp.rst b/Documentation/networking/l2tp.rst
index 7f383e99dbada9..8496b467dea492 100644
--- a/Documentation/networking/l2tp.rst
+++ b/Documentation/networking/l2tp.rst
@@ -386,12 +386,19 @@ Sample userspace code:
- Create session PPPoX data socket::
+ /* Input: the L2TP tunnel UDP socket `tunnel_fd`, which needs to be
+ * bound already (both sockname and peername), otherwise it will not be
+ * ready.
+ */
+
struct sockaddr_pppol2tp sax;
- int fd;
+ int session_fd;
+ int ret;
+
+ session_fd = socket(AF_PPPOX, SOCK_DGRAM, PX_PROTO_OL2TP);
+ if (session_fd < 0)
+ return -errno;
- /* Note, the tunnel socket must be bound already, else it
- * will not be ready
- */
sax.sa_family = AF_PPPOX;
sax.sa_protocol = PX_PROTO_OL2TP;
sax.pppol2tp.fd = tunnel_fd;
@@ -406,12 +413,128 @@ Sample userspace code:
/* session_fd is the fd of the session's PPPoL2TP socket.
* tunnel_fd is the fd of the tunnel UDP / L2TPIP socket.
*/
- fd = connect(session_fd, (struct sockaddr *)&sax, sizeof(sax));
- if (fd < 0 ) {
+ ret = connect(session_fd, (struct sockaddr *)&sax, sizeof(sax));
+ if (ret < 0 ) {
+ close(session_fd);
+ return -errno;
+ }
+
+ return session_fd;
+
+L2TP control packets will still be available for read on `tunnel_fd`.
+
+ - Create PPP channel::
+
+ /* Input: the session PPPoX data socket `session_fd` which was created
+ * as described above.
+ */
+
+ int ppp_chan_fd;
+ int chindx;
+ int ret;
+
+ ret = ioctl(session_fd, PPPIOCGCHAN, &chindx);
+ if (ret < 0)
+ return -errno;
+
+ ppp_chan_fd = open("/dev/ppp", O_RDWR);
+ if (ppp_chan_fd < 0)
+ return -errno;
+
+ ret = ioctl(ppp_chan_fd, PPPIOCATTCHAN, &chindx);
+ if (ret < 0) {
+ close(ppp_chan_fd);
return -errno;
}
+
+ return ppp_chan_fd;
+
+LCP PPP frames will be available for read on `ppp_chan_fd`.
+
+ - Create PPP interface::
+
+ /* Input: the PPP channel `ppp_chan_fd` which was created as described
+ * above.
+ */
+
+ int ifunit = -1;
+ int ppp_if_fd;
+ int ret;
+
+ ppp_if_fd = open("/dev/ppp", O_RDWR);
+ if (ppp_if_fd < 0)
+ return -errno;
+
+ ret = ioctl(ppp_if_fd, PPPIOCNEWUNIT, &ifunit);
+ if (ret < 0) {
+ close(ppp_if_fd);
+ return -errno;
+ }
+
+ ret = ioctl(ppp_chan_fd, PPPIOCCONNECT, &ifunit);
+ if (ret < 0) {
+ close(ppp_if_fd);
+ return -errno;
+ }
+
+ return ppp_if_fd;
+
+IPCP/IPv6CP PPP frames will be available for read on `ppp_if_fd`.
+
+The ppp<ifunit> interface can then be configured as usual with netlink's
+RTM_NEWLINK, RTM_NEWADDR, RTM_NEWROUTE, or ioctl's SIOCSIFMTU, SIOCSIFADDR,
+SIOCSIFDSTADDR, SIOCSIFNETMASK, SIOCSIFFLAGS, or with the `ip` command.
+
+ - Bridging L2TP sessions which have PPP pseudowire types (this is also called
+ L2TP tunnel switching or L2TP multihop) is supported by bridging the PPP
+ channels of the two L2TP sessions to be bridged::
+
+ /* Input: the session PPPoX data sockets `session_fd1` and `session_fd2`
+ * which were created as described further above.
+ */
+
+ int ppp_chan_fd;
+ int chindx1;
+ int chindx2;
+ int ret;
+
+ ret = ioctl(session_fd1, PPPIOCGCHAN, &chindx1);
+ if (ret < 0)
+ return -errno;
+
+ ret = ioctl(session_fd2, PPPIOCGCHAN, &chindx2);
+ if (ret < 0)
+ return -errno;
+
+ ppp_chan_fd = open("/dev/ppp", O_RDWR);
+ if (ppp_chan_fd < 0)
+ return -errno;
+
+ ret = ioctl(ppp_chan_fd, PPPIOCATTCHAN, &chindx1);
+ if (ret < 0) {
+ close(ppp_chan_fd);
+ return -errno;
+ }
+
+ ret = ioctl(ppp_chan_fd, PPPIOCBRIDGECHAN, &chindx2);
+ close(ppp_chan_fd);
+ if (ret < 0)
+ return -errno;
+
return 0;
+It can be noted that when bridging PPP channels, the PPP session is not locally
+terminated, and no local PPP interface is created. PPP frames arriving on one
+channel are directly passed to the other channel, and vice versa.
+
+The PPP channel does not need to be kept open. Only the session PPPoX data
+sockets need to be kept open.
+
+More generally, it is also possible in the same way to e.g. bridge a PPPoL2TP
+PPP channel with other types of PPP channels, such as PPPoE.
+
+See more details for the PPP side in ppp_generic.rst.
+
Old L2TPv2-only API
-------------------
diff --git a/Documentation/networking/multi-pf-netdev.rst b/Documentation/networking/multi-pf-netdev.rst
new file mode 100644
index 00000000000000..be8e4bcadf1190
--- /dev/null
+++ b/Documentation/networking/multi-pf-netdev.rst
@@ -0,0 +1,174 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>
+
+===============
+Multi-PF Netdev
+===============
+
+Contents
+========
+
+- `Background`_
+- `Overview`_
+- `mlx5 implementation`_
+- `Channels distribution`_
+- `Observability`_
+- `Steering`_
+- `Mutually exclusive features`_
+
+Background
+==========
+
+The Multi-PF NIC technology enables several CPUs within a multi-socket server to connect directly to
+the network, each through its own dedicated PCIe interface. Through either a connection harness that
+splits the PCIe lanes between two cards or by bifurcating a PCIe slot for a single card. This
+results in eliminating the network traffic traversing over the internal bus between the sockets,
+significantly reducing overhead and latency, in addition to reducing CPU utilization and increasing
+network throughput.
+
+Overview
+========
+
+The feature adds support for combining multiple PFs of the same port in a Multi-PF environment under
+one netdev instance. It is implemented in the netdev layer. Lower-layer instances like pci func,
+sysfs entry, and devlink are kept separate.
+Passing traffic through different devices belonging to different NUMA sockets saves cross-NUMA
+traffic and allows apps running on the same netdev from different NUMAs to still feel a sense of
+proximity to the device and achieve improved performance.
+
+mlx5 implementation
+===================
+
+Multi-PF or Socket-direct in mlx5 is achieved by grouping PFs together which belong to the same
+NIC and has the socket-direct property enabled, once all PFs are probed, we create a single netdev
+to represent all of them, symmetrically, we destroy the netdev whenever any of the PFs is removed.
+
+The netdev network channels are distributed between all devices, a proper configuration would utilize
+the correct close NUMA node when working on a certain app/CPU.
+
+We pick one PF to be a primary (leader), and it fills a special role. The other devices
+(secondaries) are disconnected from the network at the chip level (set to silent mode). In silent
+mode, no south <-> north traffic flowing directly through a secondary PF. It needs the assistance of
+the leader PF (east <-> west traffic) to function. All Rx/Tx traffic is steered through the primary
+to/from the secondaries.
+
+Currently, we limit the support to PFs only, and up to two PFs (sockets).
+
+Channels distribution
+=====================
+
+We distribute the channels between the different PFs to achieve local NUMA node performance
+on multiple NUMA nodes.
+
+Each combined channel works against one specific PF, creating all its datapath queues against it. We
+distribute channels to PFs in a round-robin policy.
+
+::
+
+ Example for 2 PFs and 5 channels:
+ +--------+--------+
+ | ch idx | PF idx |
+ +--------+--------+
+ | 0 | 0 |
+ | 1 | 1 |
+ | 2 | 0 |
+ | 3 | 1 |
+ | 4 | 0 |
+ +--------+--------+
+
+
+The reason we prefer round-robin is, it is less influenced by changes in the number of channels. The
+mapping between a channel index and a PF is fixed, no matter how many channels the user configures.
+As the channel stats are persistent across channel's closure, changing the mapping every single time
+would turn the accumulative stats less representing of the channel's history.
+
+This is achieved by using the correct core device instance (mdev) in each channel, instead of them
+all using the same instance under "priv->mdev".
+
+Observability
+=============
+The relation between PF, irq, napi, and queue can be observed via netlink spec:
+
+$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml --dump queue-get --json='{"ifindex": 13}'
+[{'id': 0, 'ifindex': 13, 'napi-id': 539, 'type': 'rx'},
+ {'id': 1, 'ifindex': 13, 'napi-id': 540, 'type': 'rx'},
+ {'id': 2, 'ifindex': 13, 'napi-id': 541, 'type': 'rx'},
+ {'id': 3, 'ifindex': 13, 'napi-id': 542, 'type': 'rx'},
+ {'id': 4, 'ifindex': 13, 'napi-id': 543, 'type': 'rx'},
+ {'id': 0, 'ifindex': 13, 'napi-id': 539, 'type': 'tx'},
+ {'id': 1, 'ifindex': 13, 'napi-id': 540, 'type': 'tx'},
+ {'id': 2, 'ifindex': 13, 'napi-id': 541, 'type': 'tx'},
+ {'id': 3, 'ifindex': 13, 'napi-id': 542, 'type': 'tx'},
+ {'id': 4, 'ifindex': 13, 'napi-id': 543, 'type': 'tx'}]
+
+$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml --dump napi-get --json='{"ifindex": 13}'
+[{'id': 543, 'ifindex': 13, 'irq': 42},
+ {'id': 542, 'ifindex': 13, 'irq': 41},
+ {'id': 541, 'ifindex': 13, 'irq': 40},
+ {'id': 540, 'ifindex': 13, 'irq': 39},
+ {'id': 539, 'ifindex': 13, 'irq': 36}]
+
+Here you can clearly observe our channels distribution policy:
+
+$ ls /proc/irq/{36,39,40,41,42}/mlx5* -d -1
+/proc/irq/36/mlx5_comp1@pci:0000:08:00.0
+/proc/irq/39/mlx5_comp1@pci:0000:09:00.0
+/proc/irq/40/mlx5_comp2@pci:0000:08:00.0
+/proc/irq/41/mlx5_comp2@pci:0000:09:00.0
+/proc/irq/42/mlx5_comp3@pci:0000:08:00.0
+
+Steering
+========
+Secondary PFs are set to "silent" mode, meaning they are disconnected from the network.
+
+In Rx, the steering tables belong to the primary PF only, and it is its role to distribute incoming
+traffic to other PFs, via cross-vhca steering capabilities. Still maintain a single default RSS table,
+that is capable of pointing to the receive queues of a different PF.
+
+In Tx, the primary PF creates a new Tx flow table, which is aliased by the secondaries, so they can
+go out to the network through it.
+
+In addition, we set default XPS configuration that, based on the CPU, selects an SQ belonging to the
+PF on the same node as the CPU.
+
+XPS default config example:
+
+NUMA node(s): 2
+NUMA node0 CPU(s): 0-11
+NUMA node1 CPU(s): 12-23
+
+PF0 on node0, PF1 on node1.
+
+- /sys/class/net/eth2/queues/tx-0/xps_cpus:000001
+- /sys/class/net/eth2/queues/tx-1/xps_cpus:001000
+- /sys/class/net/eth2/queues/tx-2/xps_cpus:000002
+- /sys/class/net/eth2/queues/tx-3/xps_cpus:002000
+- /sys/class/net/eth2/queues/tx-4/xps_cpus:000004
+- /sys/class/net/eth2/queues/tx-5/xps_cpus:004000
+- /sys/class/net/eth2/queues/tx-6/xps_cpus:000008
+- /sys/class/net/eth2/queues/tx-7/xps_cpus:008000
+- /sys/class/net/eth2/queues/tx-8/xps_cpus:000010
+- /sys/class/net/eth2/queues/tx-9/xps_cpus:010000
+- /sys/class/net/eth2/queues/tx-10/xps_cpus:000020
+- /sys/class/net/eth2/queues/tx-11/xps_cpus:020000
+- /sys/class/net/eth2/queues/tx-12/xps_cpus:000040
+- /sys/class/net/eth2/queues/tx-13/xps_cpus:040000
+- /sys/class/net/eth2/queues/tx-14/xps_cpus:000080
+- /sys/class/net/eth2/queues/tx-15/xps_cpus:080000
+- /sys/class/net/eth2/queues/tx-16/xps_cpus:000100
+- /sys/class/net/eth2/queues/tx-17/xps_cpus:100000
+- /sys/class/net/eth2/queues/tx-18/xps_cpus:000200
+- /sys/class/net/eth2/queues/tx-19/xps_cpus:200000
+- /sys/class/net/eth2/queues/tx-20/xps_cpus:000400
+- /sys/class/net/eth2/queues/tx-21/xps_cpus:400000
+- /sys/class/net/eth2/queues/tx-22/xps_cpus:000800
+- /sys/class/net/eth2/queues/tx-23/xps_cpus:800000
+
+Mutually exclusive features
+===========================
+
+The nature of Multi-PF, where different channels work with different PFs, conflicts with
+stateful features where the state is maintained in one of the PFs.
+For example, in the TLS device-offload feature, special context objects are created per connection
+and maintained in the PF. Transitioning between different RQs/SQs would break the feature. Hence,
+we disable this combination for now.
diff --git a/Documentation/networking/netconsole.rst b/Documentation/networking/netconsole.rst
index 390730a74332d7..d55c2a22ec7af0 100644
--- a/Documentation/networking/netconsole.rst
+++ b/Documentation/networking/netconsole.rst
@@ -15,6 +15,8 @@ Extended console support by Tejun Heo <tj@kernel.org>, May 1 2015
Release prepend support by Breno Leitao <leitao@debian.org>, Jul 7 2023
+Userdata append support by Matthew Wood <thepacketgeek@gmail.com>, Jan 22 2024
+
Please send bug reports to Matt Mackall <mpm@selenic.com>
Satyam Sharma <satyam.sharma@gmail.com>, and Cong Wang <xiyou.wangcong@gmail.com>
@@ -171,6 +173,70 @@ You can modify these targets in runtime by creating the following targets::
cat cmdline1/remote_ip
10.0.0.3
+Append User Data
+----------------
+
+Custom user data can be appended to the end of messages with netconsole
+dynamic configuration enabled. User data entries can be modified without
+changing the "enabled" attribute of a target.
+
+Directories (keys) under `userdata` are limited to 53 character length, and
+data in `userdata/<key>/value` are limited to 200 bytes::
+
+ cd /sys/kernel/config/netconsole && mkdir cmdline0
+ cd cmdline0
+ mkdir userdata/foo
+ echo bar > userdata/foo/value
+ mkdir userdata/qux
+ echo baz > userdata/qux/value
+
+Messages will now include this additional user data::
+
+ echo "This is a message" > /dev/kmsg
+
+Sends::
+
+ 12,607,22085407756,-;This is a message
+ foo=bar
+ qux=baz
+
+Preview the userdata that will be appended with::
+
+ cd /sys/kernel/config/netconsole/cmdline0/userdata
+ for f in `ls userdata`; do echo $f=$(cat userdata/$f/value); done
+
+If a `userdata` entry is created but no data is written to the `value` file,
+the entry will be omitted from netconsole messages::
+
+ cd /sys/kernel/config/netconsole && mkdir cmdline0
+ cd cmdline0
+ mkdir userdata/foo
+ echo bar > userdata/foo/value
+ mkdir userdata/qux
+
+The `qux` key is omitted since it has no value::
+
+ echo "This is a message" > /dev/kmsg
+ 12,607,22085407756,-;This is a message
+ foo=bar
+
+Delete `userdata` entries with `rmdir`::
+
+ rmdir /sys/kernel/config/netconsole/cmdline0/userdata/qux
+
+.. warning::
+ When writing strings to user data values, input is broken up per line in
+ configfs store calls and this can cause confusing behavior::
+
+ mkdir userdata/testing
+ printf "val1\nval2" > userdata/testing/value
+ # userdata store value is called twice, first with "val1\n" then "val2"
+ # so "val2" is stored, being the last value stored
+ cat userdata/testing/value
+ val2
+
+ It is recommended to not write user data values with newlines.
+
Extended console:
=================
diff --git a/Documentation/networking/netdevices.rst b/Documentation/networking/netdevices.rst
index 9e4cccb90b8700..c2476917a6c37d 100644
--- a/Documentation/networking/netdevices.rst
+++ b/Documentation/networking/netdevices.rst
@@ -252,8 +252,8 @@ ndo_eth_ioctl:
Context: process
ndo_get_stats:
- Synchronization: rtnl_lock() semaphore, dev_base_lock rwlock, or RCU.
- Context: atomic (can't sleep under rwlock or RCU)
+ Synchronization: rtnl_lock() semaphore, or RCU.
+ Context: atomic (can't sleep under RCU)
ndo_start_xmit:
Synchronization: __netif_tx_lock spinlock.
diff --git a/Documentation/networking/sfp-phylink.rst b/Documentation/networking/sfp-phylink.rst
index 8054d33f449f56..5bf285d73e8a1a 100644
--- a/Documentation/networking/sfp-phylink.rst
+++ b/Documentation/networking/sfp-phylink.rst
@@ -231,16 +231,136 @@ this documentation.
For further information on these methods, please see the inline
documentation in :c:type:`struct phylink_mac_ops <phylink_mac_ops>`.
-9. Remove calls to of_parse_phandle() for the PHY,
- of_phy_register_fixed_link() for fixed links etc. from the probe
- function, and replace with:
+9. Fill-in the :c:type:`struct phylink_config <phylink_config>` fields with
+ a reference to the :c:type:`struct device <device>` associated to your
+ :c:type:`struct net_device <net_device>`:
.. code-block:: c
- struct phylink *phylink;
priv->phylink_config.dev = &dev.dev;
priv->phylink_config.type = PHYLINK_NETDEV;
+ Fill-in the various speeds, pause and duplex modes your MAC can handle:
+
+ .. code-block:: c
+
+ priv->phylink_config.mac_capabilities = MAC_SYM_PAUSE | MAC_10 | MAC_100 | MAC_1000FD;
+
+10. Some Ethernet controllers work in pair with a PCS (Physical Coding Sublayer)
+ block, that can handle among other things the encoding/decoding, link
+ establishment detection and autonegotiation. While some MACs have internal
+ PCS whose operation is transparent, some other require dedicated PCS
+ configuration for the link to become functional. In that case, phylink
+ provides a PCS abstraction through :c:type:`struct phylink_pcs <phylink_pcs>`.
+
+ Identify if your driver has one or more internal PCS blocks, and/or if
+ your controller can use an external PCS block that might be internally
+ connected to your controller.
+
+ If your controller doesn't have any internal PCS, you can go to step 11.
+
+ If your Ethernet controller contains one or several PCS blocks, create
+ one :c:type:`struct phylink_pcs <phylink_pcs>` instance per PCS block within
+ your driver's private data structure:
+
+ .. code-block:: c
+
+ struct phylink_pcs pcs;
+
+ Populate the relevant :c:type:`struct phylink_pcs_ops <phylink_pcs_ops>` to
+ configure your PCS. Create a :c:func:`pcs_get_state` function that reports
+ the inband link state, a :c:func:`pcs_config` function to configure your
+ PCS according to phylink-provided parameters, and a :c:func:`pcs_validate`
+ function that report to phylink all accepted configuration parameters for
+ your PCS:
+
+ .. code-block:: c
+
+ struct phylink_pcs_ops foo_pcs_ops = {
+ .pcs_validate = foo_pcs_validate,
+ .pcs_get_state = foo_pcs_get_state,
+ .pcs_config = foo_pcs_config,
+ };
+
+ Arrange for PCS link state interrupts to be forwarded into
+ phylink, via:
+
+ .. code-block:: c
+
+ phylink_pcs_change(pcs, link_is_up);
+
+ where ``link_is_up`` is true if the link is currently up or false
+ otherwise. If a PCS is unable to provide these interrupts, then
+ it should set ``pcs->pcs_poll = true;`` when creating the PCS.
+
+11. If your controller relies on, or accepts the presence of an external PCS
+ controlled through its own driver, add a pointer to a phylink_pcs instance
+ in your driver private data structure:
+
+ .. code-block:: c
+
+ struct phylink_pcs *pcs;
+
+ The way of getting an instance of the actual PCS depends on the platform,
+ some PCS sit on an MDIO bus and are grabbed by passing a pointer to the
+ corresponding :c:type:`struct mii_bus <mii_bus>` and the PCS's address on
+ that bus. In this example, we assume the controller attaches to a Lynx PCS
+ instance:
+
+ .. code-block:: c
+
+ priv->pcs = lynx_pcs_create_mdiodev(bus, 0);
+
+ Some PCS can be recovered based on firmware information:
+
+ .. code-block:: c
+
+ priv->pcs = lynx_pcs_create_fwnode(of_fwnode_handle(node));
+
+12. Populate the :c:func:`mac_select_pcs` callback and add it to your
+ :c:type:`struct phylink_mac_ops <phylink_mac_ops>` set of ops. This function
+ must return a pointer to the relevant :c:type:`struct phylink_pcs <phylink_pcs>`
+ that will be used for the requested link configuration:
+
+ .. code-block:: c
+
+ static struct phylink_pcs *foo_select_pcs(struct phylink_config *config,
+ phy_interface_t interface)
+ {
+ struct foo_priv *priv = container_of(config, struct foo_priv,
+ phylink_config);
+
+ if ( /* 'interface' needs a PCS to function */ )
+ return priv->pcs;
+
+ return NULL;
+ }
+
+ See :c:func:`mvpp2_select_pcs` for an example of a driver that has multiple
+ internal PCS.
+
+13. Fill-in all the :c:type:`phy_interface_t <phy_interface_t>` (i.e. all MAC to
+ PHY link modes) that your MAC can output. The following example shows a
+ configuration for a MAC that can handle all RGMII modes, SGMII and 1000BaseX.
+ You must adjust these according to what your MAC and all PCS associated
+ with this MAC are capable of, and not just the interface you wish to use:
+
+ .. code-block:: c
+
+ phy_interface_set_rgmii(priv->phylink_config.supported_interfaces);
+ __set_bit(PHY_INTERFACE_MODE_SGMII,
+ priv->phylink_config.supported_interfaces);
+ __set_bit(PHY_INTERFACE_MODE_1000BASEX,
+ priv->phylink_config.supported_interfaces);
+
+14. Remove calls to of_parse_phandle() for the PHY,
+ of_phy_register_fixed_link() for fixed links etc. from the probe
+ function, and replace with:
+
+ .. code-block:: c
+
+ struct phylink *phylink;
+
phylink = phylink_create(&priv->phylink_config, node, phy_mode, &phylink_ops);
if (IS_ERR(phylink)) {
err = PTR_ERR(phylink);
@@ -249,14 +369,14 @@ this documentation.
priv->phylink = phylink;
- and arrange to destroy the phylink in the probe failure path as
- appropriate and the removal path too by calling:
+ and arrange to destroy the phylink in the probe failure path as
+ appropriate and the removal path too by calling:
- .. code-block:: c
+ .. code-block:: c
phylink_destroy(priv->phylink);
-10. Arrange for MAC link state interrupts to be forwarded into
+15. Arrange for MAC link state interrupts to be forwarded into
phylink, via:
.. code-block:: c
@@ -264,17 +384,16 @@ this documentation.
phylink_mac_change(priv->phylink, link_is_up);
where ``link_is_up`` is true if the link is currently up or false
- otherwise. If a MAC is unable to provide these interrupts, then
- it should set ``priv->phylink_config.pcs_poll = true;`` in step 9.
+ otherwise.
-11. Verify that the driver does not call::
+16. Verify that the driver does not call::
netif_carrier_on()
netif_carrier_off()
- as these will interfere with phylink's tracking of the link state,
- and cause phylink to omit calls via the :c:func:`mac_link_up` and
- :c:func:`mac_link_down` methods.
+ as these will interfere with phylink's tracking of the link state,
+ and cause phylink to omit calls via the :c:func:`mac_link_up` and
+ :c:func:`mac_link_down` methods.
Network drivers should call phylink_stop() and phylink_start() via their
suspend/resume paths, which ensures that the appropriate
diff --git a/Documentation/networking/statistics.rst b/Documentation/networking/statistics.rst
index 551b3cc29a4132..75e017dfa8251d 100644
--- a/Documentation/networking/statistics.rst
+++ b/Documentation/networking/statistics.rst
@@ -41,6 +41,15 @@ If `-s` is specified once the detailed errors won't be shown.
`ip` supports JSON formatting via the `-j` option.
+Queue statistics
+~~~~~~~~~~~~~~~~
+
+Queue statistics are accessible via the netdev netlink family.
+
+Currently no widely distributed CLI exists to access those statistics.
+Kernel development tools (ynl) can be used to experiment with them,
+see `Documentation/userspace-api/netlink/intro-specs.rst`.
+
Protocol-specific statistics
----------------------------
@@ -147,6 +156,12 @@ Statistics are reported both in the responses to link information
requests (`RTM_GETLINK`) and statistic requests (`RTM_GETSTATS`,
when `IFLA_STATS_LINK_64` bit is set in the `.filter_mask` of the request).
+netdev (netlink)
+~~~~~~~~~~~~~~~~
+
+`netdev` generic netlink family allows accessing page pool and per queue
+statistics.
+
ethtool
-------
diff --git a/Documentation/networking/xfrm_device.rst b/Documentation/networking/xfrm_device.rst
index 535077cbeb07df..bfea9d8579ede0 100644
--- a/Documentation/networking/xfrm_device.rst
+++ b/Documentation/networking/xfrm_device.rst
@@ -71,9 +71,9 @@ Callbacks to implement
bool (*xdo_dev_offload_ok) (struct sk_buff *skb,
struct xfrm_state *x);
void (*xdo_dev_state_advance_esn) (struct xfrm_state *x);
+ void (*xdo_dev_state_update_stats) (struct xfrm_state *x);
/* Solely packet offload callbacks */
- void (*xdo_dev_state_update_curlft) (struct xfrm_state *x);
int (*xdo_dev_policy_add) (struct xfrm_policy *x, struct netlink_ext_ack *extack);
void (*xdo_dev_policy_delete) (struct xfrm_policy *x);
void (*xdo_dev_policy_free) (struct xfrm_policy *x);
@@ -191,6 +191,6 @@ xdo_dev_policy_free() on any remaining offloaded states.
Outcome of HW handling packets, the XFRM core can't count hard, soft limits.
The HW/driver are responsible to perform it and provide accurate data when
-xdo_dev_state_update_curlft() is called. In case of one of these limits
+xdo_dev_state_update_stats() is called. In case of one of these limits
occuried, the driver needs to call to xfrm_state_check_expire() to make sure
that XFRM performs rekeying sequence.
diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
index 3731ecf1e4370d..c472423412bf27 100644
--- a/Documentation/userspace-api/ioctl/ioctl-number.rst
+++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
@@ -310,6 +310,7 @@ Code Seq# Include File Comments
0x89 0B-DF linux/sockios.h
0x89 E0-EF linux/sockios.h SIOCPROTOPRIVATE range
0x89 F0-FF linux/sockios.h SIOCDEVPRIVATE range
+0x8A 00-1F linux/eventpoll.h
0x8B all linux/wireless.h
0x8C 00-3F WiNRADiO driver
<http://www.winradio.com.au/>
diff --git a/Documentation/userspace-api/netlink/netlink-raw.rst b/Documentation/userspace-api/netlink/netlink-raw.rst
index 1e14f5f22b8e0f..1990eea772d081 100644
--- a/Documentation/userspace-api/netlink/netlink-raw.rst
+++ b/Documentation/userspace-api/netlink/netlink-raw.rst
@@ -150,3 +150,45 @@ attributes from an ``attribute-set``. For example the following
Note that a selector attribute must appear in a netlink message before any
sub-message attributes that depend on it.
+
+If an attribute such as ``kind`` is defined at more than one nest level, then a
+sub-message selector will be resolved using the value 'closest' to the selector.
+For example, if the same attribute name is defined in a nested ``attribute-set``
+alongside a sub-message selector and also in a top level ``attribute-set``, then
+the selector will be resolved using the value 'closest' to the selector. If the
+value is not present in the message at the same level as defined in the spec
+then this is an error.
+
+Nested struct definitions
+-------------------------
+
+Many raw netlink families such as :doc:`tc<../../networking/netlink_spec/tc>`
+make use of nested struct definitions. The ``netlink-raw`` schema makes it
+possible to embed a struct within a struct definition using the ``struct``
+property. For example, the following struct definition embeds the
+``tc-ratespec`` struct definition for both the ``rate`` and the ``peakrate``
+members of ``struct tc-tbf-qopt``.
+
+.. code-block:: yaml
+
+ -
+ name: tc-tbf-qopt
+ type: struct
+ members:
+ -
+ name: rate
+ type: binary
+ struct: tc-ratespec
+ -
+ name: peakrate
+ type: binary
+ struct: tc-ratespec
+ -
+ name: limit
+ type: u32
+ -
+ name: buffer
+ type: u32
+ -
+ name: mtu
+ type: u32