From 81777efbf59305fa145bede97dd4abdc35540578 Mon Sep 17 00:00:00 2001 From: Dave Thaler Date: Mon, 8 Jan 2024 13:42:31 -0800 Subject: Introduce concept of conformance groups The discussion of what the actual conformance groups should be is still in progress, so this is just part 1 which only uses "legacy" for deprecated instructions and "basic" for everything else. Subsequent patches will add more groups as discussion continues. Signed-off-by: Dave Thaler Acked-by: David Vernet Link: https://lore.kernel.org/r/20240108214231.5280-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov --- .../bpf/standardization/instruction-set.rst | 26 +++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/bpf/standardization/instruction-set.rst b/Documentation/bpf/standardization/instruction-set.rst index 245b6defc298c2..eb0f234a800144 100644 --- a/Documentation/bpf/standardization/instruction-set.rst +++ b/Documentation/bpf/standardization/instruction-set.rst @@ -97,6 +97,28 @@ Definitions A: 10000110 B: 11111111 10000110 +Conformance groups +------------------ + +An implementation does not need to support all instructions specified in this +document (e.g., deprecated instructions). Instead, a number of conformance +groups are specified. An implementation must support the "basic" conformance +group and may support additional conformance groups, where supporting a +conformance group means it must support all instructions in that conformance +group. + +The use of named conformance groups enables interoperability between a runtime +that executes instructions, and tools as such compilers that generate +instructions for the runtime. Thus, capability discovery in terms of +conformance groups might be done manually by users or automatically by tools. + +Each conformance group has a short ASCII label (e.g., "basic") that +corresponds to a set of instructions that are mandatory. That is, each +instruction has one or more conformance groups of which it is a member. + +The "basic" conformance group includes all instructions defined in this +specification unless otherwise noted. + Instruction encoding ==================== @@ -610,4 +632,6 @@ Legacy BPF Packet access instructions BPF previously introduced special instructions for access to packet data that were carried over from classic BPF. However, these instructions are -deprecated and should no longer be used. +deprecated and should no longer be used. All legacy packet access +instructions belong to the "legacy" conformance group instead of the "basic" +conformance group. -- cgit 1.2.3-korg From 88031b929c01fe3686d34a848c413c2e51e6a7c8 Mon Sep 17 00:00:00 2001 From: Yonghong Song Date: Wed, 10 Jan 2024 21:21:36 -0800 Subject: docs/bpf: Fix an incorrect statement in verifier.rst In verifier.rst, I found an incorrect statement (maybe a typo) in section 'Liveness marks tracking'. Basically, the wrong register is attributed to have a read mark. This may confuse the user. Signed-off-by: Yonghong Song Acked-by: Eduard Zingerman Link: https://lore.kernel.org/r/20240111052136.3440417-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov --- Documentation/bpf/verifier.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/bpf/verifier.rst b/Documentation/bpf/verifier.rst index f0ec19db301c69..356894399fbf83 100644 --- a/Documentation/bpf/verifier.rst +++ b/Documentation/bpf/verifier.rst @@ -562,7 +562,7 @@ works:: * ``checkpoint[0].r1`` is marked as read; * At instruction #5 exit is reached and ``checkpoint[0]`` can now be processed - by ``clean_live_states()``. After this processing ``checkpoint[0].r0`` has a + by ``clean_live_states()``. After this processing ``checkpoint[0].r1`` has a read mark and all other registers and stack slots are marked as ``NOT_INIT`` or ``STACK_INVALID`` -- cgit 1.2.3-korg From 20e109ea9842158a153b24ef42ec5cc3d44e9485 Mon Sep 17 00:00:00 2001 From: Dave Thaler Date: Thu, 18 Jan 2024 15:29:54 -0800 Subject: bpf, docs: Clarify that MOVSX is only for BPF_X not BPF_K Per discussion on the mailing list at https://mailarchive.ietf.org/arch/msg/bpf/uQiqhURdtxV_ZQOTgjCdm-seh74/ the MOVSX operation is only defined to support register extension. The document didn't previously state this and incorrectly implied that one could use an immediate value. Signed-off-by: Dave Thaler Acked-by: David Vernet Acked-by: Yonghong Song Link: https://lore.kernel.org/r/20240118232954.27206-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov --- Documentation/bpf/standardization/instruction-set.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/bpf/standardization/instruction-set.rst b/Documentation/bpf/standardization/instruction-set.rst index eb0f234a800144..d17a96c6254fd7 100644 --- a/Documentation/bpf/standardization/instruction-set.rst +++ b/Documentation/bpf/standardization/instruction-set.rst @@ -317,7 +317,8 @@ The ``BPF_MOVSX`` instruction does a move operation with sign extension. ``BPF_ALU | BPF_MOVSX`` :term:`sign extends` 8-bit and 16-bit operands into 32 bit operands, and zeroes the remaining upper 32 bits. ``BPF_ALU64 | BPF_MOVSX`` :term:`sign extends` 8-bit, 16-bit, and 32-bit -operands into 64 bit operands. +operands into 64 bit operands. Unlike other arithmetic instructions, +``BPF_MOVSX`` is only defined for register source operands (``BPF_X``). Shift operations use a mask of 0x3F (63) for 64-bit operations and 0x1F (31) for 32-bit operations. -- cgit 1.2.3-korg From e48f0f4a9bfed8947e4d1123e8b6a15c18ee1708 Mon Sep 17 00:00:00 2001 From: Dave Thaler Date: Thu, 25 Jan 2024 20:00:50 -0800 Subject: bpf, docs: Clarify definitions of various instructions Clarify definitions of several instructions: * BPF_NEG does not support BPF_X * BPF_CALL does not support BPF_JMP32 or BPF_X * BPF_EXIT does not support BPF_X * BPF_JA does not support BPF_X (was implied but not explicitly stated) Also fix a typo in the wide instruction figure where the field is actually named "opcode" not "code". Signed-off-by: Dave Thaler Signed-off-by: Daniel Borkmann Acked-by: Yonghong Song Link: https://lore.kernel.org/bpf/20240126040050.8464-1-dthaler1968@gmail.com --- .../bpf/standardization/instruction-set.rst | 51 ++++++++++++---------- 1 file changed, 27 insertions(+), 24 deletions(-) (limited to 'Documentation') diff --git a/Documentation/bpf/standardization/instruction-set.rst b/Documentation/bpf/standardization/instruction-set.rst index d17a96c6254fd7..af43227b6ee49b 100644 --- a/Documentation/bpf/standardization/instruction-set.rst +++ b/Documentation/bpf/standardization/instruction-set.rst @@ -174,12 +174,12 @@ and imm containing the high 32 bits of the immediate value. This is depicted in the following figure:: basic_instruction - .-----------------------------. - | | - code:8 regs:8 offset:16 imm:32 unused:32 imm:32 - | | - '--------------' - pseudo instruction + .------------------------------. + | | + opcode:8 regs:8 offset:16 imm:32 unused:32 imm:32 + | | + '--------------' + pseudo instruction Thus the 64-bit immediate value is constructed as follows: @@ -320,6 +320,9 @@ bit operands, and zeroes the remaining upper 32 bits. operands into 64 bit operands. Unlike other arithmetic instructions, ``BPF_MOVSX`` is only defined for register source operands (``BPF_X``). +The ``BPF_NEG`` instruction is only defined when the source bit is clear +(``BPF_K``). + Shift operations use a mask of 0x3F (63) for 64-bit operations and 0x1F (31) for 32-bit operations. @@ -375,27 +378,27 @@ Jump instructions otherwise identical operations. The 'code' field encodes the operation as below: -======== ===== === =========================================== ========================================= -code value src description notes -======== ===== === =========================================== ========================================= -BPF_JA 0x0 0x0 PC += offset BPF_JMP class -BPF_JA 0x0 0x0 PC += imm BPF_JMP32 class +======== ===== === =============================== ============================================= +code value src description notes +======== ===== === =============================== ============================================= +BPF_JA 0x0 0x0 PC += offset BPF_JMP | BPF_K only +BPF_JA 0x0 0x0 PC += imm BPF_JMP32 | BPF_K only BPF_JEQ 0x1 any PC += offset if dst == src -BPF_JGT 0x2 any PC += offset if dst > src unsigned -BPF_JGE 0x3 any PC += offset if dst >= src unsigned +BPF_JGT 0x2 any PC += offset if dst > src unsigned +BPF_JGE 0x3 any PC += offset if dst >= src unsigned BPF_JSET 0x4 any PC += offset if dst & src BPF_JNE 0x5 any PC += offset if dst != src -BPF_JSGT 0x6 any PC += offset if dst > src signed -BPF_JSGE 0x7 any PC += offset if dst >= src signed -BPF_CALL 0x8 0x0 call helper function by address see `Helper functions`_ -BPF_CALL 0x8 0x1 call PC += imm see `Program-local functions`_ -BPF_CALL 0x8 0x2 call helper function by BTF ID see `Helper functions`_ -BPF_EXIT 0x9 0x0 return BPF_JMP only -BPF_JLT 0xa any PC += offset if dst < src unsigned -BPF_JLE 0xb any PC += offset if dst <= src unsigned -BPF_JSLT 0xc any PC += offset if dst < src signed -BPF_JSLE 0xd any PC += offset if dst <= src signed -======== ===== === =========================================== ========================================= +BPF_JSGT 0x6 any PC += offset if dst > src signed +BPF_JSGE 0x7 any PC += offset if dst >= src signed +BPF_CALL 0x8 0x0 call helper function by address BPF_JMP | BPF_K only, see `Helper functions`_ +BPF_CALL 0x8 0x1 call PC += imm BPF_JMP | BPF_K only, see `Program-local functions`_ +BPF_CALL 0x8 0x2 call helper function by BTF ID BPF_JMP | BPF_K only, see `Helper functions`_ +BPF_EXIT 0x9 0x0 return BPF_JMP | BPF_K only +BPF_JLT 0xa any PC += offset if dst < src unsigned +BPF_JLE 0xb any PC += offset if dst <= src unsigned +BPF_JSLT 0xc any PC += offset if dst < src signed +BPF_JSLE 0xd any PC += offset if dst <= src signed +======== ===== === =============================== ============================================= The BPF program needs to store the return value into register R0 before doing a ``BPF_EXIT``. -- cgit 1.2.3-korg From c94d1783136eb66f2a464a6891a32eeb55eaeacc Mon Sep 17 00:00:00 2001 From: Christian Marangi Date: Thu, 25 Jan 2024 21:36:57 +0100 Subject: dt-bindings: net: phy: Make LED active-low property common Move LED active-low property to common.yaml. This property is currently defined multiple times by bcm LEDs. This property will now be supported in a generic way for PHY LEDs with the use of a generic function. With active-low bool property not defined, active-high is always assumed. Signed-off-by: Christian Marangi Reviewed-by: Andrew Lunn Acked-by: Lee Jones Reviewed-by: Rob Herring Link: https://lore.kernel.org/r/20240125203702.4552-2-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski --- Documentation/devicetree/bindings/leds/common.yaml | 6 ++++++ Documentation/devicetree/bindings/leds/leds-bcm63138.yaml | 4 ---- Documentation/devicetree/bindings/leds/leds-bcm6328.yaml | 4 ---- Documentation/devicetree/bindings/leds/leds-bcm6358.txt | 2 -- Documentation/devicetree/bindings/leds/leds-pwm-multicolor.yaml | 4 ---- Documentation/devicetree/bindings/leds/leds-pwm.yaml | 5 ----- 6 files changed, 6 insertions(+), 19 deletions(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/leds/common.yaml b/Documentation/devicetree/bindings/leds/common.yaml index 55a8d1385e2104..5633e0aa6bdfdc 100644 --- a/Documentation/devicetree/bindings/leds/common.yaml +++ b/Documentation/devicetree/bindings/leds/common.yaml @@ -200,6 +200,12 @@ properties: #trigger-source-cells property in the source node. $ref: /schemas/types.yaml#/definitions/phandle-array + active-low: + type: boolean + description: + Makes LED active low. To turn the LED ON, line needs to be + set to low voltage instead of high. + # Required properties for flash LED child nodes: flash-max-microamp: description: diff --git a/Documentation/devicetree/bindings/leds/leds-bcm63138.yaml b/Documentation/devicetree/bindings/leds/leds-bcm63138.yaml index 52252fb6bb321d..bb20394fca5c38 100644 --- a/Documentation/devicetree/bindings/leds/leds-bcm63138.yaml +++ b/Documentation/devicetree/bindings/leds/leds-bcm63138.yaml @@ -52,10 +52,6 @@ patternProperties: maxItems: 1 description: LED pin number - active-low: - type: boolean - description: Makes LED active low - required: - reg diff --git a/Documentation/devicetree/bindings/leds/leds-bcm6328.yaml b/Documentation/devicetree/bindings/leds/leds-bcm6328.yaml index 51cc0d82c12eb8..f3a3ef99292995 100644 --- a/Documentation/devicetree/bindings/leds/leds-bcm6328.yaml +++ b/Documentation/devicetree/bindings/leds/leds-bcm6328.yaml @@ -78,10 +78,6 @@ patternProperties: - maximum: 23 description: LED pin number (only LEDs 0 to 23 are valid). - active-low: - type: boolean - description: Makes LED active low. - brcm,hardware-controlled: type: boolean description: Makes this LED hardware controlled. diff --git a/Documentation/devicetree/bindings/leds/leds-bcm6358.txt b/Documentation/devicetree/bindings/leds/leds-bcm6358.txt index 6e51c6b91ee54c..211ffc3c4a2012 100644 --- a/Documentation/devicetree/bindings/leds/leds-bcm6358.txt +++ b/Documentation/devicetree/bindings/leds/leds-bcm6358.txt @@ -25,8 +25,6 @@ LED sub-node required properties: LED sub-node optional properties: - label : see Documentation/devicetree/bindings/leds/common.txt - - active-low : Boolean, makes LED active low. - Default : false - default-state : see Documentation/devicetree/bindings/leds/common.txt - linux,default-trigger : see diff --git a/Documentation/devicetree/bindings/leds/leds-pwm-multicolor.yaml b/Documentation/devicetree/bindings/leds/leds-pwm-multicolor.yaml index bd6ec04a87277f..5edfbe347341cd 100644 --- a/Documentation/devicetree/bindings/leds/leds-pwm-multicolor.yaml +++ b/Documentation/devicetree/bindings/leds/leds-pwm-multicolor.yaml @@ -41,10 +41,6 @@ properties: pwm-names: true - active-low: - description: For PWMs where the LED is wired to supply rather than ground. - type: boolean - color: true required: diff --git a/Documentation/devicetree/bindings/leds/leds-pwm.yaml b/Documentation/devicetree/bindings/leds/leds-pwm.yaml index 7de6da58be3c53..113b7c218303ad 100644 --- a/Documentation/devicetree/bindings/leds/leds-pwm.yaml +++ b/Documentation/devicetree/bindings/leds/leds-pwm.yaml @@ -34,11 +34,6 @@ patternProperties: Maximum brightness possible for the LED $ref: /schemas/types.yaml#/definitions/uint32 - active-low: - description: - For PWMs where the LED is wired to supply rather than ground. - type: boolean - required: - pwms - max-brightness -- cgit 1.2.3-korg From 355c6dc37efa7fe6a64d155254cec8e180e5e6cb Mon Sep 17 00:00:00 2001 From: Christian Marangi Date: Thu, 25 Jan 2024 21:36:58 +0100 Subject: dt-bindings: net: phy: Document LED inactive high impedance mode Document LED inactive high impedance mode to set the LED to require high impedance configuration to be turned OFF. Signed-off-by: Christian Marangi Reviewed-by: Andrew Lunn Acked-by: Lee Jones Reviewed-by: Rob Herring Link: https://lore.kernel.org/r/20240125203702.4552-3-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski --- Documentation/devicetree/bindings/leds/common.yaml | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/leds/common.yaml b/Documentation/devicetree/bindings/leds/common.yaml index 5633e0aa6bdfdc..8a3c2398b10ce0 100644 --- a/Documentation/devicetree/bindings/leds/common.yaml +++ b/Documentation/devicetree/bindings/leds/common.yaml @@ -206,6 +206,12 @@ properties: Makes LED active low. To turn the LED ON, line needs to be set to low voltage instead of high. + inactive-high-impedance: + type: boolean + description: + Set LED to high-impedance mode to turn the LED OFF. LED might also + describe this mode as tristate. + # Required properties for flash LED child nodes: flash-max-microamp: description: -- cgit 1.2.3-korg From 91e893b43d1c8e8b6f4ba0737b597091423024f3 Mon Sep 17 00:00:00 2001 From: Christian Marangi Date: Thu, 25 Jan 2024 21:37:00 +0100 Subject: dt-bindings: net: Document QCA808x PHYs Add Documentation for QCA808x PHYs for the additional LED configuration for this PHY. Signed-off-by: Christian Marangi Reviewed-by: Rob Herring Link: https://lore.kernel.org/r/20240125203702.4552-5-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski --- .../devicetree/bindings/net/qca,qca808x.yaml | 54 ++++++++++++++++++++++ 1 file changed, 54 insertions(+) create mode 100644 Documentation/devicetree/bindings/net/qca,qca808x.yaml (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/qca,qca808x.yaml b/Documentation/devicetree/bindings/net/qca,qca808x.yaml new file mode 100644 index 00000000000000..e2552655902a38 --- /dev/null +++ b/Documentation/devicetree/bindings/net/qca,qca808x.yaml @@ -0,0 +1,54 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/qca,qca808x.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Qualcomm Atheros QCA808X PHY + +maintainers: + - Christian Marangi + +description: + QCA808X PHYs can have up to 3 LEDs attached. + All 3 LEDs are disabled by default. + 2 LEDs have dedicated pins with the 3rd LED having the + double function of Interrupt LEDs/GPIO or additional LED. + + By default this special PIN is set to LED function. + +allOf: + - $ref: ethernet-phy.yaml# + +properties: + compatible: + enum: + - ethernet-phy-id004d.d101 + +unevaluatedProperties: false + +examples: + - | + #include + + mdio { + #address-cells = <1>; + #size-cells = <0>; + + ethernet-phy@0 { + compatible = "ethernet-phy-id004d.d101"; + reg = <0>; + + leds { + #address-cells = <1>; + #size-cells = <0>; + + led@0 { + reg = <0>; + color = ; + function = LED_FUNCTION_WAN; + default-state = "keep"; + }; + }; + }; + }; -- cgit 1.2.3-korg From 6f83b62283edc295be1cfa18dd49d4f278575118 Mon Sep 17 00:00:00 2001 From: William Tu Date: Wed, 24 Jan 2024 20:00:41 -0800 Subject: Documentation: mlx5.rst: Add note for eswitch MD Add a note when using esw_port_metadata. The parameter has runtime mode but setting it does not take effect immediately. Setting it must happen in legacy mode, and the port metadata takes effects when the switchdev mode is enabled. Disable eswitch port metadata:: $ devlink dev param set pci/0000:06:00.0 name esw_port_metadata value \ false cmode runtime Change eswitch mode to switchdev mode where after choosing the metadata value:: $ devlink dev eswitch set pci/0000:06:00.0 mode switchdev Note that other mlx5 devlink runtime parameters, esw_multiport and flow_steering_mode, do not have this limitation. Signed-off-by: William Tu Reviewed-by: Jiri Pirko Signed-off-by: David S. Miller --- Documentation/networking/devlink/mlx5.rst | 4 ++++ 1 file changed, 4 insertions(+) (limited to 'Documentation') diff --git a/Documentation/networking/devlink/mlx5.rst b/Documentation/networking/devlink/mlx5.rst index 702f204a3dbd35..b9587b3400b903 100644 --- a/Documentation/networking/devlink/mlx5.rst +++ b/Documentation/networking/devlink/mlx5.rst @@ -97,6 +97,10 @@ parameters. When metadata is disabled, the above use cases will fail to initialize if users try to enable them. + + Note: Setting this parameter does not take effect immediately. Setting + must happen in legacy mode and eswitch port metadata takes effect after + enabling switchdev mode. * - ``hairpin_num_queues`` - u32 - driverinit -- cgit 1.2.3-korg From 9e1aa985d61eacd5931496b80fbd1c2d2cdeece5 Mon Sep 17 00:00:00 2001 From: Tobias Schramm Date: Thu, 25 Jan 2024 21:15:05 +0100 Subject: dt-bindings: nfc: ti,trf7970a: fix usage example The TRF7970A is a SPI device, not I2C. Signed-off-by: Tobias Schramm Reviewed-by: Krzysztof Kozlowski Signed-off-by: David S. Miller --- Documentation/devicetree/bindings/net/nfc/ti,trf7970a.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/nfc/ti,trf7970a.yaml b/Documentation/devicetree/bindings/net/nfc/ti,trf7970a.yaml index 9cc236ec42f232..d0332eb76ad263 100644 --- a/Documentation/devicetree/bindings/net/nfc/ti,trf7970a.yaml +++ b/Documentation/devicetree/bindings/net/nfc/ti,trf7970a.yaml @@ -73,7 +73,7 @@ examples: #include #include - i2c { + spi { #address-cells = <1>; #size-cells = <0>; -- cgit 1.2.3-korg From ced33f2cfa21a14a292a00e31dc9f85c1bfbda1c Mon Sep 17 00:00:00 2001 From: Yonghong Song Date: Sat, 27 Jan 2024 11:46:29 -0800 Subject: docs/bpf: Improve documentation of 64-bit immediate instructions For 64-bit immediate instruction, 'BPF_IMM | BPF_DW | BPF_LD' and src_reg=[0-6], the current documentation describes the 64-bit immediate is constructed by: imm64 = (next_imm << 32) | imm But actually imm64 is only used when src_reg=0. For all other variants (src_reg != 0), 'imm' and 'next_imm' have separate special encoding requirement and imm64 cannot be easily used to describe instruction semantics. This patch clarifies that 64-bit immediate instructions use two 32-bit immediate values instead of a 64-bit immediate value, so later describing individual 64-bit immediate instructions becomes less confusing. Signed-off-by: Yonghong Song Signed-off-by: Daniel Borkmann Acked-by: Dave Thaler Link: https://lore.kernel.org/bpf/20240127194629.737589-1-yonghong.song@linux.dev --- Documentation/bpf/standardization/instruction-set.rst | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-) (limited to 'Documentation') diff --git a/Documentation/bpf/standardization/instruction-set.rst b/Documentation/bpf/standardization/instruction-set.rst index af43227b6ee49b..fceacca4629961 100644 --- a/Documentation/bpf/standardization/instruction-set.rst +++ b/Documentation/bpf/standardization/instruction-set.rst @@ -166,7 +166,7 @@ Note that most instructions do not use all of the fields. Unused fields shall be cleared to zero. As discussed below in `64-bit immediate instructions`_, a 64-bit immediate -instruction uses a 64-bit immediate value that is constructed as follows. +instruction uses two 32-bit immediate values that are constructed as follows. The 64 bits following the basic instruction contain a pseudo instruction using the same format but with opcode, dst_reg, src_reg, and offset all set to zero, and imm containing the high 32 bits of the immediate value. @@ -181,13 +181,8 @@ This is depicted in the following figure:: '--------------' pseudo instruction -Thus the 64-bit immediate value is constructed as follows: - - imm64 = (next_imm << 32) | imm - -where 'next_imm' refers to the imm value of the pseudo instruction -following the basic instruction. The unused bytes in the pseudo -instruction are reserved and shall be cleared to zero. +Here, the imm value of the pseudo instruction is called 'next_imm'. The unused +bytes in the pseudo instruction are reserved and shall be cleared to zero. Instruction classes ------------------- @@ -590,7 +585,7 @@ defined further below: ========================= ====== === ========================================= =========== ============== opcode construction opcode src pseudocode imm type dst type ========================= ====== === ========================================= =========== ============== -BPF_IMM | BPF_DW | BPF_LD 0x18 0x0 dst = imm64 integer integer +BPF_IMM | BPF_DW | BPF_LD 0x18 0x0 dst = (next_imm << 32) | imm integer integer BPF_IMM | BPF_DW | BPF_LD 0x18 0x1 dst = map_by_fd(imm) map fd map BPF_IMM | BPF_DW | BPF_LD 0x18 0x2 dst = map_val(map_by_fd(imm)) + next_imm map fd data pointer BPF_IMM | BPF_DW | BPF_LD 0x18 0x3 dst = var_addr(imm) variable id data pointer -- cgit 1.2.3-korg From 53e41b76a8ff27f4969e3816c0ce3a1af8156091 Mon Sep 17 00:00:00 2001 From: Cristian Ciocaltea Date: Fri, 26 Jan 2024 21:21:25 +0200 Subject: dt-bindings: net: starfive,jh7110-dwmac: Add JH7100 SoC compatible The Synopsys DesignWare MAC found on StarFive JH7100 SoC is mostly similar to the newer JH7110, but it requires only two interrupts and a single reset line, which is 'ahb' instead of the commonly used 'stmmaceth'. Since the common binding 'snps,dwmac' allows selecting 'ahb' only in conjunction with 'stmmaceth', extend the logic to also permit exclusive usage of the 'ahb' reset name. This ensures the following use cases are supported: JH7110: reset-names = "stmmaceth", "ahb"; JH7100: reset-names = "ahb"; other: reset-names = "stmmaceth"; Also note the need to use a different dwmac fallback, as v5.20 applies to JH7110 only, while JH7100 relies on v3.7x. Additionally, drop the reset description items from top-level binding as they are already provided by the included snps,dwmac schema. Signed-off-by: Cristian Ciocaltea Reviewed-by: Jacob Keller Reviewed-by: Rob Herring Reviewed-by: Krzysztof Kozlowski Signed-off-by: David S. Miller --- .../devicetree/bindings/net/snps,dwmac.yaml | 11 ++-- .../bindings/net/starfive,jh7110-dwmac.yaml | 72 +++++++++++++++------- 2 files changed, 57 insertions(+), 26 deletions(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/snps,dwmac.yaml b/Documentation/devicetree/bindings/net/snps,dwmac.yaml index 5c2769dc689af7..90c4db178c676c 100644 --- a/Documentation/devicetree/bindings/net/snps,dwmac.yaml +++ b/Documentation/devicetree/bindings/net/snps,dwmac.yaml @@ -95,6 +95,7 @@ properties: - snps,dwmac-5.20 - snps,dwxgmac - snps,dwxgmac-2.10 + - starfive,jh7100-dwmac - starfive,jh7110-dwmac reg: @@ -144,10 +145,12 @@ properties: - description: AHB reset reset-names: - minItems: 1 - items: - - const: stmmaceth - - const: ahb + oneOf: + - items: + - enum: [stmmaceth, ahb] + - items: + - const: stmmaceth + - const: ahb power-domains: maxItems: 1 diff --git a/Documentation/devicetree/bindings/net/starfive,jh7110-dwmac.yaml b/Documentation/devicetree/bindings/net/starfive,jh7110-dwmac.yaml index 5e7cfbbebce6cc..0d1962980f57f5 100644 --- a/Documentation/devicetree/bindings/net/starfive,jh7110-dwmac.yaml +++ b/Documentation/devicetree/bindings/net/starfive,jh7110-dwmac.yaml @@ -16,16 +16,20 @@ select: compatible: contains: enum: + - starfive,jh7100-dwmac - starfive,jh7110-dwmac required: - compatible properties: compatible: - items: - - enum: - - starfive,jh7110-dwmac - - const: snps,dwmac-5.20 + oneOf: + - items: + - const: starfive,jh7100-dwmac + - const: snps,dwmac + - items: + - const: starfive,jh7110-dwmac + - const: snps,dwmac-5.20 reg: maxItems: 1 @@ -46,24 +50,6 @@ properties: - const: tx - const: gtx - interrupts: - minItems: 3 - maxItems: 3 - - interrupt-names: - minItems: 3 - maxItems: 3 - - resets: - items: - - description: MAC Reset signal. - - description: AHB Reset signal. - - reset-names: - items: - - const: stmmaceth - - const: ahb - starfive,tx-use-rgmii-clk: description: Tx clock is provided by external rgmii clock. @@ -94,6 +80,48 @@ required: allOf: - $ref: snps,dwmac.yaml# + - if: + properties: + compatible: + contains: + const: starfive,jh7100-dwmac + then: + properties: + interrupts: + minItems: 2 + maxItems: 2 + + interrupt-names: + minItems: 2 + maxItems: 2 + + resets: + maxItems: 1 + + reset-names: + const: ahb + + - if: + properties: + compatible: + contains: + const: starfive,jh7110-dwmac + then: + properties: + interrupts: + minItems: 3 + maxItems: 3 + + interrupt-names: + minItems: 3 + maxItems: 3 + + resets: + minItems: 2 + + reset-names: + minItems: 2 + unevaluatedProperties: false examples: -- cgit 1.2.3-korg From 2a0683be5b4c9829e8335e494a21d1148e832822 Mon Sep 17 00:00:00 2001 From: Benjamin Poirier Date: Fri, 26 Jan 2024 18:21:18 -0500 Subject: selftests: Introduce Makefile variable to list shared bash scripts Some tests written in bash source other files in a parent directory. For example, drivers/net/bonding/dev_addr_lists.sh sources net/forwarding/lib.sh. If a subset of tests is exported and run outside the source tree (for example by using `make -C tools/testing/selftests gen_tar TARGETS="drivers/net/bonding"`), these other files must be made available as well. Commit ae108c48b5d2 ("selftests: net: Fix cross-tree inclusion of scripts") addressed this problem by symlinking and copying the sourced files but this only works for direct dependencies. Commit 25ae948b4478 ("selftests/net: add lib.sh") changed net/forwarding/lib.sh to source net/lib.sh. As a result, that latter file must be included as well when the former is exported. This was not handled and was reverted in commit 2114e83381d3 ("selftests: forwarding: Avoid failures to source net/lib.sh"). In order to allow reinstating the inclusion of net/lib.sh from net/forwarding/lib.sh, add a mechanism to list dependent files in a new Makefile variable and export them. This allows sourcing those files using the same expression whether tests are run in-tree or exported. Dependencies are not resolved recursively so transitive dependencies must be listed in TEST_INCLUDES. For example, if net/forwarding/lib.sh sources net/lib.sh; the Makefile related to a test that sources net/forwarding/lib.sh from a parent directory must list: TEST_INCLUDES := \ ../../../net/forwarding/lib.sh \ ../../../net/lib.sh v2: Fix rst syntax in Documentation/dev-tools/kselftest.rst (Jakub Kicinski) v1 (from RFC): * changed TEST_INCLUDES to take relative paths, like other TEST_* variables (Vladimir Oltean) * preserved common "$(MAKE) OUTPUT=... -C ... target" ordering in Makefile (Petr Machata) Signed-off-by: Benjamin Poirier Signed-off-by: David S. Miller --- Documentation/dev-tools/kselftest.rst | 12 ++++++++++++ tools/testing/selftests/Makefile | 7 ++++++- tools/testing/selftests/lib.mk | 19 +++++++++++++++++++ 3 files changed, 37 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/dev-tools/kselftest.rst b/Documentation/dev-tools/kselftest.rst index ab376b316c36d6..522214c7b43ba2 100644 --- a/Documentation/dev-tools/kselftest.rst +++ b/Documentation/dev-tools/kselftest.rst @@ -255,9 +255,21 @@ Contributing new tests (details) TEST_PROGS_EXTENDED, TEST_GEN_PROGS_EXTENDED mean it is the executable which is not tested by default. + TEST_FILES, TEST_GEN_FILES mean it is the file which is used by test. + TEST_INCLUDES is similar to TEST_FILES, it lists files which should be + included when exporting or installing the tests, with the following + differences: + + * symlinks to files in other directories are preserved + * the part of paths below tools/testing/selftests/ is preserved when + copying the files to the output directory + + TEST_INCLUDES is meant to list dependencies located in other directories of + the selftests hierarchy. + * First use the headers inside the kernel source and/or git repo, and then the system headers. Headers for the kernel release as opposed to headers installed by the distro on the system should be the primary focus to be able diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index 15b6a111c3beaa..082db6b68060d0 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -191,6 +191,8 @@ run_tests: all @for TARGET in $(TARGETS); do \ BUILD_TARGET=$$BUILD/$$TARGET; \ $(MAKE) OUTPUT=$$BUILD_TARGET -C $$TARGET run_tests \ + SRC_PATH=$(shell readlink -e $$(pwd)) \ + OBJ_PATH=$(BUILD) \ O=$(abs_objtree); \ done; @@ -241,7 +243,10 @@ ifdef INSTALL_PATH @ret=1; \ for TARGET in $(TARGETS); do \ BUILD_TARGET=$$BUILD/$$TARGET; \ - $(MAKE) OUTPUT=$$BUILD_TARGET -C $$TARGET INSTALL_PATH=$(INSTALL_PATH)/$$TARGET install \ + $(MAKE) OUTPUT=$$BUILD_TARGET -C $$TARGET install \ + INSTALL_PATH=$(INSTALL_PATH)/$$TARGET \ + SRC_PATH=$(shell readlink -e $$(pwd)) \ + OBJ_PATH=$(INSTALL_PATH) \ O=$(abs_objtree) \ $(if $(FORCE_TARGETS),|| exit); \ ret=$$((ret * $$?)); \ diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk index aa646e0661f36c..087fee22dd5312 100644 --- a/tools/testing/selftests/lib.mk +++ b/tools/testing/selftests/lib.mk @@ -69,11 +69,29 @@ define RUN_TESTS run_many $(1) endef +define INSTALL_INCLUDES + $(if $(TEST_INCLUDES), \ + relative_files=""; \ + for entry in $(TEST_INCLUDES); do \ + entry_dir=$$(readlink -e "$$(dirname "$$entry")"); \ + entry_name=$$(basename "$$entry"); \ + relative_dir=$${entry_dir#"$$SRC_PATH"/}; \ + if [ "$$relative_dir" = "$$entry_dir" ]; then \ + echo "Error: TEST_INCLUDES entry \"$$entry\" not located inside selftests directory ($$SRC_PATH)" >&2; \ + exit 1; \ + fi; \ + relative_files="$$relative_files $$relative_dir/$$entry_name"; \ + done; \ + cd $(SRC_PATH) && rsync -aR $$relative_files $(OBJ_PATH)/ \ + ) +endef + run_tests: all ifdef building_out_of_srctree @if [ "X$(TEST_PROGS)$(TEST_PROGS_EXTENDED)$(TEST_FILES)" != "X" ]; then \ rsync -aq --copy-unsafe-links $(TEST_PROGS) $(TEST_PROGS_EXTENDED) $(TEST_FILES) $(OUTPUT); \ fi + @$(INSTALL_INCLUDES) @if [ "X$(TEST_PROGS)" != "X" ]; then \ $(call RUN_TESTS, $(TEST_GEN_PROGS) $(TEST_CUSTOM_PROGS) \ $(addprefix $(OUTPUT)/,$(TEST_PROGS))) ; \ @@ -103,6 +121,7 @@ endef install: all ifdef INSTALL_PATH $(INSTALL_RULE) + $(INSTALL_INCLUDES) else $(error Error: set INSTALL_PATH to use install) endif -- cgit 1.2.3-korg From 6f3189f38a3e995232e028a4c341164c4aca1b20 Mon Sep 17 00:00:00 2001 From: Daniel Xu Date: Sun, 28 Jan 2024 18:24:08 -0700 Subject: bpf: treewide: Annotate BPF kfuncs in BTF This commit marks kfuncs as such inside the .BTF_ids section. The upshot of these annotations is that we'll be able to automatically generate kfunc prototypes for downstream users. The process is as follows: 1. In source, use BTF_KFUNCS_START/END macro pair to mark kfuncs 2. During build, pahole injects into BTF a "bpf_kfunc" BTF_DECL_TAG for each function inside BTF_KFUNCS sets 3. At runtime, vmlinux or module BTF is made available in sysfs 4. At runtime, bpftool (or similar) can look at provided BTF and generate appropriate prototypes for functions with "bpf_kfunc" tag To ensure future kfunc are similarly tagged, we now also return error inside kfunc registration for untagged kfuncs. For vmlinux kfuncs, we also WARN(), as initcall machinery does not handle errors. Signed-off-by: Daniel Xu Acked-by: Benjamin Tissoires Link: https://lore.kernel.org/r/e55150ceecbf0a5d961e608941165c0bee7bc943.1706491398.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov --- Documentation/bpf/kfuncs.rst | 8 ++++---- drivers/hid/bpf/hid_bpf_dispatch.c | 8 ++++---- fs/verity/measure.c | 4 ++-- kernel/bpf/btf.c | 8 ++++++++ kernel/bpf/cpumask.c | 4 ++-- kernel/bpf/helpers.c | 8 ++++---- kernel/bpf/map_iter.c | 4 ++-- kernel/cgroup/rstat.c | 4 ++-- kernel/trace/bpf_trace.c | 8 ++++---- net/bpf/test_run.c | 8 ++++---- net/core/filter.c | 20 ++++++++++---------- net/core/xdp.c | 4 ++-- net/ipv4/bpf_tcp_ca.c | 4 ++-- net/ipv4/fou_bpf.c | 4 ++-- net/ipv4/tcp_bbr.c | 4 ++-- net/ipv4/tcp_cubic.c | 4 ++-- net/ipv4/tcp_dctcp.c | 4 ++-- net/netfilter/nf_conntrack_bpf.c | 4 ++-- net/netfilter/nf_nat_bpf.c | 4 ++-- net/xfrm/xfrm_interface_bpf.c | 4 ++-- net/xfrm/xfrm_state_bpf.c | 4 ++-- .../testing/selftests/bpf/bpf_testmod/bpf_testmod.c | 8 ++++---- 22 files changed, 70 insertions(+), 62 deletions(-) (limited to 'Documentation') diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst index 7985c6615f3c2f..a8f5782bd83318 100644 --- a/Documentation/bpf/kfuncs.rst +++ b/Documentation/bpf/kfuncs.rst @@ -177,10 +177,10 @@ In addition to kfuncs' arguments, verifier may need more information about the type of kfunc(s) being registered with the BPF subsystem. To do so, we define flags on a set of kfuncs as follows:: - BTF_SET8_START(bpf_task_set) + BTF_KFUNCS_START(bpf_task_set) BTF_ID_FLAGS(func, bpf_get_task_pid, KF_ACQUIRE | KF_RET_NULL) BTF_ID_FLAGS(func, bpf_put_pid, KF_RELEASE) - BTF_SET8_END(bpf_task_set) + BTF_KFUNCS_END(bpf_task_set) This set encodes the BTF ID of each kfunc listed above, and encodes the flags along with it. Ofcourse, it is also allowed to specify no flags. @@ -347,10 +347,10 @@ Once the kfunc is prepared for use, the final step to making it visible is registering it with the BPF subsystem. Registration is done per BPF program type. An example is shown below:: - BTF_SET8_START(bpf_task_set) + BTF_KFUNCS_START(bpf_task_set) BTF_ID_FLAGS(func, bpf_get_task_pid, KF_ACQUIRE | KF_RET_NULL) BTF_ID_FLAGS(func, bpf_put_pid, KF_RELEASE) - BTF_SET8_END(bpf_task_set) + BTF_KFUNCS_END(bpf_task_set) static const struct btf_kfunc_id_set bpf_task_kfunc_set = { .owner = THIS_MODULE, diff --git a/drivers/hid/bpf/hid_bpf_dispatch.c b/drivers/hid/bpf/hid_bpf_dispatch.c index d9ef45fcaeab13..02c441aaa21751 100644 --- a/drivers/hid/bpf/hid_bpf_dispatch.c +++ b/drivers/hid/bpf/hid_bpf_dispatch.c @@ -172,9 +172,9 @@ hid_bpf_get_data(struct hid_bpf_ctx *ctx, unsigned int offset, const size_t rdwr * The following set contains all functions we agree BPF programs * can use. */ -BTF_SET8_START(hid_bpf_kfunc_ids) +BTF_KFUNCS_START(hid_bpf_kfunc_ids) BTF_ID_FLAGS(func, hid_bpf_get_data, KF_RET_NULL) -BTF_SET8_END(hid_bpf_kfunc_ids) +BTF_KFUNCS_END(hid_bpf_kfunc_ids) static const struct btf_kfunc_id_set hid_bpf_kfunc_set = { .owner = THIS_MODULE, @@ -440,12 +440,12 @@ static const struct btf_kfunc_id_set hid_bpf_fmodret_set = { }; /* for syscall HID-BPF */ -BTF_SET8_START(hid_bpf_syscall_kfunc_ids) +BTF_KFUNCS_START(hid_bpf_syscall_kfunc_ids) BTF_ID_FLAGS(func, hid_bpf_attach_prog) BTF_ID_FLAGS(func, hid_bpf_allocate_context, KF_ACQUIRE | KF_RET_NULL) BTF_ID_FLAGS(func, hid_bpf_release_context, KF_RELEASE) BTF_ID_FLAGS(func, hid_bpf_hw_request) -BTF_SET8_END(hid_bpf_syscall_kfunc_ids) +BTF_KFUNCS_END(hid_bpf_syscall_kfunc_ids) static const struct btf_kfunc_id_set hid_bpf_syscall_kfunc_set = { .owner = THIS_MODULE, diff --git a/fs/verity/measure.c b/fs/verity/measure.c index bf7a5f4cccaf04..3969d54158d128 100644 --- a/fs/verity/measure.c +++ b/fs/verity/measure.c @@ -159,9 +159,9 @@ __bpf_kfunc int bpf_get_fsverity_digest(struct file *file, struct bpf_dynptr_ker __bpf_kfunc_end_defs(); -BTF_SET8_START(fsverity_set_ids) +BTF_KFUNCS_START(fsverity_set_ids) BTF_ID_FLAGS(func, bpf_get_fsverity_digest, KF_TRUSTED_ARGS) -BTF_SET8_END(fsverity_set_ids) +BTF_KFUNCS_END(fsverity_set_ids) static int bpf_get_fsverity_digest_filter(const struct bpf_prog *prog, u32 kfunc_id) { diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index c8c6e6cf18e7f4..ef380e5469521b 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -8124,6 +8124,14 @@ int register_btf_kfunc_id_set(enum bpf_prog_type prog_type, { enum btf_kfunc_hook hook; + /* All kfuncs need to be tagged as such in BTF. + * WARN() for initcall registrations that do not check errors. + */ + if (!(kset->set->flags & BTF_SET8_KFUNCS)) { + WARN_ON(!kset->owner); + return -EINVAL; + } + hook = bpf_prog_type_to_kfunc_hook(prog_type); return __register_btf_kfunc_id_set(hook, kset); } diff --git a/kernel/bpf/cpumask.c b/kernel/bpf/cpumask.c index 2e73533a3811cd..dad0fb1c8e876f 100644 --- a/kernel/bpf/cpumask.c +++ b/kernel/bpf/cpumask.c @@ -424,7 +424,7 @@ __bpf_kfunc u32 bpf_cpumask_weight(const struct cpumask *cpumask) __bpf_kfunc_end_defs(); -BTF_SET8_START(cpumask_kfunc_btf_ids) +BTF_KFUNCS_START(cpumask_kfunc_btf_ids) BTF_ID_FLAGS(func, bpf_cpumask_create, KF_ACQUIRE | KF_RET_NULL) BTF_ID_FLAGS(func, bpf_cpumask_release, KF_RELEASE) BTF_ID_FLAGS(func, bpf_cpumask_acquire, KF_ACQUIRE | KF_TRUSTED_ARGS) @@ -450,7 +450,7 @@ BTF_ID_FLAGS(func, bpf_cpumask_copy, KF_RCU) BTF_ID_FLAGS(func, bpf_cpumask_any_distribute, KF_RCU) BTF_ID_FLAGS(func, bpf_cpumask_any_and_distribute, KF_RCU) BTF_ID_FLAGS(func, bpf_cpumask_weight, KF_RCU) -BTF_SET8_END(cpumask_kfunc_btf_ids) +BTF_KFUNCS_END(cpumask_kfunc_btf_ids) static const struct btf_kfunc_id_set cpumask_kfunc_set = { .owner = THIS_MODULE, diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index bcb951a2ecf4b9..4db1c658254c17 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -2544,7 +2544,7 @@ __bpf_kfunc void bpf_throw(u64 cookie) __bpf_kfunc_end_defs(); -BTF_SET8_START(generic_btf_ids) +BTF_KFUNCS_START(generic_btf_ids) #ifdef CONFIG_KEXEC_CORE BTF_ID_FLAGS(func, crash_kexec, KF_DESTRUCTIVE) #endif @@ -2573,7 +2573,7 @@ BTF_ID_FLAGS(func, bpf_task_get_cgroup1, KF_ACQUIRE | KF_RCU | KF_RET_NULL) #endif BTF_ID_FLAGS(func, bpf_task_from_pid, KF_ACQUIRE | KF_RET_NULL) BTF_ID_FLAGS(func, bpf_throw) -BTF_SET8_END(generic_btf_ids) +BTF_KFUNCS_END(generic_btf_ids) static const struct btf_kfunc_id_set generic_kfunc_set = { .owner = THIS_MODULE, @@ -2589,7 +2589,7 @@ BTF_ID(struct, cgroup) BTF_ID(func, bpf_cgroup_release_dtor) #endif -BTF_SET8_START(common_btf_ids) +BTF_KFUNCS_START(common_btf_ids) BTF_ID_FLAGS(func, bpf_cast_to_kern_ctx) BTF_ID_FLAGS(func, bpf_rdonly_cast) BTF_ID_FLAGS(func, bpf_rcu_read_lock) @@ -2618,7 +2618,7 @@ BTF_ID_FLAGS(func, bpf_dynptr_is_null) BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly) BTF_ID_FLAGS(func, bpf_dynptr_size) BTF_ID_FLAGS(func, bpf_dynptr_clone) -BTF_SET8_END(common_btf_ids) +BTF_KFUNCS_END(common_btf_ids) static const struct btf_kfunc_id_set common_kfunc_set = { .owner = THIS_MODULE, diff --git a/kernel/bpf/map_iter.c b/kernel/bpf/map_iter.c index 6abd7c5df4b39e..9575314f40a692 100644 --- a/kernel/bpf/map_iter.c +++ b/kernel/bpf/map_iter.c @@ -213,9 +213,9 @@ __bpf_kfunc s64 bpf_map_sum_elem_count(const struct bpf_map *map) __bpf_kfunc_end_defs(); -BTF_SET8_START(bpf_map_iter_kfunc_ids) +BTF_KFUNCS_START(bpf_map_iter_kfunc_ids) BTF_ID_FLAGS(func, bpf_map_sum_elem_count, KF_TRUSTED_ARGS) -BTF_SET8_END(bpf_map_iter_kfunc_ids) +BTF_KFUNCS_END(bpf_map_iter_kfunc_ids) static const struct btf_kfunc_id_set bpf_map_iter_kfunc_set = { .owner = THIS_MODULE, diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c index a8350d2d63e6b1..07e2284bb49971 100644 --- a/kernel/cgroup/rstat.c +++ b/kernel/cgroup/rstat.c @@ -562,10 +562,10 @@ void cgroup_base_stat_cputime_show(struct seq_file *seq) } /* Add bpf kfuncs for cgroup_rstat_updated() and cgroup_rstat_flush() */ -BTF_SET8_START(bpf_rstat_kfunc_ids) +BTF_KFUNCS_START(bpf_rstat_kfunc_ids) BTF_ID_FLAGS(func, cgroup_rstat_updated) BTF_ID_FLAGS(func, cgroup_rstat_flush, KF_SLEEPABLE) -BTF_SET8_END(bpf_rstat_kfunc_ids) +BTF_KFUNCS_END(bpf_rstat_kfunc_ids) static const struct btf_kfunc_id_set bpf_rstat_kfunc_set = { .owner = THIS_MODULE, diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 64fdaf79d11365..241ddf5e38953e 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -1412,14 +1412,14 @@ __bpf_kfunc int bpf_verify_pkcs7_signature(struct bpf_dynptr_kern *data_ptr, __bpf_kfunc_end_defs(); -BTF_SET8_START(key_sig_kfunc_set) +BTF_KFUNCS_START(key_sig_kfunc_set) BTF_ID_FLAGS(func, bpf_lookup_user_key, KF_ACQUIRE | KF_RET_NULL | KF_SLEEPABLE) BTF_ID_FLAGS(func, bpf_lookup_system_key, KF_ACQUIRE | KF_RET_NULL) BTF_ID_FLAGS(func, bpf_key_put, KF_RELEASE) #ifdef CONFIG_SYSTEM_DATA_VERIFICATION BTF_ID_FLAGS(func, bpf_verify_pkcs7_signature, KF_SLEEPABLE) #endif -BTF_SET8_END(key_sig_kfunc_set) +BTF_KFUNCS_END(key_sig_kfunc_set) static const struct btf_kfunc_id_set bpf_key_sig_kfunc_set = { .owner = THIS_MODULE, @@ -1475,9 +1475,9 @@ __bpf_kfunc int bpf_get_file_xattr(struct file *file, const char *name__str, __bpf_kfunc_end_defs(); -BTF_SET8_START(fs_kfunc_set_ids) +BTF_KFUNCS_START(fs_kfunc_set_ids) BTF_ID_FLAGS(func, bpf_get_file_xattr, KF_SLEEPABLE | KF_TRUSTED_ARGS) -BTF_SET8_END(fs_kfunc_set_ids) +BTF_KFUNCS_END(fs_kfunc_set_ids) static int bpf_get_file_xattr_filter(const struct bpf_prog *prog, u32 kfunc_id) { diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c index dfd91937401783..5535f9adc6589d 100644 --- a/net/bpf/test_run.c +++ b/net/bpf/test_run.c @@ -617,21 +617,21 @@ CFI_NOSEAL(bpf_kfunc_call_memb_release_dtor); __bpf_kfunc_end_defs(); -BTF_SET8_START(bpf_test_modify_return_ids) +BTF_KFUNCS_START(bpf_test_modify_return_ids) BTF_ID_FLAGS(func, bpf_modify_return_test) BTF_ID_FLAGS(func, bpf_modify_return_test2) BTF_ID_FLAGS(func, bpf_fentry_test1, KF_SLEEPABLE) -BTF_SET8_END(bpf_test_modify_return_ids) +BTF_KFUNCS_END(bpf_test_modify_return_ids) static const struct btf_kfunc_id_set bpf_test_modify_return_set = { .owner = THIS_MODULE, .set = &bpf_test_modify_return_ids, }; -BTF_SET8_START(test_sk_check_kfunc_ids) +BTF_KFUNCS_START(test_sk_check_kfunc_ids) BTF_ID_FLAGS(func, bpf_kfunc_call_test_release, KF_RELEASE) BTF_ID_FLAGS(func, bpf_kfunc_call_memb_release, KF_RELEASE) -BTF_SET8_END(test_sk_check_kfunc_ids) +BTF_KFUNCS_END(test_sk_check_kfunc_ids) static void *bpf_test_init(const union bpf_attr *kattr, u32 user_size, u32 size, u32 headroom, u32 tailroom) diff --git a/net/core/filter.c b/net/core/filter.c index 358870408a51e6..524adf1fa6d019 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -11982,21 +11982,21 @@ int bpf_dynptr_from_skb_rdonly(struct sk_buff *skb, u64 flags, return 0; } -BTF_SET8_START(bpf_kfunc_check_set_skb) +BTF_KFUNCS_START(bpf_kfunc_check_set_skb) BTF_ID_FLAGS(func, bpf_dynptr_from_skb) -BTF_SET8_END(bpf_kfunc_check_set_skb) +BTF_KFUNCS_END(bpf_kfunc_check_set_skb) -BTF_SET8_START(bpf_kfunc_check_set_xdp) +BTF_KFUNCS_START(bpf_kfunc_check_set_xdp) BTF_ID_FLAGS(func, bpf_dynptr_from_xdp) -BTF_SET8_END(bpf_kfunc_check_set_xdp) +BTF_KFUNCS_END(bpf_kfunc_check_set_xdp) -BTF_SET8_START(bpf_kfunc_check_set_sock_addr) +BTF_KFUNCS_START(bpf_kfunc_check_set_sock_addr) BTF_ID_FLAGS(func, bpf_sock_addr_set_sun_path) -BTF_SET8_END(bpf_kfunc_check_set_sock_addr) +BTF_KFUNCS_END(bpf_kfunc_check_set_sock_addr) -BTF_SET8_START(bpf_kfunc_check_set_tcp_reqsk) +BTF_KFUNCS_START(bpf_kfunc_check_set_tcp_reqsk) BTF_ID_FLAGS(func, bpf_sk_assign_tcp_reqsk, KF_TRUSTED_ARGS) -BTF_SET8_END(bpf_kfunc_check_set_tcp_reqsk) +BTF_KFUNCS_END(bpf_kfunc_check_set_tcp_reqsk) static const struct btf_kfunc_id_set bpf_kfunc_set_skb = { .owner = THIS_MODULE, @@ -12075,9 +12075,9 @@ __bpf_kfunc int bpf_sock_destroy(struct sock_common *sock) __bpf_kfunc_end_defs(); -BTF_SET8_START(bpf_sk_iter_kfunc_ids) +BTF_KFUNCS_START(bpf_sk_iter_kfunc_ids) BTF_ID_FLAGS(func, bpf_sock_destroy, KF_TRUSTED_ARGS) -BTF_SET8_END(bpf_sk_iter_kfunc_ids) +BTF_KFUNCS_END(bpf_sk_iter_kfunc_ids) static int tracing_iter_filter(const struct bpf_prog *prog, u32 kfunc_id) { diff --git a/net/core/xdp.c b/net/core/xdp.c index 4869c1c2d8f3d9..034fb80f3fbe9b 100644 --- a/net/core/xdp.c +++ b/net/core/xdp.c @@ -771,11 +771,11 @@ __bpf_kfunc int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx, __bpf_kfunc_end_defs(); -BTF_SET8_START(xdp_metadata_kfunc_ids) +BTF_KFUNCS_START(xdp_metadata_kfunc_ids) #define XDP_METADATA_KFUNC(_, __, name, ___) BTF_ID_FLAGS(func, name, KF_TRUSTED_ARGS) XDP_METADATA_KFUNC_xxx #undef XDP_METADATA_KFUNC -BTF_SET8_END(xdp_metadata_kfunc_ids) +BTF_KFUNCS_END(xdp_metadata_kfunc_ids) static const struct btf_kfunc_id_set xdp_metadata_kfunc_set = { .owner = THIS_MODULE, diff --git a/net/ipv4/bpf_tcp_ca.c b/net/ipv4/bpf_tcp_ca.c index 834edc18463ac1..7f518ea5f4ac7d 100644 --- a/net/ipv4/bpf_tcp_ca.c +++ b/net/ipv4/bpf_tcp_ca.c @@ -201,13 +201,13 @@ bpf_tcp_ca_get_func_proto(enum bpf_func_id func_id, } } -BTF_SET8_START(bpf_tcp_ca_check_kfunc_ids) +BTF_KFUNCS_START(bpf_tcp_ca_check_kfunc_ids) BTF_ID_FLAGS(func, tcp_reno_ssthresh) BTF_ID_FLAGS(func, tcp_reno_cong_avoid) BTF_ID_FLAGS(func, tcp_reno_undo_cwnd) BTF_ID_FLAGS(func, tcp_slow_start) BTF_ID_FLAGS(func, tcp_cong_avoid_ai) -BTF_SET8_END(bpf_tcp_ca_check_kfunc_ids) +BTF_KFUNCS_END(bpf_tcp_ca_check_kfunc_ids) static const struct btf_kfunc_id_set bpf_tcp_ca_kfunc_set = { .owner = THIS_MODULE, diff --git a/net/ipv4/fou_bpf.c b/net/ipv4/fou_bpf.c index 4da03bf45c9b75..06e5572f296f1e 100644 --- a/net/ipv4/fou_bpf.c +++ b/net/ipv4/fou_bpf.c @@ -100,10 +100,10 @@ __bpf_kfunc int bpf_skb_get_fou_encap(struct __sk_buff *skb_ctx, __bpf_kfunc_end_defs(); -BTF_SET8_START(fou_kfunc_set) +BTF_KFUNCS_START(fou_kfunc_set) BTF_ID_FLAGS(func, bpf_skb_set_fou_encap) BTF_ID_FLAGS(func, bpf_skb_get_fou_encap) -BTF_SET8_END(fou_kfunc_set) +BTF_KFUNCS_END(fou_kfunc_set) static const struct btf_kfunc_id_set fou_bpf_kfunc_set = { .owner = THIS_MODULE, diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c index 22358032dd484b..05dc2d05bc7cbb 100644 --- a/net/ipv4/tcp_bbr.c +++ b/net/ipv4/tcp_bbr.c @@ -1155,7 +1155,7 @@ static struct tcp_congestion_ops tcp_bbr_cong_ops __read_mostly = { .set_state = bbr_set_state, }; -BTF_SET8_START(tcp_bbr_check_kfunc_ids) +BTF_KFUNCS_START(tcp_bbr_check_kfunc_ids) #ifdef CONFIG_X86 #ifdef CONFIG_DYNAMIC_FTRACE BTF_ID_FLAGS(func, bbr_init) @@ -1168,7 +1168,7 @@ BTF_ID_FLAGS(func, bbr_min_tso_segs) BTF_ID_FLAGS(func, bbr_set_state) #endif #endif -BTF_SET8_END(tcp_bbr_check_kfunc_ids) +BTF_KFUNCS_END(tcp_bbr_check_kfunc_ids) static const struct btf_kfunc_id_set tcp_bbr_kfunc_set = { .owner = THIS_MODULE, diff --git a/net/ipv4/tcp_cubic.c b/net/ipv4/tcp_cubic.c index 0fd78ecb67e756..44869ea089e346 100644 --- a/net/ipv4/tcp_cubic.c +++ b/net/ipv4/tcp_cubic.c @@ -485,7 +485,7 @@ static struct tcp_congestion_ops cubictcp __read_mostly = { .name = "cubic", }; -BTF_SET8_START(tcp_cubic_check_kfunc_ids) +BTF_KFUNCS_START(tcp_cubic_check_kfunc_ids) #ifdef CONFIG_X86 #ifdef CONFIG_DYNAMIC_FTRACE BTF_ID_FLAGS(func, cubictcp_init) @@ -496,7 +496,7 @@ BTF_ID_FLAGS(func, cubictcp_cwnd_event) BTF_ID_FLAGS(func, cubictcp_acked) #endif #endif -BTF_SET8_END(tcp_cubic_check_kfunc_ids) +BTF_KFUNCS_END(tcp_cubic_check_kfunc_ids) static const struct btf_kfunc_id_set tcp_cubic_kfunc_set = { .owner = THIS_MODULE, diff --git a/net/ipv4/tcp_dctcp.c b/net/ipv4/tcp_dctcp.c index bb23bb5b387a0c..e33fbe4933e42f 100644 --- a/net/ipv4/tcp_dctcp.c +++ b/net/ipv4/tcp_dctcp.c @@ -260,7 +260,7 @@ static struct tcp_congestion_ops dctcp_reno __read_mostly = { .name = "dctcp-reno", }; -BTF_SET8_START(tcp_dctcp_check_kfunc_ids) +BTF_KFUNCS_START(tcp_dctcp_check_kfunc_ids) #ifdef CONFIG_X86 #ifdef CONFIG_DYNAMIC_FTRACE BTF_ID_FLAGS(func, dctcp_init) @@ -271,7 +271,7 @@ BTF_ID_FLAGS(func, dctcp_cwnd_undo) BTF_ID_FLAGS(func, dctcp_state) #endif #endif -BTF_SET8_END(tcp_dctcp_check_kfunc_ids) +BTF_KFUNCS_END(tcp_dctcp_check_kfunc_ids) static const struct btf_kfunc_id_set tcp_dctcp_kfunc_set = { .owner = THIS_MODULE, diff --git a/net/netfilter/nf_conntrack_bpf.c b/net/netfilter/nf_conntrack_bpf.c index 475358ec821296..d2492d050fe601 100644 --- a/net/netfilter/nf_conntrack_bpf.c +++ b/net/netfilter/nf_conntrack_bpf.c @@ -467,7 +467,7 @@ __bpf_kfunc int bpf_ct_change_status(struct nf_conn *nfct, u32 status) __bpf_kfunc_end_defs(); -BTF_SET8_START(nf_ct_kfunc_set) +BTF_KFUNCS_START(nf_ct_kfunc_set) BTF_ID_FLAGS(func, bpf_xdp_ct_alloc, KF_ACQUIRE | KF_RET_NULL) BTF_ID_FLAGS(func, bpf_xdp_ct_lookup, KF_ACQUIRE | KF_RET_NULL) BTF_ID_FLAGS(func, bpf_skb_ct_alloc, KF_ACQUIRE | KF_RET_NULL) @@ -478,7 +478,7 @@ BTF_ID_FLAGS(func, bpf_ct_set_timeout, KF_TRUSTED_ARGS) BTF_ID_FLAGS(func, bpf_ct_change_timeout, KF_TRUSTED_ARGS) BTF_ID_FLAGS(func, bpf_ct_set_status, KF_TRUSTED_ARGS) BTF_ID_FLAGS(func, bpf_ct_change_status, KF_TRUSTED_ARGS) -BTF_SET8_END(nf_ct_kfunc_set) +BTF_KFUNCS_END(nf_ct_kfunc_set) static const struct btf_kfunc_id_set nf_conntrack_kfunc_set = { .owner = THIS_MODULE, diff --git a/net/netfilter/nf_nat_bpf.c b/net/netfilter/nf_nat_bpf.c index 6e3b2f58855fc0..481be15609b16a 100644 --- a/net/netfilter/nf_nat_bpf.c +++ b/net/netfilter/nf_nat_bpf.c @@ -54,9 +54,9 @@ __bpf_kfunc int bpf_ct_set_nat_info(struct nf_conn___init *nfct, __bpf_kfunc_end_defs(); -BTF_SET8_START(nf_nat_kfunc_set) +BTF_KFUNCS_START(nf_nat_kfunc_set) BTF_ID_FLAGS(func, bpf_ct_set_nat_info, KF_TRUSTED_ARGS) -BTF_SET8_END(nf_nat_kfunc_set) +BTF_KFUNCS_END(nf_nat_kfunc_set) static const struct btf_kfunc_id_set nf_bpf_nat_kfunc_set = { .owner = THIS_MODULE, diff --git a/net/xfrm/xfrm_interface_bpf.c b/net/xfrm/xfrm_interface_bpf.c index 7d5e920141e9b7..5ea15037ebd104 100644 --- a/net/xfrm/xfrm_interface_bpf.c +++ b/net/xfrm/xfrm_interface_bpf.c @@ -93,10 +93,10 @@ __bpf_kfunc int bpf_skb_set_xfrm_info(struct __sk_buff *skb_ctx, const struct bp __bpf_kfunc_end_defs(); -BTF_SET8_START(xfrm_ifc_kfunc_set) +BTF_KFUNCS_START(xfrm_ifc_kfunc_set) BTF_ID_FLAGS(func, bpf_skb_get_xfrm_info) BTF_ID_FLAGS(func, bpf_skb_set_xfrm_info) -BTF_SET8_END(xfrm_ifc_kfunc_set) +BTF_KFUNCS_END(xfrm_ifc_kfunc_set) static const struct btf_kfunc_id_set xfrm_interface_kfunc_set = { .owner = THIS_MODULE, diff --git a/net/xfrm/xfrm_state_bpf.c b/net/xfrm/xfrm_state_bpf.c index 9e20d4a377f7eb..2248eda741f8e0 100644 --- a/net/xfrm/xfrm_state_bpf.c +++ b/net/xfrm/xfrm_state_bpf.c @@ -117,10 +117,10 @@ __bpf_kfunc void bpf_xdp_xfrm_state_release(struct xfrm_state *x) __bpf_kfunc_end_defs(); -BTF_SET8_START(xfrm_state_kfunc_set) +BTF_KFUNCS_START(xfrm_state_kfunc_set) BTF_ID_FLAGS(func, bpf_xdp_get_xfrm_state, KF_RET_NULL | KF_ACQUIRE) BTF_ID_FLAGS(func, bpf_xdp_xfrm_state_release, KF_RELEASE) -BTF_SET8_END(xfrm_state_kfunc_set) +BTF_KFUNCS_END(xfrm_state_kfunc_set) static const struct btf_kfunc_id_set xfrm_state_xdp_kfunc_set = { .owner = THIS_MODULE, diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c index 6f163a0f1c94cc..4754c662b39ff8 100644 --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c @@ -343,12 +343,12 @@ static struct bin_attribute bin_attr_bpf_testmod_file __ro_after_init = { .write = bpf_testmod_test_write, }; -BTF_SET8_START(bpf_testmod_common_kfunc_ids) +BTF_KFUNCS_START(bpf_testmod_common_kfunc_ids) BTF_ID_FLAGS(func, bpf_iter_testmod_seq_new, KF_ITER_NEW) BTF_ID_FLAGS(func, bpf_iter_testmod_seq_next, KF_ITER_NEXT | KF_RET_NULL) BTF_ID_FLAGS(func, bpf_iter_testmod_seq_destroy, KF_ITER_DESTROY) BTF_ID_FLAGS(func, bpf_kfunc_common_test) -BTF_SET8_END(bpf_testmod_common_kfunc_ids) +BTF_KFUNCS_END(bpf_testmod_common_kfunc_ids) static const struct btf_kfunc_id_set bpf_testmod_common_kfunc_set = { .owner = THIS_MODULE, @@ -494,7 +494,7 @@ __bpf_kfunc static u32 bpf_kfunc_call_test_static_unused_arg(u32 arg, u32 unused return arg; } -BTF_SET8_START(bpf_testmod_check_kfunc_ids) +BTF_KFUNCS_START(bpf_testmod_check_kfunc_ids) BTF_ID_FLAGS(func, bpf_testmod_test_mod_kfunc) BTF_ID_FLAGS(func, bpf_kfunc_call_test1) BTF_ID_FLAGS(func, bpf_kfunc_call_test2) @@ -520,7 +520,7 @@ BTF_ID_FLAGS(func, bpf_kfunc_call_test_ref, KF_TRUSTED_ARGS | KF_RCU) BTF_ID_FLAGS(func, bpf_kfunc_call_test_destructive, KF_DESTRUCTIVE) BTF_ID_FLAGS(func, bpf_kfunc_call_test_static_unused_arg) BTF_ID_FLAGS(func, bpf_kfunc_call_test_offset) -BTF_SET8_END(bpf_testmod_check_kfunc_ids) +BTF_KFUNCS_END(bpf_testmod_check_kfunc_ids) static int bpf_testmod_ops_init(struct btf *btf) { -- cgit 1.2.3-korg From 78d23416979500c749049d5d20bac457bcca2fb5 Mon Sep 17 00:00:00 2001 From: Donald Hunter Date: Mon, 29 Jan 2024 22:34:48 +0000 Subject: doc/netlink: Describe sub-message selector resolution Update the netlink-raw docs to add a description of sub-message selector resolution to explain that selector resolution is constrained by the spec. Signed-off-by: Donald Hunter Reviewed-by: Jiri Pirko Link: https://lore.kernel.org/r/20240129223458.52046-4-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski --- Documentation/userspace-api/netlink/netlink-raw.rst | 8 ++++++++ 1 file changed, 8 insertions(+) (limited to 'Documentation') diff --git a/Documentation/userspace-api/netlink/netlink-raw.rst b/Documentation/userspace-api/netlink/netlink-raw.rst index 1e14f5f22b8e0f..32197f3cb40e5a 100644 --- a/Documentation/userspace-api/netlink/netlink-raw.rst +++ b/Documentation/userspace-api/netlink/netlink-raw.rst @@ -150,3 +150,11 @@ attributes from an ``attribute-set``. For example the following Note that a selector attribute must appear in a netlink message before any sub-message attributes that depend on it. + +If an attribute such as ``kind`` is defined at more than one nest level, then a +sub-message selector will be resolved using the value 'closest' to the selector. +For example, if the same attribute name is defined in a nested ``attribute-set`` +alongside a sub-message selector and also in a top level ``attribute-set``, then +the selector will be resolved using the value 'closest' to the selector. If the +value is not present in the message at the same level as defined in the spec +then this is an error. -- cgit 1.2.3-korg From bf08f32c8cedb12a23efcdc2c9584601d7030e16 Mon Sep 17 00:00:00 2001 From: Donald Hunter Date: Mon, 29 Jan 2024 22:34:55 +0000 Subject: tools/net/ynl: Add support for nested structs Make it possible for struct definitions to reference other struct definitions ofr binary members. For example, the tbf qdisc uses this struct definition for its parms attribute: - name: tc-tbf-qopt type: struct members: - name: rate type: binary struct: tc-ratespec - name: peakrate type: binary struct: tc-ratespec - name: limit type: u32 - name: buffer type: u32 - name: mtu type: u32 This adds the necessary schema changes and adds nested struct encoding and decoding to ynl. Signed-off-by: Donald Hunter Reviewed-by: Jiri Pirko Link: https://lore.kernel.org/r/20240129223458.52046-11-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski --- Documentation/netlink/netlink-raw.yaml | 15 ++++++++++++--- tools/net/ynl/lib/nlspec.py | 2 ++ tools/net/ynl/lib/ynl.py | 26 ++++++++++++++++++++------ 3 files changed, 34 insertions(+), 9 deletions(-) (limited to 'Documentation') diff --git a/Documentation/netlink/netlink-raw.yaml b/Documentation/netlink/netlink-raw.yaml index 04b92f1a5cd6ef..ac4e05415f2f5e 100644 --- a/Documentation/netlink/netlink-raw.yaml +++ b/Documentation/netlink/netlink-raw.yaml @@ -152,14 +152,23 @@ properties: the right formatting mechanism when displaying values of this type. enum: [ hex, mac, fddi, ipv4, ipv6, uuid ] + struct: + description: Name of the nested struct type. + type: string if: properties: type: - oneOf: - - const: binary - - const: pad + const: pad then: required: [ len ] + if: + properties: + type: + const: binary + then: + oneOf: + - required: [ len ] + - required: [ struct ] # End genetlink-legacy attribute-sets: diff --git a/tools/net/ynl/lib/nlspec.py b/tools/net/ynl/lib/nlspec.py index 44f13e383e8a94..5d197a12ab8dc3 100644 --- a/tools/net/ynl/lib/nlspec.py +++ b/tools/net/ynl/lib/nlspec.py @@ -248,6 +248,7 @@ class SpecStructMember(SpecElement): len integer, optional byte length of binary types display_hint string, hint to help choose format specifier when displaying the value + struct string, name of nested struct type """ def __init__(self, family, yaml): super().__init__(family, yaml) @@ -256,6 +257,7 @@ class SpecStructMember(SpecElement): self.enum = yaml.get('enum') self.len = yaml.get('len') self.display_hint = yaml.get('display-hint') + self.struct = yaml.get('struct') class SpecStruct(SpecElement): diff --git a/tools/net/ynl/lib/ynl.py b/tools/net/ynl/lib/ynl.py index 2b0ca61deaf8ec..0f4193cc2e3b5b 100644 --- a/tools/net/ynl/lib/ynl.py +++ b/tools/net/ynl/lib/ynl.py @@ -674,7 +674,10 @@ class YnlFamily(SpecFamily): size = 0 for m in members: if m.type in ['pad', 'binary']: - size += m.len + if m.struct: + size += self._struct_size(m.struct) + else: + size += m.len else: format = NlAttr.get_format(m.type, m.byte_order) size += format.size @@ -691,8 +694,14 @@ class YnlFamily(SpecFamily): if m.type == 'pad': offset += m.len elif m.type == 'binary': - value = data[offset : offset + m.len] - offset += m.len + if m.struct: + len = self._struct_size(m.struct) + value = self._decode_struct(data[offset : offset + len], + m.struct) + offset += len + else: + value = data[offset : offset + m.len] + offset += m.len else: format = NlAttr.get_format(m.type, m.byte_order) [ value ] = format.unpack_from(data, offset) @@ -713,10 +722,15 @@ class YnlFamily(SpecFamily): if m.type == 'pad': attr_payload += bytearray(m.len) elif m.type == 'binary': - if value is None: - attr_payload += bytearray(m.len) + if m.struct: + if value is None: + value = dict() + attr_payload += self._encode_struct(m.struct, value) else: - attr_payload += bytes.fromhex(value) + if value is None: + attr_payload += bytearray(m.len) + else: + attr_payload += bytes.fromhex(value) else: if value is None: value = 0 -- cgit 1.2.3-korg From 9d6429c33976fcce0d46124a9151314137687e0f Mon Sep 17 00:00:00 2001 From: Donald Hunter Date: Mon, 29 Jan 2024 22:34:56 +0000 Subject: doc/netlink: Describe nested structs in netlink raw docs Add a description and example of nested struct definitions to the netlink raw documentation. Signed-off-by: Donald Hunter Reviewed-by: Jiri Pirko Link: https://lore.kernel.org/r/20240129223458.52046-12-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski --- .../userspace-api/netlink/netlink-raw.rst | 34 ++++++++++++++++++++++ 1 file changed, 34 insertions(+) (limited to 'Documentation') diff --git a/Documentation/userspace-api/netlink/netlink-raw.rst b/Documentation/userspace-api/netlink/netlink-raw.rst index 32197f3cb40e5a..1990eea772d081 100644 --- a/Documentation/userspace-api/netlink/netlink-raw.rst +++ b/Documentation/userspace-api/netlink/netlink-raw.rst @@ -158,3 +158,37 @@ alongside a sub-message selector and also in a top level ``attribute-set``, then the selector will be resolved using the value 'closest' to the selector. If the value is not present in the message at the same level as defined in the spec then this is an error. + +Nested struct definitions +------------------------- + +Many raw netlink families such as :doc:`tc<../../networking/netlink_spec/tc>` +make use of nested struct definitions. The ``netlink-raw`` schema makes it +possible to embed a struct within a struct definition using the ``struct`` +property. For example, the following struct definition embeds the +``tc-ratespec`` struct definition for both the ``rate`` and the ``peakrate`` +members of ``struct tc-tbf-qopt``. + +.. code-block:: yaml + + - + name: tc-tbf-qopt + type: struct + members: + - + name: rate + type: binary + struct: tc-ratespec + - + name: peakrate + type: binary + struct: tc-ratespec + - + name: limit + type: u32 + - + name: buffer + type: u32 + - + name: mtu + type: u32 -- cgit 1.2.3-korg From 2267672a6190cfb0349e95a70e09dc6a973007c1 Mon Sep 17 00:00:00 2001 From: Donald Hunter Date: Mon, 29 Jan 2024 22:34:58 +0000 Subject: doc/netlink/specs: Update the tc spec Fill in many of the gaps in the tc netlink spec, including stats attrs, classes and actions. Many documentation strings have also been added. This is still a work in progress, albeit fairly complete: - there are still many attributes left as binary blobs. - actions have not had much testing Signed-off-by: Donald Hunter Reviewed-by: Jiri Pirko Link: https://lore.kernel.org/r/20240129223458.52046-14-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski --- Documentation/netlink/specs/tc.yaml | 2218 ++++++++++++++++++++++++++++++++--- 1 file changed, 2067 insertions(+), 151 deletions(-) (limited to 'Documentation') diff --git a/Documentation/netlink/specs/tc.yaml b/Documentation/netlink/specs/tc.yaml index 4346fa402fc91d..4b21b00dbebe6c 100644 --- a/Documentation/netlink/specs/tc.yaml +++ b/Documentation/netlink/specs/tc.yaml @@ -48,21 +48,28 @@ definitions: - name: bytes type: u64 + doc: Number of enqueued bytes - name: packets type: u32 + doc: Number of enqueued packets - name: drops type: u32 + doc: Packets dropped because of lack of resources - name: overlimits type: u32 + doc: | + Number of throttle events when this flow goes out of allocated bandwidth - name: bps type: u32 + doc: Current flow byte rate - name: pps type: u32 + doc: Current flow packet rate - name: qlen type: u32 @@ -112,6 +119,7 @@ definitions: - name: limit type: u32 + doc: Queue length; bytes for bfifo, packets for pfifo - name: tc-htb-opt type: struct @@ -119,11 +127,11 @@ definitions: - name: rate type: binary - len: 12 + struct: tc-ratespec - name: ceil type: binary - len: 12 + struct: tc-ratespec - name: buffer type: u32 @@ -149,15 +157,19 @@ definitions: - name: rate2quantum type: u32 + doc: bps->quantum divisor - name: defcls type: u32 + doc: Default class number - name: debug type: u32 + doc: Debug flags - name: direct-pkts type: u32 + doc: Count of non shaped packets - name: tc-gred-qopt type: struct @@ -165,15 +177,19 @@ definitions: - name: limit type: u32 + doc: HARD maximal queue length in bytes - name: qth-min type: u32 + doc: Min average length threshold in bytes - name: qth-max type: u32 + doc: Max average length threshold in bytes - name: DP type: u32 + doc: Up to 2^32 DPs - name: backlog type: u32 @@ -195,15 +211,19 @@ definitions: - name: Wlog type: u8 + doc: log(W) - name: Plog type: u8 + doc: log(P_max / (qth-max - qth-min)) - name: Scell_log type: u8 + doc: cell size for idle damping - name: prio type: u8 + doc: Priority of this VQ - name: packets type: u32 @@ -266,9 +286,11 @@ definitions: - name: bands type: u16 + doc: Number of bands - name: max-bands type: u16 + doc: Maximum number of queues - name: tc-netem-qopt type: struct @@ -276,21 +298,138 @@ definitions: - name: latency type: u32 + doc: Added delay in microseconds - name: limit type: u32 + doc: Fifo limit in packets - name: loss type: u32 + doc: Random packet loss (0=none, ~0=100%) - name: gap type: u32 + doc: Re-ordering gap (0 for none) - name: duplicate type: u32 + doc: Random packet duplication (0=none, ~0=100%) - name: jitter type: u32 + doc: Random jitter latency in microseconds + - + name: tc-netem-gimodel + doc: State transition probabilities for 4 state model + type: struct + members: + - + name: p13 + type: u32 + - + name: p31 + type: u32 + - + name: p32 + type: u32 + - + name: p14 + type: u32 + - + name: p23 + type: u32 + - + name: tc-netem-gemodel + doc: Gilbert-Elliot models + type: struct + members: + - + name: p + type: u32 + - + name: r + type: u32 + - + name: h + type: u32 + - + name: k1 + type: u32 + - + name: tc-netem-corr + type: struct + members: + - + name: delay-corr + type: u32 + doc: Delay correlation + - + name: loss-corr + type: u32 + doc: Packet loss correlation + - + name: dup-corr + type: u32 + doc: Duplicate correlation + - + name: tc-netem-reorder + type: struct + members: + - + name: probability + type: u32 + - + name: correlation + type: u32 + - + name: tc-netem-corrupt + type: struct + members: + - + name: probability + type: u32 + - + name: correlation + type: u32 + - + name: tc-netem-rate + type: struct + members: + - + name: rate + type: u32 + - + name: packet-overhead + type: s32 + - + name: cell-size + type: u32 + - + name: cell-overhead + type: s32 + - + name: tc-netem-slot + type: struct + members: + - + name: min-delay + type: s64 + - + name: max-delay + type: s64 + - + name: max-packets + type: s32 + - + name: max-bytes + type: s32 + - + name: dist-delay + type: s64 + - + name: dist-jitter + type: s64 - name: tc-plug-qopt type: struct @@ -307,11 +446,13 @@ definitions: members: - name: bands - type: u16 + type: u32 + doc: Number of bands - name: priomap type: binary len: 16 + doc: Map of logical priority -> PRIO band - name: tc-red-qopt type: struct @@ -319,21 +460,27 @@ definitions: - name: limit type: u32 + doc: Hard queue length in packets - name: qth-min type: u32 + doc: Min average threshold in packets - name: qth-max type: u32 + doc: Max average threshold in packets - name: Wlog type: u8 + doc: log(W) - name: Plog type: u8 + doc: log(P_max / (qth-max - qth-min)) - name: Scell-log type: u8 + doc: Cell size for idle damping - name: flags type: u8 @@ -369,71 +516,128 @@ definitions: name: penalty-burst type: u32 - - name: tc-sfq-qopt-v1 # TODO nested structs + name: tc-sfq-qopt type: struct members: - name: quantum type: u32 + doc: Bytes per round allocated to flow - name: perturb-period type: s32 + doc: Period of hash perturbation - name: limit type: u32 + doc: Maximal packets in queue - name: divisor type: u32 + doc: Hash divisor - name: flows type: u32 + doc: Maximal number of flows + - + name: tc-sfqred-stats + type: struct + members: + - + name: prob-drop + type: u32 + doc: Early drops, below max threshold + - + name: forced-drop + type: u32 + doc: Early drops, after max threshold + - + name: prob-mark + type: u32 + doc: Marked packets, below max threshold + - + name: forced-mark + type: u32 + doc: Marked packets, after max threshold + - + name: prob-mark-head + type: u32 + doc: Marked packets, below max threshold + - + name: forced-mark-head + type: u32 + doc: Marked packets, after max threshold + - + name: tc-sfq-qopt-v1 + type: struct + members: + - + name: v0 + type: binary + struct: tc-sfq-qopt - name: depth type: u32 + doc: Maximum number of packets per flow - name: headdrop type: u32 - name: limit type: u32 + doc: HARD maximal flow queue length in bytes - name: qth-min type: u32 + doc: Min average length threshold in bytes - - name: qth-mac + name: qth-max type: u32 + doc: Max average length threshold in bytes - name: Wlog type: u8 + doc: log(W) - name: Plog type: u8 + doc: log(P_max / (qth-max - qth-min)) - name: Scell-log type: u8 + doc: Cell size for idle damping - name: flags type: u8 - name: max-P type: u32 + doc: probabilty, high resolution - - name: prob-drop - type: u32 + name: stats + type: binary + struct: tc-sfqred-stats + - + name: tc-ratespec + type: struct + members: - - name: forced-drop - type: u32 + name: cell-log + type: u8 - - name: prob-mark - type: u32 + name: linklayer + type: u8 - - name: forced-mark - type: u32 + name: overhead + type: u8 - - name: prob-mark-head - type: u32 + name: cell-align + type: u8 - - name: forced-mark-head + name: mpu + type: u8 + - + name: rate type: u32 - name: tc-tbf-qopt @@ -441,12 +645,12 @@ definitions: members: - name: rate - type: binary # TODO nested struct tc_ratespec - len: 12 + type: binary + struct: tc-ratespec - name: peakrate - type: binary # TODO nested struct tc_ratespec - len: 12 + type: binary + struct: tc-ratespec - name: limit type: u32 @@ -491,67 +695,1299 @@ definitions: - name: interval type: s8 + doc: Sampling period - name: ewma-log type: u8 -attribute-sets: + doc: The log() of measurement window weight - - name: tc-attrs + name: tc-choke-xstats + type: struct + members: + - + name: early + type: u32 + doc: Early drops + - + name: pdrop + type: u32 + doc: Drops due to queue limits + - + name: other + type: u32 + doc: Drops due to drop() calls + - + name: marked + type: u32 + doc: Marked packets + - + name: matched + type: u32 + doc: Drops due to flow match + - + name: tc-codel-xstats + type: struct + members: + - + name: maxpacket + type: u32 + doc: Largest packet we've seen so far + - + name: count + type: u32 + doc: How many drops we've done since the last time we entered dropping state + - + name: lastcount + type: u32 + doc: Count at entry to dropping state + - + name: ldelay + type: u32 + doc: in-queue delay seen by most recently dequeued packet + - + name: drop-next + type: s32 + doc: Time to drop next packet + - + name: drop-overlimit + type: u32 + doc: Number of times max qdisc packet limit was hit + - + name: ecn-mark + type: u32 + doc: Number of packets we've ECN marked instead of dropped + - + name: dropping + type: u32 + doc: Are we in a dropping state? + - + name: ce-mark + type: u32 + doc: Number of CE marked packets because of ce-threshold + - + name: tc-fq-codel-xstats + type: struct + members: + - + name: type + type: u32 + - + name: maxpacket + type: u32 + doc: Largest packet we've seen so far + - + name: drop-overlimit + type: u32 + doc: Number of times max qdisc packet limit was hit + - + name: ecn-mark + type: u32 + doc: Number of packets we ECN marked instead of being dropped + - + name: new-flow-count + type: u32 + doc: Number of times packets created a new flow + - + name: new-flows-len + type: u32 + doc: Count of flows in new list + - + name: old-flows-len + type: u32 + doc: Count of flows in old list + - + name: ce-mark + type: u32 + doc: Packets above ce-threshold + - + name: memory-usage + type: u32 + doc: Memory usage in bytes + - + name: drop-overmemory + type: u32 + - + name: tc-fq-pie-xstats + type: struct + members: + - + name: packets-in + type: u32 + doc: Total number of packets enqueued + - + name: dropped + type: u32 + doc: Packets dropped due to fq_pie_action + - + name: overlimit + type: u32 + doc: Dropped due to lack of space in queue + - + name: overmemory + type: u32 + doc: Dropped due to lack of memory in queue + - + name: ecn-mark + type: u32 + doc: Packets marked with ecn + - + name: new-flow-count + type: u32 + doc: Count of new flows created by packets + - + name: new-flows-len + type: u32 + doc: Count of flows in new list + - + name: old-flows-len + type: u32 + doc: Count of flows in old list + - + name: memory-usage + type: u32 + doc: Total memory across all queues + - + name: tc-fq-qd-stats + type: struct + members: + - + name: gc-flows + type: u64 + - + name: highprio-packets + type: u64 + doc: obsolete + - + name: tcp-retrans + type: u64 + doc: obsolete + - + name: throttled + type: u64 + - + name: flows-plimit + type: u64 + - + name: pkts-too-long + type: u64 + - + name: allocation-errors + type: u64 + - + name: time-next-delayed-flow + type: s64 + - + name: flows + type: u32 + - + name: inactive-flows + type: u32 + - + name: throttled-flows + type: u32 + - + name: unthrottle-latency-ns + type: u32 + - + name: ce-mark + type: u64 + doc: Packets above ce-threshold + - + name: horizon-drops + type: u64 + - + name: horizon-caps + type: u64 + - + name: fastpath-packets + type: u64 + - + name: band-drops + type: binary + len: 24 + - + name: band-pkt-count + type: binary + len: 12 + - + name: pad + type: pad + len: 4 + - + name: tc-hhf-xstats + type: struct + members: + - + name: drop-overlimit + type: u32 + doc: Number of times max qdisc packet limit was hit + - + name: hh-overlimit + type: u32 + doc: Number of times max heavy-hitters was hit + - + name: hh-tot-count + type: u32 + doc: Number of captured heavy-hitters so far + - + name: hh-cur-count + type: u32 + doc: Number of current heavy-hitters + - + name: tc-pie-xstats + type: struct + members: + - + name: prob + type: u64 + doc: Current probability + - + name: delay + type: u32 + doc: Current delay in ms + - + name: avg-dq-rate + type: u32 + doc: Current average dq rate in bits/pie-time + - + name: dq-rate-estimating + type: u32 + doc: Is avg-dq-rate being calculated? + - + name: packets-in + type: u32 + doc: Total number of packets enqueued + - + name: dropped + type: u32 + doc: Packets dropped due to pie action + - + name: overlimit + type: u32 + doc: Dropped due to lack of space in queue + - + name: maxq + type: u32 + doc: Maximum queue size + - + name: ecn-mark + type: u32 + doc: Packets marked with ecn + - + name: tc-red-xstats + type: struct + members: + - + name: early + type: u32 + doc: Early drops + - + name: pdrop + type: u32 + doc: Drops due to queue limits + - + name: other + type: u32 + doc: Drops due to drop() calls + - + name: marked + type: u32 + doc: Marked packets + - + name: tc-sfb-xstats + type: struct + members: + - + name: earlydrop + type: u32 + - + name: penaltydrop + type: u32 + - + name: bucketdrop + type: u32 + - + name: queuedrop + type: u32 + - + name: childdrop + type: u32 + doc: drops in child qdisc + - + name: marked + type: u32 + - + name: maxqlen + type: u32 + - + name: maxprob + type: u32 + - + name: avgprob + type: u32 + - + name: tc-sfq-xstats + type: struct + members: + - + name: allot + type: s32 + - + name: gnet-stats-basic + type: struct + members: + - + name: bytes + type: u64 + - + name: packets + type: u32 + - + name: gnet-stats-rate-est + type: struct + members: + - + name: bps + type: u32 + - + name: pps + type: u32 + - + name: gnet-stats-rate-est64 + type: struct + members: + - + name: bps + type: u64 + - + name: pps + type: u64 + - + name: gnet-stats-queue + type: struct + members: + - + name: qlen + type: u32 + - + name: backlog + type: u32 + - + name: drops + type: u32 + - + name: requeues + type: u32 + - + name: overlimits + type: u32 + - + name: tc-u32-key + type: struct + members: + - + name: mask + type: u32 + byte-order: big-endian + - + name: val + type: u32 + byte-order: big-endian + - + name: "off" + type: s32 + - + name: offmask + type: s32 + - + name: tc-u32-sel + type: struct + members: + - + name: flags + type: u8 + - + name: offshift + type: u8 + - + name: nkeys + type: u8 + - + name: offmask + type: u16 + byte-order: big-endian + - + name: "off" + type: u16 + - + name: offoff + type: s16 + - + name: hoff + type: s16 + - + name: hmask + type: u32 + byte-order: big-endian + - + name: keys + type: binary + struct: tc-u32-key # TODO: array + - + name: tc-u32-pcnt + type: struct + members: + - + name: rcnt + type: u64 + - + name: rhit + type: u64 + - + name: kcnts + type: u64 # TODO: array + - + name: tcf-t + type: struct + members: + - + name: install + type: u64 + - + name: lastuse + type: u64 + - + name: expires + type: u64 + - + name: firstuse + type: u64 + - + name: tc-gen + type: struct + members: + - + name: index + type: u32 + - + name: capab + type: u32 + - + name: action + type: s32 + - + name: refcnt + type: s32 + - + name: bindcnt + type: s32 + - + name: tc-gact-p + type: struct + members: + - + name: ptype + type: u16 + - + name: pval + type: u16 + - + name: paction + type: s32 + - + name: tcf-ematch-tree-hdr + type: struct + members: + - + name: nmatches + type: u16 + - + name: progid + type: u16 + - + name: tc-basic-pcnt + type: struct + members: + - + name: rcnt + type: u64 + - + name: rhit + type: u64 + - + name: tc-matchall-pcnt + type: struct + members: + - + name: rhit + type: u64 + - + name: tc-mpls + type: struct + members: + - + name: index + type: u32 + - + name: capab + type: u32 + - + name: action + type: s32 + - + name: refcnt + type: s32 + - + name: bindcnt + type: s32 + - + name: m-action + type: s32 + - + name: tc-police + type: struct + members: + - + name: index + type: u32 + - + name: action + type: s32 + - + name: limit + type: u32 + - + name: burst + type: u32 + - + name: mtu + type: u32 + - + name: rate + type: binary + struct: tc-ratespec + - + name: peakrate + type: binary + struct: tc-ratespec + - + name: refcnt + type: s32 + - + name: bindcnt + type: s32 + - + name: capab + type: u32 + - + name: tc-pedit-sel + type: struct + members: + - + name: index + type: u32 + - + name: capab + type: u32 + - + name: action + type: s32 + - + name: refcnt + type: s32 + - + name: bindcnt + type: s32 + - + name: nkeys + type: u8 + - + name: flags + type: u8 + - + name: keys + type: binary + struct: tc-pedit-key # TODO: array + - + name: tc-pedit-key + type: struct + members: + - + name: mask + type: u32 + - + name: val + type: u32 + - + name: "off" + type: u32 + - + name: at + type: u32 + - + name: offmask + type: u32 + - + name: shift + type: u32 + - + name: tc-vlan + type: struct + members: + - + name: index + type: u32 + - + name: capab + type: u32 + - + name: action + type: s32 + - + name: refcnt + type: s32 + - + name: bindcnt + type: s32 + - + name: v-action + type: s32 +attribute-sets: + - + name: tc-attrs + attributes: + - + name: kind + type: string + - + name: options + type: sub-message + sub-message: tc-options-msg + selector: kind + - + name: stats + type: binary + struct: tc-stats + - + name: xstats + type: sub-message + sub-message: tca-stats-app-msg + selector: kind + - + name: rate + type: binary + struct: gnet-estimator + - + name: fcnt + type: u32 + - + name: stats2 + type: nest + nested-attributes: tca-stats-attrs + - + name: stab + type: nest + nested-attributes: tca-stab-attrs + - + name: pad + type: pad + - + name: dump-invisible + type: flag + - + name: chain + type: u32 + - + name: hw-offload + type: u8 + - + name: ingress-block + type: u32 + - + name: egress-block + type: u32 + - + name: dump-flags + type: bitfield32 + - + name: ext-warn-msg + type: string + - + name: tc-act-attrs + attributes: + - + name: kind + type: string + - + name: options + type: sub-message + sub-message: tc-act-options-msg + selector: kind + - + name: index + type: u32 + - + name: stats + type: nest + nested-attributes: tc-act-stats-attrs + - + name: pad + type: pad + - + name: cookie + type: binary + - + name: flags + type: bitfield32 + - + name: hw-stats + type: bitfield32 + - + name: used-hw-stats + type: bitfield32 + - + name: in-hw-count + type: u32 + - + name: tc-act-stats-attrs + attributes: + - + name: basic + type: binary + struct: gnet-stats-basic + - + name: rate-est + type: binary + struct: gnet-stats-rate-est + - + name: queue + type: binary + struct: gnet-stats-queue + - + name: app + type: binary + - + name: rate-est64 + type: binary + struct: gnet-stats-rate-est64 + - + name: pad + type: pad + - + name: basic-hw + type: binary + struct: gnet-stats-basic + - + name: pkt64 + type: u64 + - + name: tc-act-bpf-attrs + attributes: + - + name: tm + type: binary + struct: tcf-t + - + name: parms + type: binary + - + name: ops-len + type: u16 + - + name: ops + type: binary + - + name: fd + type: u32 + - + name: name + type: string + - + name: pad + type: pad + - + name: tag + type: binary + - + name: id + type: binary + - + name: tc-act-connmark-attrs + attributes: + - + name: parms + type: binary + - + name: tm + type: binary + struct: tcf-t + - + name: pad + type: pad + - + name: tc-act-csum-attrs + attributes: + - + name: parms + type: binary + - + name: tm + type: binary + struct: tcf-t + - + name: pad + type: pad + - + name: tc-act-ct-attrs + attributes: + - + name: parms + type: binary + - + name: tm + type: binary + struct: tcf-t + - + name: action + type: u16 + - + name: zone + type: u16 + - + name: mark + type: u32 + - + name: mark-mask + type: u32 + - + name: labels + type: binary + - + name: labels-mask + type: binary + - + name: nat-ipv4-min + type: u32 + byte-order: big-endian + - + name: nat-ipv4-max + type: u32 + byte-order: big-endian + - + name: nat-ipv6-min + type: binary + - + name: nat-ipv6-max + type: binary + - + name: nat-port-min + type: u16 + byte-order: big-endian + - + name: nat-port-max + type: u16 + byte-order: big-endian + - + name: pad + type: pad + - + name: helper-name + type: string + - + name: helper-family + type: u8 + - + name: helper-proto + type: u8 + - + name: tc-act-ctinfo-attrs + attributes: + - + name: pad + type: pad + - + name: tm + type: binary + struct: tcf-t + - + name: act + type: binary + - + name: zone + type: u16 + - + name: parms-dscp-mask + type: u32 + - + name: parms-dscp-statemask + type: u32 + - + name: parms-cpmark-mask + type: u32 + - + name: stats-dscp-set + type: u64 + - + name: stats-dscp-error + type: u64 + - + name: stats-cpmark-set + type: u64 + - + name: tc-act-gate-attrs + attributes: + - + name: tm + type: binary + struct: tcf-t + - + name: parms + type: binary + - + name: pad + type: pad + - + name: priority + type: s32 + - + name: entry-list + type: binary + - + name: base-time + type: u64 + - + name: cycle-time + type: u64 + - + name: cycle-time-ext + type: u64 + - + name: flags + type: u32 + - + name: clockid + type: s32 + - + name: tc-act-ife-attrs + attributes: + - + name: parms + type: binary + - + name: tm + type: binary + struct: tcf-t + - + name: dmac + type: binary + - + name: smac + type: binary + - + name: type + type: u16 + - + name: metalst + type: binary + - + name: pad + type: pad + - + name: tc-act-mirred-attrs + attributes: + - + name: tm + type: binary + struct: tcf-t + - + name: parms + type: binary + - + name: pad + type: pad + - + name: blockid + type: binary + - + name: tc-act-mpls-attrs + attributes: + - + name: tm + type: binary + struct: tcf-t + - + name: parms + type: binary + struct: tc-mpls + - + name: pad + type: pad + - + name: proto + type: u16 + byte-order: big-endian + - + name: label + type: u32 + - + name: tc + type: u8 + - + name: ttl + type: u8 + - + name: bos + type: u8 + - + name: tc-act-nat-attrs + attributes: + - + name: parms + type: binary + - + name: tm + type: binary + struct: tcf-t + - + name: pad + type: pad + - + name: tc-act-pedit-attrs + attributes: + - + name: tm + type: binary + struct: tcf-t + - + name: parms + type: binary + struct: tc-pedit-sel + - + name: pad + type: pad + - + name: parms-ex + type: binary + - + name: keys-ex + type: binary + - + name: key-ex + type: binary + - + name: tc-act-simple-attrs + attributes: + - + name: tm + type: binary + struct: tcf-t + - + name: parms + type: binary + - + name: data + type: binary + - + name: pad + type: pad + - + name: tc-act-skbedit-attrs + attributes: + - + name: tm + type: binary + struct: tcf-t + - + name: parms + type: binary + - + name: priority + type: u32 + - + name: queue-mapping + type: u16 + - + name: mark + type: u32 + - + name: pad + type: pad + - + name: ptype + type: u16 + - + name: mask + type: u32 + - + name: flags + type: u64 + - + name: queue-mapping-max + type: u16 + - + name: tc-act-skbmod-attrs + attributes: + - + name: tm + type: binary + struct: tcf-t + - + name: parms + type: binary + - + name: dmac + type: binary + - + name: smac + type: binary + - + name: etype + type: binary + - + name: pad + type: pad + - + name: tc-act-tunnel-key-attrs + attributes: + - + name: tm + type: binary + struct: tcf-t + - + name: parms + type: binary + - + name: enc-ipv4-src + type: u32 + byte-order: big-endian + - + name: enc-ipv4-dst + type: u32 + byte-order: big-endian + - + name: enc-ipv6-src + type: binary + - + name: enc-ipv6-dst + type: binary + - + name: enc-key-id + type: u64 + byte-order: big-endian + - + name: pad + type: pad + - + name: enc-dst-port + type: u16 + byte-order: big-endian + - + name: no-csum + type: u8 + - + name: enc-opts + type: binary + - + name: enc-tos + type: u8 + - + name: enc-ttl + type: u8 + - + name: no-frag + type: flag + - + name: tc-act-vlan-attrs attributes: - - name: kind - type: string - - - name: options - type: sub-message - sub-message: tc-options-msg - selector: kind + name: tm + type: binary + struct: tcf-t - - name: stats + name: parms type: binary - struct: tc-stats + struct: tc-vlan - - name: xstats + name: push-vlan-id + type: u16 + - + name: push-vlan-protocol + type: u16 + - + name: pad + type: pad + - + name: push-vlan-priority + type: u8 + - + name: push-eth-dst type: binary - - name: rate + name: push-eth-src type: binary - struct: gnet-estimator + - + name: tc-basic-attrs + attributes: - - name: fcnt + name: classid type: u32 - - name: stats2 + name: ematches type: nest - nested-attributes: tca-stats-attrs + nested-attributes: tc-ematch-attrs - - name: stab + name: act + type: array-nest + nested-attributes: tc-act-attrs + - + name: police type: nest - nested-attributes: tca-stab-attrs + nested-attributes: tc-police-attrs + - + name: pcnt + type: binary + struct: tc-basic-pcnt - name: pad type: pad + - + name: tc-bpf-attrs + attributes: - - name: dump-invisible - type: flag + name: act + type: nest + nested-attributes: tc-act-attrs - - name: chain + name: police + type: nest + nested-attributes: tc-police-attrs + - + name: classid type: u32 - - name: hw-offload - type: u8 + name: ops-len + type: u16 - - name: ingress-block + name: ops + type: binary + - + name: fd type: u32 - - name: egress-block + name: name + type: string + - + name: flags type: u32 - - name: dump-flags - type: bitfield32 + name: flags-gen + type: u32 - - name: ext-warn-msg - type: string + name: tag + type: binary + - + name: id + type: u32 - name: tc-cake-attrs attributes: @@ -641,7 +2077,8 @@ attribute-sets: type: u32 - name: tin-stats - type: binary + type: array-nest + nested-attributes: tc-cake-tin-stats-attrs - name: deficit type: s32 @@ -660,6 +2097,84 @@ attribute-sets: - name: blue-timer-us type: s32 + - + name: tc-cake-tin-stats-attrs + attributes: + - + name: pad + type: pad + - + name: sent-packets + type: u32 + - + name: sent-bytes64 + type: u64 + - + name: dropped-packets + type: u32 + - + name: dropped-bytes64 + type: u64 + - + name: acks-dropped-packets + type: u32 + - + name: acks-dropped-bytes64 + type: u64 + - + name: ecn-marked-packets + type: u32 + - + name: ecn-marked-bytes64 + type: u64 + - + name: backlog-packets + type: u32 + - + name: backlog-bytes + type: u32 + - + name: threshold-rate64 + type: u64 + - + name: target-us + type: u32 + - + name: interval-us + type: u32 + - + name: way-indirect-hits + type: u32 + - + name: way-misses + type: u32 + - + name: way-collisions + type: u32 + - + name: peak-delay-us + type: u32 + - + name: avg-delay-us + type: u32 + - + name: base-delay-us + type: u32 + - + name: sparse-flows + type: u32 + - + name: bulk-flows + type: u32 + - + name: unresponsive-flows + type: u32 + - + name: max-skblen + type: u32 + - + name: flow-quantum + type: u32 - name: tc-cbs-attrs attributes: @@ -667,6 +2182,20 @@ attribute-sets: name: parms type: binary struct: tc-cbs-qopt + - + name: tc-cgroup-attrs + attributes: + - + name: act + type: nest + nested-attributes: tc-act-attrs + - + name: police + type: nest + nested-attributes: tc-police-attrs + - + name: ematches + type: binary - name: tc-choke-attrs attributes: @@ -677,6 +2206,9 @@ attribute-sets: - name: stab type: binary + checks: + min-len: 256 + max-len: 256 - name: max-p type: u32 @@ -704,6 +2236,56 @@ attribute-sets: - name: quantum type: u32 + - + name: tc-ematch-attrs + attributes: + - + name: tree-hdr + type: binary + struct: tcf-ematch-tree-hdr + - + name: tree-list + type: binary + - + name: tc-flow-attrs + attributes: + - + name: keys + type: u32 + - + name: mode + type: u32 + - + name: baseclass + type: u32 + - + name: rshift + type: u32 + - + name: addend + type: u32 + - + name: mask + type: u32 + - + name: xor + type: u32 + - + name: divisor + type: u32 + - + name: act + type: binary + - + name: police + type: nest + nested-attributes: tc-police-attrs + - + name: ematches + type: binary + - + name: perturb + type: u32 - name: tc-flower-attrs attributes: @@ -953,15 +2535,19 @@ attribute-sets: - name: key-arp-sha type: binary + display-hint: mac - name: key-arp-sha-mask type: binary + display-hint: mac - name: key-arp-tha type: binary + display-hint: mac - name: key-arp-tha-mask type: binary + display-hint: mac - name: key-mpls-ttl type: u8 @@ -1020,10 +2606,12 @@ attribute-sets: type: u8 - name: key-enc-opts - type: binary + type: nest + nested-attributes: tc-flower-key-enc-opts-attrs - name: key-enc-opts-mask - type: binary + type: nest + nested-attributes: tc-flower-key-enc-opts-attrs - name: in-hw-count type: u32 @@ -1056,41 +2644,165 @@ attribute-sets: name: key-ct-zone-mask type: u16 - - name: key-ct-mark - type: u32 + name: key-ct-mark + type: u32 + - + name: key-ct-mark-mask + type: u32 + - + name: key-ct-labels + type: binary + - + name: key-ct-labels-mask + type: binary + - + name: key-mpls-opts + type: nest + nested-attributes: tc-flower-key-mpls-opt-attrs + - + name: key-hash + type: u32 + - + name: key-hash-mask + type: u32 + - + name: key-num-of-vlans + type: u8 + - + name: key-pppoe-sid + type: u16 + byte-order: big-endian + - + name: key-ppp-proto + type: u16 + byte-order: big-endian + - + name: key-l2-tpv3-sid + type: u32 + byte-order: big-endian + - + name: l2-miss + type: u8 + - + name: key-cfm + type: nest + nested-attributes: tc-flower-key-cfm-attrs + - + name: key-spi + type: u32 + byte-order: big-endian + - + name: key-spi-mask + type: u32 + byte-order: big-endian + - + name: tc-flower-key-enc-opts-attrs + attributes: + - + name: geneve + type: nest + nested-attributes: tc-flower-key-enc-opt-geneve-attrs + - + name: vxlan + type: nest + nested-attributes: tc-flower-key-enc-opt-vxlan-attrs + - + name: erspan + type: nest + nested-attributes: tc-flower-key-enc-opt-erspan-attrs + - + name: gtp + type: nest + nested-attributes: tc-flower-key-enc-opt-gtp-attrs + - + name: tc-flower-key-enc-opt-geneve-attrs + attributes: + - + name: class + type: u16 + - + name: type + type: u8 + - + name: data + type: binary + - + name: tc-flower-key-enc-opt-vxlan-attrs + attributes: + - + name: gbp + type: u32 + - + name: tc-flower-key-enc-opt-erspan-attrs + attributes: + - + name: ver + type: u8 + - + name: index + type: u32 + - + name: dir + type: u8 + - + name: hwid + type: u8 + - + name: tc-flower-key-enc-opt-gtp-attrs + attributes: + - + name: pdu-type + type: u8 - - name: key-ct-mark-mask - type: u32 + name: qfi + type: u8 + - + name: tc-flower-key-mpls-opt-attrs + attributes: - - name: key-ct-labels - type: binary + name: lse-depth + type: u8 - - name: key-ct-labels-mask - type: binary + name: lse-ttl + type: u8 - - name: key-mpls-opts - type: binary + name: lse-bos + type: u8 - - name: key-hash - type: u32 + name: lse-tc + type: u8 - - name: key-hash-mask + name: lse-label type: u32 + - + name: tc-flower-key-cfm-attrs + attributes: - - name: key-num-of-vlans + name: md-level type: u8 - - name: key-pppoe-sid - type: u16 - byte-order: big-endian + name: opcode + type: u8 + - + name: tc-fw-attrs + attributes: - - name: key-ppp-proto - type: u16 - byte-order: big-endian + name: classid + type: u32 - - name: key-l2-tpv3-sid + name: police + type: nest + nested-attributes: tc-police-attrs + - + name: indev + type: string + - + name: act + type: array-nest + nested-attributes: tc-act-attrs + - + name: mask type: u32 - byte-order: big-endian - name: tc-gred-attrs attributes: @@ -1135,7 +2847,7 @@ attribute-sets: type: u32 - name: stat-bytes - type: u32 + type: u64 - name: stat-packets type: u32 @@ -1232,40 +2944,25 @@ attribute-sets: name: offload type: flag - - name: tc-act-attrs + name: tc-matchall-attrs attributes: - - name: kind - type: string + name: classid + type: u32 - - name: options - type: sub-message - sub-message: tc-act-options-msg - selector: kind + name: act + type: array-nest + nested-attributes: tc-act-attrs - - name: index + name: flags type: u32 - - name: stats + name: pcnt type: binary + struct: tc-matchall-pcnt - name: pad type: pad - - - name: cookie - type: binary - - - name: flags - type: bitfield32 - - - name: hw-stats - type: bitfield32 - - - name: used-hw-stats - type: bitfield32 - - - name: in-hw-count - type: u32 - name: tc-etf-attrs attributes: @@ -1304,48 +3001,71 @@ attribute-sets: - name: plimit type: u32 + doc: Limit of total number of packets in queue - name: flow-plimit type: u32 + doc: Limit of packets per flow - name: quantum type: u32 + doc: RR quantum - name: initial-quantum type: u32 + doc: RR quantum for new flow - name: rate-enable type: u32 + doc: Enable / disable rate limiting - name: flow-default-rate type: u32 + doc: Obsolete, do not use - name: flow-max-rate type: u32 + doc: Per flow max rate - name: buckets-log type: u32 + doc: log2(number of buckets) - name: flow-refill-delay type: u32 + doc: Flow credit refill delay in usec - name: orphan-mask type: u32 + doc: Mask applied to orphaned skb hashes - name: low-rate-threshold type: u32 + doc: Per packet delay under this rate - name: ce-threshold type: u32 + doc: DCTCP-like CE marking threshold - name: timer-slack type: u32 - name: horizon type: u32 + doc: Time horizon in usec - name: horizon-drop type: u8 + doc: Drop packets beyond horizon, or cap their EDT + - + name: priomap + type: binary + struct: tc-prio-qopt + - + name: weights + type: binary + sub-type: s32 + doc: Weights for each band - name: tc-fq-codel-attrs attributes: @@ -1427,6 +3147,7 @@ attribute-sets: - name: corr type: binary + struct: tc-netem-corr - name: delay-dist type: binary @@ -1434,15 +3155,19 @@ attribute-sets: - name: reorder type: binary + struct: tc-netem-reorder - name: corrupt type: binary + struct: tc-netem-corrupt - name: loss - type: binary + type: nest + nested-attributes: tc-netem-loss-attrs - name: rate type: binary + struct: tc-netem-rate - name: ecn type: u32 @@ -1461,10 +3186,27 @@ attribute-sets: - name: slot type: binary + struct: tc-netem-slot - name: slot-dist type: binary sub-type: s16 + - + name: prng-seed + type: u64 + - + name: tc-netem-loss-attrs + attributes: + - + name: gi + type: binary + doc: General Intuitive - 4 state model + struct: tc-netem-gimodel + - + name: ge + type: binary + doc: Gilbert Elliot models + struct: tc-netem-gemodel - name: tc-pie-attrs attributes: @@ -1492,6 +3234,44 @@ attribute-sets: - name: dq-rate-estimator type: u32 + - + name: tc-police-attrs + attributes: + - + name: tbf + type: binary + struct: tc-police + - + name: rate + type: binary + - + name: peakrate + type: binary + - + name: avrate + type: u32 + - + name: result + type: u32 + - + name: tm + type: binary + struct: tcf-t + - + name: pad + type: pad + - + name: rate64 + type: u64 + - + name: peakrate64 + type: u64 + - + name: pktrate64 + type: u64 + - + name: pktburst64 + type: u64 - name: tc-qfq-attrs attributes: @@ -1516,13 +3296,36 @@ attribute-sets: type: u32 - name: flags - type: binary + type: bitfield32 - name: early-drop-block type: u32 - name: mark-block type: u32 + - + name: tc-route-attrs + attributes: + - + name: classid + type: u32 + - + name: to + type: u32 + - + name: from + type: u32 + - + name: iif + type: u32 + - + name: police + type: nest + nested-attributes: tc-police-attrs + - + name: act + type: array-nest + nested-attributes: tc-act-attrs - name: tc-taprio-attrs attributes: @@ -1629,17 +3432,43 @@ attribute-sets: name: pad type: pad - - name: tca-gact-attrs + name: tc-act-sample-attrs + attributes: + - + name: tm + type: binary + struct: tcf-t + - + name: parms + type: binary + struct: tc-gen + - + name: rate + type: u32 + - + name: trunc-size + type: u32 + - + name: psample-group + type: u32 + - + name: pad + type: pad + - + name: tc-act-gact-attrs attributes: - name: tm type: binary + struct: tcf-t - name: parms type: binary + struct: tc-gen - name: prob type: binary + struct: tc-gact-p - name: pad type: pad @@ -1659,34 +3488,89 @@ attribute-sets: - name: basic type: binary + struct: gnet-stats-basic - name: rate-est type: binary + struct: gnet-stats-rate-est - name: queue type: binary + struct: gnet-stats-queue - name: app - type: binary # TODO sub-message needs 2+ level deep lookup + type: sub-message sub-message: tca-stats-app-msg selector: kind - name: rate-est64 type: binary + struct: gnet-stats-rate-est64 - name: pad type: pad - name: basic-hw type: binary + struct: gnet-stats-basic - name: pkt64 + type: u64 + - + name: tc-u32-attrs + attributes: + - + name: classid + type: u32 + - + name: hash + type: u32 + - + name: link + type: u32 + - + name: divisor + type: u32 + - + name: sel + type: binary + struct: tc-u32-sel + - + name: police + type: nest + nested-attributes: tc-police-attrs + - + name: act + type: array-nest + nested-attributes: tc-act-attrs + - + name: indev + type: string + - + name: pcnt + type: binary + struct: tc-u32-pcnt + - + name: mark type: binary + struct: tc-u32-mark + - + name: flags + type: u32 + - + name: pad + type: pad sub-messages: - name: tc-options-msg formats: + - + value: basic + attribute-set: tc-basic-attrs + - + value: bpf + attribute-set: tc-bpf-attrs - value: bfifo fixed-header: tc-fifo-qopt @@ -1696,6 +3580,9 @@ sub-messages: - value: cbs attribute-set: tc-cbs-attrs + - + value: cgroup + attribute-set: tc-cgroup-attrs - value: choke attribute-set: tc-choke-attrs @@ -1713,6 +3600,12 @@ sub-messages: - value: ets attribute-set: tc-ets-attrs + - + value: flow + attribute-set: tc-flow-attrs + - + value: flower + attribute-set: tc-flower-attrs - value: fq attribute-set: tc-fq-attrs @@ -1723,8 +3616,8 @@ sub-messages: value: fq_pie attribute-set: tc-fq-pie-attrs - - value: flower - attribute-set: tc-flower-attrs + value: fw + attribute-set: tc-fw-attrs - value: gred attribute-set: tc-gred-attrs @@ -1739,6 +3632,9 @@ sub-messages: attribute-set: tc-htb-attrs - value: ingress # no content + - + value: matchall + attribute-set: tc-matchall-attrs - value: mq # no content - @@ -1775,6 +3671,9 @@ sub-messages: - value: red attribute-set: tc-red-attrs + - + value: route + attribute-set: tc-route-attrs - value: sfb fixed-header: tc-sfb-qopt @@ -1787,88 +3686,105 @@ sub-messages: - value: tbf attribute-set: tc-tbf-attrs - - - name: tc-act-options-msg - formats: - - value: gact - attribute-set: tca-gact-attrs + value: u32 + attribute-set: tc-u32-attrs - - name: tca-stats-app-msg + name: tc-act-options-msg formats: - - value: bfifo - - - value: blackhole + value: bpf + attribute-set: tc-act-bpf-attrs - - value: cake - attribute-set: tc-cake-stats-attrs + value: connmark + attribute-set: tc-act-connmark-attrs - - value: cbs + value: csum + attribute-set: tc-act-csum-attrs - - value: choke + value: ct + attribute-set: tc-act-ct-attrs - - value: clsact + value: ctinfo + attribute-set: tc-act-ctinfo-attrs - - value: codel + value: gact + attribute-set: tc-act-gact-attrs - - value: drr + value: gate + attribute-set: tc-act-gate-attrs - - value: etf + value: ife + attribute-set: tc-act-ife-attrs - - value: ets + value: mirred + attribute-set: tc-act-mirred-attrs - - value: fq + value: mpls + attribute-set: tc-act-mpls-attrs - - value: fq_codel + value: nat + attribute-set: tc-act-nat-attrs - - value: fq_pie + value: pedit + attribute-set: tc-act-pedit-attrs - - value: flower + value: police + attribute-set: tc-act-police-attrs - - value: gred + value: sample + attribute-set: tc-act-sample-attrs - - value: hfsc + value: simple + attribute-set: tc-act-simple-attrs - - value: hhf + value: skbedit + attribute-set: tc-act-skbedit-attrs - - value: htb + value: skbmod + attribute-set: tc-act-skbmod-attrs - - value: ingress + value: tunnel_key + attribute-set: tc-act-tunnel-key-attrs - - value: mq + value: vlan + attribute-set: tc-act-vlan-attrs + - + name: tca-stats-app-msg + formats: - - value: mqprio + value: cake + attribute-set: tc-cake-stats-attrs - - value: multiq + value: choke + fixed-header: tc-choke-xstats - - value: netem + value: codel + fixed-header: tc-codel-xstats - - value: noqueue + value: fq + fixed-header: tc-fq-qd-stats - - value: pfifo + value: fq_codel + fixed-header: tc-fq-codel-xstats - - value: pfifo_fast + value: fq_pie + fixed-header: tc-fq-pie-xstats - - value: pfifo_head_drop + value: hhf + fixed-header: tc-hhf-xstats - value: pie - - - value: plug - - - value: prio - - - value: qfq + fixed-header: tc-pie-xstats - value: red + fixed-header: tc-red-xstats - value: sfb + fixed-header: tc-sfb-xstats - value: sfq - - - value: taprio - - - value: tbf + fixed-header: tc-sfq-xstats operations: enum-model: directional -- cgit 1.2.3-korg From b2005bb756e1d0ef400a79f3e1bce4f3870415a9 Mon Sep 17 00:00:00 2001 From: Krzysztof Kozlowski Date: Mon, 29 Jan 2024 15:21:21 +0100 Subject: dt-bindings: net: qcom,ipa: do not override firmware-name $ref dtschema package defines firmware-name as string-array, so individual bindings should not make it a string but instead just narrow the number of expected firmware file names. Signed-off-by: Krzysztof Kozlowski Acked-by: Conor Dooley Acked-by: Alex Elder Link: https://lore.kernel.org/r/20240129142121.102450-1-krzysztof.kozlowski@linaro.org Signed-off-by: Paolo Abeni --- Documentation/devicetree/bindings/net/qcom,ipa.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/qcom,ipa.yaml b/Documentation/devicetree/bindings/net/qcom,ipa.yaml index c30218684cfe46..53cae71d99572c 100644 --- a/Documentation/devicetree/bindings/net/qcom,ipa.yaml +++ b/Documentation/devicetree/bindings/net/qcom,ipa.yaml @@ -159,7 +159,7 @@ properties: when the AP (not the modem) performs early initialization. firmware-name: - $ref: /schemas/types.yaml#/definitions/string + maxItems: 1 description: If present, name (or relative path) of the file within the firmware search path containing the firmware image used when -- cgit 1.2.3-korg From 5f8066d4578241a2d9d63428e6a604807c2ab226 Mon Sep 17 00:00:00 2001 From: Philippe Schenker Date: Tue, 30 Jan 2024 09:34:18 +0100 Subject: dt-bindings: net: dsa: Add KSZ8567 switch support This commit adds the dt-binding for KSZ8567, a robust 7-port Ethernet switch. The KSZ8567 features two RGMII/MII/RMII interfaces, each capable of gigabit speeds, complemented by five 10/100 Mbps MAC/PHYs. This binding is necessary to set specific capabilities for this switch chip that are necessary due to the ksz dsa driver only accepting specific chip ids. The KSZ8567 is very similar to KSZ9567 however only containing 100 Mbps phys on its downstream ports. Signed-off-by: Philippe Schenker Acked-by: Conor Dooley Reviewed-by: Andrew Lunn Reviewed-by: Florian Fainelli Link: https://lore.kernel.org/r/20240130083419.135763-1-dev@pschenker.ch Signed-off-by: Paolo Abeni --- Documentation/devicetree/bindings/net/dsa/microchip,ksz.yaml | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/dsa/microchip,ksz.yaml b/Documentation/devicetree/bindings/net/dsa/microchip,ksz.yaml index c963dc09e8e12a..52acc15ebcbf36 100644 --- a/Documentation/devicetree/bindings/net/dsa/microchip,ksz.yaml +++ b/Documentation/devicetree/bindings/net/dsa/microchip,ksz.yaml @@ -31,6 +31,7 @@ properties: - microchip,ksz9893 - microchip,ksz9563 - microchip,ksz8563 + - microchip,ksz8567 reset-gpios: description: -- cgit 1.2.3-korg From 088a464ed53feeab9632c6748b9f25354639e2bd Mon Sep 17 00:00:00 2001 From: Dave Thaler Date: Tue, 30 Jan 2024 19:37:59 -0800 Subject: bpf, docs: Clarify which legacy packet instructions existed As discussed on the BPF IETF mailing list (see link), this patch updates the "Legacy BPF Packet access instructions" section to clarify which instructions are deprecated (vs which were never defined and so are not deprecated). Signed-off-by: Dave Thaler Signed-off-by: Daniel Borkmann Acked-by: Yonghong Song Acked-by: David Vernet Link: https://mailarchive.ietf.org/arch/msg/bpf/5LnnKm093cGpOmDI9TnLQLBXyys Link: https://lore.kernel.org/bpf/20240131033759.3634-1-dthaler1968@gmail.com --- Documentation/bpf/standardization/instruction-set.rst | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/bpf/standardization/instruction-set.rst b/Documentation/bpf/standardization/instruction-set.rst index fceacca4629961..dcbc9193c66f23 100644 --- a/Documentation/bpf/standardization/instruction-set.rst +++ b/Documentation/bpf/standardization/instruction-set.rst @@ -630,7 +630,9 @@ Legacy BPF Packet access instructions ------------------------------------- BPF previously introduced special instructions for access to packet data that were -carried over from classic BPF. However, these instructions are +carried over from classic BPF. These instructions used an instruction +class of BPF_LD, a size modifier of BPF_W, BPF_H, or BPF_B, and a +mode modifier of BPF_ABS or BPF_IND. However, these instructions are deprecated and should no longer be used. All legacy packet access instructions belong to the "legacy" conformance group instead of the "basic" conformance group. -- cgit 1.2.3-korg From bd765cc910127ee8ed6cd83dae0f0bfbca69d71e Mon Sep 17 00:00:00 2001 From: David Arinzon Date: Tue, 30 Jan 2024 09:53:44 +0000 Subject: net: ena: Add more documentation for RX copybreak This patch contains more details about the functionality of RX copybreak. Signed-off-by: Shay Agroskin Signed-off-by: David Arinzon Signed-off-by: Paolo Abeni --- Documentation/networking/device_drivers/ethernet/amazon/ena.rst | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'Documentation') diff --git a/Documentation/networking/device_drivers/ethernet/amazon/ena.rst b/Documentation/networking/device_drivers/ethernet/amazon/ena.rst index b842bcb14255b5..a4c7d0c65fd7e5 100644 --- a/Documentation/networking/device_drivers/ethernet/amazon/ena.rst +++ b/Documentation/networking/device_drivers/ethernet/amazon/ena.rst @@ -211,10 +211,16 @@ Documentation/networking/net_dim.rst RX copybreak ============ + The rx_copybreak is initialized by default to ENA_DEFAULT_RX_COPYBREAK and can be configured by the ETHTOOL_STUNABLE command of the SIOCETHTOOL ioctl. +This option controls the maximum packet length for which the RX +descriptor it was received on would be recycled. When a packet smaller +than RX copybreak bytes is received, it is copied into a new memory +buffer and the RX descriptor is returned to HW. + Statistics ========== -- cgit 1.2.3-korg From cf4f0f1e1c465da7c1f6bc89c3ff50bf42f0ab02 Mon Sep 17 00:00:00 2001 From: Jiri Pirko Date: Tue, 30 Jan 2024 13:08:29 +0100 Subject: dpll: extend uapi by lock status error attribute If the dpll devices goes to state "unlocked" or "holdover", it may be caused by an error. In that case, allow user to see what the error was. Introduce a new attribute and values it can carry. Signed-off-by: Jiri Pirko Acked-by: Vadim Fedorenko Reviewed-by: Simon Horman Signed-off-by: Paolo Abeni --- Documentation/netlink/specs/dpll.yaml | 39 +++++++++++++++++++++++++++++++++++ include/uapi/linux/dpll.h | 30 +++++++++++++++++++++++++++ 2 files changed, 69 insertions(+) (limited to 'Documentation') diff --git a/Documentation/netlink/specs/dpll.yaml b/Documentation/netlink/specs/dpll.yaml index b14aed18065f43..1755066d8308c3 100644 --- a/Documentation/netlink/specs/dpll.yaml +++ b/Documentation/netlink/specs/dpll.yaml @@ -51,6 +51,40 @@ definitions: if dpll lock-state was not DPLL_LOCK_STATUS_LOCKED_HO_ACQ, the dpll's lock-state shall remain DPLL_LOCK_STATUS_UNLOCKED) render-max: true + - + type: enum + name: lock-status-error + doc: | + if previous status change was done due to a failure, this provides + information of dpll device lock status error. + Valid values for DPLL_A_LOCK_STATUS_ERROR attribute + entries: + - + name: none + doc: | + dpll device lock status was changed without any error + value: 1 + - + name: undefined + doc: | + dpll device lock status was changed due to undefined error. + Driver fills this value up in case it is not able + to obtain suitable exact error type. + - + name: media-down + doc: | + dpll device lock status was changed because of associated + media got down. + This may happen for example if dpll device was previously + locked on an input pin of type PIN_TYPE_SYNCE_ETH_PORT. + - + name: fractional-frequency-offset-too-high + doc: | + the FFO (Fractional Frequency Offset) between the RX and TX + symbol rate on the media got too high. + This may happen for example if dpll device was previously + locked on an input pin of type PIN_TYPE_SYNCE_ETH_PORT. + render-max: true - type: const name: temp-divider @@ -214,6 +248,10 @@ attribute-sets: name: type type: u32 enum: type + - + name: lock-status-error + type: u32 + enum: lock-status-error - name: pin enum-name: dpll_a_pin @@ -379,6 +417,7 @@ operations: - mode - mode-supported - lock-status + - lock-status-error - temp - clock-id - type diff --git a/include/uapi/linux/dpll.h b/include/uapi/linux/dpll.h index b4e947f9bfbcdb..0c13d7f1a1bc3b 100644 --- a/include/uapi/linux/dpll.h +++ b/include/uapi/linux/dpll.h @@ -50,6 +50,35 @@ enum dpll_lock_status { DPLL_LOCK_STATUS_MAX = (__DPLL_LOCK_STATUS_MAX - 1) }; +/** + * enum dpll_lock_status_error - if previous status change was done due to a + * failure, this provides information of dpll device lock status error. Valid + * values for DPLL_A_LOCK_STATUS_ERROR attribute + * @DPLL_LOCK_STATUS_ERROR_NONE: dpll device lock status was changed without + * any error + * @DPLL_LOCK_STATUS_ERROR_UNDEFINED: dpll device lock status was changed due + * to undefined error. Driver fills this value up in case it is not able to + * obtain suitable exact error type. + * @DPLL_LOCK_STATUS_ERROR_MEDIA_DOWN: dpll device lock status was changed + * because of associated media got down. This may happen for example if dpll + * device was previously locked on an input pin of type + * PIN_TYPE_SYNCE_ETH_PORT. + * @DPLL_LOCK_STATUS_ERROR_FRACTIONAL_FREQUENCY_OFFSET_TOO_HIGH: the FFO + * (Fractional Frequency Offset) between the RX and TX symbol rate on the + * media got too high. This may happen for example if dpll device was + * previously locked on an input pin of type PIN_TYPE_SYNCE_ETH_PORT. + */ +enum dpll_lock_status_error { + DPLL_LOCK_STATUS_ERROR_NONE = 1, + DPLL_LOCK_STATUS_ERROR_UNDEFINED, + DPLL_LOCK_STATUS_ERROR_MEDIA_DOWN, + DPLL_LOCK_STATUS_ERROR_FRACTIONAL_FREQUENCY_OFFSET_TOO_HIGH, + + /* private: */ + __DPLL_LOCK_STATUS_ERROR_MAX, + DPLL_LOCK_STATUS_ERROR_MAX = (__DPLL_LOCK_STATUS_ERROR_MAX - 1) +}; + #define DPLL_TEMP_DIVIDER 1000 /** @@ -150,6 +179,7 @@ enum dpll_a { DPLL_A_LOCK_STATUS, DPLL_A_TEMP, DPLL_A_TYPE, + DPLL_A_LOCK_STATUS_ERROR, __DPLL_A_MAX, DPLL_A_MAX = (__DPLL_A_MAX - 1) -- cgit 1.2.3-korg From 9484b9555de04ed16952dda6518b324f61a6fd6a Mon Sep 17 00:00:00 2001 From: Christian Marangi Date: Wed, 31 Jan 2024 03:26:03 +0100 Subject: dt-bindings: net: ipq4019-mdio: document now supported clock-frequency Document support for clock-frequency and add details on why this property is needed and what values are supported. From internal documentation, while other values are supported, the correct function of the MDIO bus is not assured hence add only the suggested supported values to the property enum. Signed-off-by: Christian Marangi Reviewed-by: Andrew Lunn Signed-off-by: David S. Miller --- .../devicetree/bindings/net/qcom,ipq4019-mdio.yaml | 15 +++++++++++++++ 1 file changed, 15 insertions(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/qcom,ipq4019-mdio.yaml b/Documentation/devicetree/bindings/net/qcom,ipq4019-mdio.yaml index 3407e909e8a7ad..0029e197a8251e 100644 --- a/Documentation/devicetree/bindings/net/qcom,ipq4019-mdio.yaml +++ b/Documentation/devicetree/bindings/net/qcom,ipq4019-mdio.yaml @@ -44,6 +44,21 @@ properties: items: - const: gcc_mdio_ahb_clk + clock-frequency: + description: + The MDIO bus clock that must be output by the MDIO bus hardware, if + absent, the default hardware values are used. + + MDC rate is feed by an external clock (fixed 100MHz) and is divider + internally. The default divider is /256 resulting in the default rate + applied of 390KHz. + + To follow 802.3 standard that instruct up to 2.5MHz by default, if + this property is not declared and the divider is set to /256, by + default 1.5625Mhz is select. + enum: [ 390625, 781250, 1562500, 3125000, 6250000, 12500000 ] + default: 1562500 + required: - compatible - reg -- cgit 1.2.3-korg From 84f90efd5076525a581e3f923f6c86579f41e713 Mon Sep 17 00:00:00 2001 From: Ravi Gunasekaran Date: Wed, 31 Jan 2024 14:23:51 +0530 Subject: dt-bindings: net: ti: Update maintainers list Update the list with the current maintainers of TI's CPSW ethernet peripheral. Signed-off-by: Ravi Gunasekaran Acked-by: Roger Quadros Signed-off-by: David S. Miller --- Documentation/devicetree/bindings/net/ti,cpsw-switch.yaml | 5 +++-- Documentation/devicetree/bindings/net/ti,k3-am654-cpsw-nuss.yaml | 5 +++-- Documentation/devicetree/bindings/net/ti,k3-am654-cpts.yaml | 5 +++-- 3 files changed, 9 insertions(+), 6 deletions(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/ti,cpsw-switch.yaml b/Documentation/devicetree/bindings/net/ti,cpsw-switch.yaml index f07ae3173b03d4..d5bd93ee4dbb16 100644 --- a/Documentation/devicetree/bindings/net/ti,cpsw-switch.yaml +++ b/Documentation/devicetree/bindings/net/ti,cpsw-switch.yaml @@ -7,8 +7,9 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: TI SoC Ethernet Switch Controller (CPSW) maintainers: - - Grygorii Strashko - - Sekhar Nori + - Siddharth Vadapalli + - Ravi Gunasekaran + - Roger Quadros description: The 3-port switch gigabit ethernet subsystem provides ethernet packet diff --git a/Documentation/devicetree/bindings/net/ti,k3-am654-cpsw-nuss.yaml b/Documentation/devicetree/bindings/net/ti,k3-am654-cpsw-nuss.yaml index c9c25132d1544a..73ed5951d29695 100644 --- a/Documentation/devicetree/bindings/net/ti,k3-am654-cpsw-nuss.yaml +++ b/Documentation/devicetree/bindings/net/ti,k3-am654-cpsw-nuss.yaml @@ -7,8 +7,9 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: The TI AM654x/J721E/AM642x SoC Gigabit Ethernet MAC (Media Access Controller) maintainers: - - Grygorii Strashko - - Sekhar Nori + - Siddharth Vadapalli + - Ravi Gunasekaran + - Roger Quadros description: The TI AM654x/J721E SoC Gigabit Ethernet MAC (CPSW2G NUSS) has two ports diff --git a/Documentation/devicetree/bindings/net/ti,k3-am654-cpts.yaml b/Documentation/devicetree/bindings/net/ti,k3-am654-cpts.yaml index 3e910d3b24a08b..b1c875325776d4 100644 --- a/Documentation/devicetree/bindings/net/ti,k3-am654-cpts.yaml +++ b/Documentation/devicetree/bindings/net/ti,k3-am654-cpts.yaml @@ -7,8 +7,9 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: The TI AM654x/J721E Common Platform Time Sync (CPTS) module maintainers: - - Grygorii Strashko - - Sekhar Nori + - Siddharth Vadapalli + - Ravi Gunasekaran + - Roger Quadros description: |+ The TI AM654x/J721E CPTS module is used to facilitate host control of time -- cgit 1.2.3-korg From 2d9a925d0fbf0dae99af148adaf4f5cadf1be5e0 Mon Sep 17 00:00:00 2001 From: Dave Thaler Date: Fri, 2 Feb 2024 14:11:10 -0800 Subject: bpf, docs: Expand set of initial conformance groups This patch attempts to update the ISA specification according to the latest mailing list discussion about conformance groups, in a way that is intended to be consistent with IANA registry processes and IETF 118 WG meeting discussion. It does the following: * Split basic into base32 and base64 for 32-bit vs 64-bit base instructions * Split division/multiplication/modulo instructions out of base groups * Split atomic instructions out of base groups There may be additional changes as discussion continues, but there seems to be consensus on the principles above. v1->v2: fixed typo pointed out by David Vernet v2->v3: Moved multiplication to same groups as division/modulo Signed-off-by: Dave Thaler Acked-by: David Vernet Link: https://lore.kernel.org/r/20240202221110.3872-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov --- .../bpf/standardization/instruction-set.rst | 48 ++++++++++++++++------ 1 file changed, 36 insertions(+), 12 deletions(-) (limited to 'Documentation') diff --git a/Documentation/bpf/standardization/instruction-set.rst b/Documentation/bpf/standardization/instruction-set.rst index dcbc9193c66f23..1c4258f1ce9306 100644 --- a/Documentation/bpf/standardization/instruction-set.rst +++ b/Documentation/bpf/standardization/instruction-set.rst @@ -102,7 +102,7 @@ Conformance groups An implementation does not need to support all instructions specified in this document (e.g., deprecated instructions). Instead, a number of conformance -groups are specified. An implementation must support the "basic" conformance +groups are specified. An implementation must support the base32 conformance group and may support additional conformance groups, where supporting a conformance group means it must support all instructions in that conformance group. @@ -112,12 +112,21 @@ that executes instructions, and tools as such compilers that generate instructions for the runtime. Thus, capability discovery in terms of conformance groups might be done manually by users or automatically by tools. -Each conformance group has a short ASCII label (e.g., "basic") that +Each conformance group has a short ASCII label (e.g., "base32") that corresponds to a set of instructions that are mandatory. That is, each instruction has one or more conformance groups of which it is a member. -The "basic" conformance group includes all instructions defined in this -specification unless otherwise noted. +This document defines the following conformance groups: +* base32: includes all instructions defined in this + specification unless otherwise noted. +* base64: includes base32, plus instructions explicitly noted + as being in the base64 conformance group. +* atomic32: includes 32-bit atomic operation instructions (see `Atomic operations`_). +* atomic64: includes atomic32, plus 64-bit atomic operation instructions. +* divmul32: includes 32-bit division, multiplication, and modulo instructions. +* divmul64: includes divmul32, plus 64-bit division, multiplication, + and modulo instructions. +* legacy: deprecated packet access instructions. Instruction encoding ==================== @@ -234,7 +243,8 @@ Arithmetic instructions ----------------------- ``BPF_ALU`` uses 32-bit wide operands while ``BPF_ALU64`` uses 64-bit wide operands for -otherwise identical operations. +otherwise identical operations. ``BPF_ALU64`` instructions belong to the +base64 conformance group unless noted otherwise. The 'code' field encodes the operation as below, where 'src' and 'dst' refer to the values of the source and destination registers, respectively. @@ -288,6 +298,10 @@ where '(u32)' indicates that the upper 32 bits are zeroed. Note that most instructions have instruction offset of 0. Only three instructions (``BPF_SDIV``, ``BPF_SMOD``, ``BPF_MOVSX``) have a non-zero offset. +Division, multiplication, and modulo operations for ``BPF_ALU`` are part +of the "divmul32" conformance group, and division, multiplication, and +modulo operations for ``BPF_ALU64`` are part of the "divmul64" conformance +group. The division and modulo operations support both unsigned and signed flavors. For unsigned operations (``BPF_DIV`` and ``BPF_MOD``), for ``BPF_ALU``, @@ -344,7 +358,9 @@ BPF_ALU64 Reserved 0x00 do byte swap unconditionally ========= ========= ===== ================================================= The 'imm' field encodes the width of the swap operations. The following widths -are supported: 16, 32 and 64. +are supported: 16, 32 and 64. Width 64 operations belong to the base64 +conformance group and other swap operations belong to the base32 +conformance group. Examples: @@ -369,8 +385,10 @@ Examples: Jump instructions ----------------- -``BPF_JMP32`` uses 32-bit wide operands while ``BPF_JMP`` uses 64-bit wide operands for -otherwise identical operations. +``BPF_JMP32`` uses 32-bit wide operands and indicates the base32 +conformance group, while ``BPF_JMP`` uses 64-bit wide operands for +otherwise identical operations, and indicates the base64 conformance +group unless otherwise specified. The 'code' field encodes the operation as below: ======== ===== === =============================== ============================================= @@ -419,6 +437,9 @@ specified by the 'imm' field. A > 16-bit conditional jump may be converted to a < 16-bit conditional jump plus a 32-bit unconditional jump. +All ``BPF_CALL`` and ``BPF_JA`` instructions belong to the +base32 conformance group. + Helper functions ~~~~~~~~~~~~~~~~ @@ -476,6 +497,8 @@ The size modifier is one of: BPF_DW 0x18 double word (8 bytes) ============= ===== ===================== +Instructions using ``BPF_DW`` belong to the base64 conformance group. + Regular load and store operations --------------------------------- @@ -520,8 +543,10 @@ by other BPF programs or means outside of this specification. All atomic operations supported by BPF are encoded as store operations that use the ``BPF_ATOMIC`` mode modifier as follows: -* ``BPF_ATOMIC | BPF_W | BPF_STX`` for 32-bit operations -* ``BPF_ATOMIC | BPF_DW | BPF_STX`` for 64-bit operations +* ``BPF_ATOMIC | BPF_W | BPF_STX`` for 32-bit operations, which are + part of the "atomic32" conformance group. +* ``BPF_ATOMIC | BPF_DW | BPF_STX`` for 64-bit operations, which are + part of the "atomic64" conformance group. * 8-bit and 16-bit wide atomic operations are not supported. The 'imm' field is used to encode the actual atomic operation. @@ -634,5 +659,4 @@ carried over from classic BPF. These instructions used an instruction class of BPF_LD, a size modifier of BPF_W, BPF_H, or BPF_B, and a mode modifier of BPF_ABS or BPF_IND. However, these instructions are deprecated and should no longer be used. All legacy packet access -instructions belong to the "legacy" conformance group instead of the "basic" -conformance group. +instructions belong to the "legacy" conformance group. -- cgit 1.2.3-korg From fd2bc4195d5107f88c1b90e1ec935888ccbfc5c0 Mon Sep 17 00:00:00 2001 From: Leon Romanovsky Date: Tue, 3 Oct 2023 20:57:20 +0300 Subject: xfrm: generalize xdo_dev_state_update_curlft to allow statistics update In order to allow drivers to fill all statistics, change the name of xdo_dev_state_update_curlft to be xdo_dev_state_update_stats. Acked-by: Steffen Klassert Signed-off-by: Leon Romanovsky Signed-off-by: Saeed Mahameed --- Documentation/networking/xfrm_device.rst | 4 ++-- drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c | 7 ++++--- include/linux/netdevice.h | 2 +- include/net/xfrm.h | 11 ++++------- net/xfrm/xfrm_state.c | 4 ++-- net/xfrm/xfrm_user.c | 2 +- 6 files changed, 14 insertions(+), 16 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/xfrm_device.rst b/Documentation/networking/xfrm_device.rst index 535077cbeb07df..bfea9d8579ede0 100644 --- a/Documentation/networking/xfrm_device.rst +++ b/Documentation/networking/xfrm_device.rst @@ -71,9 +71,9 @@ Callbacks to implement bool (*xdo_dev_offload_ok) (struct sk_buff *skb, struct xfrm_state *x); void (*xdo_dev_state_advance_esn) (struct xfrm_state *x); + void (*xdo_dev_state_update_stats) (struct xfrm_state *x); /* Solely packet offload callbacks */ - void (*xdo_dev_state_update_curlft) (struct xfrm_state *x); int (*xdo_dev_policy_add) (struct xfrm_policy *x, struct netlink_ext_ack *extack); void (*xdo_dev_policy_delete) (struct xfrm_policy *x); void (*xdo_dev_policy_free) (struct xfrm_policy *x); @@ -191,6 +191,6 @@ xdo_dev_policy_free() on any remaining offloaded states. Outcome of HW handling packets, the XFRM core can't count hard, soft limits. The HW/driver are responsible to perform it and provide accurate data when -xdo_dev_state_update_curlft() is called. In case of one of these limits +xdo_dev_state_update_stats() is called. In case of one of these limits occuried, the driver needs to call to xfrm_state_check_expire() to make sure that XFRM performs rekeying sequence. diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c index 05612d9c6080c7..f160522fbe7573 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c @@ -984,7 +984,7 @@ static void mlx5e_xfrm_advance_esn_state(struct xfrm_state *x) queue_work(sa_entry->ipsec->wq, &work->work); } -static void mlx5e_xfrm_update_curlft(struct xfrm_state *x) +static void mlx5e_xfrm_update_stats(struct xfrm_state *x) { struct mlx5e_ipsec_sa_entry *sa_entry = to_ipsec_sa_entry(x); struct mlx5e_ipsec_rule *ipsec_rule = &sa_entry->ipsec_rule; @@ -993,7 +993,8 @@ static void mlx5e_xfrm_update_curlft(struct xfrm_state *x) lockdep_assert(lockdep_is_held(&x->lock) || lockdep_is_held(&dev_net(x->xso.real_dev)->xfrm.xfrm_cfg_mutex)); - if (x->xso.flags & XFRM_DEV_OFFLOAD_FLAG_ACQ) + if (x->xso.flags & XFRM_DEV_OFFLOAD_FLAG_ACQ || + x->xso.type != XFRM_DEV_OFFLOAD_PACKET) return; mlx5_fc_query_cached(ipsec_rule->fc, &bytes, &packets, &lastuse); @@ -1156,7 +1157,7 @@ static const struct xfrmdev_ops mlx5e_ipsec_xfrmdev_ops = { .xdo_dev_offload_ok = mlx5e_ipsec_offload_ok, .xdo_dev_state_advance_esn = mlx5e_xfrm_advance_esn_state, - .xdo_dev_state_update_curlft = mlx5e_xfrm_update_curlft, + .xdo_dev_state_update_stats = mlx5e_xfrm_update_stats, .xdo_dev_policy_add = mlx5e_xfrm_add_policy, .xdo_dev_policy_delete = mlx5e_xfrm_del_policy, .xdo_dev_policy_free = mlx5e_xfrm_free_policy, diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 118c40258d07b7..9538576dbebcd6 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1062,7 +1062,7 @@ struct xfrmdev_ops { bool (*xdo_dev_offload_ok) (struct sk_buff *skb, struct xfrm_state *x); void (*xdo_dev_state_advance_esn) (struct xfrm_state *x); - void (*xdo_dev_state_update_curlft) (struct xfrm_state *x); + void (*xdo_dev_state_update_stats) (struct xfrm_state *x); int (*xdo_dev_policy_add) (struct xfrm_policy *x, struct netlink_ext_ack *extack); void (*xdo_dev_policy_delete) (struct xfrm_policy *x); void (*xdo_dev_policy_free) (struct xfrm_policy *x); diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 1d107241b90187..4ca2f3205190c9 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -1578,21 +1578,18 @@ struct xfrm_state *xfrm_state_lookup_byspi(struct net *net, __be32 spi, unsigned short family); int xfrm_state_check_expire(struct xfrm_state *x); #ifdef CONFIG_XFRM_OFFLOAD -static inline void xfrm_dev_state_update_curlft(struct xfrm_state *x) +static inline void xfrm_dev_state_update_stats(struct xfrm_state *x) { struct xfrm_dev_offload *xdo = &x->xso; struct net_device *dev = xdo->dev; - if (x->xso.type != XFRM_DEV_OFFLOAD_PACKET) - return; - if (dev && dev->xfrmdev_ops && - dev->xfrmdev_ops->xdo_dev_state_update_curlft) - dev->xfrmdev_ops->xdo_dev_state_update_curlft(x); + dev->xfrmdev_ops->xdo_dev_state_update_stats) + dev->xfrmdev_ops->xdo_dev_state_update_stats(x); } #else -static inline void xfrm_dev_state_update_curlft(struct xfrm_state *x) {} +static inline void xfrm_dev_state_update_stats(struct xfrm_state *x) {} #endif void xfrm_state_insert(struct xfrm_state *x); int xfrm_state_add(struct xfrm_state *x); diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index bda5327bf34dff..d8701b2d0d57b9 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -570,7 +570,7 @@ static enum hrtimer_restart xfrm_timer_handler(struct hrtimer *me) int err = 0; spin_lock(&x->lock); - xfrm_dev_state_update_curlft(x); + xfrm_dev_state_update_stats(x); if (x->km.state == XFRM_STATE_DEAD) goto out; @@ -1935,7 +1935,7 @@ EXPORT_SYMBOL(xfrm_state_update); int xfrm_state_check_expire(struct xfrm_state *x) { - xfrm_dev_state_update_curlft(x); + xfrm_dev_state_update_stats(x); if (!READ_ONCE(x->curlft.use_time)) WRITE_ONCE(x->curlft.use_time, ktime_get_real_seconds()); diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index ad01997c3aa9dd..dc4f9b8d7cb0fb 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -902,7 +902,7 @@ static void copy_to_user_state(struct xfrm_state *x, struct xfrm_usersa_info *p) memcpy(&p->sel, &x->sel, sizeof(p->sel)); memcpy(&p->lft, &x->lft, sizeof(p->lft)); if (x->xso.dev) - xfrm_dev_state_update_curlft(x); + xfrm_dev_state_update_stats(x); memcpy(&p->curlft, &x->curlft, sizeof(p->curlft)); put_unaligned(x->stats.replay_window, &p->stats.replay_window); put_unaligned(x->stats.replay, &p->stats.replay); -- cgit 1.2.3-korg From 21e16fa5dc6c6f69d9f2cf84e6d8f147ec4c1fbe Mon Sep 17 00:00:00 2001 From: Moshe Shemesh Date: Fri, 26 Jan 2024 15:05:58 +0200 Subject: Documentation: Fix counter name of mlx5 vnic reporter Fix counter name in documentation of mlx5 vnic health reporter diagnose output: total_error_queues. While here fix alignment in the documentation file of another counter, comp_eq_overrun, as it should have its own line and not be part of another counter's description. Example: $ devlink health diagnose pci/0000:00:04.0 reporter vnic vNIC env counters: total_error_queues: 0 send_queue_priority_update_flow: 0 comp_eq_overrun: 0 async_eq_overrun: 0 cq_overrun: 0 invalid_command: 0 quota_exceeded_command: 0 nic_receive_steering_discard: 0 Signed-off-by: Moshe Shemesh Signed-off-by: Saeed Mahameed --- Documentation/networking/devlink/mlx5.rst | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/devlink/mlx5.rst b/Documentation/networking/devlink/mlx5.rst index b9587b3400b903..456985407475f1 100644 --- a/Documentation/networking/devlink/mlx5.rst +++ b/Documentation/networking/devlink/mlx5.rst @@ -250,7 +250,7 @@ them in realtime. Description of the vnic counters: -- total_q_under_processor_handle +- total_error_queues number of queues in an error state due to an async error or errored command. - send_queue_priority_update_flow @@ -259,7 +259,8 @@ Description of the vnic counters: number of times CQ entered an error state due to an overflow. - async_eq_overrun number of times an EQ mapped to async events was overrun. - comp_eq_overrun number of times an EQ mapped to completion events was +- comp_eq_overrun + number of times an EQ mapped to completion events was overrun. - quota_exceeded_command number of commands issued and failed due to quota exceeded. -- cgit 1.2.3-korg From 968595a93669b6b4f6d1fcf80cf2d97956b6868f Mon Sep 17 00:00:00 2001 From: Magnus Karlsson Date: Mon, 5 Feb 2024 13:35:51 +0100 Subject: xsk: document ability to redirect to any socket bound to the same umem Document the ability to redirect to any socket bound to the same umem. Signed-off-by: Magnus Karlsson Link: https://lore.kernel.org/r/20240205123553.22180-3-magnus.karlsson@gmail.com Signed-off-by: Alexei Starovoitov --- Documentation/networking/af_xdp.rst | 33 +++++++++++++++++++-------------- 1 file changed, 19 insertions(+), 14 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/af_xdp.rst b/Documentation/networking/af_xdp.rst index dceeb0d763aa23..72da7057e4cf96 100644 --- a/Documentation/networking/af_xdp.rst +++ b/Documentation/networking/af_xdp.rst @@ -329,23 +329,24 @@ XDP_SHARED_UMEM option and provide the initial socket's fd in the sxdp_shared_umem_fd field as you registered the UMEM on that socket. These two sockets will now share one and the same UMEM. -There is no need to supply an XDP program like the one in the previous -case where sockets were bound to the same queue id and -device. Instead, use the NIC's packet steering capabilities to steer -the packets to the right queue. In the previous example, there is only -one queue shared among sockets, so the NIC cannot do this steering. It -can only steer between queues. - -In libbpf, you need to use the xsk_socket__create_shared() API as it -takes a reference to a FILL ring and a COMPLETION ring that will be -created for you and bound to the shared UMEM. You can use this -function for all the sockets you create, or you can use it for the -second and following ones and use xsk_socket__create() for the first -one. Both methods yield the same result. +In this case, it is possible to use the NIC's packet steering +capabilities to steer the packets to the right queue. This is not +possible in the previous example as there is only one queue shared +among sockets, so the NIC cannot do this steering as it can only steer +between queues. + +In libxdp (or libbpf prior to version 1.0), you need to use the +xsk_socket__create_shared() API as it takes a reference to a FILL ring +and a COMPLETION ring that will be created for you and bound to the +shared UMEM. You can use this function for all the sockets you create, +or you can use it for the second and following ones and use +xsk_socket__create() for the first one. Both methods yield the same +result. Note that a UMEM can be shared between sockets on the same queue id and device, as well as between queues on the same device and between -devices at the same time. +devices at the same time. It is also possible to redirect to any +socket as long as it is bound to the same umem with XDP_SHARED_UMEM. XDP_USE_NEED_WAKEUP bind flag ----------------------------- @@ -822,6 +823,10 @@ A: The short answer is no, that is not supported at the moment. The switch, or other distribution mechanism, in your NIC to direct traffic to the correct queue id and socket. + Note that if you are using the XDP_SHARED_UMEM option, it is + possible to switch traffic between any socket bound to the same + umem. + Q: My packets are sometimes corrupted. What is wrong? A: Care has to be taken not to feed the same buffer in the UMEM into -- cgit 1.2.3-korg From 240fd405528bbf7fafa0559202ca7aa524c9cd96 Mon Sep 17 00:00:00 2001 From: Aahil Awatramani Date: Fri, 2 Feb 2024 17:58:58 +0000 Subject: bonding: Add independent control state machine Add support for the independent control state machine per IEEE 802.1AX-2008 5.4.15 in addition to the existing implementation of the coupled control state machine. Introduces two new states, AD_MUX_COLLECTING and AD_MUX_DISTRIBUTING in the LACP MUX state machine for separated handling of an initial Collecting state before the Collecting and Distributing state. This enables a port to be in a state where it can receive incoming packets while not still distributing. This is useful for reducing packet loss when a port begins distributing before its partner is able to collect. Added new functions such as bond_set_slave_tx_disabled_flags and bond_set_slave_rx_enabled_flags to precisely manage the port's collecting and distributing states. Previously, there was no dedicated method to disable TX while keeping RX enabled, which this patch addresses. Note that the regular flow process in the kernel's bonding driver remains unaffected by this patch. The extension requires explicit opt-in by the user (in order to ensure no disruptions for existing setups) via netlink support using the new bonding parameter coupled_control. The default value for coupled_control is set to 1 so as to preserve existing behaviour. Signed-off-by: Aahil Awatramani Reviewed-by: Hangbin Liu Link: https://lore.kernel.org/r/20240202175858.1573852-1-aahila@google.com Signed-off-by: Paolo Abeni --- Documentation/networking/bonding.rst | 12 +++ drivers/net/bonding/bond_3ad.c | 157 +++++++++++++++++++++++++++++++++-- drivers/net/bonding/bond_main.c | 1 + drivers/net/bonding/bond_netlink.c | 16 ++++ drivers/net/bonding/bond_options.c | 28 ++++++- include/net/bond_3ad.h | 2 + include/net/bond_options.h | 1 + include/net/bonding.h | 23 +++++ include/uapi/linux/if_link.h | 1 + tools/include/uapi/linux/if_link.h | 1 + 10 files changed, 234 insertions(+), 8 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/bonding.rst b/Documentation/networking/bonding.rst index f7a73421eb76a1..e774b48de9f511 100644 --- a/Documentation/networking/bonding.rst +++ b/Documentation/networking/bonding.rst @@ -444,6 +444,18 @@ arp_missed_max The default value is 2, and the allowable range is 1 - 255. +coupled_control + + Specifies whether the LACP state machine's MUX in the 802.3ad mode + should have separate Collecting and Distributing states. + + This is by implementing the independent control state machine per + IEEE 802.1AX-2008 5.4.15 in addition to the existing coupled control + state machine. + + The default value is 1. This setting does not separate the Collecting + and Distributing states, maintaining the bond in coupled control. + downdelay Specifies the time, in milliseconds, to wait before disabling diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c index c99ffe6c683a38..f2942e8c6c91b4 100644 --- a/drivers/net/bonding/bond_3ad.c +++ b/drivers/net/bonding/bond_3ad.c @@ -106,6 +106,9 @@ static void ad_agg_selection_logic(struct aggregator *aggregator, static void ad_clear_agg(struct aggregator *aggregator); static void ad_initialize_agg(struct aggregator *aggregator); static void ad_initialize_port(struct port *port, int lacp_fast); +static void ad_enable_collecting(struct port *port); +static void ad_disable_distributing(struct port *port, + bool *update_slave_arr); static void ad_enable_collecting_distributing(struct port *port, bool *update_slave_arr); static void ad_disable_collecting_distributing(struct port *port, @@ -171,9 +174,38 @@ static inline int __agg_has_partner(struct aggregator *agg) return !is_zero_ether_addr(agg->partner_system.mac_addr_value); } +/** + * __disable_distributing_port - disable the port's slave for distributing. + * Port will still be able to collect. + * @port: the port we're looking at + * + * This will disable only distributing on the port's slave. + */ +static void __disable_distributing_port(struct port *port) +{ + bond_set_slave_tx_disabled_flags(port->slave, BOND_SLAVE_NOTIFY_LATER); +} + +/** + * __enable_collecting_port - enable the port's slave for collecting, + * if it's up + * @port: the port we're looking at + * + * This will enable only collecting on the port's slave. + */ +static void __enable_collecting_port(struct port *port) +{ + struct slave *slave = port->slave; + + if (slave->link == BOND_LINK_UP && bond_slave_is_up(slave)) + bond_set_slave_rx_enabled_flags(slave, BOND_SLAVE_NOTIFY_LATER); +} + /** * __disable_port - disable the port's slave * @port: the port we're looking at + * + * This will disable both collecting and distributing on the port's slave. */ static inline void __disable_port(struct port *port) { @@ -183,6 +215,8 @@ static inline void __disable_port(struct port *port) /** * __enable_port - enable the port's slave, if it's up * @port: the port we're looking at + * + * This will enable both collecting and distributing on the port's slave. */ static inline void __enable_port(struct port *port) { @@ -193,10 +227,27 @@ static inline void __enable_port(struct port *port) } /** - * __port_is_enabled - check if the port's slave is in active state + * __port_move_to_attached_state - check if port should transition back to attached + * state. + * @port: the port we're looking at + */ +static bool __port_move_to_attached_state(struct port *port) +{ + if (!(port->sm_vars & AD_PORT_SELECTED) || + (port->sm_vars & AD_PORT_STANDBY) || + !(port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) || + !(port->actor_oper_port_state & LACP_STATE_SYNCHRONIZATION)) + port->sm_mux_state = AD_MUX_ATTACHED; + + return port->sm_mux_state == AD_MUX_ATTACHED; +} + +/** + * __port_is_collecting_distributing - check if the port's slave is in the + * combined collecting/distributing state * @port: the port we're looking at */ -static inline int __port_is_enabled(struct port *port) +static int __port_is_collecting_distributing(struct port *port) { return bond_is_active_slave(port->slave); } @@ -942,6 +993,7 @@ static int ad_marker_send(struct port *port, struct bond_marker *marker) */ static void ad_mux_machine(struct port *port, bool *update_slave_arr) { + struct bonding *bond = __get_bond_by_port(port); mux_states_t last_state; /* keep current State Machine state to compare later if it was @@ -999,9 +1051,13 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr) if ((port->sm_vars & AD_PORT_SELECTED) && (port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) && !__check_agg_selection_timer(port)) { - if (port->aggregator->is_active) - port->sm_mux_state = - AD_MUX_COLLECTING_DISTRIBUTING; + if (port->aggregator->is_active) { + int state = AD_MUX_COLLECTING_DISTRIBUTING; + + if (!bond->params.coupled_control) + state = AD_MUX_COLLECTING; + port->sm_mux_state = state; + } } else if (!(port->sm_vars & AD_PORT_SELECTED) || (port->sm_vars & AD_PORT_STANDBY)) { /* if UNSELECTED or STANDBY */ @@ -1019,11 +1075,45 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr) } break; case AD_MUX_COLLECTING_DISTRIBUTING: + if (!__port_move_to_attached_state(port)) { + /* if port state hasn't changed make + * sure that a collecting distributing + * port in an active aggregator is enabled + */ + if (port->aggregator->is_active && + !__port_is_collecting_distributing(port)) { + __enable_port(port); + *update_slave_arr = true; + } + } + break; + case AD_MUX_COLLECTING: + if (!__port_move_to_attached_state(port)) { + if ((port->sm_vars & AD_PORT_SELECTED) && + (port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) && + (port->partner_oper.port_state & LACP_STATE_COLLECTING)) { + port->sm_mux_state = AD_MUX_DISTRIBUTING; + } else { + /* If port state hasn't changed, make sure that a collecting + * port is enabled for an active aggregator. + */ + struct slave *slave = port->slave; + + if (port->aggregator->is_active && + bond_is_slave_rx_disabled(slave)) { + ad_enable_collecting(port); + *update_slave_arr = true; + } + } + } + break; + case AD_MUX_DISTRIBUTING: if (!(port->sm_vars & AD_PORT_SELECTED) || (port->sm_vars & AD_PORT_STANDBY) || + !(port->partner_oper.port_state & LACP_STATE_COLLECTING) || !(port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) || !(port->actor_oper_port_state & LACP_STATE_SYNCHRONIZATION)) { - port->sm_mux_state = AD_MUX_ATTACHED; + port->sm_mux_state = AD_MUX_COLLECTING; } else { /* if port state hasn't changed make * sure that a collecting distributing @@ -1031,7 +1121,7 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr) */ if (port->aggregator && port->aggregator->is_active && - !__port_is_enabled(port)) { + !__port_is_collecting_distributing(port)) { __enable_port(port); *update_slave_arr = true; } @@ -1082,6 +1172,20 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr) update_slave_arr); port->ntt = true; break; + case AD_MUX_COLLECTING: + port->actor_oper_port_state |= LACP_STATE_COLLECTING; + port->actor_oper_port_state &= ~LACP_STATE_DISTRIBUTING; + port->actor_oper_port_state |= LACP_STATE_SYNCHRONIZATION; + ad_enable_collecting(port); + ad_disable_distributing(port, update_slave_arr); + port->ntt = true; + break; + case AD_MUX_DISTRIBUTING: + port->actor_oper_port_state |= LACP_STATE_DISTRIBUTING; + port->actor_oper_port_state |= LACP_STATE_SYNCHRONIZATION; + ad_enable_collecting_distributing(port, + update_slave_arr); + break; default: break; } @@ -1906,6 +2010,45 @@ static void ad_initialize_port(struct port *port, int lacp_fast) } } +/** + * ad_enable_collecting - enable a port's receive + * @port: the port we're looking at + * + * Enable @port if it's in an active aggregator + */ +static void ad_enable_collecting(struct port *port) +{ + if (port->aggregator->is_active) { + struct slave *slave = port->slave; + + slave_dbg(slave->bond->dev, slave->dev, + "Enabling collecting on port %d (LAG %d)\n", + port->actor_port_number, + port->aggregator->aggregator_identifier); + __enable_collecting_port(port); + } +} + +/** + * ad_disable_distributing - disable a port's transmit + * @port: the port we're looking at + * @update_slave_arr: Does slave array need update? + */ +static void ad_disable_distributing(struct port *port, bool *update_slave_arr) +{ + if (port->aggregator && + !MAC_ADDRESS_EQUAL(&port->aggregator->partner_system, + &(null_mac_addr))) { + slave_dbg(port->slave->bond->dev, port->slave->dev, + "Disabling distributing on port %d (LAG %d)\n", + port->actor_port_number, + port->aggregator->aggregator_identifier); + __disable_distributing_port(port); + /* Slave array needs an update */ + *update_slave_arr = true; + } +} + /** * ad_enable_collecting_distributing - enable a port's transmit/receive * @port: the port we're looking at diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 4e0600c7b050f2..ae9d32c0faf40c 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -6306,6 +6306,7 @@ static int __init bond_check_params(struct bond_params *params) params->ad_actor_sys_prio = ad_actor_sys_prio; eth_zero_addr(params->ad_actor_system); params->ad_user_port_key = ad_user_port_key; + params->coupled_control = 1; if (packets_per_slave > 0) { params->reciprocal_packets_per_slave = reciprocal_value(packets_per_slave); diff --git a/drivers/net/bonding/bond_netlink.c b/drivers/net/bonding/bond_netlink.c index cfa74cf8bb1a95..29b4c3d1b9b6ff 100644 --- a/drivers/net/bonding/bond_netlink.c +++ b/drivers/net/bonding/bond_netlink.c @@ -122,6 +122,7 @@ static const struct nla_policy bond_policy[IFLA_BOND_MAX + 1] = { [IFLA_BOND_PEER_NOTIF_DELAY] = NLA_POLICY_FULL_RANGE(NLA_U32, &delay_range), [IFLA_BOND_MISSED_MAX] = { .type = NLA_U8 }, [IFLA_BOND_NS_IP6_TARGET] = { .type = NLA_NESTED }, + [IFLA_BOND_COUPLED_CONTROL] = { .type = NLA_U8 }, }; static const struct nla_policy bond_slave_policy[IFLA_BOND_SLAVE_MAX + 1] = { @@ -549,6 +550,16 @@ static int bond_changelink(struct net_device *bond_dev, struct nlattr *tb[], return err; } + if (data[IFLA_BOND_COUPLED_CONTROL]) { + int coupled_control = nla_get_u8(data[IFLA_BOND_COUPLED_CONTROL]); + + bond_opt_initval(&newval, coupled_control); + err = __bond_opt_set(bond, BOND_OPT_COUPLED_CONTROL, &newval, + data[IFLA_BOND_COUPLED_CONTROL], extack); + if (err) + return err; + } + return 0; } @@ -615,6 +626,7 @@ static size_t bond_get_size(const struct net_device *bond_dev) /* IFLA_BOND_NS_IP6_TARGET */ nla_total_size(sizeof(struct nlattr)) + nla_total_size(sizeof(struct in6_addr)) * BOND_MAX_NS_TARGETS + + nla_total_size(sizeof(u8)) + /* IFLA_BOND_COUPLED_CONTROL */ 0; } @@ -774,6 +786,10 @@ static int bond_fill_info(struct sk_buff *skb, bond->params.missed_max)) goto nla_put_failure; + if (nla_put_u8(skb, IFLA_BOND_COUPLED_CONTROL, + bond->params.coupled_control)) + goto nla_put_failure; + if (BOND_MODE(bond) == BOND_MODE_8023AD) { struct ad_info info; diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c index f3f27f0bd2a6cd..4cdbc7e084f4b4 100644 --- a/drivers/net/bonding/bond_options.c +++ b/drivers/net/bonding/bond_options.c @@ -84,7 +84,8 @@ static int bond_option_ad_user_port_key_set(struct bonding *bond, const struct bond_opt_value *newval); static int bond_option_missed_max_set(struct bonding *bond, const struct bond_opt_value *newval); - +static int bond_option_coupled_control_set(struct bonding *bond, + const struct bond_opt_value *newval); static const struct bond_opt_value bond_mode_tbl[] = { { "balance-rr", BOND_MODE_ROUNDROBIN, BOND_VALFLAG_DEFAULT}, @@ -232,6 +233,12 @@ static const struct bond_opt_value bond_missed_max_tbl[] = { { NULL, -1, 0}, }; +static const struct bond_opt_value bond_coupled_control_tbl[] = { + { "on", 1, BOND_VALFLAG_DEFAULT}, + { "off", 0, 0}, + { NULL, -1, 0}, +}; + static const struct bond_option bond_opts[BOND_OPT_LAST] = { [BOND_OPT_MODE] = { .id = BOND_OPT_MODE, @@ -496,6 +503,15 @@ static const struct bond_option bond_opts[BOND_OPT_LAST] = { .desc = "Delay between each peer notification on failover event, in milliseconds", .values = bond_peer_notif_delay_tbl, .set = bond_option_peer_notif_delay_set + }, + [BOND_OPT_COUPLED_CONTROL] = { + .id = BOND_OPT_COUPLED_CONTROL, + .name = "coupled_control", + .desc = "Opt into using coupled control MUX for LACP states", + .unsuppmodes = BOND_MODE_ALL_EX(BIT(BOND_MODE_8023AD)), + .flags = BOND_OPTFLAG_IFDOWN, + .values = bond_coupled_control_tbl, + .set = bond_option_coupled_control_set, } }; @@ -1692,3 +1708,13 @@ static int bond_option_ad_user_port_key_set(struct bonding *bond, bond->params.ad_user_port_key = newval->value; return 0; } + +static int bond_option_coupled_control_set(struct bonding *bond, + const struct bond_opt_value *newval) +{ + netdev_info(bond->dev, "Setting coupled_control to %s (%llu)\n", + newval->string, newval->value); + + bond->params.coupled_control = newval->value; + return 0; +} diff --git a/include/net/bond_3ad.h b/include/net/bond_3ad.h index c5e57c6bd87367..9ce5ac2bfbad9e 100644 --- a/include/net/bond_3ad.h +++ b/include/net/bond_3ad.h @@ -54,6 +54,8 @@ typedef enum { AD_MUX_DETACHED, /* mux machine */ AD_MUX_WAITING, /* mux machine */ AD_MUX_ATTACHED, /* mux machine */ + AD_MUX_COLLECTING, /* mux machine */ + AD_MUX_DISTRIBUTING, /* mux machine */ AD_MUX_COLLECTING_DISTRIBUTING /* mux machine */ } mux_states_t; diff --git a/include/net/bond_options.h b/include/net/bond_options.h index 69292ecc03257f..473a0147769eb9 100644 --- a/include/net/bond_options.h +++ b/include/net/bond_options.h @@ -76,6 +76,7 @@ enum { BOND_OPT_MISSED_MAX, BOND_OPT_NS_TARGETS, BOND_OPT_PRIO, + BOND_OPT_COUPLED_CONTROL, BOND_OPT_LAST }; diff --git a/include/net/bonding.h b/include/net/bonding.h index 5b8b1b644a2dbf..b61fb1aa3a56b5 100644 --- a/include/net/bonding.h +++ b/include/net/bonding.h @@ -148,6 +148,7 @@ struct bond_params { #if IS_ENABLED(CONFIG_IPV6) struct in6_addr ns_targets[BOND_MAX_NS_TARGETS]; #endif + int coupled_control; /* 2 bytes of padding : see ether_addr_equal_64bits() */ u8 ad_actor_system[ETH_ALEN + 2]; @@ -167,6 +168,7 @@ struct slave { u8 backup:1, /* indicates backup slave. Value corresponds with BOND_STATE_ACTIVE and BOND_STATE_BACKUP */ inactive:1, /* indicates inactive slave */ + rx_disabled:1, /* indicates whether slave's Rx is disabled */ should_notify:1, /* indicates whether the state changed */ should_notify_link:1; /* indicates whether the link changed */ u8 duplex; @@ -568,6 +570,14 @@ static inline void bond_set_slave_inactive_flags(struct slave *slave, bond_set_slave_state(slave, BOND_STATE_BACKUP, notify); if (!slave->bond->params.all_slaves_active) slave->inactive = 1; + if (BOND_MODE(slave->bond) == BOND_MODE_8023AD) + slave->rx_disabled = 1; +} + +static inline void bond_set_slave_tx_disabled_flags(struct slave *slave, + bool notify) +{ + bond_set_slave_state(slave, BOND_STATE_BACKUP, notify); } static inline void bond_set_slave_active_flags(struct slave *slave, @@ -575,6 +585,14 @@ static inline void bond_set_slave_active_flags(struct slave *slave, { bond_set_slave_state(slave, BOND_STATE_ACTIVE, notify); slave->inactive = 0; + if (BOND_MODE(slave->bond) == BOND_MODE_8023AD) + slave->rx_disabled = 0; +} + +static inline void bond_set_slave_rx_enabled_flags(struct slave *slave, + bool notify) +{ + slave->rx_disabled = 0; } static inline bool bond_is_slave_inactive(struct slave *slave) @@ -582,6 +600,11 @@ static inline bool bond_is_slave_inactive(struct slave *slave) return slave->inactive; } +static inline bool bond_is_slave_rx_disabled(struct slave *slave) +{ + return slave->rx_disabled; +} + static inline void bond_propose_link_state(struct slave *slave, int state) { slave->link_new_state = state; diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h index ab9bcff96e4da1..ffa637b38c93bc 100644 --- a/include/uapi/linux/if_link.h +++ b/include/uapi/linux/if_link.h @@ -1505,6 +1505,7 @@ enum { IFLA_BOND_AD_LACP_ACTIVE, IFLA_BOND_MISSED_MAX, IFLA_BOND_NS_IP6_TARGET, + IFLA_BOND_COUPLED_CONTROL, __IFLA_BOND_MAX, }; diff --git a/tools/include/uapi/linux/if_link.h b/tools/include/uapi/linux/if_link.h index a0aa05a28cf29c..f0d71b2a3f1e1a 100644 --- a/tools/include/uapi/linux/if_link.h +++ b/tools/include/uapi/linux/if_link.h @@ -974,6 +974,7 @@ enum { IFLA_BOND_AD_LACP_ACTIVE, IFLA_BOND_MISSED_MAX, IFLA_BOND_NS_IP6_TARGET, + IFLA_BOND_COUPLED_CONTROL, __IFLA_BOND_MAX, }; -- cgit 1.2.3-korg From 563918a0e3afd97bcfb680b72c52ec080c82aea6 Mon Sep 17 00:00:00 2001 From: Dave Thaler Date: Mon, 5 Feb 2024 20:51:46 -0800 Subject: bpf, docs: Fix typos in instructions-set.rst * "imm32" should just be "imm" * Add blank line to fix formatting error reported by Stephen Rothwell [0] [0]: https://lore.kernel.org/bpf/20240206153301.4ead0bad@canb.auug.org.au/T/#u Signed-off-by: Dave Thaler Acked-by: David Vernet Link: https://lore.kernel.org/r/20240206045146.4965-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov --- Documentation/bpf/standardization/instruction-set.rst | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/bpf/standardization/instruction-set.rst b/Documentation/bpf/standardization/instruction-set.rst index 1c4258f1ce9306..bdfe0cd0e49952 100644 --- a/Documentation/bpf/standardization/instruction-set.rst +++ b/Documentation/bpf/standardization/instruction-set.rst @@ -117,6 +117,7 @@ corresponds to a set of instructions that are mandatory. That is, each instruction has one or more conformance groups of which it is a member. This document defines the following conformance groups: + * base32: includes all instructions defined in this specification unless otherwise noted. * base64: includes base32, plus instructions explicitly noted @@ -289,11 +290,11 @@ where '(u32)' indicates that the upper 32 bits are zeroed. ``BPF_XOR | BPF_K | BPF_ALU`` means:: - dst = (u32) dst ^ (u32) imm32 + dst = (u32) dst ^ (u32) imm ``BPF_XOR | BPF_K | BPF_ALU64`` means:: - dst = dst ^ imm32 + dst = dst ^ imm Note that most instructions have instruction offset of 0. Only three instructions (``BPF_SDIV``, ``BPF_SMOD``, ``BPF_MOVSX``) have a non-zero offset. @@ -511,7 +512,7 @@ instructions that transfer data between a register and memory. ``BPF_MEM | | BPF_ST`` means:: - *(size *) (dst + offset) = imm32 + *(size *) (dst + offset) = imm ``BPF_MEM | | BPF_LDX`` means:: -- cgit 1.2.3-korg From 70ff9a91e86850103f71d5920eff6bee81bd2a0d Mon Sep 17 00:00:00 2001 From: Alessandro Marcolini Date: Sat, 3 Feb 2024 14:16:52 +0100 Subject: doc: netlink: specs: tc: add multi-attr to tc-taprio-sched-entry Add multi-attr attribute to tc-taprio-sched-entry to specify multiple entries. Signed-off-by: Alessandro Marcolini Reviewed-by: Donald Hunter Reviewed-by: Jakub Kicinski Link: https://lore.kernel.org/r/0ba5088ea715103a2bce83b12e2dcbdaa08da6ac.1706962013.git.alessandromarcolini99@gmail.com Signed-off-by: Jakub Kicinski --- Documentation/netlink/specs/tc.yaml | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/netlink/specs/tc.yaml b/Documentation/netlink/specs/tc.yaml index 4b21b00dbebe6c..324fa182cd1491 100644 --- a/Documentation/netlink/specs/tc.yaml +++ b/Documentation/netlink/specs/tc.yaml @@ -3376,6 +3376,7 @@ attribute-sets: name: entry type: nest nested-attributes: tc-taprio-sched-entry + multi-attr: true - name: tc-taprio-sched-entry attributes: -- cgit 1.2.3-korg From aa7b608d69ea9dfd65ef3694b22a2017d54e3d5b Mon Sep 17 00:00:00 2001 From: Matthew Wood Date: Sun, 4 Feb 2024 15:27:35 -0800 Subject: net: netconsole: add docs for appending netconsole user data Add a new User Data section to the netconsole docs to describe the appending of user data capability (for netconsole dynamic configuration) with usage and netconsole output examples. Co-developed-by: Breno Leitao Signed-off-by: Breno Leitao Signed-off-by: Matthew Wood Signed-off-by: David S. Miller --- Documentation/networking/netconsole.rst | 66 +++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+) (limited to 'Documentation') diff --git a/Documentation/networking/netconsole.rst b/Documentation/networking/netconsole.rst index 390730a74332d7..b28c525e5d1e22 100644 --- a/Documentation/networking/netconsole.rst +++ b/Documentation/networking/netconsole.rst @@ -15,6 +15,8 @@ Extended console support by Tejun Heo , May 1 2015 Release prepend support by Breno Leitao , Jul 7 2023 +Userdata append support by Matthew Wood , Jan 22 2024 + Please send bug reports to Matt Mackall Satyam Sharma , and Cong Wang @@ -171,6 +173,70 @@ You can modify these targets in runtime by creating the following targets:: cat cmdline1/remote_ip 10.0.0.3 +Append User Data +---------------- + +Custom user data can be appended to the end of messages with netconsole +dynamic configuration enabled. User data entries can be modified without +changing the "enabled" attribute of a target. + +Directories (keys) under `userdata` are limited to 54 character length, and +data in `userdata//value` are limited to 200 bytes:: + + cd /sys/kernel/config/netconsole && mkdir cmdline0 + cd cmdline0 + mkdir userdata/foo + echo bar > userdata/foo/value + mkdir userdata/qux + echo baz > userdata/qux/value + +Messages will now include this additional user data:: + + echo "This is a message" > /dev/kmsg + +Sends:: + + 12,607,22085407756,-;This is a message + foo=bar + qux=baz + +Preview the userdata that will be appended with:: + + cd /sys/kernel/config/netconsole/cmdline0/userdata + for f in `ls userdata`; do echo $f=$(cat userdata/$f/value); done + +If a `userdata` entry is created but no data is written to the `value` file, +the entry will be omitted from netconsole messages:: + + cd /sys/kernel/config/netconsole && mkdir cmdline0 + cd cmdline0 + mkdir userdata/foo + echo bar > userdata/foo/value + mkdir userdata/qux + +The `qux` key is omitted since it has no value:: + + echo "This is a message" > /dev/kmsg + 12,607,22085407756,-;This is a message + foo=bar + +Delete `userdata` entries with `rmdir`:: + + rmdir /sys/kernel/config/netconsole/cmdline0/userdata/qux + +.. warning:: + When writing strings to user data values, input is broken up per line in + configfs store calls and this can cause confusing behavior:: + + mkdir userdata/testing + printf "val1\nval2" > userdata/testing/value + # userdata store value is called twice, first with "val1\n" then "val2" + # so "val2" is stored, being the last value stored + cat userdata/testing/value + val2 + + It is recommended to not write user data values with newlines. + Extended console: ================= -- cgit 1.2.3-korg From 409c38d4f156740bf3165fd6ceae4fa6425eebf4 Mon Sep 17 00:00:00 2001 From: Jinjian Song Date: Mon, 5 Feb 2024 18:22:28 +0800 Subject: net: wwan: t7xx: Add sysfs attribute for device state machine Add support for userspace to get/set the device mode, device's state machine changes between (unknown/ready/reset/fastboot). Get the device state mode: - 'cat /sys/bus/pci/devices/${bdf}/t7xx_mode' Set the device state mode: - reset(cold reset): 'echo reset > /sys/bus/pci/devices/${bdf}/t7xx_mode' - fastboot: 'echo fastboot_switching > /sys/bus/pci/devices/${bdf}/t7xx_mode' Reload driver to get the new device state after setting operation. Signed-off-by: Jinjian Song Signed-off-by: David S. Miller --- .../networking/device_drivers/wwan/t7xx.rst | 28 ++++++ drivers/net/wwan/t7xx/t7xx_modem_ops.c | 6 ++ drivers/net/wwan/t7xx/t7xx_modem_ops.h | 1 + drivers/net/wwan/t7xx/t7xx_pci.c | 101 ++++++++++++++++++++- drivers/net/wwan/t7xx/t7xx_pci.h | 14 ++- drivers/net/wwan/t7xx/t7xx_state_monitor.c | 1 + 6 files changed, 145 insertions(+), 6 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/device_drivers/wwan/t7xx.rst b/Documentation/networking/device_drivers/wwan/t7xx.rst index dd5b731957ca40..8429b992734177 100644 --- a/Documentation/networking/device_drivers/wwan/t7xx.rst +++ b/Documentation/networking/device_drivers/wwan/t7xx.rst @@ -39,6 +39,34 @@ command and receive response: - open the AT control channel using a UART tool or a special user tool +Sysfs +===== +The driver provides sysfs interfaces to userspace. + +t7xx_mode +--------- +The sysfs interface provides userspace with access to the device mode, this interface +supports read and write operations. + +Device mode: + +- ``unknown`` represents that device in unknown status +- ``ready`` represents that device in ready status +- ``reset`` represents that device in reset status +- ``fastboot_switching`` represents that device in fastboot switching status +- ``fastboot_download`` represents that device in fastboot download status +- ``fastboot_dump`` represents that device in fastboot dump status + +Read from userspace to get the current device mode. + +:: + $ cat /sys/bus/pci/devices/${bdf}/t7xx_mode + +Write from userspace to set the device mode. + +:: + $ echo fastboot_switching > /sys/bus/pci/devices/${bdf}/t7xx_mode + Management application development ================================== The driver and userspace interfaces are described below. The MBIM protocol is diff --git a/drivers/net/wwan/t7xx/t7xx_modem_ops.c b/drivers/net/wwan/t7xx/t7xx_modem_ops.c index 24e7d491468e0a..ca262d2961ed7f 100644 --- a/drivers/net/wwan/t7xx/t7xx_modem_ops.c +++ b/drivers/net/wwan/t7xx/t7xx_modem_ops.c @@ -177,6 +177,11 @@ int t7xx_acpi_fldr_func(struct t7xx_pci_dev *t7xx_dev) return t7xx_acpi_reset(t7xx_dev, "_RST"); } +int t7xx_acpi_pldr_func(struct t7xx_pci_dev *t7xx_dev) +{ + return t7xx_acpi_reset(t7xx_dev, "MRST._RST"); +} + static void t7xx_reset_device_via_pmic(struct t7xx_pci_dev *t7xx_dev) { u32 val; @@ -192,6 +197,7 @@ static irqreturn_t t7xx_rgu_isr_thread(int irq, void *data) { struct t7xx_pci_dev *t7xx_dev = data; + t7xx_mode_update(t7xx_dev, T7XX_RESET); msleep(RGU_RESET_DELAY_MS); t7xx_reset_device_via_pmic(t7xx_dev); return IRQ_HANDLED; diff --git a/drivers/net/wwan/t7xx/t7xx_modem_ops.h b/drivers/net/wwan/t7xx/t7xx_modem_ops.h index abe633cf7adc01..b39e945a92e017 100644 --- a/drivers/net/wwan/t7xx/t7xx_modem_ops.h +++ b/drivers/net/wwan/t7xx/t7xx_modem_ops.h @@ -85,6 +85,7 @@ int t7xx_md_init(struct t7xx_pci_dev *t7xx_dev); void t7xx_md_exit(struct t7xx_pci_dev *t7xx_dev); void t7xx_clear_rgu_irq(struct t7xx_pci_dev *t7xx_dev); int t7xx_acpi_fldr_func(struct t7xx_pci_dev *t7xx_dev); +int t7xx_acpi_pldr_func(struct t7xx_pci_dev *t7xx_dev); int t7xx_pci_mhccif_isr(struct t7xx_pci_dev *t7xx_dev); #endif /* __T7XX_MODEM_OPS_H__ */ diff --git a/drivers/net/wwan/t7xx/t7xx_pci.c b/drivers/net/wwan/t7xx/t7xx_pci.c index 91256e005b846f..f99eb21cb8ccb0 100644 --- a/drivers/net/wwan/t7xx/t7xx_pci.c +++ b/drivers/net/wwan/t7xx/t7xx_pci.c @@ -52,6 +52,81 @@ #define PM_RESOURCE_POLL_TIMEOUT_US 10000 #define PM_RESOURCE_POLL_STEP_US 100 +static const char * const t7xx_mode_names[] = { + [T7XX_UNKNOWN] = "unknown", + [T7XX_READY] = "ready", + [T7XX_RESET] = "reset", + [T7XX_FASTBOOT_SWITCHING] = "fastboot_switching", + [T7XX_FASTBOOT_DOWNLOAD] = "fastboot_download", + [T7XX_FASTBOOT_DUMP] = "fastboot_dump", +}; + +static_assert(ARRAY_SIZE(t7xx_mode_names) == T7XX_MODE_LAST); + +static ssize_t t7xx_mode_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct t7xx_pci_dev *t7xx_dev; + struct pci_dev *pdev; + int index = 0; + + pdev = to_pci_dev(dev); + t7xx_dev = pci_get_drvdata(pdev); + if (!t7xx_dev) + return -ENODEV; + + index = sysfs_match_string(t7xx_mode_names, buf); + if (index == T7XX_FASTBOOT_SWITCHING) { + WRITE_ONCE(t7xx_dev->mode, T7XX_FASTBOOT_SWITCHING); + } else if (index == T7XX_RESET) { + WRITE_ONCE(t7xx_dev->mode, T7XX_RESET); + t7xx_acpi_pldr_func(t7xx_dev); + } + + return count; +}; + +static ssize_t t7xx_mode_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + enum t7xx_mode mode = T7XX_UNKNOWN; + struct t7xx_pci_dev *t7xx_dev; + struct pci_dev *pdev; + + pdev = to_pci_dev(dev); + t7xx_dev = pci_get_drvdata(pdev); + if (!t7xx_dev) + return -ENODEV; + + mode = READ_ONCE(t7xx_dev->mode); + if (mode < T7XX_MODE_LAST) + return sysfs_emit(buf, "%s\n", t7xx_mode_names[mode]); + + return sysfs_emit(buf, "%s\n", t7xx_mode_names[T7XX_UNKNOWN]); +} + +static DEVICE_ATTR_RW(t7xx_mode); + +static struct attribute *t7xx_mode_attr[] = { + &dev_attr_t7xx_mode.attr, + NULL +}; + +static const struct attribute_group t7xx_mode_attribute_group = { + .attrs = t7xx_mode_attr, +}; + +void t7xx_mode_update(struct t7xx_pci_dev *t7xx_dev, enum t7xx_mode mode) +{ + if (!t7xx_dev) + return; + + WRITE_ONCE(t7xx_dev->mode, mode); + sysfs_notify(&t7xx_dev->pdev->dev.kobj, NULL, "t7xx_mode"); +} + enum t7xx_pm_state { MTK_PM_EXCEPTION, MTK_PM_INIT, /* Device initialized, but handshake not completed */ @@ -279,7 +354,8 @@ static int __t7xx_pci_pm_suspend(struct pci_dev *pdev) int ret; t7xx_dev = pci_get_drvdata(pdev); - if (atomic_read(&t7xx_dev->md_pm_state) <= MTK_PM_INIT) { + if (atomic_read(&t7xx_dev->md_pm_state) <= MTK_PM_INIT || + READ_ONCE(t7xx_dev->mode) != T7XX_READY) { dev_err(&pdev->dev, "[PM] Exiting suspend, modem in invalid state\n"); return -EFAULT; } @@ -729,16 +805,28 @@ static int t7xx_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) t7xx_pcie_mac_interrupts_dis(t7xx_dev); + ret = sysfs_create_group(&t7xx_dev->pdev->dev.kobj, + &t7xx_mode_attribute_group); + if (ret) + goto err_md_exit; + ret = t7xx_interrupt_init(t7xx_dev); - if (ret) { - t7xx_md_exit(t7xx_dev); - return ret; - } + if (ret) + goto err_remove_group; + t7xx_pcie_mac_set_int(t7xx_dev, MHCCIF_INT); t7xx_pcie_mac_interrupts_en(t7xx_dev); return 0; + +err_remove_group: + sysfs_remove_group(&t7xx_dev->pdev->dev.kobj, + &t7xx_mode_attribute_group); + +err_md_exit: + t7xx_md_exit(t7xx_dev); + return ret; } static void t7xx_pci_remove(struct pci_dev *pdev) @@ -747,6 +835,9 @@ static void t7xx_pci_remove(struct pci_dev *pdev) int i; t7xx_dev = pci_get_drvdata(pdev); + + sysfs_remove_group(&t7xx_dev->pdev->dev.kobj, + &t7xx_mode_attribute_group); t7xx_md_exit(t7xx_dev); for (i = 0; i < EXT_INT_NUM; i++) { diff --git a/drivers/net/wwan/t7xx/t7xx_pci.h b/drivers/net/wwan/t7xx/t7xx_pci.h index f08f1ab7446917..49a11586d8d844 100644 --- a/drivers/net/wwan/t7xx/t7xx_pci.h +++ b/drivers/net/wwan/t7xx/t7xx_pci.h @@ -43,6 +43,16 @@ struct t7xx_addr_base { typedef irqreturn_t (*t7xx_intr_callback)(int irq, void *param); +enum t7xx_mode { + T7XX_UNKNOWN, + T7XX_READY, + T7XX_RESET, + T7XX_FASTBOOT_SWITCHING, + T7XX_FASTBOOT_DOWNLOAD, + T7XX_FASTBOOT_DUMP, + T7XX_MODE_LAST, /* must always be last */ +}; + /* struct t7xx_pci_dev - MTK device context structure * @intr_handler: array of handler function for request_threaded_irq * @intr_thread: array of thread_fn for request_threaded_irq @@ -59,6 +69,7 @@ typedef irqreturn_t (*t7xx_intr_callback)(int irq, void *param); * @md_pm_lock: protects PCIe sleep lock * @sleep_disable_count: PCIe L1.2 lock counter * @sleep_lock_acquire: indicates that sleep has been disabled + * @mode: indicates the device mode */ struct t7xx_pci_dev { t7xx_intr_callback intr_handler[EXT_INT_NUM]; @@ -82,6 +93,7 @@ struct t7xx_pci_dev { #ifdef CONFIG_WWAN_DEBUGFS struct dentry *debugfs_dir; #endif + u32 mode; }; enum t7xx_pm_id { @@ -120,5 +132,5 @@ int t7xx_pci_pm_entity_register(struct t7xx_pci_dev *t7xx_dev, struct md_pm_enti int t7xx_pci_pm_entity_unregister(struct t7xx_pci_dev *t7xx_dev, struct md_pm_entity *pm_entity); void t7xx_pci_pm_init_late(struct t7xx_pci_dev *t7xx_dev); void t7xx_pci_pm_exp_detected(struct t7xx_pci_dev *t7xx_dev); - +void t7xx_mode_update(struct t7xx_pci_dev *t7xx_dev, enum t7xx_mode mode); #endif /* __T7XX_PCI_H__ */ diff --git a/drivers/net/wwan/t7xx/t7xx_state_monitor.c b/drivers/net/wwan/t7xx/t7xx_state_monitor.c index 0bc97430211bf2..c5d46f45fa6237 100644 --- a/drivers/net/wwan/t7xx/t7xx_state_monitor.c +++ b/drivers/net/wwan/t7xx/t7xx_state_monitor.c @@ -272,6 +272,7 @@ static void fsm_routine_ready(struct t7xx_fsm_ctl *ctl) ctl->curr_state = FSM_STATE_READY; t7xx_fsm_broadcast_ready_state(ctl); + t7xx_mode_update(md->t7xx_dev, T7XX_READY); t7xx_md_event_notify(md, FSM_READY); } -- cgit 1.2.3-korg From 2dac6381c3da50d4b2525fd0514e41e8041ad974 Mon Sep 17 00:00:00 2001 From: Jinjian Song Date: Mon, 5 Feb 2024 18:22:30 +0800 Subject: net: wwan: t7xx: Add fastboot WWAN port On early detection of wwan device in fastboot mode, driver sets up CLDMA0 HW tx/rx queues for raw data transfer and then create fastboot port to userspace. Application can use this port to flash firmware and collect core dump by fastboot protocol commands. E.g., flash firmware through fastboot port: - "download:%08x": write data to memory with the download size. - "flash:%s": write the previously downloaded image to the named partition. - "reboot": reboot the device. Link: https://android.googlesource.com/platform/system/core/+/refs/heads/main/fastboot/README.md Signed-off-by: Jinjian Song Signed-off-by: David S. Miller --- .../networking/device_drivers/wwan/t7xx.rst | 18 ++++ drivers/net/wwan/t7xx/t7xx_port_proxy.c | 3 + drivers/net/wwan/t7xx/t7xx_port_wwan.c | 116 ++++++++++++++++----- drivers/net/wwan/t7xx/t7xx_state_monitor.c | 4 + 4 files changed, 115 insertions(+), 26 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/device_drivers/wwan/t7xx.rst b/Documentation/networking/device_drivers/wwan/t7xx.rst index 8429b992734177..f346f5f85f154e 100644 --- a/Documentation/networking/device_drivers/wwan/t7xx.rst +++ b/Documentation/networking/device_drivers/wwan/t7xx.rst @@ -125,6 +125,20 @@ The driver exposes an AT port by implementing AT WWAN Port. The userspace end of the control port is a /dev/wwan0at0 character device. Application shall use this interface to issue AT commands. +fastboot port userspace ABI +--------------------------- + +/dev/wwan0fastboot0 character device +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +The driver exposes a fastboot protocol interface by implementing +fastboot WWAN Port. The userspace end of the fastboot channel pipe is a +/dev/wwan0fastboot0 character device. Application shall use this interface for +fastboot protocol communication. + +Please note that driver needs to be reloaded to export /dev/wwan0fastboot0 +port, because device needs a cold reset after enter ``fastboot_switching`` +mode. + The MediaTek's T700 modem supports the 3GPP TS 27.007 [4] specification. References @@ -146,3 +160,7 @@ speak the Mobile Interface Broadband Model (MBIM) protocol"* [4] *Specification # 27.007 - 3GPP* - https://www.3gpp.org/DynaReport/27007.htm + +[5] *fastboot "a mechanism for communicating with bootloaders"* + +- https://android.googlesource.com/platform/system/core/+/refs/heads/main/fastboot/README.md diff --git a/drivers/net/wwan/t7xx/t7xx_port_proxy.c b/drivers/net/wwan/t7xx/t7xx_port_proxy.c index e53a152faee408..8f5e01705af295 100644 --- a/drivers/net/wwan/t7xx/t7xx_port_proxy.c +++ b/drivers/net/wwan/t7xx/t7xx_port_proxy.c @@ -112,6 +112,9 @@ static const struct t7xx_port_conf t7xx_early_port_conf[] = { .txq_exp_index = CLDMA_Q_IDX_DUMP, .rxq_exp_index = CLDMA_Q_IDX_DUMP, .path_id = CLDMA_ID_AP, + .ops = &wwan_sub_port_ops, + .name = "fastboot", + .port_type = WWAN_PORT_FASTBOOT, }, }; diff --git a/drivers/net/wwan/t7xx/t7xx_port_wwan.c b/drivers/net/wwan/t7xx/t7xx_port_wwan.c index ddc20ddfa7347a..4b23ba693f3f1d 100644 --- a/drivers/net/wwan/t7xx/t7xx_port_wwan.c +++ b/drivers/net/wwan/t7xx/t7xx_port_wwan.c @@ -2,6 +2,7 @@ /* * Copyright (c) 2021, MediaTek Inc. * Copyright (c) 2021-2022, Intel Corporation. + * Copyright (c) 2024, Fibocom Wireless Inc. * * Authors: * Amir Hanania @@ -15,6 +16,7 @@ * Chiranjeevi Rapolu * Eliot Lee * Sreehari Kancharla + * Jinjian Song */ #include @@ -33,7 +35,7 @@ #include "t7xx_port_proxy.h" #include "t7xx_state_monitor.h" -static int t7xx_port_ctrl_start(struct wwan_port *port) +static int t7xx_port_wwan_start(struct wwan_port *port) { struct t7xx_port *port_mtk = wwan_port_get_drvdata(port); @@ -44,30 +46,60 @@ static int t7xx_port_ctrl_start(struct wwan_port *port) return 0; } -static void t7xx_port_ctrl_stop(struct wwan_port *port) +static void t7xx_port_wwan_stop(struct wwan_port *port) { struct t7xx_port *port_mtk = wwan_port_get_drvdata(port); atomic_dec(&port_mtk->usage_cnt); } -static int t7xx_port_ctrl_tx(struct wwan_port *port, struct sk_buff *skb) +static int t7xx_port_fastboot_tx(struct t7xx_port *port, struct sk_buff *skb) +{ + struct sk_buff *cur = skb, *tx_skb; + size_t actual, len, offset = 0; + int txq_mtu; + int ret; + + txq_mtu = t7xx_get_port_mtu(port); + if (txq_mtu < 0) + return -EINVAL; + + actual = cur->len; + while (actual) { + len = min_t(size_t, actual, txq_mtu); + tx_skb = __dev_alloc_skb(len, GFP_KERNEL); + if (!tx_skb) + return -ENOMEM; + + skb_put_data(tx_skb, cur->data + offset, len); + + ret = t7xx_port_send_raw_skb(port, tx_skb); + if (ret) { + dev_kfree_skb(tx_skb); + dev_err(port->dev, "Write error on fastboot port, %d\n", ret); + break; + } + offset += len; + actual -= len; + } + + dev_kfree_skb(skb); + return 0; +} + +static int t7xx_port_ctrl_tx(struct t7xx_port *port, struct sk_buff *skb) { - struct t7xx_port *port_private = wwan_port_get_drvdata(port); const struct t7xx_port_conf *port_conf; struct sk_buff *cur = skb, *cloned; struct t7xx_fsm_ctl *ctl; enum md_state md_state; int cnt = 0, ret; - if (!port_private->chan_enable) - return -EINVAL; - - port_conf = port_private->port_conf; - ctl = port_private->t7xx_dev->md->fsm_ctl; + port_conf = port->port_conf; + ctl = port->t7xx_dev->md->fsm_ctl; md_state = t7xx_fsm_get_md_state(ctl); if (md_state == MD_STATE_WAITING_FOR_HS1 || md_state == MD_STATE_WAITING_FOR_HS2) { - dev_warn(port_private->dev, "Cannot write to %s port when md_state=%d\n", + dev_warn(port->dev, "Cannot write to %s port when md_state=%d\n", port_conf->name, md_state); return -ENODEV; } @@ -75,10 +107,10 @@ static int t7xx_port_ctrl_tx(struct wwan_port *port, struct sk_buff *skb) while (cur) { cloned = skb_clone(cur, GFP_KERNEL); cloned->len = skb_headlen(cur); - ret = t7xx_port_send_skb(port_private, cloned, 0, 0); + ret = t7xx_port_send_skb(port, cloned, 0, 0); if (ret) { dev_kfree_skb(cloned); - dev_err(port_private->dev, "Write error on %s port, %d\n", + dev_err(port->dev, "Write error on %s port, %d\n", port_conf->name, ret); return cnt ? cnt + ret : ret; } @@ -93,14 +125,53 @@ static int t7xx_port_ctrl_tx(struct wwan_port *port, struct sk_buff *skb) return 0; } +static int t7xx_port_wwan_tx(struct wwan_port *port, struct sk_buff *skb) +{ + struct t7xx_port *port_private = wwan_port_get_drvdata(port); + const struct t7xx_port_conf *port_conf = port_private->port_conf; + int ret; + + if (!port_private->chan_enable) + return -EINVAL; + + if (port_conf->port_type != WWAN_PORT_FASTBOOT) + ret = t7xx_port_ctrl_tx(port_private, skb); + else + ret = t7xx_port_fastboot_tx(port_private, skb); + + return ret; +} + static const struct wwan_port_ops wwan_ops = { - .start = t7xx_port_ctrl_start, - .stop = t7xx_port_ctrl_stop, - .tx = t7xx_port_ctrl_tx, + .start = t7xx_port_wwan_start, + .stop = t7xx_port_wwan_stop, + .tx = t7xx_port_wwan_tx, }; +static void t7xx_port_wwan_create(struct t7xx_port *port) +{ + const struct t7xx_port_conf *port_conf = port->port_conf; + unsigned int header_len = sizeof(struct ccci_header), mtu; + struct wwan_port_caps caps; + + if (!port->wwan.wwan_port) { + mtu = t7xx_get_port_mtu(port); + caps.frag_len = mtu - header_len; + caps.headroom_len = header_len; + port->wwan.wwan_port = wwan_create_port(port->dev, port_conf->port_type, + &wwan_ops, &caps, port); + if (IS_ERR(port->wwan.wwan_port)) + dev_err(port->dev, "Unable to create WWAN port %s", port_conf->name); + } +} + static int t7xx_port_wwan_init(struct t7xx_port *port) { + const struct t7xx_port_conf *port_conf = port->port_conf; + + if (port_conf->port_type == WWAN_PORT_FASTBOOT) + t7xx_port_wwan_create(port); + port->rx_length_th = RX_QUEUE_MAXLEN; return 0; } @@ -152,21 +223,14 @@ static int t7xx_port_wwan_disable_chl(struct t7xx_port *port) static void t7xx_port_wwan_md_state_notify(struct t7xx_port *port, unsigned int state) { const struct t7xx_port_conf *port_conf = port->port_conf; - unsigned int header_len = sizeof(struct ccci_header), mtu; - struct wwan_port_caps caps; + + if (port_conf->port_type == WWAN_PORT_FASTBOOT) + return; if (state != MD_STATE_READY) return; - if (!port->wwan.wwan_port) { - mtu = t7xx_get_port_mtu(port); - caps.frag_len = mtu - header_len; - caps.headroom_len = header_len; - port->wwan.wwan_port = wwan_create_port(port->dev, port_conf->port_type, - &wwan_ops, &caps, port); - if (IS_ERR(port->wwan.wwan_port)) - dev_err(port->dev, "Unable to create WWWAN port %s", port_conf->name); - } + t7xx_port_wwan_create(port); } struct port_ops wwan_sub_port_ops = { diff --git a/drivers/net/wwan/t7xx/t7xx_state_monitor.c b/drivers/net/wwan/t7xx/t7xx_state_monitor.c index 038377fed1028a..9889ca4621cf58 100644 --- a/drivers/net/wwan/t7xx/t7xx_state_monitor.c +++ b/drivers/net/wwan/t7xx/t7xx_state_monitor.c @@ -229,6 +229,7 @@ static void t7xx_lk_stage_event_handling(struct t7xx_fsm_ctl *ctl, unsigned int struct cldma_ctrl *md_ctrl; enum lk_event_id lk_event; struct device *dev; + struct t7xx_port *port; dev = &md->t7xx_dev->pdev->dev; lk_event = FIELD_GET(MISC_LK_EVENT_MASK, status); @@ -244,6 +245,9 @@ static void t7xx_lk_stage_event_handling(struct t7xx_fsm_ctl *ctl, unsigned int t7xx_cldma_stop(md_ctrl); t7xx_cldma_switch_cfg(md_ctrl, CLDMA_DEDICATED_Q_CFG); + port = &ctl->md->port_prox->ports[0]; + port->port_conf->ops->enable_chl(port); + t7xx_cldma_start(md_ctrl); if (lk_event == LK_EVENT_CREATE_POST_DL_PORT) -- cgit 1.2.3-korg From 8453c88c7a150a5ae52382b0bfda00a4b0a643ef Mon Sep 17 00:00:00 2001 From: Christian Marangi Date: Tue, 6 Feb 2024 18:31:04 +0100 Subject: dt-bindings: net: document ethernet PHY package nodes Document ethernet PHY package nodes used to describe PHY shipped in bundle of 2-5 PHY. The special node describe a container of PHY that share common properties. This is a generic schema and PHY package should create specialized version with the required additional shared properties. Example are PHY packages that have some regs only in one PHY of the package and will affect every other PHY in the package, for example related to PHY interface mode calibration or global PHY mode selection. The PHY package node MUST declare the base address used by the PHY driver for global configuration by calculating the offsets of the global PHY based on the base address of the PHY package. Each reg of the PHYs defined in the PHY package node is absolute and describe the real address of the Ethernet PHY on the bus. Signed-off-by: Christian Marangi Signed-off-by: David S. Miller --- .../bindings/net/ethernet-phy-package.yaml | 52 ++++++++++++++++++++++ 1 file changed, 52 insertions(+) create mode 100644 Documentation/devicetree/bindings/net/ethernet-phy-package.yaml (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/ethernet-phy-package.yaml b/Documentation/devicetree/bindings/net/ethernet-phy-package.yaml new file mode 100644 index 00000000000000..e567101e6f38b1 --- /dev/null +++ b/Documentation/devicetree/bindings/net/ethernet-phy-package.yaml @@ -0,0 +1,52 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/ethernet-phy-package.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Ethernet PHY Package Common Properties + +maintainers: + - Christian Marangi + +description: + PHY packages are multi-port Ethernet PHY of the same family + and each Ethernet PHY is affected by the global configuration + of the PHY package. + + Each reg of the PHYs defined in the PHY package node is + absolute and describe the real address of the Ethernet PHY on + the MDIO bus. + +properties: + $nodename: + pattern: "^ethernet-phy-package@[a-f0-9]+$" + + reg: + minimum: 0 + maximum: 31 + description: + The base ID number for the PHY package. + Commonly the ID of the first PHY in the PHY package. + + Some PHY in the PHY package might be not defined but + still occupy ID on the device (just not attached to + anything) hence the PHY package reg might correspond + to a not attached PHY (offset 0). + + '#address-cells': + const: 1 + + '#size-cells': + const: 0 + +patternProperties: + ^ethernet-phy@[a-f0-9]+$: + $ref: ethernet-phy.yaml# + +required: + - reg + - '#address-cells' + - '#size-cells' + +additionalProperties: true -- cgit 1.2.3-korg From dd87eaa137870bfc7aab38953384768bf1c87a3f Mon Sep 17 00:00:00 2001 From: Christian Marangi Date: Tue, 6 Feb 2024 18:31:08 +0100 Subject: dt-bindings: net: Document Qcom QCA807x PHY package Document Qcom QCA807x PHY package. Qualcomm QCA807X Ethernet PHY is PHY package of 2 or 5 IEEE 802.3 clause 22 compliant 10BASE-Te, 100BASE-TX and 1000BASE-T PHY-s. Document the required property to make the PHY package correctly configure and work. Signed-off-by: Christian Marangi Reviewed-by: Conor Dooley Reviewed-by: Andrew Lunn Signed-off-by: David S. Miller --- .../devicetree/bindings/net/qcom,qca807x.yaml | 184 +++++++++++++++++++++ 1 file changed, 184 insertions(+) create mode 100644 Documentation/devicetree/bindings/net/qcom,qca807x.yaml (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/qcom,qca807x.yaml b/Documentation/devicetree/bindings/net/qcom,qca807x.yaml new file mode 100644 index 00000000000000..7290024024f526 --- /dev/null +++ b/Documentation/devicetree/bindings/net/qcom,qca807x.yaml @@ -0,0 +1,184 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/qcom,qca807x.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Qualcomm QCA807x Ethernet PHY + +maintainers: + - Christian Marangi + - Robert Marko + +description: | + Qualcomm QCA8072/5 Ethernet PHY is PHY package of 2 or 5 + IEEE 802.3 clause 22 compliant 10BASE-Te, 100BASE-TX and + 1000BASE-T PHY-s. + + They feature 2 SerDes, one for PSGMII or QSGMII connection with + MAC, while second one is SGMII for connection to MAC or fiber. + + Both models have a combo port that supports 1000BASE-X and + 100BASE-FX fiber. + + Each PHY inside of QCA807x series has 4 digitally controlled + output only pins that natively drive LED-s for up to 2 attached + LEDs. Some vendor also use these 4 output for GPIO usage without + attaching LEDs. + + Note that output pins can be set to drive LEDs OR GPIO, mixed + definition are not accepted. + +$ref: ethernet-phy-package.yaml# + +properties: + compatible: + enum: + - qcom,qca8072-package + - qcom,qca8075-package + + qcom,package-mode: + description: | + PHY package can be configured in 3 mode following this table: + + First Serdes mode Second Serdes mode + Option 1 PSGMII for copper Disabled + ports 0-4 + Option 2 PSGMII for copper 1000BASE-X / 100BASE-FX + ports 0-4 + Option 3 QSGMII for copper SGMII for + ports 0-3 copper port 4 + + PSGMII mode (option 1 or 2) is configured dynamically based on + the presence of a connected SFP device. + $ref: /schemas/types.yaml#/definitions/string + enum: + - qsgmii + - psgmii + default: psgmii + + qcom,tx-drive-strength-milliwatt: + description: set the TX Amplifier value in mv. + $ref: /schemas/types.yaml#/definitions/uint32 + enum: [140, 160, 180, 200, 220, + 240, 260, 280, 300, 320, + 400, 500, 600] + default: 600 + +patternProperties: + ^ethernet-phy@[a-f0-9]+$: + $ref: ethernet-phy.yaml# + + properties: + qcom,dac-full-amplitude: + description: + Set Analog MDI driver amplitude to FULL. + + With this not defined, amplitude is set to DSP. + (amplitude is adjusted based on cable length) + + With this enabled and qcom,dac-full-bias-current + and qcom,dac-disable-bias-current-tweak disabled, + bias current is half. + type: boolean + + qcom,dac-full-bias-current: + description: + Set Analog MDI driver bias current to FULL. + + With this not defined, bias current is set to DSP. + (bias current is adjusted based on cable length) + + Actual bias current might be different with + qcom,dac-disable-bias-current-tweak disabled. + type: boolean + + qcom,dac-disable-bias-current-tweak: + description: | + Set Analog MDI driver bias current to disable tweak + to bias current. + + With this not defined, bias current tweak are enabled + by default. + + With this enabled the following tweak are NOT applied: + - With both FULL amplitude and FULL bias current: bias current + is set to half. + - With only DSP amplitude: bias current is set to half and + is set to 1/4 with cable < 10m. + - With DSP bias current (included both DSP amplitude and + DSP bias current): bias current is half the detected current + with cable < 10m. + type: boolean + + gpio-controller: true + + '#gpio-cells': + const: 2 + + if: + required: + - gpio-controller + then: + properties: + leds: false + + unevaluatedProperties: false + +required: + - compatible + +unevaluatedProperties: false + +examples: + - | + #include + + mdio { + #address-cells = <1>; + #size-cells = <0>; + + ethernet-phy-package@0 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "qcom,qca8075-package"; + reg = <0>; + + qcom,package-mode = "qsgmii"; + + ethernet-phy@0 { + reg = <0>; + + leds { + #address-cells = <1>; + #size-cells = <0>; + + led@0 { + reg = <0>; + color = ; + function = LED_FUNCTION_LAN; + default-state = "keep"; + }; + }; + }; + + ethernet-phy@1 { + reg = <1>; + }; + + ethernet-phy@2 { + reg = <2>; + + gpio-controller; + #gpio-cells = <2>; + }; + + ethernet-phy@3 { + reg = <3>; + }; + + ethernet-phy@4 { + reg = <4>; + }; + }; + }; -- cgit 1.2.3-korg From cb7dd712189f2fe5eb34ac2f904e28283f055cae Mon Sep 17 00:00:00 2001 From: Shinas Rasheed Date: Thu, 8 Feb 2024 02:18:33 -0800 Subject: octeon_ep_vf: Add driver framework and device initialization Add driver framework and device setup and initialization for Octeon PCI Endpoint NIC VF. Add implementation to load module, initialize, register network device, cleanup and unload module. Signed-off-by: Shinas Rasheed Signed-off-by: David S. Miller --- .../networking/device_drivers/ethernet/index.rst | 1 + .../ethernet/marvell/octeon_ep_vf.rst | 24 + drivers/net/ethernet/marvell/Kconfig | 1 + drivers/net/ethernet/marvell/Makefile | 1 + drivers/net/ethernet/marvell/octeon_ep_vf/Kconfig | 19 + drivers/net/ethernet/marvell/octeon_ep_vf/Makefile | 9 + .../ethernet/marvell/octeon_ep_vf/octep_vf_cn9k.c | 157 +++++++ .../ethernet/marvell/octeon_ep_vf/octep_vf_cnxk.c | 158 +++++++ .../marvell/octeon_ep_vf/octep_vf_config.h | 160 +++++++ .../ethernet/marvell/octeon_ep_vf/octep_vf_main.c | 516 +++++++++++++++++++++ .../ethernet/marvell/octeon_ep_vf/octep_vf_main.h | 331 +++++++++++++ .../ethernet/marvell/octeon_ep_vf/octep_vf_mbox.c | 96 ++++ .../ethernet/marvell/octeon_ep_vf/octep_vf_mbox.h | 25 + .../marvell/octeon_ep_vf/octep_vf_regs_cn9k.h | 154 ++++++ .../marvell/octeon_ep_vf/octep_vf_regs_cnxk.h | 162 +++++++ .../ethernet/marvell/octeon_ep_vf/octep_vf_rx.c | 42 ++ .../ethernet/marvell/octeon_ep_vf/octep_vf_rx.h | 224 +++++++++ .../ethernet/marvell/octeon_ep_vf/octep_vf_tx.c | 43 ++ .../ethernet/marvell/octeon_ep_vf/octep_vf_tx.h | 276 +++++++++++ 19 files changed, 2399 insertions(+) create mode 100644 Documentation/networking/device_drivers/ethernet/marvell/octeon_ep_vf.rst create mode 100644 drivers/net/ethernet/marvell/octeon_ep_vf/Kconfig create mode 100644 drivers/net/ethernet/marvell/octeon_ep_vf/Makefile create mode 100644 drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_cn9k.c create mode 100644 drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_cnxk.c create mode 100644 drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_config.h create mode 100644 drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_main.c create mode 100644 drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_main.h create mode 100644 drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_mbox.c create mode 100644 drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_mbox.h create mode 100644 drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_regs_cn9k.h create mode 100644 drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_regs_cnxk.h create mode 100644 drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_rx.c create mode 100644 drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_rx.h create mode 100644 drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_tx.c create mode 100644 drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_tx.h (limited to 'Documentation') diff --git a/Documentation/networking/device_drivers/ethernet/index.rst b/Documentation/networking/device_drivers/ethernet/index.rst index 43de285b8a923a..6932d8c043c2c1 100644 --- a/Documentation/networking/device_drivers/ethernet/index.rst +++ b/Documentation/networking/device_drivers/ethernet/index.rst @@ -42,6 +42,7 @@ Contents: intel/ice marvell/octeontx2 marvell/octeon_ep + marvell/octeon_ep_vf mellanox/mlx5/index microsoft/netvsc neterion/s2io diff --git a/Documentation/networking/device_drivers/ethernet/marvell/octeon_ep_vf.rst b/Documentation/networking/device_drivers/ethernet/marvell/octeon_ep_vf.rst new file mode 100644 index 00000000000000..603133d0b92f88 --- /dev/null +++ b/Documentation/networking/device_drivers/ethernet/marvell/octeon_ep_vf.rst @@ -0,0 +1,24 @@ +.. SPDX-License-Identifier: GPL-2.0+ + +======================================================================= +Linux kernel networking driver for Marvell's Octeon PCI Endpoint NIC VF +======================================================================= + +Network driver for Marvell's Octeon PCI EndPoint NIC VF. +Copyright (c) 2020 Marvell International Ltd. + +Overview +======== +This driver implements networking functionality of Marvell's Octeon PCI +EndPoint NIC VF. + +Supported Devices +================= +Currently, this driver support following devices: + * Network controller: Cavium, Inc. Device b203 + * Network controller: Cavium, Inc. Device b403 + * Network controller: Cavium, Inc. Device b103 + * Network controller: Cavium, Inc. Device b903 + * Network controller: Cavium, Inc. Device ba03 + * Network controller: Cavium, Inc. Device bc03 + * Network controller: Cavium, Inc. Device bd03 diff --git a/drivers/net/ethernet/marvell/Kconfig b/drivers/net/ethernet/marvell/Kconfig index 884d64114bff3c..837295fecd178a 100644 --- a/drivers/net/ethernet/marvell/Kconfig +++ b/drivers/net/ethernet/marvell/Kconfig @@ -180,6 +180,7 @@ config SKY2_DEBUG source "drivers/net/ethernet/marvell/octeontx2/Kconfig" source "drivers/net/ethernet/marvell/octeon_ep/Kconfig" +source "drivers/net/ethernet/marvell/octeon_ep_vf/Kconfig" source "drivers/net/ethernet/marvell/prestera/Kconfig" endif # NET_VENDOR_MARVELL diff --git a/drivers/net/ethernet/marvell/Makefile b/drivers/net/ethernet/marvell/Makefile index ceba4aa4f02671..a399defe25fdcd 100644 --- a/drivers/net/ethernet/marvell/Makefile +++ b/drivers/net/ethernet/marvell/Makefile @@ -12,5 +12,6 @@ obj-$(CONFIG_PXA168_ETH) += pxa168_eth.o obj-$(CONFIG_SKGE) += skge.o obj-$(CONFIG_SKY2) += sky2.o obj-y += octeon_ep/ +obj-y += octeon_ep_vf/ obj-y += octeontx2/ obj-y += prestera/ diff --git a/drivers/net/ethernet/marvell/octeon_ep_vf/Kconfig b/drivers/net/ethernet/marvell/octeon_ep_vf/Kconfig new file mode 100644 index 00000000000000..dbd1267bda0c00 --- /dev/null +++ b/drivers/net/ethernet/marvell/octeon_ep_vf/Kconfig @@ -0,0 +1,19 @@ +# SPDX-License-Identifier: GPL-2.0-only +# +# Marvell's Octeon PCI Endpoint NIC VF Driver Configuration +# + +config OCTEON_EP_VF + tristate "Marvell Octeon PCI Endpoint NIC VF Driver" + depends on 64BIT + depends on PCI + help + This driver supports networking functionality of Marvell's + Octeon PCI Endpoint NIC VF. + + To know the list of devices supported by this driver, refer + documentation in + . + + To compile this drivers as a module, choose M here. Name of the + module is octeon_ep_vf. diff --git a/drivers/net/ethernet/marvell/octeon_ep_vf/Makefile b/drivers/net/ethernet/marvell/octeon_ep_vf/Makefile new file mode 100644 index 00000000000000..694eb9b46e99f9 --- /dev/null +++ b/drivers/net/ethernet/marvell/octeon_ep_vf/Makefile @@ -0,0 +1,9 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# Network driver for Marvell's Octeon PCI Endpoint NIC VF +# + +obj-$(CONFIG_OCTEON_EP_VF) += octeon_ep_vf.o + +octeon_ep_vf-y := octep_vf_main.o octep_vf_cn9k.o octep_vf_cnxk.o \ + octep_vf_tx.o octep_vf_rx.o octep_vf_mbox.o diff --git a/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_cn9k.c b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_cn9k.c new file mode 100644 index 00000000000000..c24ef2265205c0 --- /dev/null +++ b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_cn9k.c @@ -0,0 +1,157 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Marvell Octeon EP (EndPoint) VF Ethernet Driver + * + * Copyright (C) 2020 Marvell. + * + */ + +#include +#include +#include + +#include "octep_vf_config.h" +#include "octep_vf_main.h" +#include "octep_vf_regs_cn9k.h" + +/* Reset all hardware Tx/Rx queues */ +static void octep_vf_reset_io_queues_cn93(struct octep_vf_device *oct) +{ +} + +/* Initialize configuration limits and initial active config */ +static void octep_vf_init_config_cn93_vf(struct octep_vf_device *oct) +{ + struct octep_vf_config *conf = oct->conf; + u64 reg_val; + + reg_val = octep_vf_read_csr64(oct, CN93_VF_SDP_R_IN_CONTROL(0)); + conf->ring_cfg.max_io_rings = (reg_val >> CN93_VF_R_IN_CTL_RPVF_POS) & + CN93_VF_R_IN_CTL_RPVF_MASK; + conf->ring_cfg.active_io_rings = conf->ring_cfg.max_io_rings; + + conf->iq.num_descs = OCTEP_VF_IQ_MAX_DESCRIPTORS; + conf->iq.instr_type = OCTEP_VF_64BYTE_INSTR; + conf->iq.db_min = OCTEP_VF_DB_MIN; + conf->iq.intr_threshold = OCTEP_VF_IQ_INTR_THRESHOLD; + + conf->oq.num_descs = OCTEP_VF_OQ_MAX_DESCRIPTORS; + conf->oq.buf_size = OCTEP_VF_OQ_BUF_SIZE; + conf->oq.refill_threshold = OCTEP_VF_OQ_REFILL_THRESHOLD; + conf->oq.oq_intr_pkt = OCTEP_VF_OQ_INTR_PKT_THRESHOLD; + conf->oq.oq_intr_time = OCTEP_VF_OQ_INTR_TIME_THRESHOLD; + + conf->msix_cfg.ioq_msix = conf->ring_cfg.active_io_rings; +} + +/* Setup registers for a hardware Tx Queue */ +static void octep_vf_setup_iq_regs_cn93(struct octep_vf_device *oct, int iq_no) +{ +} + +/* Setup registers for a hardware Rx Queue */ +static void octep_vf_setup_oq_regs_cn93(struct octep_vf_device *oct, int oq_no) +{ +} + +/* Setup registers for a VF mailbox */ +static void octep_vf_setup_mbox_regs_cn93(struct octep_vf_device *oct, int q_no) +{ +} + +/* Tx/Rx queue interrupt handler */ +static irqreturn_t octep_vf_ioq_intr_handler_cn93(void *data) +{ + return IRQ_HANDLED; +} + +/* Re-initialize Octeon hardware registers */ +static void octep_vf_reinit_regs_cn93(struct octep_vf_device *oct) +{ +} + +/* Enable all interrupts */ +static void octep_vf_enable_interrupts_cn93(struct octep_vf_device *oct) +{ +} + +/* Disable all interrupts */ +static void octep_vf_disable_interrupts_cn93(struct octep_vf_device *oct) +{ +} + +/* Get new Octeon Read Index: index of descriptor that Octeon reads next. */ +static u32 octep_vf_update_iq_read_index_cn93(struct octep_vf_iq *iq) +{ + return 0; +} + +/* Enable a hardware Tx Queue */ +static void octep_vf_enable_iq_cn93(struct octep_vf_device *oct, int iq_no) +{ +} + +/* Enable a hardware Rx Queue */ +static void octep_vf_enable_oq_cn93(struct octep_vf_device *oct, int oq_no) +{ +} + +/* Enable all hardware Tx/Rx Queues assigned to VF */ +static void octep_vf_enable_io_queues_cn93(struct octep_vf_device *oct) +{ +} + +/* Disable a hardware Tx Queue assigned to VF */ +static void octep_vf_disable_iq_cn93(struct octep_vf_device *oct, int iq_no) +{ +} + +/* Disable a hardware Rx Queue assigned to VF */ +static void octep_vf_disable_oq_cn93(struct octep_vf_device *oct, int oq_no) +{ +} + +/* Disable all hardware Tx/Rx Queues assigned to VF */ +static void octep_vf_disable_io_queues_cn93(struct octep_vf_device *oct) +{ +} + +/* Dump hardware registers (including Tx/Rx queues) for debugging. */ +static void octep_vf_dump_registers_cn93(struct octep_vf_device *oct) +{ +} + +/** + * octep_vf_device_setup_cn93() - Setup Octeon device. + * + * @oct: Octeon device private data structure. + * + * - initialize hardware operations. + * - get target side pcie port number for the device. + * - set initial configuration and max limits. + */ +void octep_vf_device_setup_cn93(struct octep_vf_device *oct) +{ + oct->hw_ops.setup_iq_regs = octep_vf_setup_iq_regs_cn93; + oct->hw_ops.setup_oq_regs = octep_vf_setup_oq_regs_cn93; + oct->hw_ops.setup_mbox_regs = octep_vf_setup_mbox_regs_cn93; + + oct->hw_ops.ioq_intr_handler = octep_vf_ioq_intr_handler_cn93; + oct->hw_ops.reinit_regs = octep_vf_reinit_regs_cn93; + + oct->hw_ops.enable_interrupts = octep_vf_enable_interrupts_cn93; + oct->hw_ops.disable_interrupts = octep_vf_disable_interrupts_cn93; + + oct->hw_ops.update_iq_read_idx = octep_vf_update_iq_read_index_cn93; + + oct->hw_ops.enable_iq = octep_vf_enable_iq_cn93; + oct->hw_ops.enable_oq = octep_vf_enable_oq_cn93; + oct->hw_ops.enable_io_queues = octep_vf_enable_io_queues_cn93; + + oct->hw_ops.disable_iq = octep_vf_disable_iq_cn93; + oct->hw_ops.disable_oq = octep_vf_disable_oq_cn93; + oct->hw_ops.disable_io_queues = octep_vf_disable_io_queues_cn93; + oct->hw_ops.reset_io_queues = octep_vf_reset_io_queues_cn93; + + oct->hw_ops.dump_registers = octep_vf_dump_registers_cn93; + octep_vf_init_config_cn93_vf(oct); +} diff --git a/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_cnxk.c b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_cnxk.c new file mode 100644 index 00000000000000..af07a4a6edc5be --- /dev/null +++ b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_cnxk.c @@ -0,0 +1,158 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Marvell Octeon EP (EndPoint) VF Ethernet Driver + * + * Copyright (C) 2020 Marvell. + * + */ + +#include +#include +#include + +#include "octep_vf_config.h" +#include "octep_vf_main.h" +#include "octep_vf_regs_cnxk.h" + +/* Reset all hardware Tx/Rx queues */ +static void octep_vf_reset_io_queues_cnxk(struct octep_vf_device *oct) +{ +} + +/* Initialize configuration limits and initial active config */ +static void octep_vf_init_config_cnxk_vf(struct octep_vf_device *oct) +{ + struct octep_vf_config *conf = oct->conf; + u64 reg_val; + + reg_val = octep_vf_read_csr64(oct, CNXK_VF_SDP_R_IN_CONTROL(0)); + conf->ring_cfg.max_io_rings = (reg_val >> CNXK_VF_R_IN_CTL_RPVF_POS) & + CNXK_VF_R_IN_CTL_RPVF_MASK; + conf->ring_cfg.active_io_rings = conf->ring_cfg.max_io_rings; + + conf->iq.num_descs = OCTEP_VF_IQ_MAX_DESCRIPTORS; + conf->iq.instr_type = OCTEP_VF_64BYTE_INSTR; + conf->iq.db_min = OCTEP_VF_DB_MIN; + conf->iq.intr_threshold = OCTEP_VF_IQ_INTR_THRESHOLD; + + conf->oq.num_descs = OCTEP_VF_OQ_MAX_DESCRIPTORS; + conf->oq.buf_size = OCTEP_VF_OQ_BUF_SIZE; + conf->oq.refill_threshold = OCTEP_VF_OQ_REFILL_THRESHOLD; + conf->oq.oq_intr_pkt = OCTEP_VF_OQ_INTR_PKT_THRESHOLD; + conf->oq.oq_intr_time = OCTEP_VF_OQ_INTR_TIME_THRESHOLD; + conf->oq.wmark = OCTEP_VF_OQ_WMARK_MIN; + + conf->msix_cfg.ioq_msix = conf->ring_cfg.active_io_rings; +} + +/* Setup registers for a hardware Tx Queue */ +static void octep_vf_setup_iq_regs_cnxk(struct octep_vf_device *oct, int iq_no) +{ +} + +/* Setup registers for a hardware Rx Queue */ +static void octep_vf_setup_oq_regs_cnxk(struct octep_vf_device *oct, int oq_no) +{ +} + +/* Setup registers for a VF mailbox */ +static void octep_vf_setup_mbox_regs_cnxk(struct octep_vf_device *oct, int q_no) +{ +} + +/* Tx/Rx queue interrupt handler */ +static irqreturn_t octep_vf_ioq_intr_handler_cnxk(void *data) +{ + return IRQ_HANDLED; +} + +/* Re-initialize Octeon hardware registers */ +static void octep_vf_reinit_regs_cnxk(struct octep_vf_device *oct) +{ +} + +/* Enable all interrupts */ +static void octep_vf_enable_interrupts_cnxk(struct octep_vf_device *oct) +{ +} + +/* Disable all interrupts */ +static void octep_vf_disable_interrupts_cnxk(struct octep_vf_device *oct) +{ +} + +/* Get new Octeon Read Index: index of descriptor that Octeon reads next. */ +static u32 octep_vf_update_iq_read_index_cnxk(struct octep_vf_iq *iq) +{ + return 0; +} + +/* Enable a hardware Tx Queue */ +static void octep_vf_enable_iq_cnxk(struct octep_vf_device *oct, int iq_no) +{ +} + +/* Enable a hardware Rx Queue */ +static void octep_vf_enable_oq_cnxk(struct octep_vf_device *oct, int oq_no) +{ +} + +/* Enable all hardware Tx/Rx Queues assigned to VF */ +static void octep_vf_enable_io_queues_cnxk(struct octep_vf_device *oct) +{ +} + +/* Disable a hardware Tx Queue assigned to VF */ +static void octep_vf_disable_iq_cnxk(struct octep_vf_device *oct, int iq_no) +{ +} + +/* Disable a hardware Rx Queue assigned to VF */ +static void octep_vf_disable_oq_cnxk(struct octep_vf_device *oct, int oq_no) +{ +} + +/* Disable all hardware Tx/Rx Queues assigned to VF */ +static void octep_vf_disable_io_queues_cnxk(struct octep_vf_device *oct) +{ +} + +/* Dump hardware registers (including Tx/Rx queues) for debugging. */ +static void octep_vf_dump_registers_cnxk(struct octep_vf_device *oct) +{ +} + +/** + * octep_vf_device_setup_cnxk() - Setup Octeon device. + * + * @oct: Octeon device private data structure. + * + * - initialize hardware operations. + * - get target side pcie port number for the device. + * - set initial configuration and max limits. + */ +void octep_vf_device_setup_cnxk(struct octep_vf_device *oct) +{ + oct->hw_ops.setup_iq_regs = octep_vf_setup_iq_regs_cnxk; + oct->hw_ops.setup_oq_regs = octep_vf_setup_oq_regs_cnxk; + oct->hw_ops.setup_mbox_regs = octep_vf_setup_mbox_regs_cnxk; + + oct->hw_ops.ioq_intr_handler = octep_vf_ioq_intr_handler_cnxk; + oct->hw_ops.reinit_regs = octep_vf_reinit_regs_cnxk; + + oct->hw_ops.enable_interrupts = octep_vf_enable_interrupts_cnxk; + oct->hw_ops.disable_interrupts = octep_vf_disable_interrupts_cnxk; + + oct->hw_ops.update_iq_read_idx = octep_vf_update_iq_read_index_cnxk; + + oct->hw_ops.enable_iq = octep_vf_enable_iq_cnxk; + oct->hw_ops.enable_oq = octep_vf_enable_oq_cnxk; + oct->hw_ops.enable_io_queues = octep_vf_enable_io_queues_cnxk; + + oct->hw_ops.disable_iq = octep_vf_disable_iq_cnxk; + oct->hw_ops.disable_oq = octep_vf_disable_oq_cnxk; + oct->hw_ops.disable_io_queues = octep_vf_disable_io_queues_cnxk; + oct->hw_ops.reset_io_queues = octep_vf_reset_io_queues_cnxk; + + oct->hw_ops.dump_registers = octep_vf_dump_registers_cnxk; + octep_vf_init_config_cnxk_vf(oct); +} diff --git a/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_config.h b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_config.h new file mode 100644 index 00000000000000..e03a647b011043 --- /dev/null +++ b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_config.h @@ -0,0 +1,160 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Marvell Octeon EP (EndPoint) VF Ethernet Driver + * + * Copyright (C) 2020 Marvell. + * + */ + +#ifndef _OCTEP_VF_CONFIG_H_ +#define _OCTEP_VF_CONFIG_H_ + +/* Tx instruction types by length */ +#define OCTEP_VF_32BYTE_INSTR 32 +#define OCTEP_VF_64BYTE_INSTR 64 + +/* Tx Queue: maximum descriptors per ring */ +#define OCTEP_VF_IQ_MAX_DESCRIPTORS 1024 +/* Minimum input (Tx) requests to be enqueued to ring doorbell */ +#define OCTEP_VF_DB_MIN 8 +/* Packet threshold for Tx queue interrupt */ +#define OCTEP_VF_IQ_INTR_THRESHOLD 0x0 + +/* Minimum watermark for backpressure */ +#define OCTEP_VF_OQ_WMARK_MIN 256 + +/* Rx Queue: maximum descriptors per ring */ +#define OCTEP_VF_OQ_MAX_DESCRIPTORS 1024 + +/* Rx buffer size: Use page size buffers. + * Build skb from allocated page buffer once the packet is received. + * When a gathered packet is received, make head page as skb head and + * page buffers in consecutive Rx descriptors as fragments. + */ +#define OCTEP_VF_OQ_BUF_SIZE (SKB_WITH_OVERHEAD(PAGE_SIZE)) +#define OCTEP_VF_OQ_PKTS_PER_INTR 128 +#define OCTEP_VF_OQ_REFILL_THRESHOLD (OCTEP_VF_OQ_MAX_DESCRIPTORS / 4) + +#define OCTEP_VF_OQ_INTR_PKT_THRESHOLD 1 +#define OCTEP_VF_OQ_INTR_TIME_THRESHOLD 10 + +#define OCTEP_VF_MSIX_NAME_SIZE (IFNAMSIZ + 32) + +/* Tx Queue wake threshold + * wakeup a stopped Tx queue if minimum 2 descriptors are available. + * Even a skb with fragments consume only one Tx queue descriptor entry. + */ +#define OCTEP_VF_WAKE_QUEUE_THRESHOLD 2 + +/* Minimum MTU supported by Octeon network interface */ +#define OCTEP_VF_MIN_MTU ETH_MIN_MTU +/* Maximum MTU supported by Octeon interface*/ +#define OCTEP_VF_MAX_MTU (10000 - (ETH_HLEN + ETH_FCS_LEN)) +/* Default MTU */ +#define OCTEP_VF_DEFAULT_MTU 1500 + +/* Macros to get octeon config params */ +#define CFG_GET_IQ_CFG(cfg) ((cfg)->iq) +#define CFG_GET_IQ_NUM_DESC(cfg) ((cfg)->iq.num_descs) +#define CFG_GET_IQ_INSTR_TYPE(cfg) ((cfg)->iq.instr_type) +#define CFG_GET_IQ_INSTR_SIZE(cfg) (64) +#define CFG_GET_IQ_DB_MIN(cfg) ((cfg)->iq.db_min) +#define CFG_GET_IQ_INTR_THRESHOLD(cfg) ((cfg)->iq.intr_threshold) + +#define CFG_GET_OQ_NUM_DESC(cfg) ((cfg)->oq.num_descs) +#define CFG_GET_OQ_BUF_SIZE(cfg) ((cfg)->oq.buf_size) +#define CFG_GET_OQ_REFILL_THRESHOLD(cfg) ((cfg)->oq.refill_threshold) +#define CFG_GET_OQ_INTR_PKT(cfg) ((cfg)->oq.oq_intr_pkt) +#define CFG_GET_OQ_INTR_TIME(cfg) ((cfg)->oq.oq_intr_time) +#define CFG_GET_OQ_WMARK(cfg) ((cfg)->oq.wmark) + +#define CFG_GET_PORTS_ACTIVE_IO_RINGS(cfg) ((cfg)->ring_cfg.active_io_rings) +#define CFG_GET_PORTS_MAX_IO_RINGS(cfg) ((cfg)->ring_cfg.max_io_rings) + +#define CFG_GET_CORE_TICS_PER_US(cfg) ((cfg)->core_cfg.core_tics_per_us) +#define CFG_GET_COPROC_TICS_PER_US(cfg) ((cfg)->core_cfg.coproc_tics_per_us) + +#define CFG_GET_IOQ_MSIX(cfg) ((cfg)->msix_cfg.ioq_msix) + +/* Hardware Tx Queue configuration. */ +struct octep_vf_iq_config { + /* Size of the Input queue (number of commands) */ + u16 num_descs; + + /* Command size - 32 or 64 bytes */ + u16 instr_type; + + /* Minimum number of commands pending to be posted to Octeon before driver + * hits the Input queue doorbell. + */ + u16 db_min; + + /* Trigger the IQ interrupt when processed cmd count reaches + * this level. + */ + u32 intr_threshold; +}; + +/* Hardware Rx Queue configuration. */ +struct octep_vf_oq_config { + /* Size of Output queue (number of descriptors) */ + u16 num_descs; + + /* Size of buffer in this Output queue. */ + u16 buf_size; + + /* The number of buffers that were consumed during packet processing + * by the driver on this Output queue before the driver attempts to + * replenish the descriptor ring with new buffers. + */ + u16 refill_threshold; + + /* Interrupt Coalescing (Packet Count). Octeon will interrupt the host + * only if it sent as many packets as specified by this field. + * The driver usually does not use packet count interrupt coalescing. + */ + u32 oq_intr_pkt; + + /* Interrupt Coalescing (Time Interval). Octeon will interrupt the host + * if at least one packet was sent in the time interval specified by + * this field. The driver uses time interval interrupt coalescing by + * default. The time is specified in microseconds. + */ + u32 oq_intr_time; + + /* Water mark for backpressure. + * Output queue sends backpressure signal to source when + * free buffer count falls below wmark. + */ + u32 wmark; +}; + +/* Tx/Rx configuration */ +struct octep_vf_ring_config { + /* Max number of IOQs */ + u16 max_io_rings; + + /* Number of active IOQs */ + u16 active_io_rings; +}; + +/* Octeon MSI-x config. */ +struct octep_vf_msix_config { + /* Number of IOQ interrupts */ + u16 ioq_msix; +}; + +/* Data Structure to hold configuration limits and active config */ +struct octep_vf_config { + /* Input Queue attributes. */ + struct octep_vf_iq_config iq; + + /* Output Queue attributes. */ + struct octep_vf_oq_config oq; + + /* MSI-X interrupt config */ + struct octep_vf_msix_config msix_cfg; + + /* NIC VF ring Configuration */ + struct octep_vf_ring_config ring_cfg; +}; +#endif /* _OCTEP_VF_CONFIG_H_ */ diff --git a/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_main.c b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_main.c new file mode 100644 index 00000000000000..2ade88698f65cc --- /dev/null +++ b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_main.c @@ -0,0 +1,516 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Marvell Octeon EP (EndPoint) VF Ethernet Driver + * + * Copyright (C) 2020 Marvell. + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "octep_vf_config.h" +#include "octep_vf_main.h" + +struct workqueue_struct *octep_vf_wq; + +/* Supported Devices */ +static const struct pci_device_id octep_vf_pci_id_tbl[] = { + {PCI_DEVICE(PCI_VENDOR_ID_CAVIUM, OCTEP_PCI_DEVICE_ID_CN93_VF)}, + {PCI_DEVICE(PCI_VENDOR_ID_CAVIUM, OCTEP_PCI_DEVICE_ID_CNF95N_VF)}, + {PCI_DEVICE(PCI_VENDOR_ID_CAVIUM, OCTEP_PCI_DEVICE_ID_CN98_VF)}, + {PCI_DEVICE(PCI_VENDOR_ID_CAVIUM, OCTEP_PCI_DEVICE_ID_CN10KA_VF)}, + {PCI_DEVICE(PCI_VENDOR_ID_CAVIUM, OCTEP_PCI_DEVICE_ID_CNF10KA_VF)}, + {PCI_DEVICE(PCI_VENDOR_ID_CAVIUM, OCTEP_PCI_DEVICE_ID_CNF10KB_VF)}, + {PCI_DEVICE(PCI_VENDOR_ID_CAVIUM, OCTEP_PCI_DEVICE_ID_CN10KB_VF)}, + {0, }, +}; +MODULE_DEVICE_TABLE(pci, octep_vf_pci_id_tbl); + +MODULE_AUTHOR("Veerasenareddy Burru "); +MODULE_DESCRIPTION(OCTEP_VF_DRV_STRING); +MODULE_LICENSE("GPL"); + +static void octep_vf_link_up(struct net_device *netdev) +{ + netif_carrier_on(netdev); + netif_tx_start_all_queues(netdev); +} + +static void octep_vf_set_rx_state(struct octep_vf_device *oct, bool up) +{ + int err; + + err = octep_vf_mbox_set_rx_state(oct, up); + if (err) + netdev_err(oct->netdev, "Set Rx state to %d failed with err:%d\n", up, err); +} + +static int octep_vf_get_link_status(struct octep_vf_device *oct) +{ + int err; + + err = octep_vf_mbox_get_link_status(oct, &oct->link_info.oper_up); + if (err) + netdev_err(oct->netdev, "Get link status failed with err:%d\n", err); + return oct->link_info.oper_up; +} + +static void octep_vf_set_link_status(struct octep_vf_device *oct, bool up) +{ + int err; + + err = octep_vf_mbox_set_link_status(oct, up); + if (err) { + netdev_err(oct->netdev, "Set link status to %d failed with err:%d\n", up, err); + return; + } + oct->link_info.oper_up = up; +} + +/** + * octep_vf_open() - start the octeon network device. + * + * @netdev: pointer to kernel network device. + * + * setup Tx/Rx queues, interrupts and enable hardware operation of Tx/Rx queues + * and interrupts.. + * + * Return: 0, on successfully setting up device and bring it up. + * -1, on any error. + */ +static int octep_vf_open(struct net_device *netdev) +{ + struct octep_vf_device *oct = netdev_priv(netdev); + int err, ret; + + netdev_info(netdev, "Starting netdev ...\n"); + netif_carrier_off(netdev); + + oct->hw_ops.reset_io_queues(oct); + + if (octep_vf_setup_iqs(oct)) + goto setup_iq_err; + if (octep_vf_setup_oqs(oct)) + goto setup_oq_err; + + err = netif_set_real_num_tx_queues(netdev, oct->num_oqs); + if (err) + goto set_queues_err; + err = netif_set_real_num_rx_queues(netdev, oct->num_iqs); + if (err) + goto set_queues_err; + + oct->link_info.admin_up = 1; + octep_vf_set_rx_state(oct, true); + + ret = octep_vf_get_link_status(oct); + if (!ret) + octep_vf_set_link_status(oct, true); + + /* Enable the input and output queues for this Octeon device */ + oct->hw_ops.enable_io_queues(oct); + + /* Enable Octeon device interrupts */ + oct->hw_ops.enable_interrupts(oct); + + octep_vf_oq_dbell_init(oct); + + ret = octep_vf_get_link_status(oct); + if (ret) + octep_vf_link_up(netdev); + + return 0; + +set_queues_err: + octep_vf_free_oqs(oct); +setup_oq_err: + octep_vf_free_iqs(oct); +setup_iq_err: + return -1; +} + +/** + * octep_vf_stop() - stop the octeon network device. + * + * @netdev: pointer to kernel network device. + * + * stop the device Tx/Rx operations, bring down the link and + * free up all resources allocated for Tx/Rx queues and interrupts. + */ +static int octep_vf_stop(struct net_device *netdev) +{ + struct octep_vf_device *oct = netdev_priv(netdev); + + netdev_info(netdev, "Stopping the device ...\n"); + + /* Stop Tx from stack */ + netif_carrier_off(netdev); + netif_tx_disable(netdev); + + octep_vf_set_link_status(oct, false); + octep_vf_set_rx_state(oct, false); + + oct->link_info.admin_up = 0; + oct->link_info.oper_up = 0; + + oct->hw_ops.disable_interrupts(oct); + + octep_vf_clean_iqs(oct); + + oct->hw_ops.disable_io_queues(oct); + oct->hw_ops.reset_io_queues(oct); + octep_vf_free_oqs(oct); + octep_vf_free_iqs(oct); + netdev_info(netdev, "Device stopped !!\n"); + return 0; +} + +/** + * octep_vf_start_xmit() - Enqueue packet to Octoen hardware Tx Queue. + * + * @skb: packet skbuff pointer. + * @netdev: kernel network device. + * + * Return: NETDEV_TX_BUSY, if Tx Queue is full. + * NETDEV_TX_OK, if successfully enqueued to hardware Tx queue. + */ +static netdev_tx_t octep_vf_start_xmit(struct sk_buff *skb, + struct net_device *netdev) +{ + return NETDEV_TX_OK; +} + +/** + * octep_vf_tx_timeout_task - work queue task to Handle Tx queue timeout. + * + * @work: pointer to Tx queue timeout work_struct + * + * Stop and start the device so that it frees up all queue resources + * and restarts the queues, that potentially clears a Tx queue timeout + * condition. + **/ +static void octep_vf_tx_timeout_task(struct work_struct *work) +{ + struct octep_vf_device *oct = container_of(work, struct octep_vf_device, + tx_timeout_task); + struct net_device *netdev = oct->netdev; + + rtnl_lock(); + if (netif_running(netdev)) { + octep_vf_stop(netdev); + octep_vf_open(netdev); + } + rtnl_unlock(); + netdev_put(netdev, NULL); +} + +/** + * octep_vf_tx_timeout() - Handle Tx Queue timeout. + * + * @netdev: pointer to kernel network device. + * @txqueue: Timed out Tx queue number. + * + * Schedule a work to handle Tx queue timeout. + */ +static void octep_vf_tx_timeout(struct net_device *netdev, unsigned int txqueue) +{ + struct octep_vf_device *oct = netdev_priv(netdev); + + netdev_hold(netdev, NULL, GFP_ATOMIC); + schedule_work(&oct->tx_timeout_task); +} + +static const struct net_device_ops octep_vf_netdev_ops = { + .ndo_open = octep_vf_open, + .ndo_stop = octep_vf_stop, + .ndo_start_xmit = octep_vf_start_xmit, + .ndo_tx_timeout = octep_vf_tx_timeout, +}; + +static const char *octep_vf_devid_to_str(struct octep_vf_device *oct) +{ + switch (oct->chip_id) { + case OCTEP_PCI_DEVICE_ID_CN93_VF: + return "CN93XX"; + case OCTEP_PCI_DEVICE_ID_CNF95N_VF: + return "CNF95N"; + case OCTEP_PCI_DEVICE_ID_CN10KA_VF: + return "CN10KA"; + case OCTEP_PCI_DEVICE_ID_CNF10KA_VF: + return "CNF10KA"; + case OCTEP_PCI_DEVICE_ID_CNF10KB_VF: + return "CNF10KB"; + case OCTEP_PCI_DEVICE_ID_CN10KB_VF: + return "CN10KB"; + default: + return "Unsupported"; + } +} + +/** + * octep_vf_device_setup() - Setup Octeon Device. + * + * @oct: Octeon device private data structure. + * + * Setup Octeon device hardware operations, configuration, etc ... + */ +int octep_vf_device_setup(struct octep_vf_device *oct) +{ + struct pci_dev *pdev = oct->pdev; + + /* allocate memory for oct->conf */ + oct->conf = kzalloc(sizeof(*oct->conf), GFP_KERNEL); + if (!oct->conf) + return -ENOMEM; + + /* Map BAR region 0 */ + oct->mmio.hw_addr = ioremap(pci_resource_start(oct->pdev, 0), + pci_resource_len(oct->pdev, 0)); + if (!oct->mmio.hw_addr) { + dev_err(&pdev->dev, + "Failed to remap BAR0; start=0x%llx len=0x%llx\n", + pci_resource_start(oct->pdev, 0), + pci_resource_len(oct->pdev, 0)); + goto ioremap_err; + } + oct->mmio.mapped = 1; + + oct->chip_id = pdev->device; + oct->rev_id = pdev->revision; + dev_info(&pdev->dev, "chip_id = 0x%x\n", pdev->device); + + switch (oct->chip_id) { + case OCTEP_PCI_DEVICE_ID_CN93_VF: + case OCTEP_PCI_DEVICE_ID_CNF95N_VF: + case OCTEP_PCI_DEVICE_ID_CN98_VF: + dev_info(&pdev->dev, "Setting up OCTEON %s VF PASS%d.%d\n", + octep_vf_devid_to_str(oct), OCTEP_VF_MAJOR_REV(oct), + OCTEP_VF_MINOR_REV(oct)); + octep_vf_device_setup_cn93(oct); + break; + case OCTEP_PCI_DEVICE_ID_CNF10KA_VF: + case OCTEP_PCI_DEVICE_ID_CN10KA_VF: + case OCTEP_PCI_DEVICE_ID_CNF10KB_VF: + case OCTEP_PCI_DEVICE_ID_CN10KB_VF: + dev_info(&pdev->dev, "Setting up OCTEON %s VF PASS%d.%d\n", + octep_vf_devid_to_str(oct), OCTEP_VF_MAJOR_REV(oct), + OCTEP_VF_MINOR_REV(oct)); + octep_vf_device_setup_cnxk(oct); + break; + default: + dev_err(&pdev->dev, "Unsupported device\n"); + goto unsupported_dev; + } + + return 0; + +unsupported_dev: + iounmap(oct->mmio.hw_addr); +ioremap_err: + kfree(oct->conf); + return -EOPNOTSUPP; +} + +/** + * octep_vf_device_cleanup() - Cleanup Octeon Device. + * + * @oct: Octeon device private data structure. + * + * Cleanup Octeon device allocated resources. + */ +static void octep_vf_device_cleanup(struct octep_vf_device *oct) +{ + dev_info(&oct->pdev->dev, "Cleaning up Octeon Device ...\n"); + + if (oct->mmio.mapped) + iounmap(oct->mmio.hw_addr); + + kfree(oct->conf); + oct->conf = NULL; +} + +static int octep_vf_get_mac_addr(struct octep_vf_device *oct, u8 *addr) +{ + return octep_vf_mbox_get_mac_addr(oct, addr); +} + +/** + * octep_vf_probe() - Octeon PCI device probe handler. + * + * @pdev: PCI device structure. + * @ent: entry in Octeon PCI device ID table. + * + * Initializes and enables the Octeon PCI device for network operations. + * Initializes Octeon private data structure and registers a network device. + */ +static int octep_vf_probe(struct pci_dev *pdev, const struct pci_device_id *ent) +{ + struct octep_vf_device *octep_vf_dev; + struct net_device *netdev; + int err; + + err = pci_enable_device(pdev); + if (err) { + dev_err(&pdev->dev, "Failed to enable PCI device\n"); + return err; + } + + err = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)); + if (err) { + dev_err(&pdev->dev, "Failed to set DMA mask !!\n"); + goto disable_pci_device; + } + + err = pci_request_mem_regions(pdev, OCTEP_VF_DRV_NAME); + if (err) { + dev_err(&pdev->dev, "Failed to map PCI memory regions\n"); + goto disable_pci_device; + } + + pci_set_master(pdev); + + netdev = alloc_etherdev_mq(sizeof(struct octep_vf_device), + OCTEP_VF_MAX_QUEUES); + if (!netdev) { + dev_err(&pdev->dev, "Failed to allocate netdev\n"); + err = -ENOMEM; + goto mem_regions_release; + } + SET_NETDEV_DEV(netdev, &pdev->dev); + + octep_vf_dev = netdev_priv(netdev); + octep_vf_dev->netdev = netdev; + octep_vf_dev->pdev = pdev; + octep_vf_dev->dev = &pdev->dev; + pci_set_drvdata(pdev, octep_vf_dev); + + err = octep_vf_device_setup(octep_vf_dev); + if (err) { + dev_err(&pdev->dev, "Device setup failed\n"); + goto netdevice_free; + } + INIT_WORK(&octep_vf_dev->tx_timeout_task, octep_vf_tx_timeout_task); + + netdev->netdev_ops = &octep_vf_netdev_ops; + netif_carrier_off(netdev); + + if (octep_vf_setup_mbox(octep_vf_dev)) { + dev_err(&pdev->dev, "VF Mailbox setup failed\n"); + err = -ENOMEM; + goto device_cleanup; + } + + if (octep_vf_mbox_version_check(octep_vf_dev)) { + dev_err(&pdev->dev, "PF VF Mailbox version mismatch\n"); + err = -EINVAL; + goto delete_mbox; + } + + netdev->hw_features = NETIF_F_SG; + netdev->min_mtu = OCTEP_VF_MIN_MTU; + netdev->max_mtu = OCTEP_VF_MAX_MTU; + netdev->mtu = OCTEP_VF_DEFAULT_MTU; + + netdev->features |= netdev->hw_features; + octep_vf_get_mac_addr(octep_vf_dev, octep_vf_dev->mac_addr); + eth_hw_addr_set(netdev, octep_vf_dev->mac_addr); + err = register_netdev(netdev); + if (err) { + dev_err(&pdev->dev, "Failed to register netdev\n"); + goto delete_mbox; + } + dev_info(&pdev->dev, "Device probe successful\n"); + return 0; + +delete_mbox: + octep_vf_delete_mbox(octep_vf_dev); +device_cleanup: + octep_vf_device_cleanup(octep_vf_dev); +netdevice_free: + free_netdev(netdev); +mem_regions_release: + pci_release_mem_regions(pdev); +disable_pci_device: + pci_disable_device(pdev); + dev_err(&pdev->dev, "Device probe failed\n"); + return err; +} + +/** + * octep_vf_remove() - Remove Octeon PCI device from driver control. + * + * @pdev: PCI device structure of the Octeon device. + * + * Cleanup all resources allocated for the Octeon device. + * Unregister from network device and disable the PCI device. + */ +static void octep_vf_remove(struct pci_dev *pdev) +{ + struct octep_vf_device *oct = pci_get_drvdata(pdev); + struct net_device *netdev; + + if (!oct) + return; + + octep_vf_mbox_dev_remove(oct); + cancel_work_sync(&oct->tx_timeout_task); + netdev = oct->netdev; + if (netdev->reg_state == NETREG_REGISTERED) + unregister_netdev(netdev); + octep_vf_delete_mbox(oct); + octep_vf_device_cleanup(oct); + pci_release_mem_regions(pdev); + free_netdev(netdev); + pci_disable_device(pdev); +} + +static struct pci_driver octep_vf_driver = { + .name = OCTEP_VF_DRV_NAME, + .id_table = octep_vf_pci_id_tbl, + .probe = octep_vf_probe, + .remove = octep_vf_remove, +}; + +/** + * octep_vf_init_module() - Module initialization. + * + * create common resource for the driver and register PCI driver. + */ +static int __init octep_vf_init_module(void) +{ + int ret; + + pr_info("%s: Loading %s ...\n", OCTEP_VF_DRV_NAME, OCTEP_VF_DRV_STRING); + + ret = pci_register_driver(&octep_vf_driver); + if (ret < 0) { + pr_err("%s: Failed to register PCI driver; err=%d\n", + OCTEP_VF_DRV_NAME, ret); + return ret; + } + + return ret; +} + +/** + * octep_vf_exit_module() - Module exit routine. + * + * unregister the driver with PCI subsystem and cleanup common resources. + */ +static void __exit octep_vf_exit_module(void) +{ + pr_info("%s: Unloading ...\n", OCTEP_VF_DRV_NAME); + + pci_unregister_driver(&octep_vf_driver); + + pr_info("%s: Unloading complete\n", OCTEP_VF_DRV_NAME); +} + +module_init(octep_vf_init_module); +module_exit(octep_vf_exit_module); diff --git a/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_main.h b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_main.h new file mode 100644 index 00000000000000..4359e0e585ecaf --- /dev/null +++ b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_main.h @@ -0,0 +1,331 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Marvell Octeon EP (EndPoint) VF Ethernet Driver + * + * Copyright (C) 2020 Marvell. + * + */ + +#ifndef _OCTEP_VF_MAIN_H_ +#define _OCTEP_VF_MAIN_H_ + +#include "octep_vf_tx.h" +#include "octep_vf_rx.h" +#include "octep_vf_mbox.h" + +#define OCTEP_VF_DRV_NAME "octeon_ep_vf" +#define OCTEP_VF_DRV_STRING "Marvell Octeon EndPoint NIC VF Driver" + +#define OCTEP_PCI_DEVICE_ID_CN93_VF 0xB203 //93xx VF +#define OCTEP_PCI_DEVICE_ID_CNF95N_VF 0xB403 //95N VF +#define OCTEP_PCI_DEVICE_ID_CN98_VF 0xB103 +#define OCTEP_PCI_DEVICE_ID_CN10KA_VF 0xB903 +#define OCTEP_PCI_DEVICE_ID_CNF10KA_VF 0xBA03 +#define OCTEP_PCI_DEVICE_ID_CNF10KB_VF 0xBC03 +#define OCTEP_PCI_DEVICE_ID_CN10KB_VF 0xBD03 + +#define OCTEP_VF_MAX_QUEUES 63 +#define OCTEP_VF_MAX_IQ OCTEP_VF_MAX_QUEUES +#define OCTEP_VF_MAX_OQ OCTEP_VF_MAX_QUEUES + +#define OCTEP_VF_MAX_MSIX_VECTORS OCTEP_VF_MAX_OQ + +#define OCTEP_VF_IQ_INTR_RESEND_BIT 59 +#define OCTEP_VF_OQ_INTR_RESEND_BIT 59 + +#define IQ_INSTR_PENDING(iq) ({ typeof(iq) iq__ = (iq); \ + ((iq__)->host_write_index - (iq__)->flush_index) & \ + (iq__)->ring_size_mask; \ + }) +#define IQ_INSTR_SPACE(iq) ({ typeof(iq) iq_ = (iq); \ + (iq_)->max_count - IQ_INSTR_PENDING(iq_); \ + }) + +/* PCI address space mapping information. + * Each of the 3 address spaces given by BAR0, BAR2 and BAR4 of + * Octeon gets mapped to different physical address spaces in + * the kernel. + */ +struct octep_vf_mmio { + /* The physical address to which the PCI address space is mapped. */ + u8 __iomem *hw_addr; + + /* Flag indicating the mapping was successful. */ + int mapped; +}; + +struct octep_vf_hw_ops { + void (*setup_iq_regs)(struct octep_vf_device *oct, int q); + void (*setup_oq_regs)(struct octep_vf_device *oct, int q); + void (*setup_mbox_regs)(struct octep_vf_device *oct, int mbox); + + irqreturn_t (*non_ioq_intr_handler)(void *ioq_vector); + irqreturn_t (*ioq_intr_handler)(void *ioq_vector); + void (*reinit_regs)(struct octep_vf_device *oct); + u32 (*update_iq_read_idx)(struct octep_vf_iq *iq); + + void (*enable_interrupts)(struct octep_vf_device *oct); + void (*disable_interrupts)(struct octep_vf_device *oct); + + void (*enable_io_queues)(struct octep_vf_device *oct); + void (*disable_io_queues)(struct octep_vf_device *oct); + void (*enable_iq)(struct octep_vf_device *oct, int q); + void (*disable_iq)(struct octep_vf_device *oct, int q); + void (*enable_oq)(struct octep_vf_device *oct, int q); + void (*disable_oq)(struct octep_vf_device *oct, int q); + void (*reset_io_queues)(struct octep_vf_device *oct); + void (*dump_registers)(struct octep_vf_device *oct); +}; + +/* Octeon mailbox data */ +struct octep_vf_mbox_data { + /* Holds the offset of received data via mailbox. */ + u32 data_index; + + /* Holds the received data via mailbox. */ + u8 recv_data[OCTEP_PFVF_MBOX_MAX_DATA_BUF_SIZE]; +}; + +/* wrappers around work structs */ +struct octep_vf_mbox_wk { + struct work_struct work; + void *ctxptr; +}; + +/* Octeon device mailbox */ +struct octep_vf_mbox { + /* A mutex to protect access to this q_mbox. */ + struct mutex lock; + + u32 state; + + /* SLI_MAC_PF_MBOX_INT for PF, SLI_PKT_MBOX_INT for VF. */ + u8 __iomem *mbox_int_reg; + + /* SLI_PKT_PF_VF_MBOX_SIG(0) for PF, + * SLI_PKT_PF_VF_MBOX_SIG(1) for VF. + */ + u8 __iomem *mbox_write_reg; + + /* SLI_PKT_PF_VF_MBOX_SIG(1) for PF, + * SLI_PKT_PF_VF_MBOX_SIG(0) for VF. + */ + u8 __iomem *mbox_read_reg; + + /* Octeon mailbox data */ + struct octep_vf_mbox_data mbox_data; + + /* Octeon mailbox work handler to process Mbox messages */ + struct octep_vf_mbox_wk wk; +}; + +/* Tx/Rx queue vector per interrupt. */ +struct octep_vf_ioq_vector { + char name[OCTEP_VF_MSIX_NAME_SIZE]; + struct napi_struct napi; + struct octep_vf_device *octep_vf_dev; + struct octep_vf_iq *iq; + struct octep_vf_oq *oq; + cpumask_t affinity_mask; +}; + +/* Octeon hardware/firmware offload capability flags. */ +#define OCTEP_VF_CAP_TX_CHECKSUM BIT(0) +#define OCTEP_VF_CAP_RX_CHECKSUM BIT(1) +#define OCTEP_VF_CAP_TSO BIT(2) + +/* Link modes */ +enum octep_vf_link_mode_bit_indices { + OCTEP_VF_LINK_MODE_10GBASE_T = 0, + OCTEP_VF_LINK_MODE_10GBASE_R, + OCTEP_VF_LINK_MODE_10GBASE_CR, + OCTEP_VF_LINK_MODE_10GBASE_KR, + OCTEP_VF_LINK_MODE_10GBASE_LR, + OCTEP_VF_LINK_MODE_10GBASE_SR, + OCTEP_VF_LINK_MODE_25GBASE_CR, + OCTEP_VF_LINK_MODE_25GBASE_KR, + OCTEP_VF_LINK_MODE_25GBASE_SR, + OCTEP_VF_LINK_MODE_40GBASE_CR4, + OCTEP_VF_LINK_MODE_40GBASE_KR4, + OCTEP_VF_LINK_MODE_40GBASE_LR4, + OCTEP_VF_LINK_MODE_40GBASE_SR4, + OCTEP_VF_LINK_MODE_50GBASE_CR2, + OCTEP_VF_LINK_MODE_50GBASE_KR2, + OCTEP_VF_LINK_MODE_50GBASE_SR2, + OCTEP_VF_LINK_MODE_50GBASE_CR, + OCTEP_VF_LINK_MODE_50GBASE_KR, + OCTEP_VF_LINK_MODE_50GBASE_LR, + OCTEP_VF_LINK_MODE_50GBASE_SR, + OCTEP_VF_LINK_MODE_100GBASE_CR4, + OCTEP_VF_LINK_MODE_100GBASE_KR4, + OCTEP_VF_LINK_MODE_100GBASE_LR4, + OCTEP_VF_LINK_MODE_100GBASE_SR4, + OCTEP_VF_LINK_MODE_NBITS +}; + +/* Hardware interface link state information. */ +struct octep_vf_iface_link_info { + /* Bitmap of Supported link speeds/modes. */ + u64 supported_modes; + + /* Bitmap of Advertised link speeds/modes. */ + u64 advertised_modes; + + /* Negotiated link speed in Mbps. */ + u32 speed; + + /* MTU */ + u16 mtu; + + /* Autonegotiation state. */ +#define OCTEP_VF_LINK_MODE_AUTONEG_SUPPORTED BIT(0) +#define OCTEP_VF_LINK_MODE_AUTONEG_ADVERTISED BIT(1) + u8 autoneg; + + /* Pause frames setting. */ +#define OCTEP_VF_LINK_MODE_PAUSE_SUPPORTED BIT(0) +#define OCTEP_VF_LINK_MODE_PAUSE_ADVERTISED BIT(1) + u8 pause; + + /* Admin state of the link (ifconfig up/down */ + u8 admin_up; + + /* Operational state of the link: physical link is up down */ + u8 oper_up; +}; + +/* Hardware interface stats information. */ +struct octep_vf_iface_rxtx_stats { + /* Hardware Interface Rx statistics */ + struct octep_vf_iface_rx_stats iface_rx_stats; + + /* Hardware Interface Tx statistics */ + struct octep_vf_iface_tx_stats iface_tx_stats; +}; + +struct octep_vf_fw_info { + /* pkind value to be used in every Tx hardware descriptor */ + u8 pkind; + /* front size data */ + u8 fsz; + /* supported rx offloads OCTEP_VF_RX_OFFLOAD_* */ + u16 rx_ol_flags; + /* supported tx offloads OCTEP_VF_TX_OFFLOAD_* */ + u16 tx_ol_flags; +}; + +/* The Octeon device specific private data structure. + * Each Octeon device has this structure to represent all its components. + */ +struct octep_vf_device { + struct octep_vf_config *conf; + + /* Octeon Chip type. */ + u16 chip_id; + u16 rev_id; + + /* Device capabilities enabled */ + u64 caps_enabled; + /* Device capabilities supported */ + u64 caps_supported; + + /* Pointer to basic Linux device */ + struct device *dev; + /* Linux PCI device pointer */ + struct pci_dev *pdev; + /* Netdev corresponding to the Octeon device */ + struct net_device *netdev; + + /* memory mapped io range */ + struct octep_vf_mmio mmio; + + /* MAC address */ + u8 mac_addr[ETH_ALEN]; + + /* Tx queues (IQ: Instruction Queue) */ + u16 num_iqs; + /* Pointers to Octeon Tx queues */ + struct octep_vf_iq *iq[OCTEP_VF_MAX_IQ]; + + /* Rx queues (OQ: Output Queue) */ + u16 num_oqs; + /* Pointers to Octeon Rx queues */ + struct octep_vf_oq *oq[OCTEP_VF_MAX_OQ]; + + /* Hardware port number of the PCIe interface */ + u16 pcie_port; + + /* Hardware operations */ + struct octep_vf_hw_ops hw_ops; + + /* IRQ info */ + u16 num_irqs; + u16 num_non_ioq_irqs; + char *non_ioq_irq_names; + struct msix_entry *msix_entries; + /* IOq information of it's corresponding MSI-X interrupt. */ + struct octep_vf_ioq_vector *ioq_vector[OCTEP_VF_MAX_QUEUES]; + + /* Hardware Interface Tx statistics */ + struct octep_vf_iface_tx_stats iface_tx_stats; + /* Hardware Interface Rx statistics */ + struct octep_vf_iface_rx_stats iface_rx_stats; + + /* Hardware Interface Link info like supported modes, aneg support */ + struct octep_vf_iface_link_info link_info; + + /* Mailbox to talk to VFs */ + struct octep_vf_mbox *mbox; + + /* Work entry to handle Tx timeout */ + struct work_struct tx_timeout_task; + + /* offset for iface stats */ + u32 ctrl_mbox_ifstats_offset; + + /* Negotiated Mbox version */ + u32 mbox_neg_ver; + + /* firmware info */ + struct octep_vf_fw_info fw_info; +}; + +static inline u16 OCTEP_VF_MAJOR_REV(struct octep_vf_device *oct) +{ + u16 rev = (oct->rev_id & 0xC) >> 2; + + return (rev == 0) ? 1 : rev; +} + +static inline u16 OCTEP_VF_MINOR_REV(struct octep_vf_device *oct) +{ + return (oct->rev_id & 0x3); +} + +/* Octeon CSR read/write access APIs */ +#define octep_vf_write_csr(octep_vf_dev, reg_off, value) \ + writel(value, (octep_vf_dev)->mmio.hw_addr + (reg_off)) + +#define octep_vf_write_csr64(octep_vf_dev, reg_off, val64) \ + writeq(val64, (octep_vf_dev)->mmio.hw_addr + (reg_off)) + +#define octep_vf_read_csr(octep_vf_dev, reg_off) \ + readl((octep_vf_dev)->mmio.hw_addr + (reg_off)) + +#define octep_vf_read_csr64(octep_vf_dev, reg_off) \ + readq((octep_vf_dev)->mmio.hw_addr + (reg_off)) + +extern struct workqueue_struct *octep_vf_wq; + +int octep_vf_device_setup(struct octep_vf_device *oct); +int octep_vf_setup_iqs(struct octep_vf_device *oct); +void octep_vf_free_iqs(struct octep_vf_device *oct); +void octep_vf_clean_iqs(struct octep_vf_device *oct); +int octep_vf_setup_oqs(struct octep_vf_device *oct); +void octep_vf_free_oqs(struct octep_vf_device *oct); +void octep_vf_oq_dbell_init(struct octep_vf_device *oct); +void octep_vf_device_setup_cn93(struct octep_vf_device *oct); +void octep_vf_device_setup_cnxk(struct octep_vf_device *oct); +int octep_vf_iq_process_completions(struct octep_vf_iq *iq, u16 budget); +int octep_vf_oq_process_rx(struct octep_vf_oq *oq, int budget); +void octep_vf_mbox_work(struct work_struct *work); +#endif /* _OCTEP_VF_MAIN_H_ */ diff --git a/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_mbox.c b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_mbox.c new file mode 100644 index 00000000000000..1c1fe293fc509f --- /dev/null +++ b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_mbox.c @@ -0,0 +1,96 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Marvell Octeon EP (EndPoint) VF Ethernet Driver + * + * Copyright (C) 2020 Marvell. + * + */ +#include +#include +#include +#include "octep_vf_config.h" +#include "octep_vf_main.h" + +int octep_vf_setup_mbox(struct octep_vf_device *oct) +{ + int ring = 0; + + oct->mbox = vzalloc(sizeof(*oct->mbox)); + if (!oct->mbox) + return -1; + + mutex_init(&oct->mbox->lock); + + oct->hw_ops.setup_mbox_regs(oct, ring); + INIT_WORK(&oct->mbox->wk.work, octep_vf_mbox_work); + oct->mbox->wk.ctxptr = oct; + dev_info(&oct->pdev->dev, "setup vf mbox successfully\n"); + return 0; +} + +void octep_vf_delete_mbox(struct octep_vf_device *oct) +{ + if (oct->mbox) { + if (work_pending(&oct->mbox->wk.work)) + cancel_work_sync(&oct->mbox->wk.work); + + mutex_destroy(&oct->mbox->lock); + vfree(oct->mbox); + oct->mbox = NULL; + dev_info(&oct->pdev->dev, "Deleted vf mbox successfully\n"); + } +} + +int octep_vf_mbox_version_check(struct octep_vf_device *oct) +{ + return 0; +} + +void octep_vf_mbox_work(struct work_struct *work) +{ +} + +int octep_vf_mbox_set_mtu(struct octep_vf_device *oct, int mtu) +{ + return 0; +} + +int octep_vf_mbox_set_mac_addr(struct octep_vf_device *oct, char *mac_addr) +{ + return 0; +} + +int octep_vf_mbox_get_mac_addr(struct octep_vf_device *oct, char *mac_addr) +{ + return 0; +} + +int octep_vf_mbox_set_rx_state(struct octep_vf_device *oct, bool state) +{ + return 0; +} + +int octep_vf_mbox_set_link_status(struct octep_vf_device *oct, bool status) +{ + return 0; +} + +int octep_vf_mbox_get_link_status(struct octep_vf_device *oct, u8 *oper_up) +{ + return 0; +} + +int octep_vf_mbox_dev_remove(struct octep_vf_device *oct) +{ + return 0; +} + +int octep_vf_mbox_get_fw_info(struct octep_vf_device *oct) +{ + return 0; +} + +int octep_vf_mbox_set_offloads(struct octep_vf_device *oct, u16 tx_offloads, + u16 rx_offloads) +{ + return 0; +} diff --git a/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_mbox.h b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_mbox.h new file mode 100644 index 00000000000000..14f4fb19445b4b --- /dev/null +++ b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_mbox.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Marvell Octeon EP (EndPoint) Ethernet Driver + * + * Copyright (C) 2020 Marvell. + * + */ +#ifndef _OCTEP_VF_MBOX_H_ +#define _OCTEP_VF_MBOX_H_ + +#define OCTEP_PFVF_MBOX_MAX_DATA_BUF_SIZE 256 + +int octep_vf_setup_mbox(struct octep_vf_device *oct); +void octep_vf_delete_mbox(struct octep_vf_device *oct); +int octep_vf_mbox_set_mtu(struct octep_vf_device *oct, int mtu); +int octep_vf_mbox_set_mac_addr(struct octep_vf_device *oct, char *mac_addr); +int octep_vf_mbox_get_mac_addr(struct octep_vf_device *oct, char *mac_addr); +int octep_vf_mbox_version_check(struct octep_vf_device *oct); +int octep_vf_mbox_set_rx_state(struct octep_vf_device *oct, bool state); +int octep_vf_mbox_set_link_status(struct octep_vf_device *oct, bool status); +int octep_vf_mbox_get_link_status(struct octep_vf_device *oct, u8 *oper_up); +int octep_vf_mbox_dev_remove(struct octep_vf_device *oct); +int octep_vf_mbox_get_fw_info(struct octep_vf_device *oct); +int octep_vf_mbox_set_offloads(struct octep_vf_device *oct, u16 tx_offloads, u16 rx_offloads); + +#endif diff --git a/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_regs_cn9k.h b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_regs_cn9k.h new file mode 100644 index 00000000000000..25e2a876ebba46 --- /dev/null +++ b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_regs_cn9k.h @@ -0,0 +1,154 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Marvell Octeon EP (EndPoint) VF Ethernet Driver + * + * Copyright (C) 2020 Marvell. + * + */ +#ifndef _OCTEP_VF_REGS_CN9K_H_ +#define _OCTEP_VF_REGS_CN9K_H_ + +/*############################ RST #########################*/ +#define CN93_VF_CONFIG_XPANSION_BAR 0x38 +#define CN93_VF_CONFIG_PCIE_CAP 0x70 +#define CN93_VF_CONFIG_PCIE_DEVCAP 0x74 +#define CN93_VF_CONFIG_PCIE_DEVCTL 0x78 +#define CN93_VF_CONFIG_PCIE_LINKCAP 0x7C +#define CN93_VF_CONFIG_PCIE_LINKCTL 0x80 +#define CN93_VF_CONFIG_PCIE_SLOTCAP 0x84 +#define CN93_VF_CONFIG_PCIE_SLOTCTL 0x88 + +#define CN93_VF_RING_OFFSET BIT_ULL(17) + +/*###################### RING IN REGISTERS #########################*/ +#define CN93_VF_SDP_R_IN_CONTROL_START 0x10000 +#define CN93_VF_SDP_R_IN_ENABLE_START 0x10010 +#define CN93_VF_SDP_R_IN_INSTR_BADDR_START 0x10020 +#define CN93_VF_SDP_R_IN_INSTR_RSIZE_START 0x10030 +#define CN93_VF_SDP_R_IN_INSTR_DBELL_START 0x10040 +#define CN93_VF_SDP_R_IN_CNTS_START 0x10050 +#define CN93_VF_SDP_R_IN_INT_LEVELS_START 0x10060 +#define CN93_VF_SDP_R_IN_PKT_CNT_START 0x10080 +#define CN93_VF_SDP_R_IN_BYTE_CNT_START 0x10090 + +#define CN93_VF_SDP_R_IN_CONTROL(ring) \ + (CN93_VF_SDP_R_IN_CONTROL_START + ((ring) * CN93_VF_RING_OFFSET)) + +#define CN93_VF_SDP_R_IN_ENABLE(ring) \ + (CN93_VF_SDP_R_IN_ENABLE_START + ((ring) * CN93_VF_RING_OFFSET)) + +#define CN93_VF_SDP_R_IN_INSTR_BADDR(ring) \ + (CN93_VF_SDP_R_IN_INSTR_BADDR_START + ((ring) * CN93_VF_RING_OFFSET)) + +#define CN93_VF_SDP_R_IN_INSTR_RSIZE(ring) \ + (CN93_VF_SDP_R_IN_INSTR_RSIZE_START + ((ring) * CN93_VF_RING_OFFSET)) + +#define CN93_VF_SDP_R_IN_INSTR_DBELL(ring) \ + (CN93_VF_SDP_R_IN_INSTR_DBELL_START + ((ring) * CN93_VF_RING_OFFSET)) + +#define CN93_VF_SDP_R_IN_CNTS(ring) \ + (CN93_VF_SDP_R_IN_CNTS_START + ((ring) * CN93_VF_RING_OFFSET)) + +#define CN93_VF_SDP_R_IN_INT_LEVELS(ring) \ + (CN93_VF_SDP_R_IN_INT_LEVELS_START + ((ring) * CN93_VF_RING_OFFSET)) + +#define CN93_VF_SDP_R_IN_PKT_CNT(ring) \ + (CN93_VF_SDP_R_IN_PKT_CNT_START + ((ring) * CN93_VF_RING_OFFSET)) + +#define CN93_VF_SDP_R_IN_BYTE_CNT(ring) \ + (CN93_VF_SDP_R_IN_BYTE_CNT_START + ((ring) * CN93_VF_RING_OFFSET)) + +/*------------------ R_IN Masks ----------------*/ + +/** Rings per Virtual Function **/ +#define CN93_VF_R_IN_CTL_RPVF_MASK (0xF) +#define CN93_VF_R_IN_CTL_RPVF_POS (48) + +/* Number of instructions to be read in one MAC read request. + * setting to Max value(4) + **/ +#define CN93_VF_R_IN_CTL_IDLE BIT_ULL(28) +#define CN93_VF_R_IN_CTL_RDSIZE (0x3ULL << 25) +#define CN93_VF_R_IN_CTL_IS_64B BIT_ULL(24) +#define CN93_VF_R_IN_CTL_D_NSR BIT_ULL(8) +#define CN93_VF_R_IN_CTL_D_ESR BIT_ULL(6) +#define CN93_VF_R_IN_CTL_D_ROR BIT_ULL(5) +#define CN93_VF_R_IN_CTL_NSR BIT_ULL(3) +#define CN93_VF_R_IN_CTL_ESR BIT_ULL(1) +#define CN93_VF_R_IN_CTL_ROR BIT_ULL(0) + +#define CN93_VF_R_IN_CTL_MASK (CN93_VF_R_IN_CTL_RDSIZE | CN93_VF_R_IN_CTL_IS_64B) + +/*###################### RING OUT REGISTERS #########################*/ +#define CN93_VF_SDP_R_OUT_CNTS_START 0x10100 +#define CN93_VF_SDP_R_OUT_INT_LEVELS_START 0x10110 +#define CN93_VF_SDP_R_OUT_SLIST_BADDR_START 0x10120 +#define CN93_VF_SDP_R_OUT_SLIST_RSIZE_START 0x10130 +#define CN93_VF_SDP_R_OUT_SLIST_DBELL_START 0x10140 +#define CN93_VF_SDP_R_OUT_CONTROL_START 0x10150 +#define CN93_VF_SDP_R_OUT_ENABLE_START 0x10160 +#define CN93_VF_SDP_R_OUT_PKT_CNT_START 0x10180 +#define CN93_VF_SDP_R_OUT_BYTE_CNT_START 0x10190 + +#define CN93_VF_SDP_R_OUT_CONTROL(ring) \ + (CN93_VF_SDP_R_OUT_CONTROL_START + ((ring) * CN93_VF_RING_OFFSET)) + +#define CN93_VF_SDP_R_OUT_ENABLE(ring) \ + (CN93_VF_SDP_R_OUT_ENABLE_START + ((ring) * CN93_VF_RING_OFFSET)) + +#define CN93_VF_SDP_R_OUT_SLIST_BADDR(ring) \ + (CN93_VF_SDP_R_OUT_SLIST_BADDR_START + ((ring) * CN93_VF_RING_OFFSET)) + +#define CN93_VF_SDP_R_OUT_SLIST_RSIZE(ring) \ + (CN93_VF_SDP_R_OUT_SLIST_RSIZE_START + ((ring) * CN93_VF_RING_OFFSET)) + +#define CN93_VF_SDP_R_OUT_SLIST_DBELL(ring) \ + (CN93_VF_SDP_R_OUT_SLIST_DBELL_START + ((ring) * CN93_VF_RING_OFFSET)) + +#define CN93_VF_SDP_R_OUT_CNTS(ring) \ + (CN93_VF_SDP_R_OUT_CNTS_START + ((ring) * CN93_VF_RING_OFFSET)) + +#define CN93_VF_SDP_R_OUT_INT_LEVELS(ring) \ + (CN93_VF_SDP_R_OUT_INT_LEVELS_START + ((ring) * CN93_VF_RING_OFFSET)) + +#define CN93_VF_SDP_R_OUT_PKT_CNT(ring) \ + (CN93_VF_SDP_R_OUT_PKT_CNT_START + ((ring) * CN93_VF_RING_OFFSET)) + +#define CN93_VF_SDP_R_OUT_BYTE_CNT(ring) \ + (CN93_VF_SDP_R_OUT_BYTE_CNT_START + ((ring) * CN93_VF_RING_OFFSET)) + +/*------------------ R_OUT Masks ----------------*/ +#define CN93_VF_R_OUT_INT_LEVELS_BMODE BIT_ULL(63) +#define CN93_VF_R_OUT_INT_LEVELS_TIMET (32) + +#define CN93_VF_R_OUT_CTL_IDLE BIT_ULL(40) +#define CN93_VF_R_OUT_CTL_ES_I BIT_ULL(34) +#define CN93_VF_R_OUT_CTL_NSR_I BIT_ULL(33) +#define CN93_VF_R_OUT_CTL_ROR_I BIT_ULL(32) +#define CN93_VF_R_OUT_CTL_ES_D BIT_ULL(30) +#define CN93_VF_R_OUT_CTL_NSR_D BIT_ULL(29) +#define CN93_VF_R_OUT_CTL_ROR_D BIT_ULL(28) +#define CN93_VF_R_OUT_CTL_ES_P BIT_ULL(26) +#define CN93_VF_R_OUT_CTL_NSR_P BIT_ULL(25) +#define CN93_VF_R_OUT_CTL_ROR_P BIT_ULL(24) +#define CN93_VF_R_OUT_CTL_IMODE BIT_ULL(23) + +/* ##################### Mail Box Registers ########################## */ +/* SDP PF to VF Mailbox Data Register */ +#define CN93_VF_SDP_R_MBOX_PF_VF_DATA_START 0x10210 +/* SDP Packet PF to VF Mailbox Interrupt Register */ +#define CN93_VF_SDP_R_MBOX_PF_VF_INT_START 0x10220 +/* SDP VF to PF Mailbox Data Register */ +#define CN93_VF_SDP_R_MBOX_VF_PF_DATA_START 0x10230 + +#define CN93_VF_SDP_R_MBOX_PF_VF_INT_ENAB BIT_ULL(1) +#define CN93_VF_SDP_R_MBOX_PF_VF_INT_STATUS BIT_ULL(0) + +#define CN93_VF_SDP_R_MBOX_PF_VF_DATA(ring) \ + (CN93_VF_SDP_R_MBOX_PF_VF_DATA_START + ((ring) * CN93_VF_RING_OFFSET)) + +#define CN93_VF_SDP_R_MBOX_PF_VF_INT(ring) \ + (CN93_VF_SDP_R_MBOX_PF_VF_INT_START + ((ring) * CN93_VF_RING_OFFSET)) + +#define CN93_VF_SDP_R_MBOX_VF_PF_DATA(ring) \ + (CN93_VF_SDP_R_MBOX_VF_PF_DATA_START + ((ring) * CN93_VF_RING_OFFSET)) +#endif /* _OCTEP_VF_REGS_CN9K_H_ */ diff --git a/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_regs_cnxk.h b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_regs_cnxk.h new file mode 100644 index 00000000000000..2e156745ef6406 --- /dev/null +++ b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_regs_cnxk.h @@ -0,0 +1,162 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Marvell Octeon EP (EndPoint) VF Ethernet Driver + * + * Copyright (C) 2020 Marvell. + * + */ +#ifndef _OCTEP_VF_REGS_CNXK_H_ +#define _OCTEP_VF_REGS_CNXK_H_ + +/*############################ RST #########################*/ +#define CNXK_VF_CONFIG_XPANSION_BAR 0x38 +#define CNXK_VF_CONFIG_PCIE_CAP 0x70 +#define CNXK_VF_CONFIG_PCIE_DEVCAP 0x74 +#define CNXK_VF_CONFIG_PCIE_DEVCTL 0x78 +#define CNXK_VF_CONFIG_PCIE_LINKCAP 0x7C +#define CNXK_VF_CONFIG_PCIE_LINKCTL 0x80 +#define CNXK_VF_CONFIG_PCIE_SLOTCAP 0x84 +#define CNXK_VF_CONFIG_PCIE_SLOTCTL 0x88 + +#define CNXK_VF_RING_OFFSET (0x1ULL << 17) + +/*###################### RING IN REGISTERS #########################*/ +#define CNXK_VF_SDP_R_IN_CONTROL_START 0x10000 +#define CNXK_VF_SDP_R_IN_ENABLE_START 0x10010 +#define CNXK_VF_SDP_R_IN_INSTR_BADDR_START 0x10020 +#define CNXK_VF_SDP_R_IN_INSTR_RSIZE_START 0x10030 +#define CNXK_VF_SDP_R_IN_INSTR_DBELL_START 0x10040 +#define CNXK_VF_SDP_R_IN_CNTS_START 0x10050 +#define CNXK_VF_SDP_R_IN_INT_LEVELS_START 0x10060 +#define CNXK_VF_SDP_R_IN_PKT_CNT_START 0x10080 +#define CNXK_VF_SDP_R_IN_BYTE_CNT_START 0x10090 +#define CNXK_VF_SDP_R_ERR_TYPE_START 0x10400 + +#define CNXK_VF_SDP_R_ERR_TYPE(ring) \ + (CNXK_VF_SDP_R_ERR_TYPE_START + ((ring) * CNXK_VF_RING_OFFSET)) + +#define CNXK_VF_SDP_R_IN_CONTROL(ring) \ + (CNXK_VF_SDP_R_IN_CONTROL_START + ((ring) * CNXK_VF_RING_OFFSET)) + +#define CNXK_VF_SDP_R_IN_ENABLE(ring) \ + (CNXK_VF_SDP_R_IN_ENABLE_START + ((ring) * CNXK_VF_RING_OFFSET)) + +#define CNXK_VF_SDP_R_IN_INSTR_BADDR(ring) \ + (CNXK_VF_SDP_R_IN_INSTR_BADDR_START + ((ring) * CNXK_VF_RING_OFFSET)) + +#define CNXK_VF_SDP_R_IN_INSTR_RSIZE(ring) \ + (CNXK_VF_SDP_R_IN_INSTR_RSIZE_START + ((ring) * CNXK_VF_RING_OFFSET)) + +#define CNXK_VF_SDP_R_IN_INSTR_DBELL(ring) \ + (CNXK_VF_SDP_R_IN_INSTR_DBELL_START + ((ring) * CNXK_VF_RING_OFFSET)) + +#define CNXK_VF_SDP_R_IN_CNTS(ring) \ + (CNXK_VF_SDP_R_IN_CNTS_START + ((ring) * CNXK_VF_RING_OFFSET)) + +#define CNXK_VF_SDP_R_IN_INT_LEVELS(ring) \ + (CNXK_VF_SDP_R_IN_INT_LEVELS_START + ((ring) * CNXK_VF_RING_OFFSET)) + +#define CNXK_VF_SDP_R_IN_PKT_CNT(ring) \ + (CNXK_VF_SDP_R_IN_PKT_CNT_START + ((ring) * CNXK_VF_RING_OFFSET)) + +#define CNXK_VF_SDP_R_IN_BYTE_CNT(ring) \ + (CNXK_VF_SDP_R_IN_BYTE_CNT_START + ((ring) * CNXK_VF_RING_OFFSET)) + +/*------------------ R_IN Masks ----------------*/ + +/** Rings per Virtual Function **/ +#define CNXK_VF_R_IN_CTL_RPVF_MASK (0xF) +#define CNXK_VF_R_IN_CTL_RPVF_POS (48) + +/* Number of instructions to be read in one MAC read request. + * setting to Max value(4) + **/ +#define CNXK_VF_R_IN_CTL_IDLE (0x1ULL << 28) +#define CNXK_VF_R_IN_CTL_RDSIZE (0x3ULL << 25) +#define CNXK_VF_R_IN_CTL_IS_64B (0x1ULL << 24) +#define CNXK_VF_R_IN_CTL_D_NSR (0x1ULL << 8) +#define CNXK_VF_R_IN_CTL_D_ESR (0x1ULL << 6) +#define CNXK_VF_R_IN_CTL_D_ROR (0x1ULL << 5) +#define CNXK_VF_R_IN_CTL_NSR (0x1ULL << 3) +#define CNXK_VF_R_IN_CTL_ESR (0x1ULL << 1) +#define CNXK_VF_R_IN_CTL_ROR (0x1ULL << 0) + +#define CNXK_VF_R_IN_CTL_MASK (CNXK_VF_R_IN_CTL_RDSIZE | CNXK_VF_R_IN_CTL_IS_64B) + +/*###################### RING OUT REGISTERS #########################*/ +#define CNXK_VF_SDP_R_OUT_CNTS_START 0x10100 +#define CNXK_VF_SDP_R_OUT_INT_LEVELS_START 0x10110 +#define CNXK_VF_SDP_R_OUT_SLIST_BADDR_START 0x10120 +#define CNXK_VF_SDP_R_OUT_SLIST_RSIZE_START 0x10130 +#define CNXK_VF_SDP_R_OUT_SLIST_DBELL_START 0x10140 +#define CNXK_VF_SDP_R_OUT_CONTROL_START 0x10150 +#define CNXK_VF_SDP_R_OUT_WMARK_START 0x10160 +#define CNXK_VF_SDP_R_OUT_ENABLE_START 0x10170 +#define CNXK_VF_SDP_R_OUT_PKT_CNT_START 0x10180 +#define CNXK_VF_SDP_R_OUT_BYTE_CNT_START 0x10190 + +#define CNXK_VF_SDP_R_OUT_CONTROL(ring) \ + (CNXK_VF_SDP_R_OUT_CONTROL_START + ((ring) * CNXK_VF_RING_OFFSET)) + +#define CNXK_VF_SDP_R_OUT_ENABLE(ring) \ + (CNXK_VF_SDP_R_OUT_ENABLE_START + ((ring) * CNXK_VF_RING_OFFSET)) + +#define CNXK_VF_SDP_R_OUT_SLIST_BADDR(ring) \ + (CNXK_VF_SDP_R_OUT_SLIST_BADDR_START + ((ring) * CNXK_VF_RING_OFFSET)) + +#define CNXK_VF_SDP_R_OUT_SLIST_RSIZE(ring) \ + (CNXK_VF_SDP_R_OUT_SLIST_RSIZE_START + ((ring) * CNXK_VF_RING_OFFSET)) + +#define CNXK_VF_SDP_R_OUT_SLIST_DBELL(ring) \ + (CNXK_VF_SDP_R_OUT_SLIST_DBELL_START + ((ring) * CNXK_VF_RING_OFFSET)) + +#define CNXK_VF_SDP_R_OUT_WMARK(ring) \ + (CNXK_VF_SDP_R_OUT_WMARK_START + ((ring) * CNXK_VF_RING_OFFSET)) + +#define CNXK_VF_SDP_R_OUT_CNTS(ring) \ + (CNXK_VF_SDP_R_OUT_CNTS_START + ((ring) * CNXK_VF_RING_OFFSET)) + +#define CNXK_VF_SDP_R_OUT_INT_LEVELS(ring) \ + (CNXK_VF_SDP_R_OUT_INT_LEVELS_START + ((ring) * CNXK_VF_RING_OFFSET)) + +#define CNXK_VF_SDP_R_OUT_PKT_CNT(ring) \ + (CNXK_VF_SDP_R_OUT_PKT_CNT_START + ((ring) * CNXK_VF_RING_OFFSET)) + +#define CNXK_VF_SDP_R_OUT_BYTE_CNT(ring) \ + (CNXK_VF_SDP_R_OUT_BYTE_CNT_START + ((ring) * CNXK_VF_RING_OFFSET)) + +/*------------------ R_OUT Masks ----------------*/ +#define CNXK_VF_R_OUT_INT_LEVELS_BMODE BIT_ULL(63) +#define CNXK_VF_R_OUT_INT_LEVELS_TIMET (32) + +#define CNXK_VF_R_OUT_CTL_IDLE BIT_ULL(40) +#define CNXK_VF_R_OUT_CTL_ES_I BIT_ULL(34) +#define CNXK_VF_R_OUT_CTL_NSR_I BIT_ULL(33) +#define CNXK_VF_R_OUT_CTL_ROR_I BIT_ULL(32) +#define CNXK_VF_R_OUT_CTL_ES_D BIT_ULL(30) +#define CNXK_VF_R_OUT_CTL_NSR_D BIT_ULL(29) +#define CNXK_VF_R_OUT_CTL_ROR_D BIT_ULL(28) +#define CNXK_VF_R_OUT_CTL_ES_P BIT_ULL(26) +#define CNXK_VF_R_OUT_CTL_NSR_P BIT_ULL(25) +#define CNXK_VF_R_OUT_CTL_ROR_P BIT_ULL(24) +#define CNXK_VF_R_OUT_CTL_IMODE BIT_ULL(23) + +/* ##################### Mail Box Registers ########################## */ +/* SDP PF to VF Mailbox Data Register */ +#define CNXK_VF_SDP_R_MBOX_PF_VF_DATA_START 0x10210 +/* SDP Packet PF to VF Mailbox Interrupt Register */ +#define CNXK_VF_SDP_R_MBOX_PF_VF_INT_START 0x10220 +/* SDP VF to PF Mailbox Data Register */ +#define CNXK_VF_SDP_R_MBOX_VF_PF_DATA_START 0x10230 + +#define CNXK_VF_SDP_R_MBOX_PF_VF_INT_ENAB BIT_ULL(1) +#define CNXK_VF_SDP_R_MBOX_PF_VF_INT_STATUS BIT_ULL(0) + +#define CNXK_VF_SDP_R_MBOX_PF_VF_DATA(ring) \ + (CNXK_VF_SDP_R_MBOX_PF_VF_DATA_START + ((ring) * CNXK_VF_RING_OFFSET)) + +#define CNXK_VF_SDP_R_MBOX_PF_VF_INT(ring) \ + (CNXK_VF_SDP_R_MBOX_PF_VF_INT_START + ((ring) * CNXK_VF_RING_OFFSET)) + +#define CNXK_VF_SDP_R_MBOX_VF_PF_DATA(ring) \ + (CNXK_VF_SDP_R_MBOX_VF_PF_DATA_START + ((ring) * CNXK_VF_RING_OFFSET)) +#endif /* _OCTEP_VF_REGS_CNXK_H_ */ diff --git a/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_rx.c b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_rx.c new file mode 100644 index 00000000000000..4f1a8157ce3900 --- /dev/null +++ b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_rx.c @@ -0,0 +1,42 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Marvell Octeon EP (EndPoint) VF Ethernet Driver + * + * Copyright (C) 2020 Marvell. + * + */ + +#include +#include + +#include "octep_vf_config.h" +#include "octep_vf_main.h" + +/** + * octep_vf_setup_oqs() - setup resources for all Rx queues. + * + * @oct: Octeon device private data structure. + */ +int octep_vf_setup_oqs(struct octep_vf_device *oct) +{ + return -1; +} + +/** + * octep_vf_oq_dbell_init() - Initialize Rx queue doorbell. + * + * @oct: Octeon device private data structure. + * + * Write number of descriptors to Rx queue doorbell register. + */ +void octep_vf_oq_dbell_init(struct octep_vf_device *oct) +{ +} + +/** + * octep_vf_free_oqs() - Free resources of all Rx queues. + * + * @oct: Octeon device private data structure. + */ +void octep_vf_free_oqs(struct octep_vf_device *oct) +{ +} diff --git a/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_rx.h b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_rx.h new file mode 100644 index 00000000000000..fe46838b5200ff --- /dev/null +++ b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_rx.h @@ -0,0 +1,224 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Marvell Octeon EP (EndPoint) VF Ethernet Driver + * + * Copyright (C) 2020 Marvell. + * + */ + +#ifndef _OCTEP_VF_RX_H_ +#define _OCTEP_VF_RX_H_ + +/* struct octep_vf_oq_desc_hw - Octeon Hardware OQ descriptor format. + * + * The descriptor ring is made of descriptors which have 2 64-bit values: + * + * @buffer_ptr: DMA address of the skb->data + * @info_ptr: DMA address of host memory, used to update pkt count by hw. + * This is currently unused to save pci writes. + */ +struct octep_vf_oq_desc_hw { + dma_addr_t buffer_ptr; + u64 info_ptr; +}; + +static_assert(sizeof(struct octep_vf_oq_desc_hw) == 16); + +#define OCTEP_VF_OQ_DESC_SIZE (sizeof(struct octep_vf_oq_desc_hw)) + +/* Rx offload flags */ +#define OCTEP_VF_RX_OFFLOAD_VLAN_STRIP BIT(0) +#define OCTEP_VF_RX_OFFLOAD_IPV4_CKSUM BIT(1) +#define OCTEP_VF_RX_OFFLOAD_UDP_CKSUM BIT(2) +#define OCTEP_VF_RX_OFFLOAD_TCP_CKSUM BIT(3) + +#define OCTEP_VF_RX_OFFLOAD_CKSUM (OCTEP_VF_RX_OFFLOAD_IPV4_CKSUM | \ + OCTEP_VF_RX_OFFLOAD_UDP_CKSUM | \ + OCTEP_VF_RX_OFFLOAD_TCP_CKSUM) + +#define OCTEP_VF_RX_IP_CSUM(flags) ((flags) & \ + (OCTEP_VF_RX_OFFLOAD_IPV4_CKSUM | \ + OCTEP_VF_RX_OFFLOAD_TCP_CKSUM | \ + OCTEP_VF_RX_OFFLOAD_UDP_CKSUM)) + +/* bit 0 is vlan strip */ +#define OCTEP_VF_RX_CSUM_IP_VERIFIED BIT(1) +#define OCTEP_VF_RX_CSUM_L4_VERIFIED BIT(2) + +#define OCTEP_VF_RX_CSUM_VERIFIED(flags) ((flags) & \ + (OCTEP_VF_RX_CSUM_L4_VERIFIED | \ + OCTEP_VF_RX_CSUM_IP_VERIFIED)) + +/* Extended Response Header in packet data received from Hardware. + * Includes metadata like checksum status. + * this is valid only if hardware/firmware published support for this. + * This is at offset 0 of packet data (skb->data). + */ +struct octep_vf_oq_resp_hw_ext { + /* Reserved. */ + u64 rsvd:48; + + /* rx offload flags */ + u16 rx_ol_flags; +}; + +static_assert(sizeof(struct octep_vf_oq_resp_hw_ext) == 8); + +#define OCTEP_VF_OQ_RESP_HW_EXT_SIZE (sizeof(struct octep_vf_oq_resp_hw_ext)) + +/* Length of Rx packet DMA'ed by Octeon to Host. + * this is in bigendian; so need to be converted to cpu endian. + * Octeon writes this at the beginning of Rx buffer (skb->data). + */ +struct octep_vf_oq_resp_hw { + /* The Length of the packet. */ + __be64 length; +}; + +static_assert(sizeof(struct octep_vf_oq_resp_hw) == 8); + +#define OCTEP_VF_OQ_RESP_HW_SIZE (sizeof(struct octep_vf_oq_resp_hw)) + +/* Pointer to data buffer. + * Driver keeps a pointer to the data buffer that it made available to + * the Octeon device. Since the descriptor ring keeps physical (bus) + * addresses, this field is required for the driver to keep track of + * the virtual address pointers. The fields are operated by + * OS-dependent routines. + */ +struct octep_vf_rx_buffer { + struct page *page; + + /* length from rx hardware descriptor after converting to cpu endian */ + u64 len; +}; + +#define OCTEP_VF_OQ_RECVBUF_SIZE (sizeof(struct octep_vf_rx_buffer)) + +/* Output Queue statistics. Each output queue has four stats fields. */ +struct octep_vf_oq_stats { + /* Number of packets received from the Device. */ + u64 packets; + + /* Number of bytes received from the Device. */ + u64 bytes; + + /* Number of times failed to allocate buffers. */ + u64 alloc_failures; +}; + +#define OCTEP_VF_OQ_STATS_SIZE (sizeof(struct octep_vf_oq_stats)) + +/* Hardware interface Rx statistics */ +struct octep_vf_iface_rx_stats { + /* Received packets */ + u64 pkts; + + /* Octets of received packets */ + u64 octets; + + /* Received PAUSE and Control packets */ + u64 pause_pkts; + + /* Received PAUSE and Control octets */ + u64 pause_octets; + + /* Filtered DMAC0 packets */ + u64 dmac0_pkts; + + /* Filtered DMAC0 octets */ + u64 dmac0_octets; + + /* Packets dropped due to RX FIFO full */ + u64 dropped_pkts_fifo_full; + + /* Octets dropped due to RX FIFO full */ + u64 dropped_octets_fifo_full; + + /* Error packets */ + u64 err_pkts; + + /* Filtered DMAC1 packets */ + u64 dmac1_pkts; + + /* Filtered DMAC1 octets */ + u64 dmac1_octets; + + /* NCSI-bound packets dropped */ + u64 ncsi_dropped_pkts; + + /* NCSI-bound octets dropped */ + u64 ncsi_dropped_octets; + + /* Multicast packets received. */ + u64 mcast_pkts; + + /* Broadcast packets received. */ + u64 bcast_pkts; + +}; + +/* The Descriptor Ring Output Queue structure. + * This structure has all the information required to implement a + * Octeon OQ. + */ +struct octep_vf_oq { + u32 q_no; + + struct octep_vf_device *octep_vf_dev; + struct net_device *netdev; + struct device *dev; + + struct napi_struct *napi; + + /* The receive buffer list. This list has the virtual addresses + * of the buffers. + */ + struct octep_vf_rx_buffer *buff_info; + + /* Pointer to the mapped packet credit register. + * Host writes number of info/buffer ptrs available to this register + */ + u8 __iomem *pkts_credit_reg; + + /* Pointer to the mapped packet sent register. + * Octeon writes the number of packets DMA'ed to host memory + * in this register. + */ + u8 __iomem *pkts_sent_reg; + + /* Statistics for this OQ. */ + struct octep_vf_oq_stats stats; + + /* Packets pending to be processed */ + u32 pkts_pending; + u32 last_pkt_count; + + /* Index in the ring where the driver should read the next packet */ + u32 host_read_idx; + + /* Number of descriptors in this ring. */ + u32 max_count; + u32 ring_size_mask; + + /* The number of descriptors pending refill. */ + u32 refill_count; + + /* Index in the ring where the driver will refill the + * descriptor's buffer + */ + u32 host_refill_idx; + u32 refill_threshold; + + /* The size of each buffer pointed by the buffer pointer. */ + u32 buffer_size; + u32 max_single_buffer_size; + + /* The 8B aligned descriptor ring starts at this address. */ + struct octep_vf_oq_desc_hw *desc_ring; + + /* DMA mapped address of the OQ descriptor ring. */ + dma_addr_t desc_ring_dma; +}; + +#define OCTEP_VF_OQ_SIZE (sizeof(struct octep_vf_oq)) +#endif /* _OCTEP_VF_RX_H_ */ diff --git a/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_tx.c b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_tx.c new file mode 100644 index 00000000000000..232ba479ecf6e5 --- /dev/null +++ b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_tx.c @@ -0,0 +1,43 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Marvell Octeon EP (EndPoint) VF Ethernet Driver + * + * Copyright (C) 2020 Marvell. + * + */ + +#include +#include + +#include "octep_vf_config.h" +#include "octep_vf_main.h" + +/** + * octep_vf_clean_iqs() - Clean Tx queues to shutdown the device. + * + * @oct: Octeon device private data structure. + * + * Free the buffers in Tx queue descriptors pending completion and + * reset queue indices + */ +void octep_vf_clean_iqs(struct octep_vf_device *oct) +{ +} + +/** + * octep_vf_setup_iqs() - setup resources for all Tx queues. + * + * @oct: Octeon device private data structure. + */ +int octep_vf_setup_iqs(struct octep_vf_device *oct) +{ + return -1; +} + +/** + * octep_vf_free_iqs() - Free resources of all Tx queues. + * + * @oct: Octeon device private data structure. + */ +void octep_vf_free_iqs(struct octep_vf_device *oct) +{ +} diff --git a/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_tx.h b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_tx.h new file mode 100644 index 00000000000000..f338b975103c30 --- /dev/null +++ b/drivers/net/ethernet/marvell/octeon_ep_vf/octep_vf_tx.h @@ -0,0 +1,276 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Marvell Octeon EP (EndPoint) VF Ethernet Driver + * + * Copyright (C) 2020 Marvell. + * + */ + +#ifndef _OCTEP_VF_TX_H_ +#define _OCTEP_VF_TX_H_ + +#define IQ_SEND_OK 0 +#define IQ_SEND_STOP 1 +#define IQ_SEND_FAILED -1 + +#define TX_BUFTYPE_NONE 0 +#define TX_BUFTYPE_NET 1 +#define TX_BUFTYPE_NET_SG 2 +#define NUM_TX_BUFTYPES 3 + +/* Hardware format for Scatter/Gather list + * + * 63 48|47 32|31 16|15 0 + * ----------------------------------------- + * | Len 0 | Len 1 | Len 2 | Len 3 | + * ----------------------------------------- + * | Ptr 0 | + * ----------------------------------------- + * | Ptr 1 | + * ----------------------------------------- + * | Ptr 2 | + * ----------------------------------------- + * | Ptr 3 | + * ----------------------------------------- + */ +struct octep_vf_tx_sglist_desc { + u16 len[4]; + dma_addr_t dma_ptr[4]; +}; + +static_assert(sizeof(struct octep_vf_tx_sglist_desc) == 40); + +/* Each Scatter/Gather entry sent to hardwar hold four pointers. + * So, number of entries required is (MAX_SKB_FRAGS + 1)/4, where '+1' + * is for main skb which also goes as a gather buffer to Octeon hardware. + * To allocate sufficient SGLIST entries for a packet with max fragments, + * align by adding 3 before calcuating max SGLIST entries per packet. + */ +#define OCTEP_VF_SGLIST_ENTRIES_PER_PKT ((MAX_SKB_FRAGS + 1 + 3) / 4) +#define OCTEP_VF_SGLIST_SIZE_PER_PKT \ + (OCTEP_VF_SGLIST_ENTRIES_PER_PKT * sizeof(struct octep_vf_tx_sglist_desc)) + +struct octep_vf_tx_buffer { + struct sk_buff *skb; + dma_addr_t dma; + struct octep_vf_tx_sglist_desc *sglist; + dma_addr_t sglist_dma; + u8 gather; +}; + +#define OCTEP_VF_IQ_TXBUFF_INFO_SIZE (sizeof(struct octep_vf_tx_buffer)) + +/* VF Hardware interface Tx statistics */ +struct octep_vf_iface_tx_stats { + /* Total frames sent on the interface */ + u64 pkts; + + /* Total octets sent on the interface */ + u64 octs; + + /* Packets sent to a broadcast DMAC */ + u64 bcst; + + /* Packets sent to the multicast DMAC */ + u64 mcst; + + /* Packets dropped */ + u64 dropped; + + /* Reserved */ + u64 reserved[13]; +}; + +/* VF Input Queue statistics */ +struct octep_vf_iq_stats { + /* Instructions posted to this queue. */ + u64 instr_posted; + + /* Instructions copied by hardware for processing. */ + u64 instr_completed; + + /* Instructions that could not be processed. */ + u64 instr_dropped; + + /* Bytes sent through this queue. */ + u64 bytes_sent; + + /* Gather entries sent through this queue. */ + u64 sgentry_sent; + + /* Number of transmit failures due to TX_BUSY */ + u64 tx_busy; + + /* Number of times the queue is restarted */ + u64 restart_cnt; +}; + +/* The instruction (input) queue. + * The input queue is used to post raw (instruction) mode data or packet + * data to Octeon device from the host. Each input queue (up to 4) for + * a Octeon device has one such structure to represent it. + */ +struct octep_vf_iq { + u32 q_no; + + struct octep_vf_device *octep_vf_dev; + struct net_device *netdev; + struct device *dev; + struct netdev_queue *netdev_q; + + /* Index in input ring where driver should write the next packet */ + u16 host_write_index; + + /* Index in input ring where Octeon is expected to read next packet */ + u16 octep_vf_read_index; + + /* This index aids in finding the window in the queue where Octeon + * has read the commands. + */ + u16 flush_index; + + /* Statistics for this input queue. */ + struct octep_vf_iq_stats stats; + + /* Pointer to the Virtual Base addr of the input ring. */ + struct octep_vf_tx_desc_hw *desc_ring; + + /* DMA mapped base address of the input descriptor ring. */ + dma_addr_t desc_ring_dma; + + /* Info of Tx buffers pending completion. */ + struct octep_vf_tx_buffer *buff_info; + + /* Base pointer to Scatter/Gather lists for all ring descriptors. */ + struct octep_vf_tx_sglist_desc *sglist; + + /* DMA mapped addr of Scatter Gather Lists */ + dma_addr_t sglist_dma; + + /* Octeon doorbell register for the ring. */ + u8 __iomem *doorbell_reg; + + /* Octeon instruction count register for this ring. */ + u8 __iomem *inst_cnt_reg; + + /* interrupt level register for this ring */ + u8 __iomem *intr_lvl_reg; + + /* Maximum no. of instructions in this queue. */ + u32 max_count; + u32 ring_size_mask; + + u32 pkt_in_done; + u32 pkts_processed; + + u32 status; + + /* Number of instructions pending to be posted to Octeon. */ + u32 fill_cnt; + + /* The max. number of instructions that can be held pending by the + * driver before ringing doorbell. + */ + u32 fill_threshold; +}; + +/* Hardware Tx Instruction Header */ +struct octep_vf_instr_hdr { + /* Data Len */ + u64 tlen:16; + + /* Reserved */ + u64 rsvd:20; + + /* PKIND for SDP */ + u64 pkind:6; + + /* Front Data size */ + u64 fsz:6; + + /* No. of entries in gather list */ + u64 gsz:14; + + /* Gather indicator 1=gather*/ + u64 gather:1; + + /* Reserved3 */ + u64 reserved3:1; +}; + +static_assert(sizeof(struct octep_vf_instr_hdr) == 8); + +/* Tx offload flags */ +#define OCTEP_VF_TX_OFFLOAD_VLAN_INSERT BIT(0) +#define OCTEP_VF_TX_OFFLOAD_IPV4_CKSUM BIT(1) +#define OCTEP_VF_TX_OFFLOAD_UDP_CKSUM BIT(2) +#define OCTEP_VF_TX_OFFLOAD_TCP_CKSUM BIT(3) +#define OCTEP_VF_TX_OFFLOAD_SCTP_CKSUM BIT(4) +#define OCTEP_VF_TX_OFFLOAD_TCP_TSO BIT(5) +#define OCTEP_VF_TX_OFFLOAD_UDP_TSO BIT(6) + +#define OCTEP_VF_TX_OFFLOAD_CKSUM (OCTEP_VF_TX_OFFLOAD_IPV4_CKSUM | \ + OCTEP_VF_TX_OFFLOAD_UDP_CKSUM | \ + OCTEP_VF_TX_OFFLOAD_TCP_CKSUM) + +#define OCTEP_VF_TX_OFFLOAD_TSO (OCTEP_VF_TX_OFFLOAD_TCP_TSO | \ + OCTEP_VF_TX_OFFLOAD_UDP_TSO) + +#define OCTEP_VF_TX_IP_CSUM(flags) ((flags) & \ + (OCTEP_VF_TX_OFFLOAD_IPV4_CKSUM | \ + OCTEP_VF_TX_OFFLOAD_TCP_CKSUM | \ + OCTEP_VF_TX_OFFLOAD_UDP_CKSUM)) + +#define OCTEP_VF_TX_TSO(flags) ((flags) & \ + (OCTEP_VF_TX_OFFLOAD_TCP_TSO | \ + OCTEP_VF_TX_OFFLOAD_UDP_TSO)) + +struct tx_mdata { + /* offload flags */ + u16 ol_flags; + + /* gso size */ + u16 gso_size; + + /* gso flags */ + u16 gso_segs; + + /* reserved */ + u16 rsvd1; + + /* reserved */ + u64 rsvd2; +}; + +static_assert(sizeof(struct tx_mdata) == 16); + +/* 64-byte Tx instruction format. + * Format of instruction for a 64-byte mode input queue. + * + * only first 16-bytes (dptr and ih) are mandatory; rest are optional + * and filled by the driver based on firmware/hardware capabilities. + * These optional headers together called Front Data and its size is + * described by ih->fsz. + */ +struct octep_vf_tx_desc_hw { + /* Pointer where the input data is available. */ + u64 dptr; + + /* Instruction Header. */ + union { + struct octep_vf_instr_hdr ih; + u64 ih64; + }; + + union { + u64 txm64[2]; + struct tx_mdata txm; + }; + + /* Additional headers available in a 64-byte instruction. */ + u64 exhdr[4]; +}; + +static_assert(sizeof(struct octep_vf_tx_desc_hw) == 64); + +#define OCTEP_VF_IQ_DESC_SIZE (sizeof(struct octep_vf_tx_desc_hw)) +#endif /* _OCTEP_VF_TX_H_ */ -- cgit 1.2.3-korg From 1963e65b3dfee3f42dcb5d40b28764ec9939792c Mon Sep 17 00:00:00 2001 From: Suraj Jaiswal Date: Fri, 9 Feb 2024 14:20:11 +0530 Subject: dt-bindings: net: qcom,ethqos: add binding doc for safety IRQ for sa8775p Add binding doc for safety IRQ. The safety IRQ will be triggered for ECC(error correction code), DPP(data path parity), FSM(finite state machine) error. Signed-off-by: Suraj Jaiswal Reviewed-by: Rob Herring Signed-off-by: David S. Miller --- Documentation/devicetree/bindings/net/qcom,ethqos.yaml | 9 ++++++--- Documentation/devicetree/bindings/net/snps,dwmac.yaml | 6 ++++-- 2 files changed, 10 insertions(+), 5 deletions(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/qcom,ethqos.yaml b/Documentation/devicetree/bindings/net/qcom,ethqos.yaml index 7bdb412a018553..69a337c7e345ea 100644 --- a/Documentation/devicetree/bindings/net/qcom,ethqos.yaml +++ b/Documentation/devicetree/bindings/net/qcom,ethqos.yaml @@ -37,12 +37,14 @@ properties: items: - description: Combined signal for various interrupt events - description: The interrupt that occurs when Rx exits the LPI state + - description: The interrupt that occurs when HW safety error triggered interrupt-names: minItems: 1 items: - const: macirq - - const: eth_lpi + - enum: [eth_lpi, sfty] + - const: sfty clocks: maxItems: 4 @@ -89,8 +91,9 @@ examples: <&gcc GCC_ETH_PTP_CLK>, <&gcc GCC_ETH_RGMII_CLK>; interrupts = , - ; - interrupt-names = "macirq", "eth_lpi"; + , + ; + interrupt-names = "macirq", "eth_lpi", "sfty"; rx-fifo-depth = <4096>; tx-fifo-depth = <4096>; diff --git a/Documentation/devicetree/bindings/net/snps,dwmac.yaml b/Documentation/devicetree/bindings/net/snps,dwmac.yaml index 90c4db178c676c..6b0341a8e0ea5f 100644 --- a/Documentation/devicetree/bindings/net/snps,dwmac.yaml +++ b/Documentation/devicetree/bindings/net/snps,dwmac.yaml @@ -108,13 +108,15 @@ properties: - description: Combined signal for various interrupt events - description: The interrupt to manage the remote wake-up packet detection - description: The interrupt that occurs when Rx exits the LPI state + - description: The interrupt that occurs when HW safety error triggered interrupt-names: minItems: 1 items: - const: macirq - - enum: [eth_wake_irq, eth_lpi] - - const: eth_lpi + - enum: [eth_wake_irq, eth_lpi, sfty] + - enum: [eth_wake_irq, eth_lpi, sfty] + - enum: [eth_wake_irq, eth_lpi, sfty] clocks: minItems: 1 -- cgit 1.2.3-korg From fec846fa7eddf7bb651bf88bd78c7db1410ae3b1 Mon Sep 17 00:00:00 2001 From: Nicolas Maier Date: Sat, 20 Jan 2024 09:10:18 +0100 Subject: can: bcm: add recvmsg flags for own, local and remote traffic CAN RAW sockets allow userspace to tell if a received CAN frame comes from the same socket, another socket on the same host, or another host. See commit 1e55659ce6dd ("can-raw: add msg_flags to distinguish local traffic"). However, this feature is missing in CAN BCM sockets. Add the same feature to CAN BCM sockets. When reading a received frame (opcode RX_CHANGED) using recvmsg, two flags in msg->msg_flags may be set following the previous convention (from CAN RAW), to distinguish between 'own', 'local' and 'remote' CAN traffic. Update the documentation to reflect this change. Signed-off-by: Nicolas Maier Signed-off-by: Oliver Hartkopp Link: https://lore.kernel.org/all/20240120081018.2319-1-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde --- Documentation/networking/can.rst | 34 ++++++++++---------- net/can/bcm.c | 69 +++++++++++++++++++++++++++++++++------- 2 files changed, 75 insertions(+), 28 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/can.rst b/Documentation/networking/can.rst index d7e1ada905b2d3..62519d38c58bad 100644 --- a/Documentation/networking/can.rst +++ b/Documentation/networking/can.rst @@ -444,6 +444,24 @@ definitions are specified for CAN specific MTUs in include/linux/can.h: #define CANFD_MTU (sizeof(struct canfd_frame)) == 72 => CAN FD frame +Returned Message Flags +---------------------- + +When using the system call recvmsg(2) on a RAW or a BCM socket, the +msg->msg_flags field may contain the following flags: + +MSG_DONTROUTE: + set when the received frame was created on the local host. + +MSG_CONFIRM: + set when the frame was sent via the socket it is received on. + This flag can be interpreted as a 'transmission confirmation' when the + CAN driver supports the echo of frames on driver level, see + :ref:`socketcan-local-loopback1` and :ref:`socketcan-local-loopback2`. + (Note: In order to receive such messages on a RAW socket, + CAN_RAW_RECV_OWN_MSGS must be set.) + + .. _socketcan-raw-sockets: RAW Protocol Sockets with can_filters (SOCK_RAW) @@ -693,22 +711,6 @@ where the CAN_INV_FILTER flag is set in order to notch single CAN IDs or CAN ID ranges from the incoming traffic. -RAW Socket Returned Message Flags -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -When using recvmsg() call, the msg->msg_flags may contain following flags: - -MSG_DONTROUTE: - set when the received frame was created on the local host. - -MSG_CONFIRM: - set when the frame was sent via the socket it is received on. - This flag can be interpreted as a 'transmission confirmation' when the - CAN driver supports the echo of frames on driver level, see - :ref:`socketcan-local-loopback1` and :ref:`socketcan-local-loopback2`. - In order to receive such messages, CAN_RAW_RECV_OWN_MSGS must be set. - - Broadcast Manager Protocol Sockets (SOCK_DGRAM) ----------------------------------------------- diff --git a/net/can/bcm.c b/net/can/bcm.c index 9168114fc87f7b..27d5fcf0eac9dd 100644 --- a/net/can/bcm.c +++ b/net/can/bcm.c @@ -72,9 +72,11 @@ #define BCM_TIMER_SEC_MAX (400 * 24 * 60 * 60) /* use of last_frames[index].flags */ +#define RX_LOCAL 0x10 /* frame was created on the local host */ +#define RX_OWN 0x20 /* frame was sent via the socket it was received on */ #define RX_RECV 0x40 /* received data for this element */ #define RX_THR 0x80 /* element not been sent due to throttle feature */ -#define BCM_CAN_FLAGS_MASK 0x3F /* to clean private flags after usage */ +#define BCM_CAN_FLAGS_MASK 0x0F /* to clean private flags after usage */ /* get best masking value for can_rx_register() for a given single can_id */ #define REGMASK(id) ((id & CAN_EFF_FLAG) ? \ @@ -138,6 +140,16 @@ static LIST_HEAD(bcm_notifier_list); static DEFINE_SPINLOCK(bcm_notifier_lock); static struct bcm_sock *bcm_busy_notifier; +/* Return pointer to store the extra msg flags for bcm_recvmsg(). + * We use the space of one unsigned int beyond the 'struct sockaddr_can' + * in skb->cb. + */ +static inline unsigned int *bcm_flags(struct sk_buff *skb) +{ + /* return pointer after struct sockaddr_can */ + return (unsigned int *)(&((struct sockaddr_can *)skb->cb)[1]); +} + static inline struct bcm_sock *bcm_sk(const struct sock *sk) { return (struct bcm_sock *)sk; @@ -325,6 +337,7 @@ static void bcm_send_to_user(struct bcm_op *op, struct bcm_msg_head *head, struct sock *sk = op->sk; unsigned int datalen = head->nframes * op->cfsiz; int err; + unsigned int *pflags; skb = alloc_skb(sizeof(*head) + datalen, gfp_any()); if (!skb) @@ -332,6 +345,14 @@ static void bcm_send_to_user(struct bcm_op *op, struct bcm_msg_head *head, skb_put_data(skb, head, sizeof(*head)); + /* ensure space for sockaddr_can and msg flags */ + sock_skb_cb_check_size(sizeof(struct sockaddr_can) + + sizeof(unsigned int)); + + /* initialize msg flags */ + pflags = bcm_flags(skb); + *pflags = 0; + if (head->nframes) { /* CAN frames starting here */ firstframe = (struct canfd_frame *)skb_tail_pointer(skb); @@ -344,8 +365,14 @@ static void bcm_send_to_user(struct bcm_op *op, struct bcm_msg_head *head, * relevant for updates that are generated by the * BCM, where nframes is 1 */ - if (head->nframes == 1) + if (head->nframes == 1) { + if (firstframe->flags & RX_LOCAL) + *pflags |= MSG_DONTROUTE; + if (firstframe->flags & RX_OWN) + *pflags |= MSG_CONFIRM; + firstframe->flags &= BCM_CAN_FLAGS_MASK; + } } if (has_timestamp) { @@ -360,7 +387,6 @@ static void bcm_send_to_user(struct bcm_op *op, struct bcm_msg_head *head, * containing the interface index. */ - sock_skb_cb_check_size(sizeof(struct sockaddr_can)); addr = (struct sockaddr_can *)skb->cb; memset(addr, 0, sizeof(*addr)); addr->can_family = AF_CAN; @@ -444,7 +470,7 @@ static void bcm_rx_changed(struct bcm_op *op, struct canfd_frame *data) op->frames_filtered = op->frames_abs = 0; /* this element is not throttled anymore */ - data->flags &= (BCM_CAN_FLAGS_MASK|RX_RECV); + data->flags &= ~RX_THR; memset(&head, 0, sizeof(head)); head.opcode = RX_CHANGED; @@ -465,13 +491,17 @@ static void bcm_rx_changed(struct bcm_op *op, struct canfd_frame *data) */ static void bcm_rx_update_and_send(struct bcm_op *op, struct canfd_frame *lastdata, - const struct canfd_frame *rxdata) + const struct canfd_frame *rxdata, + unsigned char traffic_flags) { memcpy(lastdata, rxdata, op->cfsiz); /* mark as used and throttled by default */ lastdata->flags |= (RX_RECV|RX_THR); + /* add own/local/remote traffic flags */ + lastdata->flags |= traffic_flags; + /* throttling mode inactive ? */ if (!op->kt_ival2) { /* send RX_CHANGED to the user immediately */ @@ -508,7 +538,8 @@ rx_changed_settime: * received data stored in op->last_frames[] */ static void bcm_rx_cmp_to_index(struct bcm_op *op, unsigned int index, - const struct canfd_frame *rxdata) + const struct canfd_frame *rxdata, + unsigned char traffic_flags) { struct canfd_frame *cf = op->frames + op->cfsiz * index; struct canfd_frame *lcf = op->last_frames + op->cfsiz * index; @@ -521,7 +552,7 @@ static void bcm_rx_cmp_to_index(struct bcm_op *op, unsigned int index, if (!(lcf->flags & RX_RECV)) { /* received data for the first time => send update to user */ - bcm_rx_update_and_send(op, lcf, rxdata); + bcm_rx_update_and_send(op, lcf, rxdata, traffic_flags); return; } @@ -529,7 +560,7 @@ static void bcm_rx_cmp_to_index(struct bcm_op *op, unsigned int index, for (i = 0; i < rxdata->len; i += 8) { if ((get_u64(cf, i) & get_u64(rxdata, i)) != (get_u64(cf, i) & get_u64(lcf, i))) { - bcm_rx_update_and_send(op, lcf, rxdata); + bcm_rx_update_and_send(op, lcf, rxdata, traffic_flags); return; } } @@ -537,7 +568,7 @@ static void bcm_rx_cmp_to_index(struct bcm_op *op, unsigned int index, if (op->flags & RX_CHECK_DLC) { /* do a real check in CAN frame length */ if (rxdata->len != lcf->len) { - bcm_rx_update_and_send(op, lcf, rxdata); + bcm_rx_update_and_send(op, lcf, rxdata, traffic_flags); return; } } @@ -644,6 +675,7 @@ static void bcm_rx_handler(struct sk_buff *skb, void *data) struct bcm_op *op = (struct bcm_op *)data; const struct canfd_frame *rxframe = (struct canfd_frame *)skb->data; unsigned int i; + unsigned char traffic_flags; if (op->can_id != rxframe->can_id) return; @@ -673,15 +705,24 @@ static void bcm_rx_handler(struct sk_buff *skb, void *data) return; } + /* compute flags to distinguish between own/local/remote CAN traffic */ + traffic_flags = 0; + if (skb->sk) { + traffic_flags |= RX_LOCAL; + if (skb->sk == op->sk) + traffic_flags |= RX_OWN; + } + if (op->flags & RX_FILTER_ID) { /* the easiest case */ - bcm_rx_update_and_send(op, op->last_frames, rxframe); + bcm_rx_update_and_send(op, op->last_frames, rxframe, + traffic_flags); goto rx_starttimer; } if (op->nframes == 1) { /* simple compare with index 0 */ - bcm_rx_cmp_to_index(op, 0, rxframe); + bcm_rx_cmp_to_index(op, 0, rxframe, traffic_flags); goto rx_starttimer; } @@ -698,7 +739,8 @@ static void bcm_rx_handler(struct sk_buff *skb, void *data) if ((get_u64(op->frames, 0) & get_u64(rxframe, 0)) == (get_u64(op->frames, 0) & get_u64(op->frames + op->cfsiz * i, 0))) { - bcm_rx_cmp_to_index(op, i, rxframe); + bcm_rx_cmp_to_index(op, i, rxframe, + traffic_flags); break; } } @@ -1675,6 +1717,9 @@ static int bcm_recvmsg(struct socket *sock, struct msghdr *msg, size_t size, memcpy(msg->msg_name, skb->cb, msg->msg_namelen); } + /* assign the flags that have been recorded in bcm_send_to_user() */ + msg->msg_flags |= *(bcm_flags(skb)); + skb_free_datagram(sk, skb); return size; -- cgit 1.2.3-korg From dc8543b597c282643a433e9a8af0459ed3046908 Mon Sep 17 00:00:00 2001 From: Dave Thaler Date: Thu, 8 Feb 2024 14:14:49 -0800 Subject: bpf, docs: Update ISA document title * Use "Instruction Set Architecture (ISA)" instead of "Instruction Set Specification" * Remove version number As previously discussed on the mailing list at https://mailarchive.ietf.org/arch/msg/bpf/SEpn3OL9TabNRn-4rDX9A6XVbjM/ Signed-off-by: Dave Thaler Signed-off-by: Daniel Borkmann Acked-by: David Vernet Link: https://lore.kernel.org/bpf/20240208221449.12274-1-dthaler1968@gmail.com --- Documentation/bpf/standardization/instruction-set.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/bpf/standardization/instruction-set.rst b/Documentation/bpf/standardization/instruction-set.rst index bdfe0cd0e49952..868d9f6177e36c 100644 --- a/Documentation/bpf/standardization/instruction-set.rst +++ b/Documentation/bpf/standardization/instruction-set.rst @@ -1,11 +1,11 @@ .. contents:: .. sectnum:: -======================================= -BPF Instruction Set Specification, v1.0 -======================================= +====================================== +BPF Instruction Set Architecture (ISA) +====================================== -This document specifies version 1.0 of the BPF instruction set. +This document specifies the BPF instruction set architecture (ISA). Documentation conventions ========================= -- cgit 1.2.3-korg From 4a78f0173be2673cbdadf91023085982888474a6 Mon Sep 17 00:00:00 2001 From: Krzysztof Kozlowski Date: Mon, 12 Feb 2024 19:29:11 +0100 Subject: dt-bindings: net: qca,ar9331: convert to DT schema Convert the Qualcomm Atheros AR9331 built-in switch bindings to DT schema. Signed-off-by: Krzysztof Kozlowski Reviewed-by: Conor Dooley Reviewed-by: Oleksij Rempel Link: https://lore.kernel.org/r/20240212182911.233819-1-krzysztof.kozlowski@linaro.org Signed-off-by: Jakub Kicinski --- .../devicetree/bindings/net/dsa/ar9331.txt | 147 ------------------- .../devicetree/bindings/net/dsa/qca,ar9331.yaml | 161 +++++++++++++++++++++ 2 files changed, 161 insertions(+), 147 deletions(-) delete mode 100644 Documentation/devicetree/bindings/net/dsa/ar9331.txt create mode 100644 Documentation/devicetree/bindings/net/dsa/qca,ar9331.yaml (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/dsa/ar9331.txt b/Documentation/devicetree/bindings/net/dsa/ar9331.txt deleted file mode 100644 index f824fdae0da20a..00000000000000 --- a/Documentation/devicetree/bindings/net/dsa/ar9331.txt +++ /dev/null @@ -1,147 +0,0 @@ -Atheros AR9331 built-in switch -============================= - -It is a switch built-in to Atheros AR9331 WiSoC and addressable over internal -MDIO bus. All PHYs are built-in as well. - -Required properties: - - - compatible: should be: "qca,ar9331-switch" - - reg: Address on the MII bus for the switch. - - resets : Must contain an entry for each entry in reset-names. - - reset-names : Must include the following entries: "switch" - - interrupt-parent: Phandle to the parent interrupt controller - - interrupts: IRQ line for the switch - - interrupt-controller: Indicates the switch is itself an interrupt - controller. This is used for the PHY interrupts. - - #interrupt-cells: must be 1 - - mdio: Container of PHY and devices on the switches MDIO bus. - -See Documentation/devicetree/bindings/net/dsa/dsa.txt for a list of additional -required and optional properties. -Examples: - -eth0: ethernet@19000000 { - compatible = "qca,ar9330-eth"; - reg = <0x19000000 0x200>; - interrupts = <4>; - - resets = <&rst 9>, <&rst 22>; - reset-names = "mac", "mdio"; - clocks = <&pll ATH79_CLK_AHB>, <&pll ATH79_CLK_AHB>; - clock-names = "eth", "mdio"; - - phy-mode = "mii"; - phy-handle = <&phy_port4>; -}; - -eth1: ethernet@1a000000 { - compatible = "qca,ar9330-eth"; - reg = <0x1a000000 0x200>; - interrupts = <5>; - resets = <&rst 13>, <&rst 23>; - reset-names = "mac", "mdio"; - clocks = <&pll ATH79_CLK_AHB>, <&pll ATH79_CLK_AHB>; - clock-names = "eth", "mdio"; - - phy-mode = "gmii"; - - fixed-link { - speed = <1000>; - full-duplex; - }; - - mdio { - #address-cells = <1>; - #size-cells = <0>; - - switch10: switch@10 { - #address-cells = <1>; - #size-cells = <0>; - - compatible = "qca,ar9331-switch"; - reg = <0x10>; - resets = <&rst 8>; - reset-names = "switch"; - - interrupt-parent = <&miscintc>; - interrupts = <12>; - - interrupt-controller; - #interrupt-cells = <1>; - - ports { - #address-cells = <1>; - #size-cells = <0>; - - switch_port0: port@0 { - reg = <0x0>; - ethernet = <ð1>; - - phy-mode = "gmii"; - - fixed-link { - speed = <1000>; - full-duplex; - }; - }; - - switch_port1: port@1 { - reg = <0x1>; - phy-handle = <&phy_port0>; - phy-mode = "internal"; - }; - - switch_port2: port@2 { - reg = <0x2>; - phy-handle = <&phy_port1>; - phy-mode = "internal"; - }; - - switch_port3: port@3 { - reg = <0x3>; - phy-handle = <&phy_port2>; - phy-mode = "internal"; - }; - - switch_port4: port@4 { - reg = <0x4>; - phy-handle = <&phy_port3>; - phy-mode = "internal"; - }; - }; - - mdio { - #address-cells = <1>; - #size-cells = <0>; - - interrupt-parent = <&switch10>; - - phy_port0: phy@0 { - reg = <0x0>; - interrupts = <0>; - }; - - phy_port1: phy@1 { - reg = <0x1>; - interrupts = <0>; - }; - - phy_port2: phy@2 { - reg = <0x2>; - interrupts = <0>; - }; - - phy_port3: phy@3 { - reg = <0x3>; - interrupts = <0>; - }; - - phy_port4: phy@4 { - reg = <0x4>; - interrupts = <0>; - }; - }; - }; - }; -}; diff --git a/Documentation/devicetree/bindings/net/dsa/qca,ar9331.yaml b/Documentation/devicetree/bindings/net/dsa/qca,ar9331.yaml new file mode 100644 index 00000000000000..fd9ddc59d38c31 --- /dev/null +++ b/Documentation/devicetree/bindings/net/dsa/qca,ar9331.yaml @@ -0,0 +1,161 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/dsa/qca,ar9331.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Qualcomm Atheros AR9331 built-in switch + +maintainers: + - Oleksij Rempel + +description: + Qualcomm Atheros AR9331 is a switch built-in to Atheros AR9331 WiSoC and + addressable over internal MDIO bus. All PHYs are built-in as well. + +properties: + compatible: + const: qca,ar9331-switch + + reg: + maxItems: 1 + + interrupts: + maxItems: 1 + + interrupt-controller: true + + '#interrupt-cells': + const: 1 + + mdio: + $ref: /schemas/net/mdio.yaml# + unevaluatedProperties: false + properties: + interrupt-parent: true + + patternProperties: + '(ethernet-)?phy@[0-4]+$': + type: object + unevaluatedProperties: false + + properties: + reg: true + interrupts: + maxItems: 1 + + resets: + maxItems: 1 + + reset-names: + items: + - const: switch + +required: + - compatible + - reg + - interrupts + - interrupt-controller + - '#interrupt-cells' + - mdio + - ports + - resets + - reset-names + +allOf: + - $ref: dsa.yaml#/$defs/ethernet-ports + +unevaluatedProperties: false + +examples: + - | + mdio { + #address-cells = <1>; + #size-cells = <0>; + + switch10: switch@10 { + compatible = "qca,ar9331-switch"; + reg = <0x10>; + + interrupt-parent = <&miscintc>; + interrupts = <12>; + interrupt-controller; + #interrupt-cells = <1>; + + resets = <&rst 8>; + reset-names = "switch"; + + ports { + #address-cells = <1>; + #size-cells = <0>; + + port@0 { + reg = <0x0>; + ethernet = <ð1>; + + phy-mode = "gmii"; + + fixed-link { + speed = <1000>; + full-duplex; + }; + }; + + port@1 { + reg = <0x1>; + phy-handle = <&phy_port0>; + phy-mode = "internal"; + }; + + port@2 { + reg = <0x2>; + phy-handle = <&phy_port1>; + phy-mode = "internal"; + }; + + port@3 { + reg = <0x3>; + phy-handle = <&phy_port2>; + phy-mode = "internal"; + }; + + port@4 { + reg = <0x4>; + phy-handle = <&phy_port3>; + phy-mode = "internal"; + }; + }; + + mdio { + #address-cells = <1>; + #size-cells = <0>; + + interrupt-parent = <&switch10>; + + phy_port0: ethernet-phy@0 { + reg = <0x0>; + interrupts = <0>; + }; + + phy_port1: ethernet-phy@1 { + reg = <0x1>; + interrupts = <0>; + }; + + phy_port2: ethernet-phy@2 { + reg = <0x2>; + interrupts = <0>; + }; + + phy_port3: ethernet-phy@3 { + reg = <0x3>; + interrupts = <0>; + }; + + phy_port4: ethernet-phy@4 { + reg = <0x4>; + interrupts = <0>; + }; + }; + }; + }; -- cgit 1.2.3-korg From 18e2bf0edf4dd88d9656ec92395aa47392e85b61 Mon Sep 17 00:00:00 2001 From: Joe Damato Date: Tue, 13 Feb 2024 06:16:45 +0000 Subject: eventpoll: Add epoll ioctl for epoll_params Add an ioctl for getting and setting epoll_params. User programs can use this ioctl to get and set the busy poll usec time, packet budget, and prefer busy poll params for a specific epoll context. Parameters are limited: - busy_poll_usecs is limited to <= s32_max - busy_poll_budget is limited to <= NAPI_POLL_WEIGHT by unprivileged users (!capable(CAP_NET_ADMIN)) - prefer_busy_poll must be 0 or 1 - __pad must be 0 Signed-off-by: Joe Damato Acked-by: Stanislav Fomichev Reviewed-by: Jiri Slaby Reviewed-by: Eric Dumazet Signed-off-by: David S. Miller --- Documentation/userspace-api/ioctl/ioctl-number.rst | 1 + fs/eventpoll.c | 73 ++++++++++++++++++++++ include/uapi/linux/eventpoll.h | 13 ++++ 3 files changed, 87 insertions(+) (limited to 'Documentation') diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst index 457e16f06e04de..b33918232f7869 100644 --- a/Documentation/userspace-api/ioctl/ioctl-number.rst +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst @@ -309,6 +309,7 @@ Code Seq# Include File Comments 0x89 0B-DF linux/sockios.h 0x89 E0-EF linux/sockios.h SIOCPROTOPRIVATE range 0x89 F0-FF linux/sockios.h SIOCDEVPRIVATE range +0x8A 00-1F linux/eventpoll.h 0x8B all linux/wireless.h 0x8C 00-3F WiNRADiO driver diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 1b8d01af0c2cf1..df2ed3af486ef9 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -37,6 +37,7 @@ #include #include #include +#include #include /* @@ -494,6 +495,49 @@ static inline void ep_set_busy_poll_napi_id(struct epitem *epi) ep->napi_id = napi_id; } +static long ep_eventpoll_bp_ioctl(struct file *file, unsigned int cmd, + unsigned long arg) +{ + struct eventpoll *ep = file->private_data; + void __user *uarg = (void __user *)arg; + struct epoll_params epoll_params; + + switch (cmd) { + case EPIOCSPARAMS: + if (copy_from_user(&epoll_params, uarg, sizeof(epoll_params))) + return -EFAULT; + + /* pad byte must be zero */ + if (epoll_params.__pad) + return -EINVAL; + + if (epoll_params.busy_poll_usecs > S32_MAX) + return -EINVAL; + + if (epoll_params.prefer_busy_poll > 1) + return -EINVAL; + + if (epoll_params.busy_poll_budget > NAPI_POLL_WEIGHT && + !capable(CAP_NET_ADMIN)) + return -EPERM; + + WRITE_ONCE(ep->busy_poll_usecs, epoll_params.busy_poll_usecs); + WRITE_ONCE(ep->busy_poll_budget, epoll_params.busy_poll_budget); + WRITE_ONCE(ep->prefer_busy_poll, epoll_params.prefer_busy_poll); + return 0; + case EPIOCGPARAMS: + memset(&epoll_params, 0, sizeof(epoll_params)); + epoll_params.busy_poll_usecs = READ_ONCE(ep->busy_poll_usecs); + epoll_params.busy_poll_budget = READ_ONCE(ep->busy_poll_budget); + epoll_params.prefer_busy_poll = READ_ONCE(ep->prefer_busy_poll); + if (copy_to_user(uarg, &epoll_params, sizeof(epoll_params))) + return -EFAULT; + return 0; + default: + return -ENOIOCTLCMD; + } +} + #else static inline bool ep_busy_loop(struct eventpoll *ep, int nonblock) @@ -505,6 +549,12 @@ static inline void ep_set_busy_poll_napi_id(struct epitem *epi) { } +static long ep_eventpoll_bp_ioctl(struct file *file, unsigned int cmd, + unsigned long arg) +{ + return -EOPNOTSUPP; +} + #endif /* CONFIG_NET_RX_BUSY_POLL */ /* @@ -864,6 +914,27 @@ static void ep_clear_and_put(struct eventpoll *ep) ep_free(ep); } +static long ep_eventpoll_ioctl(struct file *file, unsigned int cmd, + unsigned long arg) +{ + int ret; + + if (!is_file_epoll(file)) + return -EINVAL; + + switch (cmd) { + case EPIOCSPARAMS: + case EPIOCGPARAMS: + ret = ep_eventpoll_bp_ioctl(file, cmd, arg); + break; + default: + ret = -EINVAL; + break; + } + + return ret; +} + static int ep_eventpoll_release(struct inode *inode, struct file *file) { struct eventpoll *ep = file->private_data; @@ -970,6 +1041,8 @@ static const struct file_operations eventpoll_fops = { .release = ep_eventpoll_release, .poll = ep_eventpoll_poll, .llseek = noop_llseek, + .unlocked_ioctl = ep_eventpoll_ioctl, + .compat_ioctl = compat_ptr_ioctl, }; /* diff --git a/include/uapi/linux/eventpoll.h b/include/uapi/linux/eventpoll.h index cfbcc4cc49acb9..4f4b948ef38113 100644 --- a/include/uapi/linux/eventpoll.h +++ b/include/uapi/linux/eventpoll.h @@ -85,4 +85,17 @@ struct epoll_event { __u64 data; } EPOLL_PACKED; +struct epoll_params { + __u32 busy_poll_usecs; + __u16 busy_poll_budget; + __u8 prefer_busy_poll; + + /* pad the struct to a multiple of 64bits */ + __u8 __pad; +}; + +#define EPOLL_IOC_TYPE 0x8A +#define EPIOCSPARAMS _IOW(EPOLL_IOC_TYPE, 0x01, struct epoll_params) +#define EPIOCGPARAMS _IOR(EPOLL_IOC_TYPE, 0x02, struct epoll_params) + #endif /* _UAPI_LINUX_EVENTPOLL_H */ -- cgit 1.2.3-korg From 328771deab16fcac55763309bb59e28b1c050853 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Tue, 13 Feb 2024 06:32:41 +0000 Subject: net: remove stale mentions of dev_base_lock in comments Change comments incorrectly mentioning dev_base_lock. Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller --- Documentation/networking/netdevices.rst | 4 ++-- drivers/net/ethernet/cisco/enic/enic_main.c | 2 +- drivers/net/ethernet/nvidia/forcedeth.c | 4 ++-- drivers/net/ethernet/sfc/efx_common.c | 2 +- drivers/net/ethernet/sfc/falcon/efx.c | 2 +- drivers/net/ethernet/sfc/siena/efx_common.c | 2 +- 6 files changed, 8 insertions(+), 8 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/netdevices.rst b/Documentation/networking/netdevices.rst index 9e4cccb90b8700..c2476917a6c37d 100644 --- a/Documentation/networking/netdevices.rst +++ b/Documentation/networking/netdevices.rst @@ -252,8 +252,8 @@ ndo_eth_ioctl: Context: process ndo_get_stats: - Synchronization: rtnl_lock() semaphore, dev_base_lock rwlock, or RCU. - Context: atomic (can't sleep under rwlock or RCU) + Synchronization: rtnl_lock() semaphore, or RCU. + Context: atomic (can't sleep under RCU) ndo_start_xmit: Synchronization: __netif_tx_lock spinlock. diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c index 37bd38d772e809..d266a87297a5e3 100644 --- a/drivers/net/ethernet/cisco/enic/enic_main.c +++ b/drivers/net/ethernet/cisco/enic/enic_main.c @@ -872,7 +872,7 @@ error: return NETDEV_TX_OK; } -/* dev_base_lock rwlock held, nominally process context */ +/* rcu_read_lock potentially held, nominally process context */ static void enic_get_stats(struct net_device *netdev, struct rtnl_link_stats64 *net_stats) { diff --git a/drivers/net/ethernet/nvidia/forcedeth.c b/drivers/net/ethernet/nvidia/forcedeth.c index 7a549b834e970a..31f896c4aa2660 100644 --- a/drivers/net/ethernet/nvidia/forcedeth.c +++ b/drivers/net/ethernet/nvidia/forcedeth.c @@ -1761,7 +1761,7 @@ static void nv_get_stats(int cpu, struct fe_priv *np, /* * nv_get_stats64: dev->ndo_get_stats64 function * Get latest stats value from the nic. - * Called with read_lock(&dev_base_lock) held for read - + * Called with rcu_read_lock() held - * only synchronized against unregister_netdevice. */ static void @@ -3090,7 +3090,7 @@ static void set_bufsize(struct net_device *dev) /* * nv_change_mtu: dev->change_mtu function - * Called with dev_base_lock held for read. + * Called with RTNL held for read. */ static int nv_change_mtu(struct net_device *dev, int new_mtu) { diff --git a/drivers/net/ethernet/sfc/efx_common.c b/drivers/net/ethernet/sfc/efx_common.c index 175bd9cdfdac3a..551f890db90a60 100644 --- a/drivers/net/ethernet/sfc/efx_common.c +++ b/drivers/net/ethernet/sfc/efx_common.c @@ -595,7 +595,7 @@ void efx_stop_all(struct efx_nic *efx) efx_stop_datapath(efx); } -/* Context: process, dev_base_lock or RTNL held, non-blocking. */ +/* Context: process, rcu_read_lock or RTNL held, non-blocking. */ void efx_net_stats(struct net_device *net_dev, struct rtnl_link_stats64 *stats) { struct efx_nic *efx = efx_netdev_priv(net_dev); diff --git a/drivers/net/ethernet/sfc/falcon/efx.c b/drivers/net/ethernet/sfc/falcon/efx.c index e001f27085c661..1cb32aedd89c73 100644 --- a/drivers/net/ethernet/sfc/falcon/efx.c +++ b/drivers/net/ethernet/sfc/falcon/efx.c @@ -2085,7 +2085,7 @@ int ef4_net_stop(struct net_device *net_dev) return 0; } -/* Context: process, dev_base_lock or RTNL held, non-blocking. */ +/* Context: process, rcu_read_lock or RTNL held, non-blocking. */ static void ef4_net_stats(struct net_device *net_dev, struct rtnl_link_stats64 *stats) { diff --git a/drivers/net/ethernet/sfc/siena/efx_common.c b/drivers/net/ethernet/sfc/siena/efx_common.c index e4b294b8e9acb1..88e5bc347a44ce 100644 --- a/drivers/net/ethernet/sfc/siena/efx_common.c +++ b/drivers/net/ethernet/sfc/siena/efx_common.c @@ -605,7 +605,7 @@ static size_t efx_siena_update_stats_atomic(struct efx_nic *efx, u64 *full_stats return efx->type->update_stats(efx, full_stats, core_stats); } -/* Context: process, dev_base_lock or RTNL held, non-blocking. */ +/* Context: process, rcu_read_lock or RTNL held, non-blocking. */ void efx_siena_net_stats(struct net_device *net_dev, struct rtnl_link_stats64 *stats) { -- cgit 1.2.3-korg From ed1d7dac08c532a23dd1da62451b40dbe1305dbd Mon Sep 17 00:00:00 2001 From: Catalin Popescu Date: Tue, 13 Feb 2024 09:07:04 +0100 Subject: dt-bindings: net: dp83826: support TX data voltage tuning Add properties ti,cfg-dac-minus-one-bp/ti,cfg-dac-plus-one-bp to support voltage tuning of logical levels -1/+1 of the MLT-3 encoded TX data. Reviewed-by: Krzysztof Kozlowski Signed-off-by: Catalin Popescu Signed-off-by: David S. Miller --- Documentation/devicetree/bindings/net/ti,dp83822.yaml | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/ti,dp83822.yaml b/Documentation/devicetree/bindings/net/ti,dp83822.yaml index db74474207ed47..8f4350be689c57 100644 --- a/Documentation/devicetree/bindings/net/ti,dp83822.yaml +++ b/Documentation/devicetree/bindings/net/ti,dp83822.yaml @@ -62,6 +62,24 @@ properties: for the PHY. The internal delay for the PHY is fixed to 3.5ns relative to transmit data. + ti,cfg-dac-minus-one-bp: + description: | + DP83826 PHY only. + Sets the voltage ratio (with respect to the nominal value) + of the logical level -1 for the MLT-3 encoded TX data. + enum: [5000, 5625, 6250, 6875, 7500, 8125, 8750, 9375, 10000, + 10625, 11250, 11875, 12500, 13125, 13750, 14375, 15000] + default: 10000 + + ti,cfg-dac-plus-one-bp: + description: | + DP83826 PHY only. + Sets the voltage ratio (with respect to the nominal value) + of the logical level +1 for the MLT-3 encoded TX data. + enum: [5000, 5625, 6250, 6875, 7500, 8125, 8750, 9375, 10000, + 10625, 11250, 11875, 12500, 13125, 13750, 14375, 15000] + default: 10000 + required: - reg -- cgit 1.2.3-korg From b00cf4f62969eb7067bc44851de9519953a7ae01 Mon Sep 17 00:00:00 2001 From: Martin Hundebøll Date: Mon, 13 Nov 2023 14:14:52 +0100 Subject: dt-bindings: can: tcan4x5x: Document the wakeup-source flag MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Let it be known that the tcan4x5x device can now be configured to wake the host from suspend when a can frame is received. Signed-off-by: Martin Hundebøll Acked-by: Conor Dooley [mkl: make first the first patch] Signed-off-by: Marc Kleine-Budde --- Documentation/devicetree/bindings/net/can/tcan4x5x.txt | 3 +++ 1 file changed, 3 insertions(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/can/tcan4x5x.txt b/Documentation/devicetree/bindings/net/can/tcan4x5x.txt index 170e23f0610db9..20c0572c985342 100644 --- a/Documentation/devicetree/bindings/net/can/tcan4x5x.txt +++ b/Documentation/devicetree/bindings/net/can/tcan4x5x.txt @@ -28,6 +28,8 @@ Optional properties: available with tcan4552/4553. - device-wake-gpios: Wake up GPIO to wake up the TCAN device. Not available with tcan4552/4553. + - wakeup-source: Leave the chip running when suspended, and configure + the RX interrupt to wake up the device. Example: tcan4x5x: tcan4x5x@0 { @@ -42,4 +44,5 @@ tcan4x5x: tcan4x5x@0 { device-state-gpios = <&gpio3 21 GPIO_ACTIVE_HIGH>; device-wake-gpios = <&gpio1 15 GPIO_ACTIVE_HIGH>; reset-gpios = <&gpio1 27 GPIO_ACTIVE_HIGH>; + wakeup-source; }; -- cgit 1.2.3-korg From 2aa8f155b09519814e449dc19adacf01fd1367ee Mon Sep 17 00:00:00 2001 From: Alex Henrie Date: Tue, 13 Feb 2024 23:26:30 -0700 Subject: net: ipv6/addrconf: ensure that regen_advance is at least 2 seconds RFC 8981 defines REGEN_ADVANCE as follows: REGEN_ADVANCE = 2 + (TEMP_IDGEN_RETRIES * DupAddrDetectTransmits * RetransTimer / 1000) Thus, allowing it to be less than 2 seconds is technically a protocol violation. Link: https://datatracker.ietf.org/doc/html/rfc8981#name-defined-protocol-parameters Signed-off-by: Alex Henrie Reviewed-by: David Ahern Signed-off-by: Paolo Abeni --- Documentation/networking/ip-sysctl.rst | 4 ++-- net/ipv6/addrconf.c | 15 +++++++++------ 2 files changed, 11 insertions(+), 8 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst index 7afff42612e949..45830593134536 100644 --- a/Documentation/networking/ip-sysctl.rst +++ b/Documentation/networking/ip-sysctl.rst @@ -2503,7 +2503,7 @@ use_tempaddr - INTEGER temp_valid_lft - INTEGER valid lifetime (in seconds) for temporary addresses. If less than the - minimum required lifetime (typically 5 seconds), temporary addresses + minimum required lifetime (typically 5-7 seconds), temporary addresses will not be created. Default: 172800 (2 days) @@ -2511,7 +2511,7 @@ temp_valid_lft - INTEGER temp_prefered_lft - INTEGER Preferred lifetime (in seconds) for temporary addresses. If temp_prefered_lft is less than the minimum required lifetime (typically - 5 seconds), temporary addresses will not be created. If + 5-7 seconds), temporary addresses will not be created. If temp_prefered_lft is greater than temp_valid_lft, the preferred lifetime is temp_valid_lft. diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 2c1ed642e3f456..65e886d7d80cc9 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -1339,6 +1339,13 @@ out: in6_ifa_put(ifp); } +static unsigned long ipv6_get_regen_advance(struct inet6_dev *idev) +{ + return 2 + idev->cnf.regen_max_retry * + idev->cnf.dad_transmits * + max(NEIGH_VAR(idev->nd_parms, RETRANS_TIME), HZ/100) / HZ; +} + static int ipv6_create_tempaddr(struct inet6_ifaddr *ifp, bool block) { struct inet6_dev *idev = ifp->idev; @@ -1380,9 +1387,7 @@ retry: age = (now - ifp->tstamp) / HZ; - regen_advance = idev->cnf.regen_max_retry * - idev->cnf.dad_transmits * - max(NEIGH_VAR(idev->nd_parms, RETRANS_TIME), HZ/100) / HZ; + regen_advance = ipv6_get_regen_advance(idev); /* recalculate max_desync_factor each time and update * idev->desync_factor if it's larger @@ -4595,9 +4600,7 @@ restart: !ifp->regen_count && ifp->ifpub) { /* This is a non-regenerated temporary addr. */ - unsigned long regen_advance = ifp->idev->cnf.regen_max_retry * - ifp->idev->cnf.dad_transmits * - max(NEIGH_VAR(ifp->idev->nd_parms, RETRANS_TIME), HZ/100) / HZ; + unsigned long regen_advance = ipv6_get_regen_advance(ifp->idev); if (age + regen_advance >= ifp->prefered_lft) { struct inet6_ifaddr *ifpub = ifp->ifpub; -- cgit 1.2.3-korg From a5fcea2d2f790aa90b6e996d411ae2cf8db55186 Mon Sep 17 00:00:00 2001 From: Alex Henrie Date: Tue, 13 Feb 2024 23:26:31 -0700 Subject: net: ipv6/addrconf: introduce a regen_min_advance sysctl In RFC 8981, REGEN_ADVANCE cannot be less than 2 seconds, and the RFC does not permit the creation of temporary addresses with lifetimes shorter than that: > When processing a Router Advertisement with a > Prefix Information option carrying a prefix for the purposes of > address autoconfiguration (i.e., the A bit is set), the host MUST > perform the following steps: > 5. A temporary address is created only if this calculated preferred > lifetime is greater than REGEN_ADVANCE time units. However, some users want to change their IPv6 address as frequently as possible regardless of the RFC's arbitrary minimum lifetime. For the benefit of those users, add a regen_min_advance sysctl parameter that can be set to below or above 2 seconds. Link: https://datatracker.ietf.org/doc/html/rfc8981 Signed-off-by: Alex Henrie Reviewed-by: David Ahern Signed-off-by: Paolo Abeni --- Documentation/networking/ip-sysctl.rst | 10 ++++++++++ include/linux/ipv6.h | 1 + include/net/addrconf.h | 5 +++-- net/ipv6/addrconf.c | 11 ++++++++++- 4 files changed, 24 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst index 45830593134536..407d917d1a367e 100644 --- a/Documentation/networking/ip-sysctl.rst +++ b/Documentation/networking/ip-sysctl.rst @@ -2535,6 +2535,16 @@ max_desync_factor - INTEGER Default: 600 +regen_min_advance - INTEGER + How far in advance (in seconds), at minimum, to create a new temporary + address before the current one is deprecated. This value is added to + the amount of time that may be required for duplicate address detection + to determine when to create a new address. Linux permits setting this + value to less than the default of 2 seconds, but a value less than 2 + does not conform to RFC 8981. + + Default: 2 + regen_max_retry - INTEGER Number of attempts before give up attempting to generate valid temporary addresses. diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h index 5e605e384aac81..ef3aa060a289ea 100644 --- a/include/linux/ipv6.h +++ b/include/linux/ipv6.h @@ -27,6 +27,7 @@ struct ipv6_devconf { __s32 use_tempaddr; __s32 temp_valid_lft; __s32 temp_prefered_lft; + __s32 regen_min_advance; __s32 regen_max_retry; __s32 max_desync_factor; __s32 max_addresses; diff --git a/include/net/addrconf.h b/include/net/addrconf.h index 61ebe723ee4d50..30d6f1e84e465e 100644 --- a/include/net/addrconf.h +++ b/include/net/addrconf.h @@ -8,8 +8,9 @@ #define MIN_VALID_LIFETIME (2*3600) /* 2 hours */ -#define TEMP_VALID_LIFETIME (7*86400) -#define TEMP_PREFERRED_LIFETIME (86400) +#define TEMP_VALID_LIFETIME (7*86400) /* 1 week */ +#define TEMP_PREFERRED_LIFETIME (86400) /* 24 hours */ +#define REGEN_MIN_ADVANCE (2) /* 2 seconds */ #define REGEN_MAX_RETRY (3) #define MAX_DESYNC_FACTOR (600) diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 65e886d7d80cc9..283823fba96a84 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -195,6 +195,7 @@ static struct ipv6_devconf ipv6_devconf __read_mostly = { .use_tempaddr = 0, .temp_valid_lft = TEMP_VALID_LIFETIME, .temp_prefered_lft = TEMP_PREFERRED_LIFETIME, + .regen_min_advance = REGEN_MIN_ADVANCE, .regen_max_retry = REGEN_MAX_RETRY, .max_desync_factor = MAX_DESYNC_FACTOR, .max_addresses = IPV6_MAX_ADDRESSES, @@ -257,6 +258,7 @@ static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = { .use_tempaddr = 0, .temp_valid_lft = TEMP_VALID_LIFETIME, .temp_prefered_lft = TEMP_PREFERRED_LIFETIME, + .regen_min_advance = REGEN_MIN_ADVANCE, .regen_max_retry = REGEN_MAX_RETRY, .max_desync_factor = MAX_DESYNC_FACTOR, .max_addresses = IPV6_MAX_ADDRESSES, @@ -1341,7 +1343,7 @@ out: static unsigned long ipv6_get_regen_advance(struct inet6_dev *idev) { - return 2 + idev->cnf.regen_max_retry * + return idev->cnf.regen_min_advance + idev->cnf.regen_max_retry * idev->cnf.dad_transmits * max(NEIGH_VAR(idev->nd_parms, RETRANS_TIME), HZ/100) / HZ; } @@ -6819,6 +6821,13 @@ static const struct ctl_table addrconf_sysctl[] = { .mode = 0644, .proc_handler = proc_dointvec, }, + { + .procname = "regen_min_advance", + .data = &ipv6_devconf.regen_min_advance, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec, + }, { .procname = "regen_max_retry", .data = &ipv6_devconf.regen_max_retry, -- cgit 1.2.3-korg From f4bcbf360ac8dc424dc4d2b384b528e69b6f34d9 Mon Sep 17 00:00:00 2001 From: Alex Henrie Date: Tue, 13 Feb 2024 23:26:32 -0700 Subject: net: ipv6/addrconf: clamp preferred_lft to the minimum required If the preferred lifetime was less than the minimum required lifetime, ipv6_create_tempaddr would error out without creating any new address. On my machine and network, this error happened immediately with the preferred lifetime set to 5 seconds or less, after a few minutes with the preferred lifetime set to 6 seconds, and not at all with the preferred lifetime set to 7 seconds. During my investigation, I found a Stack Exchange post from another person who seems to have had the same problem: They stopped getting new addresses if they lowered the preferred lifetime below 3 seconds, and they didn't really know why. The preferred lifetime is a preference, not a hard requirement. The kernel does not strictly forbid new connections on a deprecated address, nor does it guarantee that the address will be disposed of the instant its total valid lifetime expires. So rather than disable IPv6 privacy extensions altogether if the minimum required lifetime swells above the preferred lifetime, it is more in keeping with the user's intent to increase the temporary address's lifetime to the minimum necessary for the current network conditions. With these fixes, setting the preferred lifetime to 5 or 6 seconds "just works" because the extra fraction of a second is practically unnoticeable. It's even possible to reduce the time before deprecation to 1 or 2 seconds by setting /proc/sys/net/ipv6/conf/*/regen_min_advance and /proc/sys/net/ipv6/conf/*/dad_transmits to 0. I realize that that is a pretty niche use case, but I know at least one person who would gladly sacrifice performance and convenience to be sure that they are getting the maximum possible level of privacy. Link: https://serverfault.com/a/1031168/310447 Signed-off-by: Alex Henrie Reviewed-by: David Ahern Signed-off-by: Paolo Abeni --- Documentation/networking/ip-sysctl.rst | 2 +- net/ipv6/addrconf.c | 43 +++++++++++++++++++++++++++------- 2 files changed, 35 insertions(+), 10 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst index 407d917d1a367e..bd50df6a5a42e6 100644 --- a/Documentation/networking/ip-sysctl.rst +++ b/Documentation/networking/ip-sysctl.rst @@ -2511,7 +2511,7 @@ temp_valid_lft - INTEGER temp_prefered_lft - INTEGER Preferred lifetime (in seconds) for temporary addresses. If temp_prefered_lft is less than the minimum required lifetime (typically - 5-7 seconds), temporary addresses will not be created. If + 5-7 seconds), the preferred lifetime is the minimum required. If temp_prefered_lft is greater than temp_valid_lft, the preferred lifetime is temp_valid_lft. diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 283823fba96a84..d3f4b7b9cf1fe3 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -1354,6 +1354,7 @@ static int ipv6_create_tempaddr(struct inet6_ifaddr *ifp, bool block) unsigned long tmp_tstamp, age; unsigned long regen_advance; unsigned long now = jiffies; + u32 if_public_preferred_lft; s32 cnf_temp_preferred_lft; struct inet6_ifaddr *ift; struct ifa6_config cfg; @@ -1409,11 +1410,13 @@ retry: } } + if_public_preferred_lft = ifp->prefered_lft; + memset(&cfg, 0, sizeof(cfg)); cfg.valid_lft = min_t(__u32, ifp->valid_lft, idev->cnf.temp_valid_lft + age); cfg.preferred_lft = cnf_temp_preferred_lft + age - idev->desync_factor; - cfg.preferred_lft = min_t(__u32, ifp->prefered_lft, cfg.preferred_lft); + cfg.preferred_lft = min_t(__u32, if_public_preferred_lft, cfg.preferred_lft); cfg.preferred_lft = min_t(__u32, cfg.valid_lft, cfg.preferred_lft); cfg.plen = ifp->prefix_len; @@ -1422,19 +1425,41 @@ retry: write_unlock_bh(&idev->lock); - /* A temporary address is created only if this calculated Preferred - * Lifetime is greater than REGEN_ADVANCE time units. In particular, - * an implementation must not create a temporary address with a zero - * Preferred Lifetime. + /* From RFC 4941: + * + * A temporary address is created only if this calculated Preferred + * Lifetime is greater than REGEN_ADVANCE time units. In + * particular, an implementation must not create a temporary address + * with a zero Preferred Lifetime. + * + * ... + * + * When creating a temporary address, the lifetime values MUST be + * derived from the corresponding prefix as follows: + * + * ... + * + * * Its Preferred Lifetime is the lower of the Preferred Lifetime + * of the public address or TEMP_PREFERRED_LIFETIME - + * DESYNC_FACTOR. + * + * To comply with the RFC's requirements, clamp the preferred lifetime + * to a minimum of regen_advance, unless that would exceed valid_lft or + * ifp->prefered_lft. + * * Use age calculation as in addrconf_verify to avoid unnecessary * temporary addresses being generated. */ age = (now - tmp_tstamp + ADDRCONF_TIMER_FUZZ_MINUS) / HZ; if (cfg.preferred_lft <= regen_advance + age) { - in6_ifa_put(ifp); - in6_dev_put(idev); - ret = -1; - goto out; + cfg.preferred_lft = regen_advance + age + 1; + if (cfg.preferred_lft > cfg.valid_lft || + cfg.preferred_lft > if_public_preferred_lft) { + in6_ifa_put(ifp); + in6_dev_put(idev); + ret = -1; + goto out; + } } cfg.ifa_flags = IFA_F_TEMPORARY; -- cgit 1.2.3-korg From 7075d733b8e431c011d30c219012d40ea0c92e1d Mon Sep 17 00:00:00 2001 From: Srinivas Goud Date: Tue, 13 Feb 2024 11:36:43 +0100 Subject: dt-bindings: can: xilinx_can: Add 'xlnx,has-ecc' optional property ECC feature added to CAN TX_OL, TX_TL and RX FIFOs of Xilinx AXI CAN Controller. ECC is an IP configuration option where counter registers are added in IP for 1bit/2bit ECC errors. 'xlnx,has-ecc' is an optional property and added to Xilinx AXI CAN Controller node if ECC block enabled in the HW Acked-by: Conor Dooley Signed-off-by: Srinivas Goud Link: https://lore.kernel.org/all/20240213-xilinx_ecc-v8-1-8d75f8b80771@pengutronix.de Signed-off-by: Marc Kleine-Budde --- Documentation/devicetree/bindings/net/can/xilinx,can.yaml | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/can/xilinx,can.yaml b/Documentation/devicetree/bindings/net/can/xilinx,can.yaml index 64d57c343e6f0a..8d4e5af6fd6c84 100644 --- a/Documentation/devicetree/bindings/net/can/xilinx,can.yaml +++ b/Documentation/devicetree/bindings/net/can/xilinx,can.yaml @@ -49,6 +49,10 @@ properties: resets: maxItems: 1 + xlnx,has-ecc: + $ref: /schemas/types.yaml#/definitions/flag + description: CAN TX_OL, TX_TL and RX FIFOs have ECC support(AXI CAN) + required: - compatible - reg @@ -137,6 +141,7 @@ examples: interrupts = ; tx-fifo-depth = <0x40>; rx-fifo-depth = <0x40>; + xlnx,has-ecc; }; - | -- cgit 1.2.3-korg From 5983e5df86303564f0968e6e4108ca08e00828ee Mon Sep 17 00:00:00 2001 From: Frank Li Date: Thu, 1 Feb 2024 15:22:42 -0500 Subject: dt-bindings: net: fec: add iommus property iMX8QM have iommu. Add proerty 'iommus'. Signed-off-by: Frank Li Acked-by: Krzysztof Kozlowski Link: https://lore.kernel.org/r/20240201-8qm_smmu-v2-2-3d12a80201a3@nxp.com Signed-off-by: Jakub Kicinski --- Documentation/devicetree/bindings/net/fsl,fec.yaml | 3 +++ 1 file changed, 3 insertions(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/fsl,fec.yaml b/Documentation/devicetree/bindings/net/fsl,fec.yaml index 8948a11c994e48..5536c06139cae5 100644 --- a/Documentation/devicetree/bindings/net/fsl,fec.yaml +++ b/Documentation/devicetree/bindings/net/fsl,fec.yaml @@ -224,6 +224,9 @@ properties: Can be omitted thus no delay is observed. Delay is in range of 1ms to 1000ms. Other delays are invalid. + iommus: + maxItems: 1 + required: - compatible - reg -- cgit 1.2.3-korg From 6d5c36565c169376e3c5fc54d01d7c6819381465 Mon Sep 17 00:00:00 2001 From: Samuel Thibault Date: Sat, 17 Feb 2024 22:14:25 +0100 Subject: PPPoL2TP: Add more code snippets The existing documentation was not telling that one has to create a PPP channel and a PPP interface to get PPPoL2TP data offloading working. Also, tunnel switching was not mentioned, so that people were thinking it was not supported, while it actually is. Signed-off-by: Samuel Thibault Acked-by: Tom Parkin Link: https://lore.kernel.org/r/20240217211425.qj576u3jmaa6yidf@begin Signed-off-by: Jakub Kicinski --- Documentation/networking/l2tp.rst | 135 ++++++++++++++++++++++++++++++++++++-- 1 file changed, 129 insertions(+), 6 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/l2tp.rst b/Documentation/networking/l2tp.rst index 7f383e99dbada9..8496b467dea492 100644 --- a/Documentation/networking/l2tp.rst +++ b/Documentation/networking/l2tp.rst @@ -386,12 +386,19 @@ Sample userspace code: - Create session PPPoX data socket:: + /* Input: the L2TP tunnel UDP socket `tunnel_fd`, which needs to be + * bound already (both sockname and peername), otherwise it will not be + * ready. + */ + struct sockaddr_pppol2tp sax; - int fd; + int session_fd; + int ret; + + session_fd = socket(AF_PPPOX, SOCK_DGRAM, PX_PROTO_OL2TP); + if (session_fd < 0) + return -errno; - /* Note, the tunnel socket must be bound already, else it - * will not be ready - */ sax.sa_family = AF_PPPOX; sax.sa_protocol = PX_PROTO_OL2TP; sax.pppol2tp.fd = tunnel_fd; @@ -406,12 +413,128 @@ Sample userspace code: /* session_fd is the fd of the session's PPPoL2TP socket. * tunnel_fd is the fd of the tunnel UDP / L2TPIP socket. */ - fd = connect(session_fd, (struct sockaddr *)&sax, sizeof(sax)); - if (fd < 0 ) { + ret = connect(session_fd, (struct sockaddr *)&sax, sizeof(sax)); + if (ret < 0 ) { + close(session_fd); + return -errno; + } + + return session_fd; + +L2TP control packets will still be available for read on `tunnel_fd`. + + - Create PPP channel:: + + /* Input: the session PPPoX data socket `session_fd` which was created + * as described above. + */ + + int ppp_chan_fd; + int chindx; + int ret; + + ret = ioctl(session_fd, PPPIOCGCHAN, &chindx); + if (ret < 0) + return -errno; + + ppp_chan_fd = open("/dev/ppp", O_RDWR); + if (ppp_chan_fd < 0) + return -errno; + + ret = ioctl(ppp_chan_fd, PPPIOCATTCHAN, &chindx); + if (ret < 0) { + close(ppp_chan_fd); return -errno; } + + return ppp_chan_fd; + +LCP PPP frames will be available for read on `ppp_chan_fd`. + + - Create PPP interface:: + + /* Input: the PPP channel `ppp_chan_fd` which was created as described + * above. + */ + + int ifunit = -1; + int ppp_if_fd; + int ret; + + ppp_if_fd = open("/dev/ppp", O_RDWR); + if (ppp_if_fd < 0) + return -errno; + + ret = ioctl(ppp_if_fd, PPPIOCNEWUNIT, &ifunit); + if (ret < 0) { + close(ppp_if_fd); + return -errno; + } + + ret = ioctl(ppp_chan_fd, PPPIOCCONNECT, &ifunit); + if (ret < 0) { + close(ppp_if_fd); + return -errno; + } + + return ppp_if_fd; + +IPCP/IPv6CP PPP frames will be available for read on `ppp_if_fd`. + +The ppp interface can then be configured as usual with netlink's +RTM_NEWLINK, RTM_NEWADDR, RTM_NEWROUTE, or ioctl's SIOCSIFMTU, SIOCSIFADDR, +SIOCSIFDSTADDR, SIOCSIFNETMASK, SIOCSIFFLAGS, or with the `ip` command. + + - Bridging L2TP sessions which have PPP pseudowire types (this is also called + L2TP tunnel switching or L2TP multihop) is supported by bridging the PPP + channels of the two L2TP sessions to be bridged:: + + /* Input: the session PPPoX data sockets `session_fd1` and `session_fd2` + * which were created as described further above. + */ + + int ppp_chan_fd; + int chindx1; + int chindx2; + int ret; + + ret = ioctl(session_fd1, PPPIOCGCHAN, &chindx1); + if (ret < 0) + return -errno; + + ret = ioctl(session_fd2, PPPIOCGCHAN, &chindx2); + if (ret < 0) + return -errno; + + ppp_chan_fd = open("/dev/ppp", O_RDWR); + if (ppp_chan_fd < 0) + return -errno; + + ret = ioctl(ppp_chan_fd, PPPIOCATTCHAN, &chindx1); + if (ret < 0) { + close(ppp_chan_fd); + return -errno; + } + + ret = ioctl(ppp_chan_fd, PPPIOCBRIDGECHAN, &chindx2); + close(ppp_chan_fd); + if (ret < 0) + return -errno; + return 0; +It can be noted that when bridging PPP channels, the PPP session is not locally +terminated, and no local PPP interface is created. PPP frames arriving on one +channel are directly passed to the other channel, and vice versa. + +The PPP channel does not need to be kept open. Only the session PPPoX data +sockets need to be kept open. + +More generally, it is also possible in the same way to e.g. bridge a PPPoL2TP +PPP channel with other types of PPP channels, such as PPPoE. + +See more details for the PPP side in ppp_generic.rst. + Old L2TPv2-only API ------------------- -- cgit 1.2.3-korg From 5302615954e3fb5d9d06281578ae975372304248 Mon Sep 17 00:00:00 2001 From: Peter Chiu Date: Thu, 21 Dec 2023 11:26:48 +0800 Subject: dt-bindings: net: wireless: mt76: add interrupts description for MT7986 The mt7986 can support four interrupts to distribute the interrupts to different CPUs. Signed-off-by: Peter Chiu Reviewed-by: Krzysztof Kozlowski Signed-off-by: Felix Fietkau --- .../bindings/net/wireless/mediatek,mt76.yaml | 32 ++++++++++++++++++---- 1 file changed, 27 insertions(+), 5 deletions(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml b/Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml index 252207adbc54c5..0c6835db397f4d 100644 --- a/Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml +++ b/Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml @@ -19,9 +19,6 @@ description: | Alternatively, it can specify the wireless part of the MT7628/MT7688 or MT7622/MT7986 SoC. -allOf: - - $ref: ieee80211.yaml# - properties: compatible: enum: @@ -38,7 +35,12 @@ properties: MT7986 should contain 3 regions consys, dcm, and sku, in this order. interrupts: - maxItems: 1 + minItems: 1 + items: + - description: major interrupt for rings + - description: additional interrupt for ring 19 + - description: additional interrupt for ring 4 + - description: additional interrupt for ring 5 power-domains: maxItems: 1 @@ -217,6 +219,23 @@ required: - compatible - reg +allOf: + - $ref: ieee80211.yaml# + - if: + properties: + compatible: + contains: + enum: + - mediatek,mt7986-wmac + then: + properties: + interrupts: + minItems: 4 + else: + properties: + interrupts: + maxItems: 1 + unevaluatedProperties: false examples: @@ -293,7 +312,10 @@ examples: reg = <0x18000000 0x1000000>, <0x10003000 0x1000>, <0x11d10000 0x1000>; - interrupts = ; + interrupts = , + , + , + ; clocks = <&topckgen 50>, <&topckgen 62>; clock-names = "mcu", "ap2conn"; -- cgit 1.2.3-korg From 8fa556045696fffd78fe5c3386c6e77d5a368098 Mon Sep 17 00:00:00 2001 From: RafaÅ‚ MiÅ‚ecki Date: Wed, 21 Feb 2024 09:12:38 +0100 Subject: dt-bindings: net: wireless: mt76: allow all 4 interrupts for MT7981 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit MT7981 (Filogic 820) is a low cost version of MT7986 (Filogic 830) with a similar wireless controller that also supports four interrupts. Cc: Peter Chiu Cc: Felix Fietkau Signed-off-by: RafaÅ‚ MiÅ‚ecki Acked-by: Conor Dooley Signed-off-by: Felix Fietkau --- Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml b/Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml index 0c6835db397f4d..eabceb849537c4 100644 --- a/Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml +++ b/Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml @@ -226,6 +226,7 @@ allOf: compatible: contains: enum: + - mediatek,mt7981-wmac - mediatek,mt7986-wmac then: properties: -- cgit 1.2.3-korg From c1bb68f6b2f6be5297c5fbad5caebf67d0dd3034 Mon Sep 17 00:00:00 2001 From: Dave Thaler Date: Wed, 21 Feb 2024 09:35:35 -0800 Subject: bpf, docs: Fix typos in instruction-set.rst * "BPF ADD" should be "BPF_ADD". * "src" should be "src_reg" in several places. The latter is the field name in the instruction. The former refers to the value of the register, or the immediate. * Add '' around field names in one sentence, for consistency with the rest of the document. Signed-off-by: Dave Thaler Acked-by: David Vernet Link: https://lore.kernel.org/r/20240221173535.16601-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov --- .../bpf/standardization/instruction-set.rst | 74 +++++++++++----------- 1 file changed, 37 insertions(+), 37 deletions(-) (limited to 'Documentation') diff --git a/Documentation/bpf/standardization/instruction-set.rst b/Documentation/bpf/standardization/instruction-set.rst index 868d9f6177e36c..45cffe94c75205 100644 --- a/Documentation/bpf/standardization/instruction-set.rst +++ b/Documentation/bpf/standardization/instruction-set.rst @@ -178,8 +178,8 @@ Unused fields shall be cleared to zero. As discussed below in `64-bit immediate instructions`_, a 64-bit immediate instruction uses two 32-bit immediate values that are constructed as follows. The 64 bits following the basic instruction contain a pseudo instruction -using the same format but with opcode, dst_reg, src_reg, and offset all set to zero, -and imm containing the high 32 bits of the immediate value. +using the same format but with 'opcode', 'dst_reg', 'src_reg', and 'offset' all +set to zero, and imm containing the high 32 bits of the immediate value. This is depicted in the following figure:: @@ -392,27 +392,27 @@ otherwise identical operations, and indicates the base64 conformance group unless otherwise specified. The 'code' field encodes the operation as below: -======== ===== === =============================== ============================================= -code value src description notes -======== ===== === =============================== ============================================= -BPF_JA 0x0 0x0 PC += offset BPF_JMP | BPF_K only -BPF_JA 0x0 0x0 PC += imm BPF_JMP32 | BPF_K only -BPF_JEQ 0x1 any PC += offset if dst == src -BPF_JGT 0x2 any PC += offset if dst > src unsigned -BPF_JGE 0x3 any PC += offset if dst >= src unsigned -BPF_JSET 0x4 any PC += offset if dst & src -BPF_JNE 0x5 any PC += offset if dst != src -BPF_JSGT 0x6 any PC += offset if dst > src signed -BPF_JSGE 0x7 any PC += offset if dst >= src signed -BPF_CALL 0x8 0x0 call helper function by address BPF_JMP | BPF_K only, see `Helper functions`_ -BPF_CALL 0x8 0x1 call PC += imm BPF_JMP | BPF_K only, see `Program-local functions`_ -BPF_CALL 0x8 0x2 call helper function by BTF ID BPF_JMP | BPF_K only, see `Helper functions`_ -BPF_EXIT 0x9 0x0 return BPF_JMP | BPF_K only -BPF_JLT 0xa any PC += offset if dst < src unsigned -BPF_JLE 0xb any PC += offset if dst <= src unsigned -BPF_JSLT 0xc any PC += offset if dst < src signed -BPF_JSLE 0xd any PC += offset if dst <= src signed -======== ===== === =============================== ============================================= +======== ===== ======= =============================== ============================================= +code value src_reg description notes +======== ===== ======= =============================== ============================================= +BPF_JA 0x0 0x0 PC += offset BPF_JMP | BPF_K only +BPF_JA 0x0 0x0 PC += imm BPF_JMP32 | BPF_K only +BPF_JEQ 0x1 any PC += offset if dst == src +BPF_JGT 0x2 any PC += offset if dst > src unsigned +BPF_JGE 0x3 any PC += offset if dst >= src unsigned +BPF_JSET 0x4 any PC += offset if dst & src +BPF_JNE 0x5 any PC += offset if dst != src +BPF_JSGT 0x6 any PC += offset if dst > src signed +BPF_JSGE 0x7 any PC += offset if dst >= src signed +BPF_CALL 0x8 0x0 call helper function by address BPF_JMP | BPF_K only, see `Helper functions`_ +BPF_CALL 0x8 0x1 call PC += imm BPF_JMP | BPF_K only, see `Program-local functions`_ +BPF_CALL 0x8 0x2 call helper function by BTF ID BPF_JMP | BPF_K only, see `Helper functions`_ +BPF_EXIT 0x9 0x0 return BPF_JMP | BPF_K only +BPF_JLT 0xa any PC += offset if dst < src unsigned +BPF_JLE 0xb any PC += offset if dst <= src unsigned +BPF_JSLT 0xc any PC += offset if dst < src signed +BPF_JSLE 0xd any PC += offset if dst <= src signed +======== ===== ======= =============================== ============================================= The BPF program needs to store the return value into register R0 before doing a ``BPF_EXIT``. @@ -568,7 +568,7 @@ BPF_XOR 0xa0 atomic xor *(u32 *)(dst + offset) += src -``BPF_ATOMIC | BPF_DW | BPF_STX`` with 'imm' = BPF ADD means:: +``BPF_ATOMIC | BPF_DW | BPF_STX`` with 'imm' = BPF_ADD means:: *(u64 *)(dst + offset) += src @@ -601,24 +601,24 @@ and loaded back to ``R0``. ----------------------------- Instructions with the ``BPF_IMM`` 'mode' modifier use the wide instruction -encoding defined in `Instruction encoding`_, and use the 'src' field of the +encoding defined in `Instruction encoding`_, and use the 'src_reg' field of the basic instruction to hold an opcode subtype. The following table defines a set of ``BPF_IMM | BPF_DW | BPF_LD`` instructions -with opcode subtypes in the 'src' field, using new terms such as "map" +with opcode subtypes in the 'src_reg' field, using new terms such as "map" defined further below: -========================= ====== === ========================================= =========== ============== -opcode construction opcode src pseudocode imm type dst type -========================= ====== === ========================================= =========== ============== -BPF_IMM | BPF_DW | BPF_LD 0x18 0x0 dst = (next_imm << 32) | imm integer integer -BPF_IMM | BPF_DW | BPF_LD 0x18 0x1 dst = map_by_fd(imm) map fd map -BPF_IMM | BPF_DW | BPF_LD 0x18 0x2 dst = map_val(map_by_fd(imm)) + next_imm map fd data pointer -BPF_IMM | BPF_DW | BPF_LD 0x18 0x3 dst = var_addr(imm) variable id data pointer -BPF_IMM | BPF_DW | BPF_LD 0x18 0x4 dst = code_addr(imm) integer code pointer -BPF_IMM | BPF_DW | BPF_LD 0x18 0x5 dst = map_by_idx(imm) map index map -BPF_IMM | BPF_DW | BPF_LD 0x18 0x6 dst = map_val(map_by_idx(imm)) + next_imm map index data pointer -========================= ====== === ========================================= =========== ============== +========================= ====== ======= ========================================= =========== ============== +opcode construction opcode src_reg pseudocode imm type dst type +========================= ====== ======= ========================================= =========== ============== +BPF_IMM | BPF_DW | BPF_LD 0x18 0x0 dst = (next_imm << 32) | imm integer integer +BPF_IMM | BPF_DW | BPF_LD 0x18 0x1 dst = map_by_fd(imm) map fd map +BPF_IMM | BPF_DW | BPF_LD 0x18 0x2 dst = map_val(map_by_fd(imm)) + next_imm map fd data pointer +BPF_IMM | BPF_DW | BPF_LD 0x18 0x3 dst = var_addr(imm) variable id data pointer +BPF_IMM | BPF_DW | BPF_LD 0x18 0x4 dst = code_addr(imm) integer code pointer +BPF_IMM | BPF_DW | BPF_LD 0x18 0x5 dst = map_by_idx(imm) map index map +BPF_IMM | BPF_DW | BPF_LD 0x18 0x6 dst = map_val(map_by_idx(imm)) + next_imm map index data pointer +========================= ====== ======= ========================================= =========== ============== where -- cgit 1.2.3-korg From 89ee838130f470afcd02b30ca868f236a3f3b1d2 Mon Sep 17 00:00:00 2001 From: Dave Thaler Date: Wed, 21 Feb 2024 09:54:19 -0800 Subject: bpf, docs: specify which BPF_ABS and BPF_IND fields were zero Specifying which fields were unused allows IANA to only list as deprecated instructions that were actually used, leaving the rest as unassigned and possibly available for future use for something else. Signed-off-by: Dave Thaler Acked-by: David Vernet Link: https://lore.kernel.org/r/20240221175419.16843-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov --- Documentation/bpf/standardization/instruction-set.rst | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/bpf/standardization/instruction-set.rst b/Documentation/bpf/standardization/instruction-set.rst index 45cffe94c75205..f3269d6dd5e024 100644 --- a/Documentation/bpf/standardization/instruction-set.rst +++ b/Documentation/bpf/standardization/instruction-set.rst @@ -658,6 +658,7 @@ Legacy BPF Packet access instructions BPF previously introduced special instructions for access to packet data that were carried over from classic BPF. These instructions used an instruction class of BPF_LD, a size modifier of BPF_W, BPF_H, or BPF_B, and a -mode modifier of BPF_ABS or BPF_IND. However, these instructions are -deprecated and should no longer be used. All legacy packet access -instructions belong to the "legacy" conformance group. +mode modifier of BPF_ABS or BPF_IND. The 'dst_reg' and 'offset' fields were +set to zero, and 'src_reg' was set to zero for BPF_ABS. However, these +instructions are deprecated and should no longer be used. All legacy packet +access instructions belong to the "legacy" conformance group. -- cgit 1.2.3-korg From 1098eb62433cd4e1a7d256c042528336e4e7bd45 Mon Sep 17 00:00:00 2001 From: Jeff Johnson Date: Fri, 23 Feb 2024 17:20:34 +0200 Subject: dt-bindings: net: wireless: qcom: Update maintainers Add Jeff Johnson as a maintainer of the qcom,ath1*k.yaml files. Signed-off-by: Jeff Johnson Acked-by: Krzysztof Kozlowski Signed-off-by: Kalle Valo Link: https://msgid.link/20240217-ath1xk-maintainer-v1-1-9f7ff5fb6bf4@quicinc.com --- Documentation/devicetree/bindings/net/wireless/qcom,ath10k.yaml | 1 + Documentation/devicetree/bindings/net/wireless/qcom,ath11k-pci.yaml | 1 + Documentation/devicetree/bindings/net/wireless/qcom,ath11k.yaml | 1 + 3 files changed, 3 insertions(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.yaml b/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.yaml index 7758a55dd32866..9b3ef4bc373258 100644 --- a/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.yaml +++ b/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.yaml @@ -8,6 +8,7 @@ title: Qualcomm Technologies ath10k wireless devices maintainers: - Kalle Valo + - Jeff Johnson description: Qualcomm Technologies, Inc. IEEE 802.11ac devices. diff --git a/Documentation/devicetree/bindings/net/wireless/qcom,ath11k-pci.yaml b/Documentation/devicetree/bindings/net/wireless/qcom,ath11k-pci.yaml index 817f02a8b481d8..41d023797d7d3b 100644 --- a/Documentation/devicetree/bindings/net/wireless/qcom,ath11k-pci.yaml +++ b/Documentation/devicetree/bindings/net/wireless/qcom,ath11k-pci.yaml @@ -9,6 +9,7 @@ title: Qualcomm Technologies ath11k wireless devices (PCIe) maintainers: - Kalle Valo + - Jeff Johnson description: | Qualcomm Technologies IEEE 802.11ax PCIe devices diff --git a/Documentation/devicetree/bindings/net/wireless/qcom,ath11k.yaml b/Documentation/devicetree/bindings/net/wireless/qcom,ath11k.yaml index 7d5f982a3d09db..672282cdfc2fc4 100644 --- a/Documentation/devicetree/bindings/net/wireless/qcom,ath11k.yaml +++ b/Documentation/devicetree/bindings/net/wireless/qcom,ath11k.yaml @@ -9,6 +9,7 @@ title: Qualcomm Technologies ath11k wireless devices maintainers: - Kalle Valo + - Jeff Johnson description: | These are dt entries for Qualcomm Technologies, Inc. IEEE 802.11ax -- cgit 1.2.3-korg From 95f4fa1f459a69827d752bd55205af7c55b76e4e Mon Sep 17 00:00:00 2001 From: Jérémie Dautheribes Date: Thu, 22 Feb 2024 11:31:15 +0100 Subject: dt-bindings: net: dp83822: support configuring RMII master/slave mode MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add property ti,rmii-mode to support selecting the RMII operation mode between: - master mode (PHY operates from a 25MHz clock reference) - slave mode (PHY operates from a 50MHz clock reference) If not set, the operation mode is configured by hardware straps. Signed-off-by: Jérémie Dautheribes Acked-by: Krzysztof Kozlowski Signed-off-by: David S. Miller --- Documentation/devicetree/bindings/net/ti,dp83822.yaml | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/ti,dp83822.yaml b/Documentation/devicetree/bindings/net/ti,dp83822.yaml index 8f4350be689c57..8f23254c0458f7 100644 --- a/Documentation/devicetree/bindings/net/ti,dp83822.yaml +++ b/Documentation/devicetree/bindings/net/ti,dp83822.yaml @@ -80,6 +80,22 @@ properties: 10625, 11250, 11875, 12500, 13125, 13750, 14375, 15000] default: 10000 + ti,rmii-mode: + description: | + If present, select the RMII operation mode. Two modes are + available: + - RMII master, where the PHY operates from a 25MHz clock reference, + provided by a crystal or a CMOS-level oscillator + - RMII slave, where the PHY operates from a 50MHz clock reference, + provided by a CMOS-level oscillator + The RMII operation mode can also be configured by its straps. + If the strap pin is not set correctly or not set at all, then this can be + used to configure it. + $ref: /schemas/types.yaml#/definitions/string + enum: + - master + - slave + required: - reg -- cgit 1.2.3-korg From 5c237967e632c81db0504cffa26eaa19e7940650 Mon Sep 17 00:00:00 2001 From: Varshini Rajendran Date: Fri, 23 Feb 2024 22:52:28 +0530 Subject: dt-bindings: net: cdns,macb: add sam9x7 ethernet interface Add documentation for sam9x7 ethernet interface. Signed-off-by: Varshini Rajendran Acked-by: Rob Herring Link: https://lore.kernel.org/r/20240223172228.671553-1-varshini.rajendran@microchip.com Signed-off-by: Jakub Kicinski --- Documentation/devicetree/bindings/net/cdns,macb.yaml | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/cdns,macb.yaml b/Documentation/devicetree/bindings/net/cdns,macb.yaml index bf8894a0257e9d..2c71e2cf3a2fd2 100644 --- a/Documentation/devicetree/bindings/net/cdns,macb.yaml +++ b/Documentation/devicetree/bindings/net/cdns,macb.yaml @@ -59,6 +59,11 @@ properties: - cdns,gem # Generic - cdns,macb # Generic + - items: + - enum: + - microchip,sam9x7-gem # Microchip SAM9X7 gigabit ethernet interface + - const: microchip,sama7g5-gem # Microchip SAMA7G5 gigabit ethernet interface + reg: minItems: 1 items: -- cgit 1.2.3-korg From 28001bb1955fcfa63e535848c4289fcd7bb88daf Mon Sep 17 00:00:00 2001 From: Luiz Angelo Daros de Luca Date: Sun, 25 Feb 2024 13:29:53 -0300 Subject: dt-bindings: net: dsa: realtek: reset-gpios is not required MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The 'reset-gpios' should not be mandatory. although they might be required for some devices if the switch reset was left asserted by a previous driver, such as the bootloader. Signed-off-by: Luiz Angelo Daros de Luca Cc: devicetree@vger.kernel.org Acked-by: Arınç ÃœNAL Acked-by: Rob Herring Reviewed-by: Linus Walleij Reviewed-by: Alvin Å ipraga Signed-off-by: David S. Miller --- Documentation/devicetree/bindings/net/dsa/realtek.yaml | 1 - 1 file changed, 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/dsa/realtek.yaml b/Documentation/devicetree/bindings/net/dsa/realtek.yaml index cce692f57b0800..46e113df77c8e2 100644 --- a/Documentation/devicetree/bindings/net/dsa/realtek.yaml +++ b/Documentation/devicetree/bindings/net/dsa/realtek.yaml @@ -127,7 +127,6 @@ else: - mdc-gpios - mdio-gpios - mdio - - reset-gpios required: - compatible -- cgit 1.2.3-korg From 5fc2d68fc81801162188995e4d3dc0b26747dd76 Mon Sep 17 00:00:00 2001 From: Luiz Angelo Daros de Luca Date: Sun, 25 Feb 2024 13:29:54 -0300 Subject: dt-bindings: net: dsa: realtek: add reset controller MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Realtek switches can use a reset controller instead of reset-gpios. Signed-off-by: Luiz Angelo Daros de Luca Cc: devicetree@vger.kernel.org Acked-by: Arınç ÃœNAL Reviewed-by: Linus Walleij Reviewed-by: Alvin Å ipraga Acked-by: Rob Herring Signed-off-by: David S. Miller --- Documentation/devicetree/bindings/net/dsa/realtek.yaml | 3 +++ 1 file changed, 3 insertions(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/dsa/realtek.yaml b/Documentation/devicetree/bindings/net/dsa/realtek.yaml index 46e113df77c8e2..70b6bda3cf98e5 100644 --- a/Documentation/devicetree/bindings/net/dsa/realtek.yaml +++ b/Documentation/devicetree/bindings/net/dsa/realtek.yaml @@ -59,6 +59,9 @@ properties: description: GPIO to be used to reset the whole device maxItems: 1 + resets: + maxItems: 1 + realtek,disable-leds: type: boolean description: | -- cgit 1.2.3-korg From 12a686c2e761f1f1f6e6e2117a9ab9c6de2ac8a7 Mon Sep 17 00:00:00 2001 From: Adam Li Date: Mon, 26 Feb 2024 02:24:52 +0000 Subject: net: make SK_MEMORY_PCPU_RESERV tunable This patch adds /proc/sys/net/core/mem_pcpu_rsv sysctl file, to make SK_MEMORY_PCPU_RESERV tunable. Commit 3cd3399dd7a8 ("net: implement per-cpu reserves for memory_allocated") introduced per-cpu forward alloc cache: "Implement a per-cpu cache of +1/-1 MB, to reduce number of changes to sk->sk_prot->memory_allocated, which would otherwise be cause of false sharing." sk_prot->memory_allocated points to global atomic variable: atomic_long_t tcp_memory_allocated ____cacheline_aligned_in_smp; If increasing the per-cpu cache size from 1MB to e.g. 16MB, changes to sk->sk_prot->memory_allocated can be further reduced. Performance may be improved on system with many cores. Signed-off-by: Adam Li Reviewed-by: Christoph Lameter (Ampere) Signed-off-by: David S. Miller --- Documentation/admin-guide/sysctl/net.rst | 5 +++++ include/net/sock.h | 5 +++-- net/core/sock.c | 1 + net/core/sysctl_net_core.c | 9 +++++++++ 4 files changed, 18 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/admin-guide/sysctl/net.rst b/Documentation/admin-guide/sysctl/net.rst index 3960916519557f..7250c0542828b4 100644 --- a/Documentation/admin-guide/sysctl/net.rst +++ b/Documentation/admin-guide/sysctl/net.rst @@ -206,6 +206,11 @@ Will increase power usage. Default: 0 (off) +mem_pcpu_rsv +------------ + +Per-cpu reserved forward alloc cache size in page units. Default 1MB per CPU. + rmem_default ------------ diff --git a/include/net/sock.h b/include/net/sock.h index 796a902cf4c193..09a0cde8bf5228 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1443,6 +1443,7 @@ sk_memory_allocated(const struct sock *sk) /* 1 MB per cpu, in page units */ #define SK_MEMORY_PCPU_RESERVE (1 << (20 - PAGE_SHIFT)) +extern int sysctl_mem_pcpu_rsv; static inline void sk_memory_allocated_add(struct sock *sk, int amt) @@ -1451,7 +1452,7 @@ sk_memory_allocated_add(struct sock *sk, int amt) preempt_disable(); local_reserve = __this_cpu_add_return(*sk->sk_prot->per_cpu_fw_alloc, amt); - if (local_reserve >= SK_MEMORY_PCPU_RESERVE) { + if (local_reserve >= READ_ONCE(sysctl_mem_pcpu_rsv)) { __this_cpu_sub(*sk->sk_prot->per_cpu_fw_alloc, local_reserve); atomic_long_add(local_reserve, sk->sk_prot->memory_allocated); } @@ -1465,7 +1466,7 @@ sk_memory_allocated_sub(struct sock *sk, int amt) preempt_disable(); local_reserve = __this_cpu_sub_return(*sk->sk_prot->per_cpu_fw_alloc, amt); - if (local_reserve <= -SK_MEMORY_PCPU_RESERVE) { + if (local_reserve <= -READ_ONCE(sysctl_mem_pcpu_rsv)) { __this_cpu_sub(*sk->sk_prot->per_cpu_fw_alloc, local_reserve); atomic_long_add(local_reserve, sk->sk_prot->memory_allocated); } diff --git a/net/core/sock.c b/net/core/sock.c index 8d86886e39fa69..df2ac54a8f74e7 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -283,6 +283,7 @@ __u32 sysctl_rmem_max __read_mostly = SK_RMEM_MAX; EXPORT_SYMBOL(sysctl_rmem_max); __u32 sysctl_wmem_default __read_mostly = SK_WMEM_MAX; __u32 sysctl_rmem_default __read_mostly = SK_RMEM_MAX; +int sysctl_mem_pcpu_rsv __read_mostly = SK_MEMORY_PCPU_RESERVE; int sysctl_tstamp_allow_data __read_mostly = 1; diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c index 0f0cb1465e0892..986f15e5d6c412 100644 --- a/net/core/sysctl_net_core.c +++ b/net/core/sysctl_net_core.c @@ -30,6 +30,7 @@ static int int_3600 = 3600; static int min_sndbuf = SOCK_MIN_SNDBUF; static int min_rcvbuf = SOCK_MIN_RCVBUF; static int max_skb_frags = MAX_SKB_FRAGS; +static int min_mem_pcpu_rsv = SK_MEMORY_PCPU_RESERVE; static int net_msg_warn; /* Unused, but still a sysctl */ @@ -407,6 +408,14 @@ static struct ctl_table net_core_table[] = { .proc_handler = proc_dointvec_minmax, .extra1 = &min_rcvbuf, }, + { + .procname = "mem_pcpu_rsv", + .data = &sysctl_mem_pcpu_rsv, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec_minmax, + .extra1 = &min_mem_pcpu_rsv, + }, { .procname = "dev_weight", .data = &weight_p, -- cgit 1.2.3-korg From 3e46ec180ed91a2833f6cd637f919dcf2b53408c Mon Sep 17 00:00:00 2001 From: Krzysztof Kozlowski Date: Mon, 26 Feb 2024 13:29:13 +0100 Subject: dt-bindings: net: ethernet-controller: drop redundant type from label dtschema defines label as string, so $ref in other bindings is redundant. Signed-off-by: Krzysztof Kozlowski Acked-by: Rob Herring Signed-off-by: David S. Miller --- Documentation/devicetree/bindings/net/ethernet-controller.yaml | 1 - 1 file changed, 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/ethernet-controller.yaml b/Documentation/devicetree/bindings/net/ethernet-controller.yaml index d14d123ad7a028..b2785b03139f9d 100644 --- a/Documentation/devicetree/bindings/net/ethernet-controller.yaml +++ b/Documentation/devicetree/bindings/net/ethernet-controller.yaml @@ -14,7 +14,6 @@ properties: pattern: "^ethernet(@.*)?$" label: - $ref: /schemas/types.yaml#/definitions/string description: Human readable label on a port of a box. local-mac-address: -- cgit 1.2.3-korg From 896880ff30866f386ebed14ab81ce1ad3710cfc4 Mon Sep 17 00:00:00 2001 From: Kees Cook Date: Thu, 22 Feb 2024 07:56:15 -0800 Subject: bpf: Replace bpf_lpm_trie_key 0-length array with flexible array Replace deprecated 0-length array in struct bpf_lpm_trie_key with flexible array. Found with GCC 13: ../kernel/bpf/lpm_trie.c:207:51: warning: array subscript i is outside array bounds of 'const __u8[0]' {aka 'const unsigned char[]'} [-Warray-bounds=] 207 | *(__be16 *)&key->data[i]); | ^~~~~~~~~~~~~ ../include/uapi/linux/swab.h:102:54: note: in definition of macro '__swab16' 102 | #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x)) | ^ ../include/linux/byteorder/generic.h:97:21: note: in expansion of macro '__be16_to_cpu' 97 | #define be16_to_cpu __be16_to_cpu | ^~~~~~~~~~~~~ ../kernel/bpf/lpm_trie.c:206:28: note: in expansion of macro 'be16_to_cpu' 206 | u16 diff = be16_to_cpu(*(__be16 *)&node->data[i] ^ | ^~~~~~~~~~~ In file included from ../include/linux/bpf.h:7: ../include/uapi/linux/bpf.h:82:17: note: while referencing 'data' 82 | __u8 data[0]; /* Arbitrary size */ | ^~~~ And found at run-time under CONFIG_FORTIFY_SOURCE: UBSAN: array-index-out-of-bounds in kernel/bpf/lpm_trie.c:218:49 index 0 is out of range for type '__u8 [*]' Changing struct bpf_lpm_trie_key is difficult since has been used by userspace. For example, in Cilium: struct egress_gw_policy_key { struct bpf_lpm_trie_key lpm_key; __u32 saddr; __u32 daddr; }; While direct references to the "data" member haven't been found, there are static initializers what include the final member. For example, the "{}" here: struct egress_gw_policy_key in_key = { .lpm_key = { 32 + 24, {} }, .saddr = CLIENT_IP, .daddr = EXTERNAL_SVC_IP & 0Xffffff, }; To avoid the build time and run time warnings seen with a 0-sized trailing array for struct bpf_lpm_trie_key, introduce a new struct that correctly uses a flexible array for the trailing bytes, struct bpf_lpm_trie_key_u8. As part of this, include the "header" portion (which is just the "prefixlen" member), so it can be used by anything building a bpf_lpr_trie_key that has trailing members that aren't a u8 flexible array (like the self-test[1]), which is named struct bpf_lpm_trie_key_hdr. Unfortunately, C++ refuses to parse the __struct_group() helper, so it is not possible to define struct bpf_lpm_trie_key_hdr directly in struct bpf_lpm_trie_key_u8, so we must open-code the union directly. Adjust the kernel code to use struct bpf_lpm_trie_key_u8 through-out, and for the selftest to use struct bpf_lpm_trie_key_hdr. Add a comment to the UAPI header directing folks to the two new options. Reported-by: Mark Rutland Signed-off-by: Kees Cook Signed-off-by: Daniel Borkmann Acked-by: Gustavo A. R. Silva Closes: https://paste.debian.net/hidden/ca500597/ Link: https://lore.kernel.org/all/202206281009.4332AA33@keescook/ [1] Link: https://lore.kernel.org/bpf/20240222155612.it.533-kees@kernel.org --- Documentation/bpf/map_lpm_trie.rst | 2 +- include/uapi/linux/bpf.h | 19 ++++++++++++++++++- kernel/bpf/lpm_trie.c | 20 ++++++++++---------- samples/bpf/map_perf_test_user.c | 2 +- samples/bpf/xdp_router_ipv4_user.c | 2 +- tools/include/uapi/linux/bpf.h | 19 ++++++++++++++++++- tools/testing/selftests/bpf/progs/map_ptr_kern.c | 2 +- tools/testing/selftests/bpf/test_lpm_map.c | 18 +++++++++--------- 8 files changed, 59 insertions(+), 25 deletions(-) (limited to 'Documentation') diff --git a/Documentation/bpf/map_lpm_trie.rst b/Documentation/bpf/map_lpm_trie.rst index 74d64a30f50073..f9cd579496c9ce 100644 --- a/Documentation/bpf/map_lpm_trie.rst +++ b/Documentation/bpf/map_lpm_trie.rst @@ -17,7 +17,7 @@ significant byte. LPM tries may be created with a maximum prefix length that is a multiple of 8, in the range from 8 to 2048. The key used for lookup and update -operations is a ``struct bpf_lpm_trie_key``, extended by +operations is a ``struct bpf_lpm_trie_key_u8``, extended by ``max_prefixlen/8`` bytes. - For IPv4 addresses the data length is 4 bytes diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index d2e6c5fcec0198..a241f407c23414 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -77,12 +77,29 @@ struct bpf_insn { __s32 imm; /* signed immediate constant */ }; -/* Key of an a BPF_MAP_TYPE_LPM_TRIE entry */ +/* Deprecated: use struct bpf_lpm_trie_key_u8 (when the "data" member is needed for + * byte access) or struct bpf_lpm_trie_key_hdr (when using an alternative type for + * the trailing flexible array member) instead. + */ struct bpf_lpm_trie_key { __u32 prefixlen; /* up to 32 for AF_INET, 128 for AF_INET6 */ __u8 data[0]; /* Arbitrary size */ }; +/* Header for bpf_lpm_trie_key structs */ +struct bpf_lpm_trie_key_hdr { + __u32 prefixlen; +}; + +/* Key of an a BPF_MAP_TYPE_LPM_TRIE entry, with trailing byte array. */ +struct bpf_lpm_trie_key_u8 { + union { + struct bpf_lpm_trie_key_hdr hdr; + __u32 prefixlen; + }; + __u8 data[]; /* Arbitrary size */ +}; + struct bpf_cgroup_storage_key { __u64 cgroup_inode_id; /* cgroup inode id */ __u32 attach_type; /* program attach type (enum bpf_attach_type) */ diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c index b32be680da6cdc..050fe1ebf0f7d1 100644 --- a/kernel/bpf/lpm_trie.c +++ b/kernel/bpf/lpm_trie.c @@ -164,13 +164,13 @@ static inline int extract_bit(const u8 *data, size_t index) */ static size_t longest_prefix_match(const struct lpm_trie *trie, const struct lpm_trie_node *node, - const struct bpf_lpm_trie_key *key) + const struct bpf_lpm_trie_key_u8 *key) { u32 limit = min(node->prefixlen, key->prefixlen); u32 prefixlen = 0, i = 0; BUILD_BUG_ON(offsetof(struct lpm_trie_node, data) % sizeof(u32)); - BUILD_BUG_ON(offsetof(struct bpf_lpm_trie_key, data) % sizeof(u32)); + BUILD_BUG_ON(offsetof(struct bpf_lpm_trie_key_u8, data) % sizeof(u32)); #if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && defined(CONFIG_64BIT) @@ -229,7 +229,7 @@ static void *trie_lookup_elem(struct bpf_map *map, void *_key) { struct lpm_trie *trie = container_of(map, struct lpm_trie, map); struct lpm_trie_node *node, *found = NULL; - struct bpf_lpm_trie_key *key = _key; + struct bpf_lpm_trie_key_u8 *key = _key; if (key->prefixlen > trie->max_prefixlen) return NULL; @@ -309,7 +309,7 @@ static long trie_update_elem(struct bpf_map *map, struct lpm_trie *trie = container_of(map, struct lpm_trie, map); struct lpm_trie_node *node, *im_node = NULL, *new_node = NULL; struct lpm_trie_node __rcu **slot; - struct bpf_lpm_trie_key *key = _key; + struct bpf_lpm_trie_key_u8 *key = _key; unsigned long irq_flags; unsigned int next_bit; size_t matchlen = 0; @@ -437,7 +437,7 @@ out: static long trie_delete_elem(struct bpf_map *map, void *_key) { struct lpm_trie *trie = container_of(map, struct lpm_trie, map); - struct bpf_lpm_trie_key *key = _key; + struct bpf_lpm_trie_key_u8 *key = _key; struct lpm_trie_node __rcu **trim, **trim2; struct lpm_trie_node *node, *parent; unsigned long irq_flags; @@ -536,7 +536,7 @@ out: sizeof(struct lpm_trie_node)) #define LPM_VAL_SIZE_MIN 1 -#define LPM_KEY_SIZE(X) (sizeof(struct bpf_lpm_trie_key) + (X)) +#define LPM_KEY_SIZE(X) (sizeof(struct bpf_lpm_trie_key_u8) + (X)) #define LPM_KEY_SIZE_MAX LPM_KEY_SIZE(LPM_DATA_SIZE_MAX) #define LPM_KEY_SIZE_MIN LPM_KEY_SIZE(LPM_DATA_SIZE_MIN) @@ -565,7 +565,7 @@ static struct bpf_map *trie_alloc(union bpf_attr *attr) /* copy mandatory map attributes */ bpf_map_init_from_attr(&trie->map, attr); trie->data_size = attr->key_size - - offsetof(struct bpf_lpm_trie_key, data); + offsetof(struct bpf_lpm_trie_key_u8, data); trie->max_prefixlen = trie->data_size * 8; spin_lock_init(&trie->lock); @@ -616,7 +616,7 @@ static int trie_get_next_key(struct bpf_map *map, void *_key, void *_next_key) { struct lpm_trie_node *node, *next_node = NULL, *parent, *search_root; struct lpm_trie *trie = container_of(map, struct lpm_trie, map); - struct bpf_lpm_trie_key *key = _key, *next_key = _next_key; + struct bpf_lpm_trie_key_u8 *key = _key, *next_key = _next_key; struct lpm_trie_node **node_stack = NULL; int err = 0, stack_ptr = -1; unsigned int next_bit; @@ -703,7 +703,7 @@ find_leftmost: } do_copy: next_key->prefixlen = next_node->prefixlen; - memcpy((void *)next_key + offsetof(struct bpf_lpm_trie_key, data), + memcpy((void *)next_key + offsetof(struct bpf_lpm_trie_key_u8, data), next_node->data, trie->data_size); free_stack: kfree(node_stack); @@ -715,7 +715,7 @@ static int trie_check_btf(const struct bpf_map *map, const struct btf_type *key_type, const struct btf_type *value_type) { - /* Keys must have struct bpf_lpm_trie_key embedded. */ + /* Keys must have struct bpf_lpm_trie_key_u8 embedded. */ return BTF_INFO_KIND(key_type->info) != BTF_KIND_STRUCT ? -EINVAL : 0; } diff --git a/samples/bpf/map_perf_test_user.c b/samples/bpf/map_perf_test_user.c index d2fbcf963cdf6d..07ff471ed6aee0 100644 --- a/samples/bpf/map_perf_test_user.c +++ b/samples/bpf/map_perf_test_user.c @@ -370,7 +370,7 @@ static void run_perf_test(int tasks) static void fill_lpm_trie(void) { - struct bpf_lpm_trie_key *key; + struct bpf_lpm_trie_key_u8 *key; unsigned long value = 0; unsigned int i; int r; diff --git a/samples/bpf/xdp_router_ipv4_user.c b/samples/bpf/xdp_router_ipv4_user.c index 9d41db09c4800f..266fdd0b025dc6 100644 --- a/samples/bpf/xdp_router_ipv4_user.c +++ b/samples/bpf/xdp_router_ipv4_user.c @@ -91,7 +91,7 @@ static int recv_msg(struct sockaddr_nl sock_addr, int sock) static void read_route(struct nlmsghdr *nh, int nll) { char dsts[24], gws[24], ifs[16], dsts_len[24], metrics[24]; - struct bpf_lpm_trie_key *prefix_key; + struct bpf_lpm_trie_key_u8 *prefix_key; struct rtattr *rt_attr; struct rtmsg *rt_msg; int rtm_family; diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index d2e6c5fcec0198..a241f407c23414 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -77,12 +77,29 @@ struct bpf_insn { __s32 imm; /* signed immediate constant */ }; -/* Key of an a BPF_MAP_TYPE_LPM_TRIE entry */ +/* Deprecated: use struct bpf_lpm_trie_key_u8 (when the "data" member is needed for + * byte access) or struct bpf_lpm_trie_key_hdr (when using an alternative type for + * the trailing flexible array member) instead. + */ struct bpf_lpm_trie_key { __u32 prefixlen; /* up to 32 for AF_INET, 128 for AF_INET6 */ __u8 data[0]; /* Arbitrary size */ }; +/* Header for bpf_lpm_trie_key structs */ +struct bpf_lpm_trie_key_hdr { + __u32 prefixlen; +}; + +/* Key of an a BPF_MAP_TYPE_LPM_TRIE entry, with trailing byte array. */ +struct bpf_lpm_trie_key_u8 { + union { + struct bpf_lpm_trie_key_hdr hdr; + __u32 prefixlen; + }; + __u8 data[]; /* Arbitrary size */ +}; + struct bpf_cgroup_storage_key { __u64 cgroup_inode_id; /* cgroup inode id */ __u32 attach_type; /* program attach type (enum bpf_attach_type) */ diff --git a/tools/testing/selftests/bpf/progs/map_ptr_kern.c b/tools/testing/selftests/bpf/progs/map_ptr_kern.c index 3325da17ec81af..efaf622c28ddec 100644 --- a/tools/testing/selftests/bpf/progs/map_ptr_kern.c +++ b/tools/testing/selftests/bpf/progs/map_ptr_kern.c @@ -316,7 +316,7 @@ struct lpm_trie { } __attribute__((preserve_access_index)); struct lpm_key { - struct bpf_lpm_trie_key trie_key; + struct bpf_lpm_trie_key_hdr trie_key; __u32 data; }; diff --git a/tools/testing/selftests/bpf/test_lpm_map.c b/tools/testing/selftests/bpf/test_lpm_map.c index c028d621c744da..d98c72dc563eaf 100644 --- a/tools/testing/selftests/bpf/test_lpm_map.c +++ b/tools/testing/selftests/bpf/test_lpm_map.c @@ -211,7 +211,7 @@ static void test_lpm_map(int keysize) volatile size_t n_matches, n_matches_after_delete; size_t i, j, n_nodes, n_lookups; struct tlpm_node *t, *list = NULL; - struct bpf_lpm_trie_key *key; + struct bpf_lpm_trie_key_u8 *key; uint8_t *data, *value; int r, map; @@ -331,8 +331,8 @@ static void test_lpm_map(int keysize) static void test_lpm_ipaddr(void) { LIBBPF_OPTS(bpf_map_create_opts, opts, .map_flags = BPF_F_NO_PREALLOC); - struct bpf_lpm_trie_key *key_ipv4; - struct bpf_lpm_trie_key *key_ipv6; + struct bpf_lpm_trie_key_u8 *key_ipv4; + struct bpf_lpm_trie_key_u8 *key_ipv6; size_t key_size_ipv4; size_t key_size_ipv6; int map_fd_ipv4; @@ -423,7 +423,7 @@ static void test_lpm_ipaddr(void) static void test_lpm_delete(void) { LIBBPF_OPTS(bpf_map_create_opts, opts, .map_flags = BPF_F_NO_PREALLOC); - struct bpf_lpm_trie_key *key; + struct bpf_lpm_trie_key_u8 *key; size_t key_size; int map_fd; __u64 value; @@ -532,7 +532,7 @@ static void test_lpm_delete(void) static void test_lpm_get_next_key(void) { LIBBPF_OPTS(bpf_map_create_opts, opts, .map_flags = BPF_F_NO_PREALLOC); - struct bpf_lpm_trie_key *key_p, *next_key_p; + struct bpf_lpm_trie_key_u8 *key_p, *next_key_p; size_t key_size; __u32 value = 0; int map_fd; @@ -693,9 +693,9 @@ static void *lpm_test_command(void *arg) { int i, j, ret, iter, key_size; struct lpm_mt_test_info *info = arg; - struct bpf_lpm_trie_key *key_p; + struct bpf_lpm_trie_key_u8 *key_p; - key_size = sizeof(struct bpf_lpm_trie_key) + sizeof(__u32); + key_size = sizeof(*key_p) + sizeof(__u32); key_p = alloca(key_size); for (iter = 0; iter < info->iter; iter++) for (i = 0; i < MAX_TEST_KEYS; i++) { @@ -717,7 +717,7 @@ static void *lpm_test_command(void *arg) ret = bpf_map_lookup_elem(info->map_fd, key_p, &value); assert(ret == 0 || errno == ENOENT); } else { - struct bpf_lpm_trie_key *next_key_p = alloca(key_size); + struct bpf_lpm_trie_key_u8 *next_key_p = alloca(key_size); ret = bpf_map_get_next_key(info->map_fd, key_p, next_key_p); assert(ret == 0 || errno == ENOENT || errno == ENOMEM); } @@ -752,7 +752,7 @@ static void test_lpm_multi_thread(void) /* create a trie */ value_size = sizeof(__u32); - key_size = sizeof(struct bpf_lpm_trie_key) + value_size; + key_size = sizeof(struct bpf_lpm_trie_key_hdr) + value_size; map_fd = bpf_map_create(BPF_MAP_TYPE_LPM_TRIE, NULL, key_size, value_size, 100, &opts); /* create 4 threads to test update, delete, lookup and get_next_key */ -- cgit 1.2.3-korg From edac4b1132972cf086d59f3919febecc1430ebca Mon Sep 17 00:00:00 2001 From: Justin Chen Date: Wed, 28 Feb 2024 14:53:55 -0800 Subject: dt-bindings: net: brcm,unimac-mdio: Add asp-v2.2 The ASP 2.2 Ethernet controller uses a brcm unimac. Signed-off-by: Justin Chen Acked-by: Krzysztof Kozlowski Acked-by: Florian Fainelli Signed-off-by: David S. Miller --- Documentation/devicetree/bindings/net/brcm,unimac-mdio.yaml | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/brcm,unimac-mdio.yaml b/Documentation/devicetree/bindings/net/brcm,unimac-mdio.yaml index 6684810fcbf01c..23dfe0838dca48 100644 --- a/Documentation/devicetree/bindings/net/brcm,unimac-mdio.yaml +++ b/Documentation/devicetree/bindings/net/brcm,unimac-mdio.yaml @@ -24,6 +24,7 @@ properties: - brcm,genet-mdio-v5 - brcm,asp-v2.0-mdio - brcm,asp-v2.1-mdio + - brcm,asp-v2.2-mdio - brcm,unimac-mdio reg: -- cgit 1.2.3-korg From 5682a878e7f1f2c559bb09993181ed1a05315331 Mon Sep 17 00:00:00 2001 From: Justin Chen Date: Wed, 28 Feb 2024 14:53:56 -0800 Subject: dt-bindings: net: brcm,asp-v2.0: Add asp-v2.2 Add support for ASP 2.2. Signed-off-by: Justin Chen Acked-by: Florian Fainelli Acked-by: Krzysztof Kozlowski Signed-off-by: David S. Miller --- Documentation/devicetree/bindings/net/brcm,asp-v2.0.yaml | 4 ++++ 1 file changed, 4 insertions(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/brcm,asp-v2.0.yaml b/Documentation/devicetree/bindings/net/brcm,asp-v2.0.yaml index 75d8138298fbbe..660e2ca42daf50 100644 --- a/Documentation/devicetree/bindings/net/brcm,asp-v2.0.yaml +++ b/Documentation/devicetree/bindings/net/brcm,asp-v2.0.yaml @@ -15,6 +15,10 @@ description: Broadcom Ethernet controller first introduced with 72165 properties: compatible: oneOf: + - items: + - enum: + - brcm,bcm74165b0-asp + - const: brcm,asp-v2.2 - items: - enum: - brcm,bcm74165-asp -- cgit 1.2.3-korg From 4e73e1bc1abf3181d57d6b8f1ab2a9f62a6a1a52 Mon Sep 17 00:00:00 2001 From: Dave Thaler Date: Fri, 1 Mar 2024 14:23:37 -0800 Subject: bpf, docs: Use IETF format for field definitions in instruction-set.rst In preparation for publication as an IETF RFC, the WG chairs asked me to convert the document to use IETF packet format for field layout, so this patch attempts to make it consistent with other IETF documents. Some fields that are not byte aligned were previously inconsistent in how values were defined. Some were defined as the value of the byte containing the field (like 0x20 for a field holding the high four bits of the byte), and others were defined as the value of the field itself (like 0x2). This PR makes them be consistent in using just the values of the field itself, which is IETF convention. As a result, some of the defines that used BPF_* would no longer match the value in the spec, and so this patch also drops the BPF_* prefix to avoid confusion with the defines that are the full-byte equivalent values. For consistency, BPF_* is then dropped from other fields too. BPF_ is thus the Linux implementation-specific define for as it appears in the BPF ISA specification. The syntax BPF_ADD | BPF_X | BPF_ALU only worked for full-byte values so the convention {ADD, X, ALU} is proposed for referring to field values instead. Also replace the redundant "LSB bits" with "least significant bits". A preview of what the resulting Internet Draft would look like can be seen at: https://htmlpreview.github.io/?https://raw.githubusercontent.com/dthaler/ebp f-docs-1/format/draft-ietf-bpf-isa.html v1->v2: Fix sphinx issue as recommended by David Vernet Signed-off-by: Dave Thaler Acked-by: David Vernet Link: https://lore.kernel.org/r/20240301222337.15931-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov --- .../bpf/standardization/instruction-set.rst | 531 +++++++++++---------- 1 file changed, 290 insertions(+), 241 deletions(-) (limited to 'Documentation') diff --git a/Documentation/bpf/standardization/instruction-set.rst b/Documentation/bpf/standardization/instruction-set.rst index f3269d6dd5e024..ffcba257e004d4 100644 --- a/Documentation/bpf/standardization/instruction-set.rst +++ b/Documentation/bpf/standardization/instruction-set.rst @@ -24,22 +24,22 @@ a type's signedness (`S`) and bit width (`N`), respectively. .. table:: Meaning of signedness notation. ==== ========= - `S` Meaning + S Meaning ==== ========= - `u` unsigned - `s` signed + u unsigned + s signed ==== ========= .. table:: Meaning of bit-width notation. ===== ========= - `N` Bit width + N Bit width ===== ========= - `8` 8 bits - `16` 16 bits - `32` 32 bits - `64` 64 bits - `128` 128 bits + 8 8 bits + 16 16 bits + 32 32 bits + 64 64 bits + 128 128 bits ===== ========= For example, `u32` is a type whose valid values are all the 32-bit unsigned @@ -48,31 +48,31 @@ numbers. Functions --------- -* `htobe16`: Takes an unsigned 16-bit number in host-endian format and +* htobe16: Takes an unsigned 16-bit number in host-endian format and returns the equivalent number as an unsigned 16-bit number in big-endian format. -* `htobe32`: Takes an unsigned 32-bit number in host-endian format and +* htobe32: Takes an unsigned 32-bit number in host-endian format and returns the equivalent number as an unsigned 32-bit number in big-endian format. -* `htobe64`: Takes an unsigned 64-bit number in host-endian format and +* htobe64: Takes an unsigned 64-bit number in host-endian format and returns the equivalent number as an unsigned 64-bit number in big-endian format. -* `htole16`: Takes an unsigned 16-bit number in host-endian format and +* htole16: Takes an unsigned 16-bit number in host-endian format and returns the equivalent number as an unsigned 16-bit number in little-endian format. -* `htole32`: Takes an unsigned 32-bit number in host-endian format and +* htole32: Takes an unsigned 32-bit number in host-endian format and returns the equivalent number as an unsigned 32-bit number in little-endian format. -* `htole64`: Takes an unsigned 64-bit number in host-endian format and +* htole64: Takes an unsigned 64-bit number in host-endian format and returns the equivalent number as an unsigned 64-bit number in little-endian format. -* `bswap16`: Takes an unsigned 16-bit number in either big- or little-endian +* bswap16: Takes an unsigned 16-bit number in either big- or little-endian format and returns the equivalent number with the same bit width but opposite endianness. -* `bswap32`: Takes an unsigned 32-bit number in either big- or little-endian +* bswap32: Takes an unsigned 32-bit number in either big- or little-endian format and returns the equivalent number with the same bit width but opposite endianness. -* `bswap64`: Takes an unsigned 64-bit number in either big- or little-endian +* bswap64: Takes an unsigned 64-bit number in either big- or little-endian format and returns the equivalent number with the same bit width but opposite endianness. @@ -135,34 +135,63 @@ Instruction encoding BPF has two instruction encodings: * the basic instruction encoding, which uses 64 bits to encode an instruction -* the wide instruction encoding, which appends a second 64-bit immediate (i.e., - constant) value after the basic instruction for a total of 128 bits. +* the wide instruction encoding, which appends a second 64 bits + after the basic instruction for a total of 128 bits. -The fields conforming an encoded basic instruction are stored in the -following order:: +Basic instruction encoding +-------------------------- - opcode:8 src_reg:4 dst_reg:4 offset:16 imm:32 // In little-endian BPF. - opcode:8 dst_reg:4 src_reg:4 offset:16 imm:32 // In big-endian BPF. +A basic instruction is encoded as follows:: -**imm** - signed integer immediate value + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | opcode | regs | offset | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | imm | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -**offset** - signed integer offset used with pointer arithmetic +**opcode** + operation to perform, encoded as follows:: -**src_reg** - the source register number (0-10), except where otherwise specified - (`64-bit immediate instructions`_ reuse this field for other purposes) + +-+-+-+-+-+-+-+-+ + |specific |class| + +-+-+-+-+-+-+-+-+ -**dst_reg** - destination register number (0-10) + **specific** + The format of these bits varies by instruction class -**opcode** - operation to perform + **class** + The instruction class (see `Instruction classes`_) + +**regs** + The source and destination register numbers, encoded as follows + on a little-endian host:: + + +-+-+-+-+-+-+-+-+ + |src_reg|dst_reg| + +-+-+-+-+-+-+-+-+ + + and as follows on a big-endian host:: + + +-+-+-+-+-+-+-+-+ + |dst_reg|src_reg| + +-+-+-+-+-+-+-+-+ + + **src_reg** + the source register number (0-10), except where otherwise specified + (`64-bit immediate instructions`_ reuse this field for other purposes) + + **dst_reg** + destination register number (0-10) + +**offset** + signed integer offset used with pointer arithmetic + +**imm** + signed integer immediate value -Note that the contents of multi-byte fields ('imm' and 'offset') are -stored using big-endian byte ordering in big-endian BPF and -little-endian byte ordering in little-endian BPF. +Note that the contents of multi-byte fields ('offset' and 'imm') are +stored using big-endian byte ordering on big-endian hosts and +little-endian byte ordering on little-endian hosts. For example:: @@ -175,66 +204,83 @@ For example:: Note that most instructions do not use all of the fields. Unused fields shall be cleared to zero. -As discussed below in `64-bit immediate instructions`_, a 64-bit immediate -instruction uses two 32-bit immediate values that are constructed as follows. -The 64 bits following the basic instruction contain a pseudo instruction -using the same format but with 'opcode', 'dst_reg', 'src_reg', and 'offset' all -set to zero, and imm containing the high 32 bits of the immediate value. +Wide instruction encoding +-------------------------- + +Some instructions are defined to use the wide instruction encoding, +which uses two 32-bit immediate values. The 64 bits following +the basic instruction format contain a pseudo instruction +with 'opcode', 'dst_reg', 'src_reg', and 'offset' all set to zero. This is depicted in the following figure:: - basic_instruction - .------------------------------. - | | - opcode:8 regs:8 offset:16 imm:32 unused:32 imm:32 - | | - '--------------' - pseudo instruction + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | opcode | regs | offset | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | imm | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | reserved | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | next_imm | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + +**opcode** + operation to perform, encoded as explained above + +**regs** + The source and destination register numbers, encoded as explained above + +**offset** + signed integer offset used with pointer arithmetic + +**imm** + signed integer immediate value + +**reserved** + unused, set to zero -Here, the imm value of the pseudo instruction is called 'next_imm'. The unused -bytes in the pseudo instruction are reserved and shall be cleared to zero. +**next_imm** + second signed integer immediate value Instruction classes ------------------- -The three LSB bits of the 'opcode' field store the instruction class: - -========= ===== =============================== =================================== -class value description reference -========= ===== =============================== =================================== -BPF_LD 0x00 non-standard load operations `Load and store instructions`_ -BPF_LDX 0x01 load into register operations `Load and store instructions`_ -BPF_ST 0x02 store from immediate operations `Load and store instructions`_ -BPF_STX 0x03 store from register operations `Load and store instructions`_ -BPF_ALU 0x04 32-bit arithmetic operations `Arithmetic and jump instructions`_ -BPF_JMP 0x05 64-bit jump operations `Arithmetic and jump instructions`_ -BPF_JMP32 0x06 32-bit jump operations `Arithmetic and jump instructions`_ -BPF_ALU64 0x07 64-bit arithmetic operations `Arithmetic and jump instructions`_ -========= ===== =============================== =================================== +The three least significant bits of the 'opcode' field store the instruction class: + +===== ===== =============================== =================================== +class value description reference +===== ===== =============================== =================================== +LD 0x0 non-standard load operations `Load and store instructions`_ +LDX 0x1 load into register operations `Load and store instructions`_ +ST 0x2 store from immediate operations `Load and store instructions`_ +STX 0x3 store from register operations `Load and store instructions`_ +ALU 0x4 32-bit arithmetic operations `Arithmetic and jump instructions`_ +JMP 0x5 64-bit jump operations `Arithmetic and jump instructions`_ +JMP32 0x6 32-bit jump operations `Arithmetic and jump instructions`_ +ALU64 0x7 64-bit arithmetic operations `Arithmetic and jump instructions`_ +===== ===== =============================== =================================== Arithmetic and jump instructions ================================ -For arithmetic and jump instructions (``BPF_ALU``, ``BPF_ALU64``, ``BPF_JMP`` and -``BPF_JMP32``), the 8-bit 'opcode' field is divided into three parts: +For arithmetic and jump instructions (``ALU``, ``ALU64``, ``JMP`` and +``JMP32``), the 8-bit 'opcode' field is divided into three parts:: -============== ====== ================= -4 bits (MSB) 1 bit 3 bits (LSB) -============== ====== ================= -code source instruction class -============== ====== ================= + +-+-+-+-+-+-+-+-+ + | code |s|class| + +-+-+-+-+-+-+-+-+ **code** the operation code, whose meaning varies by instruction class -**source** +**s (source)** the source operand location, which unless otherwise specified is one of: ====== ===== ============================================== source value description ====== ===== ============================================== - BPF_K 0x00 use 32-bit 'imm' value as source operand - BPF_X 0x08 use 'src_reg' register value as source operand + K 0 use 32-bit 'imm' value as source operand + X 1 use 'src_reg' register value as source operand ====== ===== ============================================== **instruction class** @@ -243,75 +289,75 @@ code source instruction class Arithmetic instructions ----------------------- -``BPF_ALU`` uses 32-bit wide operands while ``BPF_ALU64`` uses 64-bit wide operands for -otherwise identical operations. ``BPF_ALU64`` instructions belong to the +``ALU`` uses 32-bit wide operands while ``ALU64`` uses 64-bit wide operands for +otherwise identical operations. ``ALU64`` instructions belong to the base64 conformance group unless noted otherwise. The 'code' field encodes the operation as below, where 'src' and 'dst' refer to the values of the source and destination registers, respectively. -========= ===== ======= ========================================================== -code value offset description -========= ===== ======= ========================================================== -BPF_ADD 0x00 0 dst += src -BPF_SUB 0x10 0 dst -= src -BPF_MUL 0x20 0 dst \*= src -BPF_DIV 0x30 0 dst = (src != 0) ? (dst / src) : 0 -BPF_SDIV 0x30 1 dst = (src != 0) ? (dst s/ src) : 0 -BPF_OR 0x40 0 dst \|= src -BPF_AND 0x50 0 dst &= src -BPF_LSH 0x60 0 dst <<= (src & mask) -BPF_RSH 0x70 0 dst >>= (src & mask) -BPF_NEG 0x80 0 dst = -dst -BPF_MOD 0x90 0 dst = (src != 0) ? (dst % src) : dst -BPF_SMOD 0x90 1 dst = (src != 0) ? (dst s% src) : dst -BPF_XOR 0xa0 0 dst ^= src -BPF_MOV 0xb0 0 dst = src -BPF_MOVSX 0xb0 8/16/32 dst = (s8,s16,s32)src -BPF_ARSH 0xc0 0 :term:`sign extending` dst >>= (src & mask) -BPF_END 0xd0 0 byte swap operations (see `Byte swap instructions`_ below) -========= ===== ======= ========================================================== +===== ===== ======= ========================================================== +name code offset description +===== ===== ======= ========================================================== +ADD 0x0 0 dst += src +SUB 0x1 0 dst -= src +MUL 0x2 0 dst \*= src +DIV 0x3 0 dst = (src != 0) ? (dst / src) : 0 +SDIV 0x3 1 dst = (src != 0) ? (dst s/ src) : 0 +OR 0x4 0 dst \|= src +AND 0x5 0 dst &= src +LSH 0x6 0 dst <<= (src & mask) +RSH 0x7 0 dst >>= (src & mask) +NEG 0x8 0 dst = -dst +MOD 0x9 0 dst = (src != 0) ? (dst % src) : dst +SMOD 0x9 1 dst = (src != 0) ? (dst s% src) : dst +XOR 0xa 0 dst ^= src +MOV 0xb 0 dst = src +MOVSX 0xb 8/16/32 dst = (s8,s16,s32)src +ARSH 0xc 0 :term:`sign extending` dst >>= (src & mask) +END 0xd 0 byte swap operations (see `Byte swap instructions`_ below) +===== ===== ======= ========================================================== Underflow and overflow are allowed during arithmetic operations, meaning the 64-bit or 32-bit value will wrap. If BPF program execution would result in division by zero, the destination register is instead set to zero. -If execution would result in modulo by zero, for ``BPF_ALU64`` the value of -the destination register is unchanged whereas for ``BPF_ALU`` the upper +If execution would result in modulo by zero, for ``ALU64`` the value of +the destination register is unchanged whereas for ``ALU`` the upper 32 bits of the destination register are zeroed. -``BPF_ADD | BPF_X | BPF_ALU`` means:: +``{ADD, X, ALU}``, where 'code' = ``ADD``, 'source' = ``X``, and 'class' = ``ALU``, means:: dst = (u32) ((u32) dst + (u32) src) where '(u32)' indicates that the upper 32 bits are zeroed. -``BPF_ADD | BPF_X | BPF_ALU64`` means:: +``{ADD, X, ALU64}`` means:: dst = dst + src -``BPF_XOR | BPF_K | BPF_ALU`` means:: +``{XOR, K, ALU}`` means:: dst = (u32) dst ^ (u32) imm -``BPF_XOR | BPF_K | BPF_ALU64`` means:: +``{XOR, K, ALU64}`` means:: dst = dst ^ imm Note that most instructions have instruction offset of 0. Only three instructions -(``BPF_SDIV``, ``BPF_SMOD``, ``BPF_MOVSX``) have a non-zero offset. +(``SDIV``, ``SMOD``, ``MOVSX``) have a non-zero offset. -Division, multiplication, and modulo operations for ``BPF_ALU`` are part +Division, multiplication, and modulo operations for ``ALU`` are part of the "divmul32" conformance group, and division, multiplication, and -modulo operations for ``BPF_ALU64`` are part of the "divmul64" conformance +modulo operations for ``ALU64`` are part of the "divmul64" conformance group. The division and modulo operations support both unsigned and signed flavors. -For unsigned operations (``BPF_DIV`` and ``BPF_MOD``), for ``BPF_ALU``, -'imm' is interpreted as a 32-bit unsigned value. For ``BPF_ALU64``, +For unsigned operations (``DIV`` and ``MOD``), for ``ALU``, +'imm' is interpreted as a 32-bit unsigned value. For ``ALU64``, 'imm' is first :term:`sign extended` from 32 to 64 bits, and then interpreted as a 64-bit unsigned value. -For signed operations (``BPF_SDIV`` and ``BPF_SMOD``), for ``BPF_ALU``, -'imm' is interpreted as a 32-bit signed value. For ``BPF_ALU64``, 'imm' +For signed operations (``SDIV`` and ``SMOD``), for ``ALU``, +'imm' is interpreted as a 32-bit signed value. For ``ALU64``, 'imm' is first :term:`sign extended` from 32 to 64 bits, and then interpreted as a 64-bit signed value. @@ -323,15 +369,15 @@ etc. This specification requires that signed modulo use truncated division a % n = a - n * trunc(a / n) -The ``BPF_MOVSX`` instruction does a move operation with sign extension. -``BPF_ALU | BPF_MOVSX`` :term:`sign extends` 8-bit and 16-bit operands into 32 +The ``MOVSX`` instruction does a move operation with sign extension. +``{MOVSX, X, ALU}`` :term:`sign extends` 8-bit and 16-bit operands into 32 bit operands, and zeroes the remaining upper 32 bits. -``BPF_ALU64 | BPF_MOVSX`` :term:`sign extends` 8-bit, 16-bit, and 32-bit +``{MOVSX, X, ALU64}`` :term:`sign extends` 8-bit, 16-bit, and 32-bit operands into 64 bit operands. Unlike other arithmetic instructions, -``BPF_MOVSX`` is only defined for register source operands (``BPF_X``). +``MOVSX`` is only defined for register source operands (``X``). -The ``BPF_NEG`` instruction is only defined when the source bit is clear -(``BPF_K``). +The ``NEG`` instruction is only defined when the source bit is clear +(``K``). Shift operations use a mask of 0x3F (63) for 64-bit operations and 0x1F (31) for 32-bit operations. @@ -339,24 +385,24 @@ for 32-bit operations. Byte swap instructions ---------------------- -The byte swap instructions use instruction classes of ``BPF_ALU`` and ``BPF_ALU64`` -and a 4-bit 'code' field of ``BPF_END``. +The byte swap instructions use instruction classes of ``ALU`` and ``ALU64`` +and a 4-bit 'code' field of ``END``. The byte swap instructions operate on the destination register only and do not use a separate source register or immediate value. -For ``BPF_ALU``, the 1-bit source operand field in the opcode is used to +For ``ALU``, the 1-bit source operand field in the opcode is used to select what byte order the operation converts from or to. For -``BPF_ALU64``, the 1-bit source operand field in the opcode is reserved +``ALU64``, the 1-bit source operand field in the opcode is reserved and must be set to 0. -========= ========= ===== ================================================= -class source value description -========= ========= ===== ================================================= -BPF_ALU BPF_TO_LE 0x00 convert between host byte order and little endian -BPF_ALU BPF_TO_BE 0x08 convert between host byte order and big endian -BPF_ALU64 Reserved 0x00 do byte swap unconditionally -========= ========= ===== ================================================= +===== ======== ===== ================================================= +class source value description +===== ======== ===== ================================================= +ALU TO_LE 0 convert between host byte order and little endian +ALU TO_BE 1 convert between host byte order and big endian +ALU64 Reserved 0 do byte swap unconditionally +===== ======== ===== ================================================= The 'imm' field encodes the width of the swap operations. The following widths are supported: 16, 32 and 64. Width 64 operations belong to the base64 @@ -365,19 +411,19 @@ conformance group. Examples: -``BPF_ALU | BPF_TO_LE | BPF_END`` with imm = 16/32/64 means:: +``{END, TO_LE, ALU}`` with imm = 16/32/64 means:: dst = htole16(dst) dst = htole32(dst) dst = htole64(dst) -``BPF_ALU | BPF_TO_BE | BPF_END`` with imm = 16/32/64 means:: +``{END, TO_BE, ALU}`` with imm = 16/32/64 means:: dst = htobe16(dst) dst = htobe32(dst) dst = htobe64(dst) -``BPF_ALU64 | BPF_TO_LE | BPF_END`` with imm = 16/32/64 means:: +``{END, TO_LE, ALU64}`` with imm = 16/32/64 means:: dst = bswap16(dst) dst = bswap32(dst) @@ -386,59 +432,59 @@ Examples: Jump instructions ----------------- -``BPF_JMP32`` uses 32-bit wide operands and indicates the base32 -conformance group, while ``BPF_JMP`` uses 64-bit wide operands for +``JMP32`` uses 32-bit wide operands and indicates the base32 +conformance group, while ``JMP`` uses 64-bit wide operands for otherwise identical operations, and indicates the base64 conformance group unless otherwise specified. The 'code' field encodes the operation as below: -======== ===== ======= =============================== ============================================= +======== ===== ======= =============================== =================================================== code value src_reg description notes -======== ===== ======= =============================== ============================================= -BPF_JA 0x0 0x0 PC += offset BPF_JMP | BPF_K only -BPF_JA 0x0 0x0 PC += imm BPF_JMP32 | BPF_K only -BPF_JEQ 0x1 any PC += offset if dst == src -BPF_JGT 0x2 any PC += offset if dst > src unsigned -BPF_JGE 0x3 any PC += offset if dst >= src unsigned -BPF_JSET 0x4 any PC += offset if dst & src -BPF_JNE 0x5 any PC += offset if dst != src -BPF_JSGT 0x6 any PC += offset if dst > src signed -BPF_JSGE 0x7 any PC += offset if dst >= src signed -BPF_CALL 0x8 0x0 call helper function by address BPF_JMP | BPF_K only, see `Helper functions`_ -BPF_CALL 0x8 0x1 call PC += imm BPF_JMP | BPF_K only, see `Program-local functions`_ -BPF_CALL 0x8 0x2 call helper function by BTF ID BPF_JMP | BPF_K only, see `Helper functions`_ -BPF_EXIT 0x9 0x0 return BPF_JMP | BPF_K only -BPF_JLT 0xa any PC += offset if dst < src unsigned -BPF_JLE 0xb any PC += offset if dst <= src unsigned -BPF_JSLT 0xc any PC += offset if dst < src signed -BPF_JSLE 0xd any PC += offset if dst <= src signed -======== ===== ======= =============================== ============================================= - -The BPF program needs to store the return value into register R0 before doing a -``BPF_EXIT``. +======== ===== ======= =============================== =================================================== +JA 0x0 0x0 PC += offset {JA, K, JMP} only +JA 0x0 0x0 PC += imm {JA, K, JMP32} only +JEQ 0x1 any PC += offset if dst == src +JGT 0x2 any PC += offset if dst > src unsigned +JGE 0x3 any PC += offset if dst >= src unsigned +JSET 0x4 any PC += offset if dst & src +JNE 0x5 any PC += offset if dst != src +JSGT 0x6 any PC += offset if dst > src signed +JSGE 0x7 any PC += offset if dst >= src signed +CALL 0x8 0x0 call helper function by address {CALL, K, JMP} only, see `Helper functions`_ +CALL 0x8 0x1 call PC += imm {CALL, K, JMP} only, see `Program-local functions`_ +CALL 0x8 0x2 call helper function by BTF ID {CALL, K, JMP} only, see `Helper functions`_ +EXIT 0x9 0x0 return {CALL, K, JMP} only +JLT 0xa any PC += offset if dst < src unsigned +JLE 0xb any PC += offset if dst <= src unsigned +JSLT 0xc any PC += offset if dst < src signed +JSLE 0xd any PC += offset if dst <= src signed +======== ===== ======= =============================== =================================================== + +The BPF program needs to store the return value into register R0 before doing an +``EXIT``. Example: -``BPF_JSGE | BPF_X | BPF_JMP32`` (0x7e) means:: +``{JSGE, X, JMP32}`` means:: if (s32)dst s>= (s32)src goto +offset where 's>=' indicates a signed '>=' comparison. -``BPF_JA | BPF_K | BPF_JMP32`` (0x06) means:: +``{JA, K, JMP32}`` means:: gotol +imm where 'imm' means the branch offset comes from insn 'imm' field. -Note that there are two flavors of ``BPF_JA`` instructions. The -``BPF_JMP`` class permits a 16-bit jump offset specified by the 'offset' -field, whereas the ``BPF_JMP32`` class permits a 32-bit jump offset +Note that there are two flavors of ``JA`` instructions. The +``JMP`` class permits a 16-bit jump offset specified by the 'offset' +field, whereas the ``JMP32`` class permits a 32-bit jump offset specified by the 'imm' field. A > 16-bit conditional jump may be converted to a < 16-bit conditional jump plus a 32-bit unconditional jump. -All ``BPF_CALL`` and ``BPF_JA`` instructions belong to the +All ``CALL`` and ``JA`` instructions belong to the base32 conformance group. Helper functions @@ -459,80 +505,83 @@ Program-local functions ~~~~~~~~~~~~~~~~~~~~~~~ Program-local functions are functions exposed by the same BPF program as the caller, and are referenced by offset from the call instruction, similar to -``BPF_JA``. The offset is encoded in the imm field of the call instruction. -A ``BPF_EXIT`` within the program-local function will return to the caller. +``JA``. The offset is encoded in the imm field of the call instruction. +A ``EXIT`` within the program-local function will return to the caller. Load and store instructions =========================== -For load and store instructions (``BPF_LD``, ``BPF_LDX``, ``BPF_ST``, and ``BPF_STX``), the -8-bit 'opcode' field is divided as: - -============ ====== ================= -3 bits (MSB) 2 bits 3 bits (LSB) -============ ====== ================= -mode size instruction class -============ ====== ================= - -The mode modifier is one of: - - ============= ===== ==================================== ============= - mode modifier value description reference - ============= ===== ==================================== ============= - BPF_IMM 0x00 64-bit immediate instructions `64-bit immediate instructions`_ - BPF_ABS 0x20 legacy BPF packet access (absolute) `Legacy BPF Packet access instructions`_ - BPF_IND 0x40 legacy BPF packet access (indirect) `Legacy BPF Packet access instructions`_ - BPF_MEM 0x60 regular load and store operations `Regular load and store operations`_ - BPF_MEMSX 0x80 sign-extension load operations `Sign-extension load operations`_ - BPF_ATOMIC 0xc0 atomic operations `Atomic operations`_ - ============= ===== ==================================== ============= - -The size modifier is one of: - - ============= ===== ===================== - size modifier value description - ============= ===== ===================== - BPF_W 0x00 word (4 bytes) - BPF_H 0x08 half word (2 bytes) - BPF_B 0x10 byte - BPF_DW 0x18 double word (8 bytes) - ============= ===== ===================== - -Instructions using ``BPF_DW`` belong to the base64 conformance group. +For load and store instructions (``LD``, ``LDX``, ``ST``, and ``STX``), the +8-bit 'opcode' field is divided as:: + + +-+-+-+-+-+-+-+-+ + |mode |sz |class| + +-+-+-+-+-+-+-+-+ + +**mode** + The mode modifier is one of: + + ============= ===== ==================================== ============= + mode modifier value description reference + ============= ===== ==================================== ============= + IMM 0 64-bit immediate instructions `64-bit immediate instructions`_ + ABS 1 legacy BPF packet access (absolute) `Legacy BPF Packet access instructions`_ + IND 2 legacy BPF packet access (indirect) `Legacy BPF Packet access instructions`_ + MEM 3 regular load and store operations `Regular load and store operations`_ + MEMSX 4 sign-extension load operations `Sign-extension load operations`_ + ATOMIC 6 atomic operations `Atomic operations`_ + ============= ===== ==================================== ============= + +**sz (size)** + The size modifier is one of: + + ==== ===== ===================== + size value description + ==== ===== ===================== + W 0 word (4 bytes) + H 1 half word (2 bytes) + B 2 byte + DW 3 double word (8 bytes) + ==== ===== ===================== + + Instructions using ``DW`` belong to the base64 conformance group. + +**class** + The instruction class (see `Instruction classes`_) Regular load and store operations --------------------------------- -The ``BPF_MEM`` mode modifier is used to encode regular load and store +The ``MEM`` mode modifier is used to encode regular load and store instructions that transfer data between a register and memory. -``BPF_MEM | | BPF_STX`` means:: +``{MEM, , STX}`` means:: *(size *) (dst + offset) = src -``BPF_MEM | | BPF_ST`` means:: +``{MEM, , ST}`` means:: *(size *) (dst + offset) = imm -``BPF_MEM | | BPF_LDX`` means:: +``{MEM, , LDX}`` means:: dst = *(unsigned size *) (src + offset) -Where size is one of: ``BPF_B``, ``BPF_H``, ``BPF_W``, or ``BPF_DW`` and -'unsigned size' is one of u8, u16, u32 or u64. +Where '' is one of: ``B``, ``H``, ``W``, or ``DW``, and +'unsigned size' is one of: u8, u16, u32, or u64. Sign-extension load operations ------------------------------ -The ``BPF_MEMSX`` mode modifier is used to encode :term:`sign-extension` load +The ``MEMSX`` mode modifier is used to encode :term:`sign-extension` load instructions that transfer data between a register and memory. -``BPF_MEMSX | | BPF_LDX`` means:: +``{MEMSX, , LDX}`` means:: dst = *(signed size *) (src + offset) -Where size is one of: ``BPF_B``, ``BPF_H`` or ``BPF_W``, and -'signed size' is one of s8, s16 or s32. +Where size is one of: ``B``, ``H``, or ``W``, and +'signed size' is one of: s8, s16, or s32. Atomic operations ----------------- @@ -542,11 +591,11 @@ interrupted or corrupted by other access to the same memory region by other BPF programs or means outside of this specification. All atomic operations supported by BPF are encoded as store operations -that use the ``BPF_ATOMIC`` mode modifier as follows: +that use the ``ATOMIC`` mode modifier as follows: -* ``BPF_ATOMIC | BPF_W | BPF_STX`` for 32-bit operations, which are +* ``{ATOMIC, W, STX}`` for 32-bit operations, which are part of the "atomic32" conformance group. -* ``BPF_ATOMIC | BPF_DW | BPF_STX`` for 64-bit operations, which are +* ``{ATOMIC, DW, STX}`` for 64-bit operations, which are part of the "atomic64" conformance group. * 8-bit and 16-bit wide atomic operations are not supported. @@ -557,18 +606,18 @@ arithmetic operations in the 'imm' field to encode the atomic operation: ======== ===== =========== imm value description ======== ===== =========== -BPF_ADD 0x00 atomic add -BPF_OR 0x40 atomic or -BPF_AND 0x50 atomic and -BPF_XOR 0xa0 atomic xor +ADD 0x00 atomic add +OR 0x40 atomic or +AND 0x50 atomic and +XOR 0xa0 atomic xor ======== ===== =========== -``BPF_ATOMIC | BPF_W | BPF_STX`` with 'imm' = BPF_ADD means:: +``{ATOMIC, W, STX}`` with 'imm' = ADD means:: *(u32 *)(dst + offset) += src -``BPF_ATOMIC | BPF_DW | BPF_STX`` with 'imm' = BPF_ADD means:: +``{ATOMIC, DW, STX}`` with 'imm' = ADD means:: *(u64 *)(dst + offset) += src @@ -578,20 +627,20 @@ two complex atomic operations: =========== ================ =========================== imm value description =========== ================ =========================== -BPF_FETCH 0x01 modifier: return old value -BPF_XCHG 0xe0 | BPF_FETCH atomic exchange -BPF_CMPXCHG 0xf0 | BPF_FETCH atomic compare and exchange +FETCH 0x01 modifier: return old value +XCHG 0xe0 | FETCH atomic exchange +CMPXCHG 0xf0 | FETCH atomic compare and exchange =========== ================ =========================== -The ``BPF_FETCH`` modifier is optional for simple atomic operations, and -always set for the complex atomic operations. If the ``BPF_FETCH`` flag +The ``FETCH`` modifier is optional for simple atomic operations, and +always set for the complex atomic operations. If the ``FETCH`` flag is set, then the operation also overwrites ``src`` with the value that was in memory before it was modified. -The ``BPF_XCHG`` operation atomically exchanges ``src`` with the value +The ``XCHG`` operation atomically exchanges ``src`` with the value addressed by ``dst + offset``. -The ``BPF_CMPXCHG`` operation atomically compares the value addressed by +The ``CMPXCHG`` operation atomically compares the value addressed by ``dst + offset`` with ``R0``. If they match, the value addressed by ``dst + offset`` is replaced with ``src``. In either case, the value that was at ``dst + offset`` before the operation is zero-extended @@ -600,25 +649,25 @@ and loaded back to ``R0``. 64-bit immediate instructions ----------------------------- -Instructions with the ``BPF_IMM`` 'mode' modifier use the wide instruction +Instructions with the ``IMM`` 'mode' modifier use the wide instruction encoding defined in `Instruction encoding`_, and use the 'src_reg' field of the basic instruction to hold an opcode subtype. -The following table defines a set of ``BPF_IMM | BPF_DW | BPF_LD`` instructions +The following table defines a set of ``{IMM, DW, LD}`` instructions with opcode subtypes in the 'src_reg' field, using new terms such as "map" defined further below: -========================= ====== ======= ========================================= =========== ============== -opcode construction opcode src_reg pseudocode imm type dst type -========================= ====== ======= ========================================= =========== ============== -BPF_IMM | BPF_DW | BPF_LD 0x18 0x0 dst = (next_imm << 32) | imm integer integer -BPF_IMM | BPF_DW | BPF_LD 0x18 0x1 dst = map_by_fd(imm) map fd map -BPF_IMM | BPF_DW | BPF_LD 0x18 0x2 dst = map_val(map_by_fd(imm)) + next_imm map fd data pointer -BPF_IMM | BPF_DW | BPF_LD 0x18 0x3 dst = var_addr(imm) variable id data pointer -BPF_IMM | BPF_DW | BPF_LD 0x18 0x4 dst = code_addr(imm) integer code pointer -BPF_IMM | BPF_DW | BPF_LD 0x18 0x5 dst = map_by_idx(imm) map index map -BPF_IMM | BPF_DW | BPF_LD 0x18 0x6 dst = map_val(map_by_idx(imm)) + next_imm map index data pointer -========================= ====== ======= ========================================= =========== ============== +======= ========================================= =========== ============== +src_reg pseudocode imm type dst type +======= ========================================= =========== ============== +0x0 dst = (next_imm << 32) | imm integer integer +0x1 dst = map_by_fd(imm) map fd map +0x2 dst = map_val(map_by_fd(imm)) + next_imm map fd data pointer +0x3 dst = var_addr(imm) variable id data pointer +0x4 dst = code_addr(imm) integer code pointer +0x5 dst = map_by_idx(imm) map index map +0x6 dst = map_val(map_by_idx(imm)) + next_imm map index data pointer +======= ========================================= =========== ============== where @@ -657,8 +706,8 @@ Legacy BPF Packet access instructions BPF previously introduced special instructions for access to packet data that were carried over from classic BPF. These instructions used an instruction -class of BPF_LD, a size modifier of BPF_W, BPF_H, or BPF_B, and a -mode modifier of BPF_ABS or BPF_IND. The 'dst_reg' and 'offset' fields were -set to zero, and 'src_reg' was set to zero for BPF_ABS. However, these +class of ``LD``, a size modifier of ``W``, ``H``, or ``B``, and a +mode modifier of ``ABS`` or ``IND``. The 'dst_reg' and 'offset' fields were +set to zero, and 'src_reg' was set to zero for ``ABS``. However, these instructions are deprecated and should no longer be used. All legacy packet access instructions belong to the "legacy" conformance group. -- cgit 1.2.3-korg From df620d7fabe984accf6567c846e4188fbd8add4d Mon Sep 17 00:00:00 2001 From: Conor Dooley Date: Thu, 29 Feb 2024 18:24:00 +0000 Subject: dt-bindings: leds: pwm-multicolour: re-allow active-low active-low was lifted to the common schema for leds, but it went unnoticed that the leds-multicolour binding had "additionalProperties: false" where the other users had "unevaluatedProperties: false", thereby disallowing active-low for multicolour leds. Explicitly permit it again. Fixes: c94d1783136e ("dt-bindings: net: phy: Make LED active-low property common") Acked-by: Rob Herring Signed-off-by: Conor Dooley Acked-by: Lee Jones Signed-off-by: David S. Miller --- Documentation/devicetree/bindings/leds/leds-pwm-multicolor.yaml | 2 ++ 1 file changed, 2 insertions(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/leds/leds-pwm-multicolor.yaml b/Documentation/devicetree/bindings/leds/leds-pwm-multicolor.yaml index 5edfbe347341cd..a31a202afe5ccf 100644 --- a/Documentation/devicetree/bindings/leds/leds-pwm-multicolor.yaml +++ b/Documentation/devicetree/bindings/leds/leds-pwm-multicolor.yaml @@ -41,6 +41,8 @@ properties: pwm-names: true + active-low: true + color: true required: -- cgit 1.2.3-korg From 9e6c88e2f05b05892113fc56fabc652ac1ee7043 Mon Sep 17 00:00:00 2001 From: Geliang Tang Date: Fri, 1 Mar 2024 19:18:28 +0100 Subject: mptcp: add token for get-addr in yaml This patch adds token parameter together with addr in get-addr section in mptcp_pm.yaml, then use the following commands to update mptcp_pm_gen.c and mptcp_pm_gen.h: ./tools/net/ynl/ynl-gen-c.py --mode kernel \ --spec Documentation/netlink/specs/mptcp_pm.yaml --source \ -o net/mptcp/mptcp_pm_gen.c ./tools/net/ynl/ynl-gen-c.py --mode kernel \ --spec Documentation/netlink/specs/mptcp_pm.yaml --header \ -o net/mptcp/mptcp_pm_gen.h Signed-off-by: Geliang Tang Reviewed-by: Matthieu Baerts (NGI0) Reviewed-by: Mat Martineau Signed-off-by: Matthieu Baerts (NGI0) Signed-off-by: David S. Miller --- Documentation/netlink/specs/mptcp_pm.yaml | 3 ++- net/mptcp/mptcp_pm_gen.c | 7 ++++--- net/mptcp/mptcp_pm_gen.h | 2 +- 3 files changed, 7 insertions(+), 5 deletions(-) (limited to 'Documentation') diff --git a/Documentation/netlink/specs/mptcp_pm.yaml b/Documentation/netlink/specs/mptcp_pm.yaml index 49f90cfb469894..af525ed2979234 100644 --- a/Documentation/netlink/specs/mptcp_pm.yaml +++ b/Documentation/netlink/specs/mptcp_pm.yaml @@ -292,13 +292,14 @@ operations: - name: get-addr doc: Get endpoint information - attribute-set: endpoint + attribute-set: attr dont-validate: [ strict ] flags: [ uns-admin-perm ] do: &get-addr-attrs request: attributes: - addr + - token reply: attributes: - addr diff --git a/net/mptcp/mptcp_pm_gen.c b/net/mptcp/mptcp_pm_gen.c index 670da7822e6c91..c30a2a90a19252 100644 --- a/net/mptcp/mptcp_pm_gen.c +++ b/net/mptcp/mptcp_pm_gen.c @@ -32,8 +32,9 @@ const struct nla_policy mptcp_pm_del_addr_nl_policy[MPTCP_PM_ENDPOINT_ADDR + 1] }; /* MPTCP_PM_CMD_GET_ADDR - do */ -const struct nla_policy mptcp_pm_get_addr_nl_policy[MPTCP_PM_ENDPOINT_ADDR + 1] = { - [MPTCP_PM_ENDPOINT_ADDR] = NLA_POLICY_NESTED(mptcp_pm_address_nl_policy), +const struct nla_policy mptcp_pm_get_addr_nl_policy[MPTCP_PM_ATTR_TOKEN + 1] = { + [MPTCP_PM_ATTR_ADDR] = NLA_POLICY_NESTED(mptcp_pm_address_nl_policy), + [MPTCP_PM_ATTR_TOKEN] = { .type = NLA_U32, }, }; /* MPTCP_PM_CMD_FLUSH_ADDRS - do */ @@ -110,7 +111,7 @@ const struct genl_ops mptcp_pm_nl_ops[11] = { .doit = mptcp_pm_nl_get_addr_doit, .dumpit = mptcp_pm_nl_get_addr_dumpit, .policy = mptcp_pm_get_addr_nl_policy, - .maxattr = MPTCP_PM_ENDPOINT_ADDR, + .maxattr = MPTCP_PM_ATTR_TOKEN, .flags = GENL_UNS_ADMIN_PERM, }, { diff --git a/net/mptcp/mptcp_pm_gen.h b/net/mptcp/mptcp_pm_gen.h index ac9fc7225b6a09..e24258f6f819bb 100644 --- a/net/mptcp/mptcp_pm_gen.h +++ b/net/mptcp/mptcp_pm_gen.h @@ -18,7 +18,7 @@ extern const struct nla_policy mptcp_pm_add_addr_nl_policy[MPTCP_PM_ENDPOINT_ADD extern const struct nla_policy mptcp_pm_del_addr_nl_policy[MPTCP_PM_ENDPOINT_ADDR + 1]; -extern const struct nla_policy mptcp_pm_get_addr_nl_policy[MPTCP_PM_ENDPOINT_ADDR + 1]; +extern const struct nla_policy mptcp_pm_get_addr_nl_policy[MPTCP_PM_ATTR_TOKEN + 1]; extern const struct nla_policy mptcp_pm_flush_addrs_nl_policy[MPTCP_PM_ENDPOINT_ADDR + 1]; -- cgit 1.2.3-korg From 0ef05e258b5e15c254534d9dd382ad4c3173dce0 Mon Sep 17 00:00:00 2001 From: Dave Thaler Date: Fri, 1 Mar 2024 17:22:29 -0800 Subject: bpf, docs: Rename legacy conformance group to packet There could be other legacy conformance groups in the future, so use a more descriptive name. The status of the conformance group in the IANA registry is what designates it as legacy, not the name of the group. Signed-off-by: Dave Thaler Link: https://lore.kernel.org/r/20240302012229.16452-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov Signed-off-by: Daniel Borkmann --- Documentation/bpf/standardization/instruction-set.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/bpf/standardization/instruction-set.rst b/Documentation/bpf/standardization/instruction-set.rst index ffcba257e004d4..a5ab00ac0b1487 100644 --- a/Documentation/bpf/standardization/instruction-set.rst +++ b/Documentation/bpf/standardization/instruction-set.rst @@ -127,7 +127,7 @@ This document defines the following conformance groups: * divmul32: includes 32-bit division, multiplication, and modulo instructions. * divmul64: includes divmul32, plus 64-bit division, multiplication, and modulo instructions. -* legacy: deprecated packet access instructions. +* packet: deprecated packet access instructions. Instruction encoding ==================== @@ -710,4 +710,4 @@ class of ``LD``, a size modifier of ``W``, ``H``, or ``B``, and a mode modifier of ``ABS`` or ``IND``. The 'dst_reg' and 'offset' fields were set to zero, and 'src_reg' was set to zero for ``ABS``. However, these instructions are deprecated and should no longer be used. All legacy packet -access instructions belong to the "legacy" conformance group. +access instructions belong to the "packet" conformance group. -- cgit 1.2.3-korg From a6d63bbf2c52d0a9d1550cd9a5ba58ea6371991b Mon Sep 17 00:00:00 2001 From: Takeru Hayasaka Date: Mon, 12 Feb 2024 02:04:05 +0000 Subject: ice: Implement RSS settings for GTP using ethtool Following the addition of new GTP RSS hash options to ethtool.h, this patch implements the corresponding RSS settings for GTP packets in the Intel ice driver. It enables users to configure RSS for GTP-U and GTP-C traffic over IPv4 and IPv6, utilizing the newly defined hash options. The implementation covers the handling of gtpu(4|6), gtpc(4|6), gtpc(4|6)t, gtpu(4|6)e, gtpu(4|6)u, and gtpu(4|6)d traffic, providing enhanced load distribution for GTP traffic across multiple processing units. Signed-off-by: Takeru Hayasaka Reviewed-by: Marcin Szycik Tested-by: Pucha Himasekhar Reddy (A Contingent worker at Intel) Signed-off-by: Tony Nguyen --- .../device_drivers/ethernet/intel/ice.rst | 21 ++++-- drivers/net/ethernet/intel/ice/ice_ethtool.c | 82 ++++++++++++++++++++++ drivers/net/ethernet/intel/ice/ice_flow.h | 31 ++++++-- drivers/net/ethernet/intel/ice/ice_lib.c | 37 ++++++++++ 4 files changed, 162 insertions(+), 9 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/device_drivers/ethernet/intel/ice.rst b/Documentation/networking/device_drivers/ethernet/intel/ice.rst index 5038e54586af66..934752f675ba40 100644 --- a/Documentation/networking/device_drivers/ethernet/intel/ice.rst +++ b/Documentation/networking/device_drivers/ethernet/intel/ice.rst @@ -368,15 +368,28 @@ more options for Receive Side Scaling (RSS) hash byte configuration. # ethtool -N rx-flow-hash