diff --git a/.mailmap b/.mailmap index b78aa092b4bb85..bd6a72f29d9cc5 100644 --- a/.mailmap +++ b/.mailmap @@ -682,6 +682,7 @@ Peter A Jonsson Peter Hilber Peter Oruba Peter Oruba +Peter Rosin Pierre-Louis Bossart Pratyush Anand Pratyush Yadav @@ -856,6 +857,7 @@ Tobias Klauser Tobias Klauser Tobias Klauser Todor Tomov +Tomasz Jeznach Tony Luck Trilok Soni TripleX Chung diff --git a/Documentation/ABI/obsolete/sysfs-selinux-user b/Documentation/ABI/removed/sysfs-selinux-user similarity index 100% rename from Documentation/ABI/obsolete/sysfs-selinux-user rename to Documentation/ABI/removed/sysfs-selinux-user diff --git a/Documentation/admin-guide/cgroup-v1/memcg_test.rst b/Documentation/admin-guide/cgroup-v1/memcg_test.rst index 9f8e27355cba54..7c7cd457cf6952 100644 --- a/Documentation/admin-guide/cgroup-v1/memcg_test.rst +++ b/Documentation/admin-guide/cgroup-v1/memcg_test.rst @@ -47,21 +47,19 @@ Please note that implementation details can be changed. Called when swp_entry's refcnt goes down to 0. A charge against swap disappears. -3. charge-commit-cancel +3. charge-commit ======================= Memcg pages are charged in two steps: - mem_cgroup_try_charge() - - mem_cgroup_commit_charge() or mem_cgroup_cancel_charge() + - commit_charge() At try_charge(), there are no flags to say "this page is charged". at this point, usage += PAGE_SIZE. At commit(), the page is associated with the memcg. - At cancel(), simply usage -= PAGE_SIZE. - Under below explanation, we assume CONFIG_SWAP=y. 4. Anonymous diff --git a/Documentation/arch/riscv/cmodx.rst b/Documentation/arch/riscv/cmodx.rst index 40ba53bed5dffd..cbfa812a11b455 100644 --- a/Documentation/arch/riscv/cmodx.rst +++ b/Documentation/arch/riscv/cmodx.rst @@ -21,13 +21,13 @@ call at each patchable function entry, and patches it dynamically at runtime to enable or disable the redirection. In the case of RISC-V, 2 instructions, AUIPC + JALR, are required to compose a function call. However, it is impossible to patch 2 instructions and expect that a concurrent read-side executes them -without a race condition. This series makes atmoic code patching possible in +without a race condition. This series makes atomic code patching possible in RISC-V ftrace. Kernel preemption makes things even worse as it allows the old state to persist across the patching process with stop_machine(). In order to get rid of stop_machine() and run dynamic ftrace with full kernel preemption, we partially initialize each patchable function entry at boot-time, -setting the first instruction to AUIPC, and the second to NOP. Now, atmoic +setting the first instruction to AUIPC, and the second to NOP. Now, atomic patching is possible because the kernel only has to update one instruction. According to Ziccif, as long as an instruction is naturally aligned, the ISA guarantee an atomic update. @@ -36,8 +36,8 @@ By fixing down the first instruction, AUIPC, the range of the ftrace trampoline is limited to +-2K from the predetermined target, ftrace_caller, due to the lack of immediate encoding space in RISC-V. To address the issue, we introduce CALL_OPS, where an 8B naturally align metadata is added in front of each -pacthable function. The metadata is resolved at the first trampoline, then the -execution can be derect to another custom trampoline. +patchable function. The metadata is resolved at the first trampoline, then the +execution can be directed to another custom trampoline. CMODX in the User Space ----------------------- diff --git a/Documentation/arch/riscv/zicfilp.rst b/Documentation/arch/riscv/zicfilp.rst index ab7d8e62ddaffb..12b35969d17ad2 100644 --- a/Documentation/arch/riscv/zicfilp.rst +++ b/Documentation/arch/riscv/zicfilp.rst @@ -78,7 +78,7 @@ the program. Per-task indirect branch tracking state can be monitored and controlled via the :c:macro:`PR_GET_CFI` and :c:macro:`PR_SET_CFI` -``prctl()` arguments (respectively), by supplying +``prctl()`` arguments (respectively), by supplying :c:macro:`PR_CFI_BRANCH_LANDING_PADS` as the second argument. These are architecture-agnostic, and will return -EINVAL if the underlying functionality is not supported. diff --git a/Documentation/devicetree/bindings/i2c/amlogic,meson6-i2c.yaml b/Documentation/devicetree/bindings/i2c/amlogic,meson6-i2c.yaml index c4cc8af1828078..7b59b60b62e5bb 100644 --- a/Documentation/devicetree/bindings/i2c/amlogic,meson6-i2c.yaml +++ b/Documentation/devicetree/bindings/i2c/amlogic,meson6-i2c.yaml @@ -16,10 +16,15 @@ allOf: properties: compatible: - enum: - - amlogic,meson6-i2c # Meson6, Meson8 and compatible SoCs - - amlogic,meson-gxbb-i2c # GXBB and compatible SoCs - - amlogic,meson-axg-i2c # AXG and compatible SoCs + oneOf: + - items: + - enum: + - amlogic,t7-i2c + - const: amlogic,meson-axg-i2c + - enum: + - amlogic,meson6-i2c # Meson6, Meson8 and compatible SoCs + - amlogic,meson-gxbb-i2c # GXBB and compatible SoCs + - amlogic,meson-axg-i2c # AXG and compatible SoCs reg: maxItems: 1 diff --git a/Documentation/devicetree/bindings/i2c/apple,i2c.yaml b/Documentation/devicetree/bindings/i2c/apple,i2c.yaml index 500a965bdb7a84..9e59200ad37b63 100644 --- a/Documentation/devicetree/bindings/i2c/apple,i2c.yaml +++ b/Documentation/devicetree/bindings/i2c/apple,i2c.yaml @@ -22,7 +22,9 @@ properties: compatible: oneOf: - items: - - const: apple,t6020-i2c + - enum: + - apple,t6020-i2c + - apple,t8122-i2c - const: apple,t8103-i2c - items: - enum: diff --git a/Documentation/devicetree/bindings/sound/gpio-audio-amp.yaml b/Documentation/devicetree/bindings/sound/gpio-audio-amp.yaml new file mode 100644 index 00000000000000..3690f3d1628c90 --- /dev/null +++ b/Documentation/devicetree/bindings/sound/gpio-audio-amp.yaml @@ -0,0 +1,270 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/sound/gpio-audio-amp.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Audio amplifier driven by GPIOs + +maintainers: + - Herve Codina + +description: | + Audio GPIO amplifiers are driven by GPIO in order to control the gain value + of the amplifier, its mute function and/or its bypass function. + + Those amplifiers are based on discrete components (analog switches, op-amps + and more) where some of them, mostly analog switches, are controlled by GPIOs + to adjust the gain value of the whole amplifier and/or to control + the mute and/or bypass function. + + For instance, the following piece of hardware is a GPIO amplifier + + +5VA + ^ + |\ | + | \ + Vin >---------------------------|+ \ + | +-------+-----> Vout + .--\/\/\/--+------------|- / | + | | | / | + v | |/ | | + GND o v | + \ GND | + gpio >-----------> \ | + o o | + | | | + | '--\/\/\/--. | + | +--\/\/\/--' + '---------------' + +properties: + compatible: + oneOf: + - const: gpio-audio-amp-mono + description: + A single channel amplifier. All features apply to this sole channel. + + - const: gpio-audio-amp-stereo + description: + A dual channel amplifier (left and right). All features apply to both + channels producing the same effect on both channels at the same time. + + vdd-supply: + description: Main power supply of the amplifier + + vddio-supply: + description: Power supply related to the control path + + vdda1-supply: + description: Analog power supply + + vdda2-supply: + description: Additional analog power supply + + mute-gpios: + description: GPIO to control the mute function + maxItems: 1 + + bypass-gpios: + description: GPIO to control the bypass function + maxItems: 1 + + gain-gpios: + description: | + GPIOs to control the amplifier gain + + The gain value is computed from GPIOs value from 0 to 2^N-1 with N the + number of GPIO described. The first GPIO described is the lsb of the gain + value. + + For instance assuming 2 gpios + gain-gpios = <&gpio1 GPIO_ACTIVE_HIGH> <&gpio2 GPIO_ACTIVE_HIGH>; + The gain value will be the following: + + gpio1 | gpio2 | gain + ------+-------+----- + 0 | 0 | 0b00 -> 0 + 1 | 0 | 0b01 -> 1 + 0 | 1 | 0b10 -> 2 + 1 | 1 | 0b11 -> 3 + ------+-------+----- + + Note: The gain value, bits set to 1 or 0, indicate the state active (bit + set) or the state inactive (bit unset) of the related GPIO. The + physical voltage corresponding to this active/inactive state is + given by the GPIO_ACTIVE_HIGH and GPIO_ACTIVE_LOW flags. + + minItems: 1 + maxItems: 16 + + gain-ranges: + $ref: /schemas/types.yaml#/definitions/int32-matrix + description: | + A list of one or more ranges of possible values. Each range is defined by + the first and last point in the range. Each point is defined by the pair + (GPIOs value, Gain in 0.01 dB unit). + + Ranges can be contiguous or holes can be present between ranges if some + gpios value should not be used. Also in a range the first point and the + last point can be identical. In that case, the range contains only one + item, the given point. + + items: + items: + - description: GPIOs value of the first point in the range + - description: Gain in 0.01 dB unit of the first point in the range + - description: GPIOs value of the last point in the range + - description: Gain in 0.01 dB unit of the last point in the range + description: | + A range defines a linear function (linear in dB) from the first point + to the last point, both included. The number of items in the range is + N = abs(first_point.gpio_value - last_point.gpio_value) + 1 + + It allows to define the gain range from the first_point.gain to + the last_point.gain, both points included. + + Gain (0.01 dB unit) + ^ + | last + +- - - - - - - - - - + point + | + . + | + . + | + . + +- - - - + . + | first . . + | point . . + | . . + +--------+-----------+---> gpios + value + + Note: Even if first_point.gpio_value is lower than last_point.gpio_value + and first_point.gain is lower than last_point.gain in the above + graphic, all combination of values are supported leading to an + increasing or a decreasing linear segment. + + minItems: 1 + maxItems: 65536 + + gain-labels: + $ref: /schemas/types.yaml#/definitions/string-array + minItems: 2 + maxItems: 65536 + description: | + List of the gain labels attached to the combination of GPIOs controlling + the gain. The first label is related to the gain value 0, the second label + is related to the gain value 1 and so on. + + With 2 GPIOs controlling the gain, GPIOs value can be 0, 1, 2 and 3. + Assuming that gain value set the hardware according to the following + table: + + GPIOs | Hardware + value | amplification + ------+-------------- + 0 | Low + 1 | Middle + 2 | High + 3 | Max + ------+-------------- + + The description using gain labels can be: + gain-labels = "Low", "Middle", "High", "Max"; + +dependencies: + gain-ranges: [ gain-gpios ] + gain-labels: [ gain-gpios ] + +required: + - compatible + - vdd-supply + +anyOf: + - required: + - gain-gpios + - required: + - mute-gpios + - required: + - bypass-gpios + +allOf: + - $ref: component-common.yaml# + - if: + required: + - gain-ranges + then: + properties: + gain-labels: false + - if: + required: + - gain-labels + then: + properties: + gain-ranges: false + +unevaluatedProperties: false + +examples: + - | + #include + + /* Gain controlled by gpios */ + amplifier-0 { + compatible = "gpio-audio-amp-mono"; + vdd-supply = <®ulator>; + gain-gpios = <&gpio 0 GPIO_ACTIVE_HIGH>, <&gpio 1 GPIO_ACTIVE_HIGH>; + }; + + /* Gain controlled by gpio using a simple range on a stereo amplifier */ + amplifier-1 { + compatible = "gpio-audio-amp-stereo"; + vdd-supply = <®ulator>; + gain-gpios = <&gpio 0 GPIO_ACTIVE_HIGH>, <&gpio 1 GPIO_ACTIVE_HIGH>; + gain-ranges = <0 (-300) 3 600>; + }; + + /* Gain controlled by gpio with labels */ + amplifier-3 { + compatible = "gpio-audio-amp-mono"; + vdd-supply = <®ulator>; + gain-gpios = <&gpio 0 GPIO_ACTIVE_HIGH>; + gain-labels = "Low", "High"; + }; + + /* A mutable stereo amplifier without any gain control */ + amplifier-4 { + compatible = "gpio-audio-amp-stereo"; + vdd-supply = <®ulator>; + mute-gpios = <&gpio 0 GPIO_ACTIVE_HIGH>; + }; + + /* + * Several supplies, gain controlled using more complex ranges, mute and + * bypass. + * + * Assuming 3 gpios for controlling the gain with the following table + * gpios value Gain + * 0b000 Do not use (gpios value not allowed) + * 0b001 - 3dB + * 0b010 + 3dB + * 0b011 + 10dB + * 0b100 Do not use (gpios value not allowed) + * 0b101 + 6dB + * 0b110 + 7dB + * 0b111 + 8dB + */ + amplifier-5 { + compatible = "gpio-audio-amp-mono"; + vdd-supply = <®ulator>; + vddio-supply = <®ulator1>; + vdda1-supply = <®ulator2>; + gain-gpios = <&gpio 0 GPIO_ACTIVE_HIGH>, + <&gpio 1 GPIO_ACTIVE_HIGH>, + <&gpio 2 GPIO_ACTIVE_HIGH>; + gain-ranges = <1 (-300) 2 300>, + <3 1000 3 1000>, + <5 600 7 800>; + mute-gpios = <&gpio 3 GPIO_ACTIVE_HIGH>; + bypass-gpios = <&gpio 4 GPIO_ACTIVE_HIGH>; + }; +... diff --git a/Documentation/devicetree/bindings/sound/mediatek,mt8173-rt5650-rt5514.yaml b/Documentation/devicetree/bindings/sound/mediatek,mt8173-rt5650-rt5514.yaml index ed698c9ff42b0e..becc7a11f8dc14 100644 --- a/Documentation/devicetree/bindings/sound/mediatek,mt8173-rt5650-rt5514.yaml +++ b/Documentation/devicetree/bindings/sound/mediatek,mt8173-rt5650-rt5514.yaml @@ -18,7 +18,9 @@ properties: description: Phandles of rt5650 and rt5514 codecs items: - description: phandle of rt5650 codec + maxItems: 1 - description: phandle of rt5514 codec + maxItems: 1 mediatek,platform: $ref: /schemas/types.yaml#/definitions/phandle diff --git a/Documentation/hwmon/sy7636a-hwmon.rst b/Documentation/hwmon/sy7636a-hwmon.rst index 0143ce0e5db765..03d866aba6e81c 100644 --- a/Documentation/hwmon/sy7636a-hwmon.rst +++ b/Documentation/hwmon/sy7636a-hwmon.rst @@ -22,5 +22,5 @@ The following sensors are supported sysfs-Interface --------------- -temp0_input +temp1_input - Temperature of external NTC (milli-degree C) diff --git a/Documentation/hwmon/yogafan.rst b/Documentation/hwmon/yogafan.rst index c553a381f772ee..68761947a1a838 100644 --- a/Documentation/hwmon/yogafan.rst +++ b/Documentation/hwmon/yogafan.rst @@ -135,4 +135,4 @@ References 4. **Lenovo IdeaPad Laptop Driver:** Reference for DMI-based hardware feature gating in Lenovo laptops. - https://github.com/torvalds/linux/blob/master/drivers/platform/x86/ideapad-laptop.c + https://github.com/torvalds/linux/blob/master/drivers/platform/x86/lenovo/ideapad-laptop.c diff --git a/Documentation/netlink/genetlink-c.yaml b/Documentation/netlink/genetlink-c.yaml index 57f59fe23e3f8c..4ea31e8fc4d19d 100644 --- a/Documentation/netlink/genetlink-c.yaml +++ b/Documentation/netlink/genetlink-c.yaml @@ -69,6 +69,15 @@ properties: header: description: For C-compatible languages, header which already defines this value. type: string + scope: + description: | + Visibility of this definition. "uapi" (default) renders into + the uAPI header, "kernel" renders into the kernel-side + generated header, "user" renders into the user-side + generated header. When combined with `header:`, the + definition is not rendered, and the named header is + included only by code matching the scope. + enum: [ uapi, kernel, user ] type: enum: [ const, enum, flags ] doc: diff --git a/Documentation/netlink/genetlink-legacy.yaml b/Documentation/netlink/genetlink-legacy.yaml index 66fb8653a34423..f9c44747729ae3 100644 --- a/Documentation/netlink/genetlink-legacy.yaml +++ b/Documentation/netlink/genetlink-legacy.yaml @@ -83,6 +83,15 @@ properties: header: description: For C-compatible languages, header which already defines this value. type: string + scope: + description: | + Visibility of this definition. "uapi" (default) renders into + the uAPI header, "kernel" renders into the kernel-side + generated header, "user" renders into the user-side + generated header. When combined with `header:`, the + definition is not rendered, and the named header is + included only by code matching the scope. + enum: [ uapi, kernel, user ] type: enum: [ const, enum, flags, struct ] # Trim doc: diff --git a/Documentation/netlink/genetlink.yaml b/Documentation/netlink/genetlink.yaml index a1194d5d93fc3c..d3f3f3399ddf85 100644 --- a/Documentation/netlink/genetlink.yaml +++ b/Documentation/netlink/genetlink.yaml @@ -55,6 +55,15 @@ properties: header: description: For C-compatible languages, header which already defines this value. type: string + scope: + description: | + Visibility of this definition. "uapi" (default) renders into + the uAPI header, "kernel" renders into the kernel-side + generated header, "user" renders into the user-side + generated header. When combined with `header:`, the + definition is not rendered, and the named header is + included only by code matching the scope. + enum: [ uapi, kernel, user ] type: enum: [ const, enum, flags ] doc: diff --git a/Documentation/netlink/netlink-raw.yaml b/Documentation/netlink/netlink-raw.yaml index dd98dda55bd0fd..4c436b59a34be5 100644 --- a/Documentation/netlink/netlink-raw.yaml +++ b/Documentation/netlink/netlink-raw.yaml @@ -87,6 +87,15 @@ properties: header: description: For C-compatible languages, header which already defines this value. type: string + scope: + description: | + Visibility of this definition. "uapi" (default) renders into + the uAPI header, "kernel" renders into the kernel-side + generated header, "user" renders into the user-side + generated header. When combined with `header:`, the + definition is not rendered, and the named header is + included only by code matching the scope. + enum: [ uapi, kernel, user ] type: enum: [ const, enum, flags, struct ] # Trim doc: diff --git a/Documentation/netlink/specs/net_shaper.yaml b/Documentation/netlink/specs/net_shaper.yaml index 3f2ad772b64b15..de01f922040a5c 100644 --- a/Documentation/netlink/specs/net_shaper.yaml +++ b/Documentation/netlink/specs/net_shaper.yaml @@ -33,6 +33,11 @@ doc: | @cap-get operation. definitions: + - + type: const + name: max-handle-id + value: 0x3fffffe + scope: kernel - type: enum name: scope @@ -140,6 +145,8 @@ attribute-sets: - name: id type: u32 + checks: + max: max-handle-id doc: | Numeric identifier of a shaper. The id semantic depends on the scope. For @queue scope it's the queue id and for @node diff --git a/Documentation/process/index.rst b/Documentation/process/index.rst index dbd6ea16aca702..aa7c959a52b875 100644 --- a/Documentation/process/index.rst +++ b/Documentation/process/index.rst @@ -86,6 +86,7 @@ regressions and security problems. debugging/index handling-regressions security-bugs + threat-model cve embargoed-hardware-issues diff --git a/Documentation/process/security-bugs.rst b/Documentation/process/security-bugs.rst index 27b028e8586104..3c51ddde31dd91 100644 --- a/Documentation/process/security-bugs.rst +++ b/Documentation/process/security-bugs.rst @@ -66,6 +66,42 @@ In addition, the following information are highly desirable: the issue appear. It is useful to share them, as they can be helpful to keep end users protected during the time it takes them to apply the fix. +What qualifies as a security bug +-------------------------------- + +It is important that most bugs are handled publicly so as to involve the widest +possible audience and find the best solution. By nature, bugs that are handled +in closed discussions between a small set of participants are less likely to +produce the best possible fix (e.g., risk of missing valid use cases, limited +testing abilities). + +It turns out that the majority of the bugs reported via the security team are +just regular bugs that have been improperly qualified as security bugs due to +a lack of awareness of the Linux kernel's threat model, as described in +Documentation/process/threat-model.rst, and ought to have been sent through +the normal channels described in Documentation/admin-guide/reporting-issues.rst +instead. + +The security list exists for urgent bugs that grant an attacker a capability +they are not supposed to have on a correctly configured production system, and +can be easily exploited, representing an imminent threat to many users. Before +reporting, consider whether the issue actually crosses a trust boundary on such +a system. + +**If you resorted to AI assistance to identify a bug, you must treat it as +public**. While you may have valid reasons to believe it is not, the security +team's experience shows that bugs discovered this way systematically surface +simultaneously across multiple researchers, often on the same day. In this +case, do not publicly share a reproducer, as this could cause unintended harm; +just mention that one is available and maintainers might ask for it privately +if they need it. + +If you are unsure whether an issue qualifies, err on the side of reporting +privately: the security team would rather triage a borderline report than miss +a real vulnerability. Reporting ordinary bugs to the security list, however, +does not make them move faster and consumes triage capacity that other reports +need. + Identifying contacts -------------------- @@ -74,7 +110,7 @@ affected subsystem's maintainers and Cc: the Linux kernel security team. Do not send it to a public list at this stage, unless you have good reasons to consider the issue as being public or trivial to discover (e.g. result of a widely available automated vulnerability scanning tool that can be repeated by -anyone). +anyone, or use of AI-based tools). If you're sending a report for issues affecting multiple parts in the kernel, even if they're fairly similar issues, please send individual messages (think @@ -131,6 +167,64 @@ the Linux kernel security team only. Your message will be triaged, and you will receive instructions about whom to contact, if needed. Your message may equally be forwarded as-is to the relevant maintainers. +Responsible use of AI to find bugs +---------------------------------- + +A significant fraction of bug reports submitted to the security team are +actually the result of code reviews assisted by AI tools. While this can be an +efficient means to find bugs in rarely explored areas, it causes an overload on +maintainers, who are sometimes forced to ignore such reports due to their poor +quality or accuracy. As such, reporters must be particularly cautious about a +number of points which tend to make these reports needlessly difficult to +handle: + + * **Length**: AI-generated reports tend to be excessively long, containing + multiple sections and excessive detail. This makes it difficult to spot + important information such as affected files, versions, and impact. Please + ensure that a clear summary of the problem and all critical details are + presented first. Do not require triage engineers to scan multiple pages of + text. Configure your tools to produce concise, human-style reports. + + * **Formatting**: Most AI-generated reports are littered with Markdown tags. + These decorations complicate the search for important information and do + not survive the quoting processes involved in forwarding or replying. + Please **always convert your report to plain text** without any formatting + decorations before sending it. + + * **Impact Evaluation**: Many AI-generated reports lack an understanding + of the kernel's threat model (see Documentation/process/threat-model.rst) + and go to great lengths inventing theoretical consequences. This adds + noise and complicates triage. Please stick to verifiable facts (e.g., + "this bug permits any user to gain CAP_NET_ADMIN") without enumerating + speculative implications. Have your tool read this documentation as + part of the evaluation process. + + * **Reproducer**: AI-based tools are often capable of generating reproducers. + Please always ensure your tool provides one and **test it thoroughly**. If + the reproducer does not work, or if the tool cannot produce one, the + validity of the report should be seriously questioned. Note that since the + report will be posted to a public list, the reproducer should only be + shared upon maintainers' request. + + * **Propose a Fix**: Many AI tools are actually better at writing code than + evaluating it. Please ask your tool to propose a fix and **test it** before + reporting the problem. If the fix cannot be tested because it relies on + rare hardware or almost extinct network protocols, the issue is likely not + a security bug. In any case, if a fix is proposed, it must adhere to + Documentation/process/submitting-patches.rst and include a 'Fixes:' tag + designating the commit that introduced the bug. + +Failure to consider these points exposes your report to the risk of being +ignored. + +Use common sense when evaluating the report. If the affected file has not been +touched for more than one year and is maintained by a single individual, it is +likely that usage has declined and exposed users are virtually non-existent +(e.g., drivers for very old hardware, obsolete filesystems). In such cases, +there is no need to consume a maintainer's time with an unimportant report. If +the issue is clearly trivial and publicly discoverable, you should report it +directly to the public mailing lists. + Sending the report ------------------ @@ -148,7 +242,15 @@ run additional tests. Reports where the reporter does not respond promptly or cannot effectively discuss their findings may be abandoned if the communication does not quickly improve. -The report must be sent to maintainers, with the security team in ``Cc:``. +The report must be sent to maintainers. If there are two or fewer +recipients in your message, you must also always Cc: the Linux kernel +security team who will ensure the message is delivered to the proper +people, and will be able to assist small maintainer teams with processes +they may not be familiar with. For larger teams, Cc: the Linux kernel +security team for your first few reports or when seeking specific help, +such as when resending a message which got no response within a week. +Once you have become comfortable with the process for a few reports, it is +no longer necessary to Cc: the security list when sending to large teams. The Linux kernel security team can be contacted by email at . This is a private list of security officers who will help verify the bug report and assist developers working on a fix. diff --git a/Documentation/process/threat-model.rst b/Documentation/process/threat-model.rst new file mode 100644 index 00000000000000..f177b8d3c1cafd --- /dev/null +++ b/Documentation/process/threat-model.rst @@ -0,0 +1,235 @@ +The Linux Kernel threat model +============================= + +There are a lot of assumptions regarding what the kernel does and does not +protect against. These assumptions tend to cause confusion for bug reports +(:doc:`security-related ones ` vs :doc:`non-security ones +<../admin-guide/reporting-issues>`), and can complicate security enforcement +when the responsibilities for some boundaries is not clear between the kernel, +distros, administrators and users. + +This document tries to clarify the responsibilities of the kernel in this +domain. + +The kernel's responsibilities +----------------------------- + +The kernel abstracts access to local hardware resources and to remote systems +in a way that allows multiple local users to get a fair share of the available +resources granted to them, and, when the underlying hardware permits, to assign +a level of confidentiality to their communications and to the data they are +processing or storing. + +The kernel assumes that the underlying hardware behaves according to its +specifications. This includes the integrity of the CPU's instruction set, the +transparency of the branch prediction unit and the cache units, the consistency +of the Memory Management Unit (MMU), the isolation of DMA-capable peripherals +(e.g., via IOMMU), state transitions in controllers, ranges of values read from +registers, the respect of documented hardware limitations, etc. + +When hardware fails to maintain its specified isolation (e.g., CPU bugs, +side-channels, hardware response to unexpected inputs), the kernel will usually +attempt to implement reasonable mitigations. These are best-effort measures +intended to reduce the attack surface or elevate the cost of an attack within +the limits of the hardware's facilities; they do not constitute a +kernel-provided safety guarantee. + +Users always perform their activities under the authority of an administrator +who is able to grant or deny various types of permissions that may affect how +users benefit from available resources, or the level of confidentiality of +their activities. Administrators may also delegate all or part of their own +permissions to some users, particularly via capabilities but not only. All this +is performed via configuration (sysctl, file-system permissions etc). + +The Linux Kernel applies a certain collection of default settings that match +its threat model. Distros have their own threat model and will come with their +own configuration presets, that the administrator may have to adjust to better +suit their expectations (relax or restrict). + +By default, the Linux Kernel guarantees the following protections when running +on common processors featuring privilege levels and memory management units: + +* **User-based isolation**: an unprivileged user may restrict access to their + own data from other unprivileged users running on the same system. This + includes: + + * stored data, via file system permissions + * in-memory data (pages are not accessible by default to other users) + * process activity (ptrace is not permitted to other users) + * inter-process communication (other users may not observe data exchanged via + UNIX domain sockets or other IPC mechanisms). + * network communications within the same or with other systems + +* **Capability-based protection**: + + * users not having elevated capabilities (including but not limited to + CAP_SYS_ADMIN) may not alter the + kernel's configuration, memory nor state, change other users' view of the + file system layout, grant any user capabilities they do not have, nor + affect the system's availability (shutdown, reboot, panic, hang, or making + the system unresponsive via unbounded resource exhaustion). + * users not having the ``CAP_NET_ADMIN`` capability may not alter the network + configuration, intercept nor spoof network communications from other users + nor systems. + * users not having ``CAP_SYS_PTRACE`` may not observe other users' processes + activities. + +When ``CONFIG_USER_NS`` is set, the kernel also permits unprivileged users to +create their own user namespace in which they have all capabilities, but with a +number of restrictions (they may not perform actions that have impacts on the +initial user namespace, such as changing time, loading modules or mounting +block devices). Please refer to ``user_namespaces(7)`` for more details, the +possibilities of user namespaces are not covered in this document. + +The kernel also offers a lot of troubleshooting and debugging facilities, which +can constitute attack vectors when placed in wrong hands. While some of them +are designed to be accessible to regular local users with a low risk (e.g. +kernel logs via ``/proc/kmsg``), some would expose enough information to +represent a risk in most places and the decision to expose them is under the +administrator's responsibility (perf events, traces), and others are not +designed to be accessed by non-privileged users (e.g. debugfs). Access to these +facilities by a user who has been explicitly granted permission by an +administrator does not constitute a security breach. + +Bugs that permit to violate the principles above constitute security breaches. +However, bugs that permit one violation only once another one was already +achieved are only weaknesses. The kernel applies a number of self-protection +measures whose purpose is to avoid crossing a security boundary when certain +classes of bugs are found, but a failure of these extra protections do not +constitute a vulnerability alone. + +What does not constitute a security bug +--------------------------------------- + +In the Linux kernel's threat model, the following classes of problems are +**NOT** considered as Linux Kernel security bugs. However, when it is believed +that the kernel could do better, they should be reported, so that they can be +reviewed and fixed where reasonably possible, but they will be handled as any +regular bug: + +* **Configuration**: + + * outdated kernels and particularly end-of-life branches are out of the scope + of the kernel's threat model: administrators are responsible for keeping + their system up to date. For a bug to qualify as a security bug, it must be + demonstrated that it affects actively maintained versions. + + * build-level: changes to the kernel configuration that are explicitly + documented as lowering the security level (e.g. ``CONFIG_NOMMU``), or + targeted at developers only. + + * OS-level: changes to command line parameters, sysctls, filesystem + permissions, user capabilities, exposure of privileged interfaces, that + explicitly increase exposure by either offering non-default access to + unprivileged users, or reduce the kernel's ability to enforce some + protections or mitigations. Example: write access to procfs or debugfs. + + * issues triggered only when using features intended for development or + debugging (e.g., LOCKDEP, KASAN, FAULT_INJECTION): these features are known + to introduce overhead and potential instability and are not intended for + production use. + + * issues affecting drivers exposed under CONFIG_STAGING, as well as features + marked EXPERIMENTAL in the configuration. + + * loading of explicitly insecure/broken/staging modules, and generally any + using any subsystem marked as experimental or not intended for production + use. + + * running out-of-tree modules or unofficial kernel forks; these should be + reported to the relevant vendor. + +* **Excess of initial privileges**: + + * actions performed by a user already possessing the privileges required to + perform that action or modify that state (e.g. ``CAP_SYS_ADMIN``, + ``CAP_NET_ADMIN``, ``CAP_SYS_RAWIO``, ``CAP_SYS_MODULE`` with no further + boundary being crossed). + + * actions performed in user namespace that do not bypass the restrictions + imposed to the initial user (e.g. ptrace usage, signal delivery, resource + usage, access to FS/device/sysctl/memory, network binding, system/network + configuration etc). + + * anything performed by the root user in the initial namespace (e.g. kernel + oops when writing to a privileged device). + +* **Out of production use**: + + This covers theoretical/probabilistic attacks that rely on laboratory + conditions with zero system noise, or those requiring an unrealistic number + of attempts (e.g., billions of trials) that would be detected by standard + system monitoring long before success, such as: + + * prediction of random numbers that only works in a totally silent + environment (such as IP ID, TCP ports or sequence numbers that can only be + guessed in a lab). + + * activity observation and information leaks based on probabilistic + approaches that are prone to measurement noise and not realistically + reproducible on a production system. + + * issues that can only be triggered by heavy attacks (e.g. brute force) whose + impact on the system makes it unlikely or impossible to remain undetected + before they succeed (e.g. consuming all memory before succeeding). + + * problems seen only under development simulators, emulators, or combinations + that do not exist on real systems at the time of reporting (issues + involving tens of millions of threads, tens of thousands of CPUs, + unrealistic CPU frequencies, RAM sizes or disk capacities, network speeds. + + * issues whose reproduction requires hardware modification or emulation, + including fake USB devices that pretend to be another one. + + * as well as issues that can be triggered at a cost that is orders of + magnitude higher than the expected benefits (e.g. fully functional keyboard + emulator only to retrieve 7 uninitialized bytes in a structure, or + brute-force method involving millions of connection attempts to guess a + port number). + +* **Hardening failures**: + + * ability to bypass some of the kernel's hardening measures with no + demonstrable exploit path (e.g. ASLR bypass, events timing or probing with + no demonstrable consequence). These are just weaknesses, not + vulnerabilities. + + * missing argument checks and failure to report certain errors with no + immediate consequence. + +* **Random information leaks**: + + This concerns information leaks of small data parts that happen to be there + and that cannot be chosen by the attacker, or face access restrictions: + + * structure padding reported by syscalls or other interfaces. + + * identifiers, partial data, non-terminated strings reported in error + messages. + + * Leaks of kernel memory addresses/pointers do not constitute an immediately + exploitable vector and are not security bugs, though they must be reported + and fixed. + +* **Crafted file system images**: + + * bugs triggered by mounting a corrupted or maliciously crafted file system + image are generally not security bugs, as the kernel assumes the underlying + storage media is under the administrator's control, unless the filesystem + driver is specifically documented as being hardened against untrusted media. + + * issues that are resolved, mitigated, or detected by running a filesystem + consistency check (fsck) on the image prior to mounting. + +* **Physical access**: + + Issues that require physical access to the machine, hardware modification, or + the use of specialized hardware (e.g., logic analyzers, DMA-attack tools over + PCI-E/Thunderbolt) are out of scope unless the system is explicitly + configured with technologies meant to defend against such attacks + (e.g. IOMMU). + +* **Functional and performance regressions**: + + Any issue that can be mitigated by setting proper permissions and limits + doesn't qualify as a security bug. diff --git a/Documentation/sound/alsa-configuration.rst b/Documentation/sound/alsa-configuration.rst index f75f087639415d..394cf87f340b4e 100644 --- a/Documentation/sound/alsa-configuration.rst +++ b/Documentation/sound/alsa-configuration.rst @@ -1097,7 +1097,7 @@ output (with ``--no-upload`` option) to kernel bugzilla or alsa-devel ML (see the section `Links and Addresses`_). ``power_save`` and ``power_save_controller`` options are for power-saving -mode. See powersave.rst for details. +mode. See Documentation/sound/designs/powersave.rst for details. Note 2: If you get click noises on output, try the module option ``position_fix=1`` or ``2``. ``position_fix=1`` will use the SD_LPIB @@ -1168,7 +1168,7 @@ line_outs_monitor enable_monitor Enable Analog Out on Channel 63/64 by default. -See hdspm.rst for details. +See Documentation/sound/cards/hdspm.rst for details. Module snd-ice1712 ------------------ diff --git a/Documentation/sound/codecs/cs35l56.rst b/Documentation/sound/codecs/cs35l56.rst index d5363b08f5152f..b3f8c1c2385185 100644 --- a/Documentation/sound/codecs/cs35l56.rst +++ b/Documentation/sound/codecs/cs35l56.rst @@ -40,7 +40,7 @@ There are two drivers in the kernel *For systems using SoundWire*: sound/soc/codecs/cs35l56.c and associated files -*For systems using HDA*: sound/pci/hda/cs35l56_hda.c +*For systems using HDA*: sound/hda/codecs/side-codecs/cs35l56_hda.c Firmware ======== diff --git a/Documentation/sound/soc/index.rst b/Documentation/sound/soc/index.rst index 8bed8f8f48daee..22878d73f7a54a 100644 --- a/Documentation/sound/soc/index.rst +++ b/Documentation/sound/soc/index.rst @@ -2,7 +2,7 @@ ALSA SoC Layer ============== -The documentation is spilt into the following sections:- +The documentation is split into the following sections:- .. toctree:: :maxdepth: 2 diff --git a/Documentation/sound/soc/platform.rst b/Documentation/sound/soc/platform.rst index bd21d0a4dd9b0b..53caba6ce4cea5 100644 --- a/Documentation/sound/soc/platform.rst +++ b/Documentation/sound/soc/platform.rst @@ -75,4 +75,4 @@ Each SoC DSP driver usually supplies the following features :- 3. DMA IO to/from DSP buffers (if applicable) 4. Definition of DSP front end (FE) PCM devices. -Please see DPCM.txt for a description of item 4. +Please see dpcm.rst for a description of item 4. diff --git a/Documentation/sound/soc/usb.rst b/Documentation/sound/soc/usb.rst index 94c12f9d9dd129..9cb860547f22d6 100644 --- a/Documentation/sound/soc/usb.rst +++ b/Documentation/sound/soc/usb.rst @@ -103,7 +103,7 @@ Returns 0 on success, and -EOPNOTSUPP on failure. - ``usbdev``: the usb device that was discovered - ``sdev``: capabilities of the device -**snd_soc_usb_connect()** notifies the ASoC USB DCPM BE DAI link of a USB +**snd_soc_usb_connect()** notifies the ASoC USB DPCM BE DAI link of a USB audio device detection. This can be utilized in the BE DAI driver to keep track of available USB audio devices. This is intended to be called by the USB offload driver residing in USB SND. @@ -118,7 +118,7 @@ Returns 0 on success, negative error code on failure. - ``usbdev``: the usb device that was removed - ``sdev``: capabilities to free -**snd_soc_usb_disconnect()** notifies the ASoC USB DCPM BE DAI link of a USB +**snd_soc_usb_disconnect()** notifies the ASoC USB DPCM BE DAI link of a USB audio device removal. This is intended to be called by the USB offload driver that resides in USB SND. diff --git a/Documentation/userspace-api/rseq.rst b/Documentation/userspace-api/rseq.rst index 3cd27a3c7c7e5a..8549a6c61531cc 100644 --- a/Documentation/userspace-api/rseq.rst +++ b/Documentation/userspace-api/rseq.rst @@ -24,6 +24,97 @@ Quick access to CPU number, node ID Allows to implement per CPU data efficiently. Documentation is in code and selftests. :( +Optimized RSEQ V2 +----------------- + +On architectures which utilize the generic entry code and generic TIF bits +the kernel supports runtime optimizations for RSEQ, which also enable +enhanced features like scheduler time slice extensions. + +To enable them a task has to register the RSEQ region with at least the +length advertised by getauxval(AT_RSEQ_FEATURE_SIZE). + +If existing binaries register with RSEQ_ORIG_SIZE (32 bytes), the kernel +keeps the legacy low performance mode enabled to fulfil the expectations +of existing users regarding the original RSEQ implementation behaviour. + +The following table documents the ABI and behavioral guarantees of the +legacy and the optimized V2 mode. + +.. list-table:: RSEQ modes + :header-rows: 1 + + * - Nr + - What + + - Legacy + - Optimized V2 + + * - 1 + - The cpu_id_start, cpu_id, node_id and mm_cid fields (User mode read + only) + .. Legacy + - Updated by the kernel unconditionally after each context switch and + before signal delivery + .. Optimized V2 + - Updated by the kernel if and only if they change, i.e. if the task + is migrated or mm_cid changes + + * - 2 + - The rseq_cs critical section field + .. Legacy + - Evaluated and handled unconditionally after each context switch and + before signal delivery + .. Optimized V2 + - Evaluated and handled conditionally only when user space was + interrupted and was scheduled out or before delivering a signal in + the interrupted context. + + * - 3 + - Read only fields + .. Legacy + - No strict enforcement except in debug mode + .. Optimized V2 + - Strict enforcement + + * - 4 + - membarrier(...RSEQ) + .. Legacy + - All running threads of the process are interrupted and the ID fields + are rewritten and eventually active critical sections are aborted + before they return to user space. All threads which are scheduled + out whether voluntary or not are covered by #1/#2 above. + .. Optimized V2 + - All running threads of the process are interrupted and eventually + active critical sections are aborted before these threads return to + user space. The ID fields are only updated if changed as a + consequence of the interrupt. All threads which are scheduled out + whether voluntary or not are covered by #1/#2 above. + + * - 5 + - Time slice extensions + .. Legacy + - Not supported + .. Optimized V2 + - Supported + +The legacy mode is obviously less performant as it does unconditional +updates and critical section checks even if not strictly required by the +ABI contract. That can't be changed anymore as some users depend on that +observed behavior, which in turn enables them to violate the ABI and +overwrite the cpu_id_start field for their own purposes. This is obviously +discouraged as it renders RSEQ incompatible with the intended usage and +breaks the expectation of other libraries in the same application. + +The ABI compliant optimized v2 mode, which respects the read only fields, +does not require unconditional updates and therefore is way more +performant. The kernel validates the read only fields for compliance. If +user space modifies them, the process is killed. Compliant usage allows +multiple libraries in the same application to benefit from the RSEQ +functionality without disturbing each other. The ABI compliant optimized v2 +mode also enables extended RSEQ features like time slice extensions. + + Scheduler time slice extensions ------------------------------- @@ -37,7 +128,8 @@ The prerequisites for this functionality are: * Enabled at boot time (default is enabled) - * A rseq userspace pointer has been registered for the thread + * A rseq userspace pointer has been registered for the thread in + optimized V2 mode The thread has to enable the functionality via prctl(2):: diff --git a/Documentation/virt/kvm/x86/amd-memory-encryption.rst b/Documentation/virt/kvm/x86/amd-memory-encryption.rst index b2395dd4769de6..bd04a908a8dbdd 100644 --- a/Documentation/virt/kvm/x86/amd-memory-encryption.rst +++ b/Documentation/virt/kvm/x86/amd-memory-encryption.rst @@ -656,8 +656,8 @@ References See [white-paper]_, [api-spec]_, [amd-apm]_, [kvm-forum]_, and [snp-fw-abi]_ for more info. -.. [white-paper] https://developer.amd.com/wordpress/media/2013/12/AMD_Memory_Encryption_Whitepaper_v7-Public.pdf -.. [api-spec] https://support.amd.com/TechDocs/55766_SEV-KM_API_Specification.pdf -.. [amd-apm] https://support.amd.com/TechDocs/24593.pdf (section 15.34) +.. [white-paper] https://docs.amd.com/v/u/en-US/memory-encryption-white-paper +.. [api-spec] https://docs.amd.com/v/u/en-US/55766_PUB_3.24_SEV_API +.. [amd-apm] https://docs.amd.com/v/u/en-US/24593_3.44_APM_Vol2 (section 15.34) .. [kvm-forum] https://www.linux-kvm.org/images/7/74/02x08A-Thomas_Lendacky-AMDs_Virtualizatoin_Memory_Encryption_Technology.pdf -.. [snp-fw-abi] https://www.amd.com/system/files/TechDocs/56860.pdf +.. [snp-fw-abi] https://www.amd.com/content/dam/amd/en/documents/developer/56860.pdf diff --git a/MAINTAINERS b/MAINTAINERS index a03907010da46b..42cc9d530714e9 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -68,6 +68,12 @@ Maintainers List first. When adding to this list, please keep the entries in alphabetical order. +3C509 NETWORK DRIVER +M: "Maciej W. Rozycki" +L: netdev@vger.kernel.org +S: Maintained +F: drivers/net/ethernet/3com/3c509.c + 3C59X NETWORK DRIVER M: Steffen Klassert L: netdev@vger.kernel.org @@ -2015,7 +2021,7 @@ F: Documentation/hwmon/aquacomputer_d5next.rst F: drivers/hwmon/aquacomputer_d5next.c AQUANTIA ETHERNET DRIVER (atlantic) -M: Igor Russkikh +M: Sukhdeep Singh L: netdev@vger.kernel.org S: Maintained W: https://www.marvell.com/ @@ -2024,7 +2030,7 @@ F: Documentation/networking/device_drivers/ethernet/aquantia/atlantic.rst F: drivers/net/ethernet/aquantia/atlantic/ AQUANTIA ETHERNET DRIVER PTP SUBSYSTEM -M: Egor Pomozov +M: Sukhdeep Singh L: netdev@vger.kernel.org S: Maintained W: http://www.aquantia.com @@ -4181,8 +4187,8 @@ F: include/uapi/linux/sonet.h F: net/atm/ ATMEL MACB ETHERNET DRIVER -M: Nicolas Ferre -M: Claudiu Beznea +M: Théo Lebrun +R: Conor Dooley S: Maintained F: drivers/net/ethernet/cadence/ @@ -4299,18 +4305,16 @@ F: Documentation/devicetree/bindings/leds/backlight/awinic,aw99706.yaml F: drivers/video/backlight/aw99706.c AXENTIA ARM DEVICES -M: Peter Rosin L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) -S: Maintained +S: Orphan F: arch/arm/boot/dts/microchip/at91-linea.dtsi F: arch/arm/boot/dts/microchip/at91-natte.dtsi F: arch/arm/boot/dts/microchip/at91-nattis-2-natte-2.dts F: arch/arm/boot/dts/microchip/at91-tse850-3.dts AXENTIA ASOC DRIVERS -M: Peter Rosin L: linux-sound@vger.kernel.org -S: Maintained +S: Orphan F: Documentation/devicetree/bindings/sound/axentia,* F: sound/soc/atmel/tse850-pcm5142.c @@ -6358,6 +6362,7 @@ F: include/uapi/linux/comedi.h COMMON CLK FRAMEWORK M: Michael Turquette M: Stephen Boyd +R: Brian Masney L: linux-clk@vger.kernel.org S: Maintained Q: http://patchwork.kernel.org/project/linux-clk/list/ @@ -7077,6 +7082,12 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git core/debugobjec F: include/linux/debugobjects.h F: lib/debugobjects.c +DEC LANCE NETWORK DRIVER +M: "Maciej W. Rozycki" +L: netdev@vger.kernel.org +S: Maintained +F: drivers/net/ethernet/amd/declance.c + DECSTATION PLATFORM SUPPORT M: "Maciej W. Rozycki" L: linux-mips@vger.kernel.org @@ -8193,10 +8204,9 @@ F: include/uapi/drm/nouveau_drm.h CORE DRIVER FOR NVIDIA GPUS [RUST] M: Danilo Krummrich M: Alexandre Courbot -L: nouveau@lists.freedesktop.org +L: nova-gpu@lists.linux.dev S: Supported W: https://rust-for-linux.com/nova-gpu-driver -Q: https://patchwork.freedesktop.org/project/nouveau/ B: https://gitlab.freedesktop.org/drm/nova/-/issues C: irc://irc.oftc.net/nouveau T: git https://gitlab.freedesktop.org/drm/rust/kernel.git drm-rust-next @@ -8205,10 +8215,9 @@ F: drivers/gpu/nova-core/ DRM DRIVER FOR NVIDIA GPUS [RUST] M: Danilo Krummrich -L: nouveau@lists.freedesktop.org +L: nova-gpu@lists.linux.dev S: Supported W: https://rust-for-linux.com/nova-gpu-driver -Q: https://patchwork.freedesktop.org/project/nouveau/ B: https://gitlab.freedesktop.org/drm/nova/-/issues C: irc://irc.oftc.net/nouveau T: git https://gitlab.freedesktop.org/drm/rust/kernel.git drm-rust-next @@ -12046,7 +12055,7 @@ F: Documentation/i2c/busses/i2c-nvidia-gpu.rst F: drivers/i2c/busses/i2c-nvidia-gpu.c I2C MUXES -M: Peter Rosin +M: Peter Rosin L: linux-i2c@vger.kernel.org S: Maintained F: Documentation/devicetree/bindings/i2c/i2c-arb* @@ -12447,7 +12456,7 @@ F: drivers/iio/industrialio-backend.c F: include/linux/iio/backend.h IIO DIGITAL POTENTIOMETER DAC -M: Peter Rosin +M: Peter Rosin L: linux-iio@vger.kernel.org S: Maintained F: Documentation/ABI/testing/sysfs-bus-iio-dac-dpot-dac @@ -12455,7 +12464,7 @@ F: Documentation/devicetree/bindings/iio/dac/dpot-dac.yaml F: drivers/iio/dac/dpot-dac.c IIO ENVELOPE DETECTOR -M: Peter Rosin +M: Peter Rosin L: linux-iio@vger.kernel.org S: Maintained F: Documentation/ABI/testing/sysfs-bus-iio-adc-envelope-detector @@ -12471,7 +12480,7 @@ F: include/linux/iio/iio-gts-helper.h F: drivers/iio/test/iio-test-gts.c IIO MULTIPLEXER -M: Peter Rosin +M: Peter Rosin L: linux-iio@vger.kernel.org S: Maintained F: Documentation/devicetree/bindings/iio/multiplexer/io-channel-mux.yaml @@ -12502,7 +12511,7 @@ F: include/linux/iio/ F: tools/iio/ IIO UNIT CONVERTER -M: Peter Rosin +M: Peter Rosin L: linux-iio@vger.kernel.org S: Maintained F: Documentation/devicetree/bindings/iio/afe/current-sense-amplifier.yaml @@ -14052,6 +14061,7 @@ KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64) M: Marc Zyngier M: Oliver Upton R: Joey Gouly +R: Steffen Eiden R: Suzuki K Poulose R: Zenghui Yu L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) @@ -15718,7 +15728,7 @@ F: Documentation/devicetree/bindings/media/i2c/maxim,max96717.yaml F: drivers/media/i2c/max96717.c MAX9860 MONO AUDIO VOICE CODEC DRIVER -M: Peter Rosin +M: Peter Rosin L: linux-sound@vger.kernel.org S: Maintained F: Documentation/devicetree/bindings/sound/max9860.txt @@ -15933,7 +15943,7 @@ F: Documentation/devicetree/bindings/net/can/microchip,mcp251xfd.yaml F: drivers/net/can/spi/mcp251xfd/ MCP4018 AND MCP4531 MICROCHIP DIGITAL POTENTIOMETER DRIVERS -M: Peter Rosin +M: Peter Rosin L: linux-iio@vger.kernel.org S: Maintained F: Documentation/ABI/testing/sysfs-bus-iio-potentiometer-mcp4531 @@ -18238,7 +18248,7 @@ F: include/linux/mmc/ F: include/uapi/linux/mmc/ MULTIPLEXER SUBSYSTEM -M: Peter Rosin +M: Peter Rosin S: Odd Fixes F: Documentation/ABI/testing/sysfs-class-mux* F: Documentation/devicetree/bindings/mux/ @@ -19347,7 +19357,7 @@ F: include/dt-bindings/display/tda998x.h K: "nxp,tda998x" NXP TFA9879 DRIVER -M: Peter Rosin +M: Peter Rosin L: linux-sound@vger.kernel.org S: Maintained F: Documentation/devicetree/bindings/sound/trivial-codec.yaml @@ -19445,7 +19455,6 @@ F: include/misc/ocxl* F: include/uapi/misc/ocxl.h OMAP AUDIO SUPPORT -M: Peter Ujfalusi M: Jarkko Nikula L: linux-sound@vger.kernel.org L: linux-omap@vger.kernel.org @@ -20348,13 +20357,14 @@ F: Documentation/devicetree/bindings/pci/marvell,armada8k-pcie.yaml F: drivers/pci/controller/dwc/pcie-armada8k.c PCI DRIVER FOR CADENCE PCIE IP +R: Aksh Garg L: linux-pci@vger.kernel.org S: Orphan F: Documentation/devicetree/bindings/pci/cdns,* -F: drivers/pci/controller/cadence/*cadence* +F: drivers/pci/controller/cadence/ PCI DRIVER FOR CIX Sky1 -M: Hans Zhang +M: Hans Zhang <18255117159@163.com> L: linux-pci@vger.kernel.org S: Maintained F: Documentation/devicetree/bindings/pci/cix,sky1-pcie-*.yaml @@ -20466,7 +20476,7 @@ F: drivers/pci/controller/plda/pcie-plda-host.c F: drivers/pci/controller/plda/pcie-plda.h PCI DRIVER FOR RENESAS R-CAR -M: Marek Vasut +M: Marek Vasut M: Yoshihiro Shimoda L: linux-pci@vger.kernel.org L: linux-renesas-soc@vger.kernel.org @@ -22940,7 +22950,7 @@ N: riscv K: riscv RISC-V IOMMU -M: Tomasz Jeznach +M: Tomasz Jeznach L: iommu@lists.linux.dev L: linux-riscv@lists.infradead.org S: Maintained @@ -24650,6 +24660,7 @@ S: Maintained F: fs/smb/client/smbdirect.* F: fs/smb/smbdirect/ F: fs/smb/server/transport_rdma.* +F: include/linux/smbdirect.h SMC91x ETHERNET DRIVER M: Nicolas Pitre @@ -25020,6 +25031,13 @@ F: include/sound/dmaengine_pcm.h F: sound/core/pcm_dmaengine.c F: sound/soc/soc-generic-dmaengine-pcm.c +SOUND - SOC LAYER / GPIO AUDIO AMPLIFIER +M: Herve Codina +L: linux-sound@vger.kernel.org +S: Maintained +F: Documentation/devicetree/bindings/sound/gpio-audio-amp.yaml +F: sound/soc/codecs/simple-amplifier.c + SOUND - SOC LAYER / DYNAMIC AUDIO POWER MANAGEMENT (ASoC) M: Liam Girdwood M: Mark Brown @@ -26346,7 +26364,7 @@ F: arch/xtensa/ F: drivers/irqchip/irq-xtensa-* TEXAS INSTRUMENTS ASoC DRIVERS -M: Peter Ujfalusi +M: Sen Wang L: linux-sound@vger.kernel.org S: Maintained F: Documentation/devicetree/bindings/sound/davinci-mcasp-audio.yaml @@ -26852,12 +26870,6 @@ S: Maintained F: Documentation/devicetree/bindings/iio/adc/ti,tsc2046.yaml F: drivers/iio/adc/ti-tsc2046.c -TI TWL4030 SERIES SOC CODEC DRIVER -M: Peter Ujfalusi -L: linux-sound@vger.kernel.org -S: Maintained -F: sound/soc/codecs/twl4030* - TI VPE/CAL DRIVERS M: Yemike Abhilash Chandra L: linux-media@vger.kernel.org diff --git a/Makefile b/Makefile index 9f88dcaae38271..9f59598d3a085a 100644 --- a/Makefile +++ b/Makefile @@ -2,7 +2,7 @@ VERSION = 7 PATCHLEVEL = 1 SUBLEVEL = 0 -EXTRAVERSION = -rc2 +EXTRAVERSION = -rc4 NAME = Baby Opossum Posse # *DOCUMENTATION* @@ -486,6 +486,8 @@ export rust_common_flags := --edition=2021 \ -Wclippy::as_ptr_cast_mut \ -Wclippy::as_underscore \ -Wclippy::cast_lossless \ + -Aclippy::collapsible_if \ + -Aclippy::collapsible_match \ -Wclippy::ignored_unit_patterns \ -Aclippy::incompatible_msrv \ -Wclippy::mut_mut \ diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h index 091544e6af442e..dc2957662ff204 100644 --- a/arch/arm64/include/asm/kvm_nested.h +++ b/arch/arm64/include/asm/kvm_nested.h @@ -23,6 +23,7 @@ static inline u64 tcr_el2_ps_to_tcr_el1_ips(u64 tcr_el2) static inline u64 translate_tcr_el2_to_tcr_el1(u64 tcr) { return TCR_EPD1_MASK | /* disable TTBR1_EL1 */ + ((tcr & TCR_EL2_DS) ? TCR_DS : 0) | ((tcr & TCR_EL2_TBI) ? TCR_TBI0 : 0) | tcr_el2_ps_to_tcr_el1_ips(tcr) | (tcr & TCR_EL2_TG0_MASK) | diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index 736561480f3650..7aa08d59d49448 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -844,7 +844,7 @@ #define INIT_SCTLR_EL2_MMU_ON \ (SCTLR_ELx_M | SCTLR_ELx_C | SCTLR_ELx_SA | SCTLR_ELx_I | \ SCTLR_ELx_IESB | SCTLR_ELx_WXN | ENDIAN_SET_EL2 | \ - SCTLR_ELx_ITFSB | SCTLR_EL2_RES1) + SCTLR_ELx_ITFSB | SCTLR_ELx_EIS | SCTLR_ELx_EOS | SCTLR_EL2_RES1) #define INIT_SCTLR_EL2_MMU_OFF \ (SCTLR_EL2_RES1 | ENDIAN_SET_EL2) diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c index cb54335465f667..c7a23f7c221220 100644 --- a/arch/arm64/kernel/entry-common.c +++ b/arch/arm64/kernel/entry-common.c @@ -62,6 +62,13 @@ static void noinstr arm64_exit_to_kernel_mode(struct pt_regs *regs, irqentry_exit_to_kernel_mode_after_preempt(regs, state); } +static __always_inline void arm64_syscall_enter_from_user_mode(struct pt_regs *regs) +{ + enter_from_user_mode(regs); + mte_disable_tco_entry(current); + sme_enter_from_user_mode(); +} + /* * Handle IRQ/context state management when entering from user mode. * Before this function is called it is not safe to call regular kernel code, @@ -70,20 +77,30 @@ static void noinstr arm64_exit_to_kernel_mode(struct pt_regs *regs, static __always_inline void arm64_enter_from_user_mode(struct pt_regs *regs) { enter_from_user_mode(regs); + rseq_note_user_irq_entry(); mte_disable_tco_entry(current); sme_enter_from_user_mode(); } +static __always_inline void arm64_syscall_exit_to_user_mode(struct pt_regs *regs) +{ + local_irq_disable(); + syscall_exit_to_user_mode_prepare(regs); + local_daif_mask(); + sme_exit_to_user_mode(); + mte_check_tfsr_exit(); + exit_to_user_mode(); +} + /* * Handle IRQ/context state management when exiting to user mode. * After this function returns it is not safe to call regular kernel code, * instrumentable code, or any code which may trigger an exception. */ - static __always_inline void arm64_exit_to_user_mode(struct pt_regs *regs) { local_irq_disable(); - exit_to_user_mode_prepare_legacy(regs); + irqentry_exit_to_user_mode_prepare(regs); local_daif_mask(); sme_exit_to_user_mode(); mte_check_tfsr_exit(); @@ -92,7 +109,7 @@ static __always_inline void arm64_exit_to_user_mode(struct pt_regs *regs) asmlinkage void noinstr asm_exit_to_user_mode(struct pt_regs *regs) { - arm64_exit_to_user_mode(regs); + arm64_syscall_exit_to_user_mode(regs); } /* @@ -716,12 +733,12 @@ static void noinstr el0_brk64(struct pt_regs *regs, unsigned long esr) static void noinstr el0_svc(struct pt_regs *regs) { - arm64_enter_from_user_mode(regs); + arm64_syscall_enter_from_user_mode(regs); cortex_a76_erratum_1463225_svc_handler(); fpsimd_syscall_enter(); local_daif_restore(DAIF_PROCCTX); do_el0_svc(regs); - arm64_exit_to_user_mode(regs); + arm64_syscall_exit_to_user_mode(regs); fpsimd_syscall_exit(); } @@ -868,11 +885,11 @@ static void noinstr el0_cp15(struct pt_regs *regs, unsigned long esr) static void noinstr el0_svc_compat(struct pt_regs *regs) { - arm64_enter_from_user_mode(regs); + arm64_syscall_enter_from_user_mode(regs); cortex_a76_erratum_1463225_svc_handler(); local_daif_restore(DAIF_PROCCTX); do_el0_svc_compat(regs); - arm64_exit_to_user_mode(regs); + arm64_syscall_exit_to_user_mode(regs); } static void noinstr el0_bkpt32(struct pt_regs *regs, unsigned long esr) diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c index ba5eab23fd9008..4d08598e2891d3 100644 --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -983,8 +983,8 @@ static int sve_set_common(struct task_struct *target, } /* Always zero V regs, FPSR, and FPCR */ - memset(¤t->thread.uw.fpsimd_state, 0, - sizeof(current->thread.uw.fpsimd_state)); + memset(&target->thread.uw.fpsimd_state, 0, + sizeof(target->thread.uw.fpsimd_state)); /* Registers: FPSIMD-only case */ diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 8bb2c7422cc8b0..34c9950884d5e6 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -4,6 +4,7 @@ * Author: Christoffer Dall */ +#include #include #include #include @@ -2638,6 +2639,22 @@ static int init_pkvm_host_sve_state(void) return 0; } +static int pkvm_check_sme_dvmsync_fw_call(void) +{ + struct arm_smccc_res res; + + if (!cpus_have_final_cap(ARM64_WORKAROUND_4193714)) + return 0; + + arm_smccc_1_1_smc(ARM_SMCCC_CPU_WORKAROUND_4193714, &res); + if (res.a0) { + kvm_err("pKVM requires firmware support for C1-Pro erratum 4193714\n"); + return -ENODEV; + } + + return 0; +} + /* * Finalizes the initialization of hyp mode, once everything else is initialized * and the initialziation process cannot fail. @@ -2838,6 +2855,10 @@ static int __init init_hyp_mode(void) if (err) goto out_err; + err = pkvm_check_sme_dvmsync_fw_call(); + if (err) + goto out_err; + err = kvm_hyp_init_protection(hyp_va_bits); if (err) { kvm_err("Failed to init hyp memory protection\n"); diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h index 98b2976837b116..bf0eb5e434274a 100644 --- a/arch/arm64/kvm/hyp/include/hyp/switch.h +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h @@ -245,7 +245,7 @@ static inline void __activate_traps_ich_hfgxtr(struct kvm_vcpu *vcpu) __activate_fgt(hctxt, vcpu, ICH_HFGITR_EL2); } -#define __deactivate_fgt(htcxt, vcpu, reg) \ +#define __deactivate_fgt(hctxt, vcpu, reg) \ do { \ write_sysreg_s(ctxt_sys_reg(hctxt, reg), \ SYS_ ## reg); \ diff --git a/arch/arm64/kvm/hyp/nvhe/clock.c b/arch/arm64/kvm/hyp/nvhe/clock.c index 32fc4313fe432a..a7fc61976fd0dc 100644 --- a/arch/arm64/kvm/hyp/nvhe/clock.c +++ b/arch/arm64/kvm/hyp/nvhe/clock.c @@ -35,6 +35,9 @@ void trace_clock_update(u32 mult, u32 shift, u64 epoch_ns, u64 epoch_cyc) struct clock_data *clock = &trace_clock_data; u64 bank = clock->cur ^ 1; + if (!mult || shift >= 64) + return; + clock->data[bank].mult = mult; clock->data[bank].shift = shift; clock->data[bank].epoch_ns = epoch_ns; diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c index 28a471d1927cd5..25f04629014e61 100644 --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c @@ -5,6 +5,7 @@ */ #include + #include #include #include @@ -14,6 +15,7 @@ #include +#include #include #include #include @@ -29,6 +31,19 @@ static struct hyp_pool host_s2_pool; static DEFINE_PER_CPU(struct pkvm_hyp_vm *, __current_vm); #define current_vm (*this_cpu_ptr(&__current_vm)) +static void pkvm_sme_dvmsync_fw_call(void) +{ + if (alternative_has_cap_unlikely(ARM64_WORKAROUND_4193714)) { + struct arm_smccc_res res; + + /* + * Ignore the return value. Probing for the workaround + * availability took place in init_hyp_mode(). + */ + hyp_smccc_1_1_smc(ARM_SMCCC_CPU_WORKAROUND_4193714, &res); + } +} + static void guest_lock_component(struct pkvm_hyp_vm *vm) { hyp_spin_lock(&vm->lock); @@ -574,8 +589,14 @@ static int host_stage2_set_owner_metadata_locked(phys_addr_t addr, u64 size, ret = host_stage2_try(kvm_pgtable_stage2_annotate, &host_mmu.pgt, addr, size, &host_s2_pool, KVM_HOST_INVALID_PTE_TYPE_DONATION, annotation); - if (!ret) + if (!ret) { + /* + * After stage2 maintenance has happened, but before the page + * owner has changed. + */ + pkvm_sme_dvmsync_fw_call(); __host_update_page_state(addr, size, PKVM_NOPAGE); + } return ret; } @@ -1369,6 +1390,22 @@ int __pkvm_host_reclaim_page_guest(u64 gfn, struct pkvm_hyp_vm *vm) return ret && ret != -EHWPOISON ? ret : 0; } +/* + * share/donate install at most one stage-2 leaf (PAGE_SIZE, or one + * KVM_PGTABLE_LAST_LEVEL - 1 block for share). kvm_mmu_cache_min_pages() + * bounds the worst-case allocation: exact for the PAGE_SIZE leaf, + * conservative by one for the block. + */ +static int __guest_check_pgtable_memcache(struct pkvm_hyp_vcpu *vcpu) +{ + struct pkvm_hyp_vm *vm = pkvm_hyp_vcpu_to_hyp_vm(vcpu); + + if (vcpu->vcpu.arch.pkvm_memcache.nr_pages < kvm_mmu_cache_min_pages(vm->pgt.mmu)) + return -ENOMEM; + + return 0; +} + int __pkvm_host_donate_guest(u64 pfn, u64 gfn, struct pkvm_hyp_vcpu *vcpu) { struct pkvm_hyp_vm *vm = pkvm_hyp_vcpu_to_hyp_vm(vcpu); @@ -1388,6 +1425,10 @@ int __pkvm_host_donate_guest(u64 pfn, u64 gfn, struct pkvm_hyp_vcpu *vcpu) if (ret) goto unlock; + ret = __guest_check_pgtable_memcache(vcpu); + if (ret) + goto unlock; + meta = host_stage2_encode_gfn_meta(vm, gfn); WARN_ON(host_stage2_set_owner_metadata_locked(phys, PAGE_SIZE, PKVM_ID_GUEST, meta)); @@ -1453,6 +1494,10 @@ int __pkvm_host_share_guest(u64 pfn, u64 gfn, u64 nr_pages, struct pkvm_hyp_vcpu } } + ret = __guest_check_pgtable_memcache(vcpu); + if (ret) + goto unlock; + for_each_hyp_page(page, phys, size) { set_host_state(page, PKVM_PAGE_SHARED_OWNED); page->host_share_guest_count++; diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c index e7496eb8562897..eb1c10120f9f58 100644 --- a/arch/arm64/kvm/hyp/nvhe/pkvm.c +++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c @@ -752,16 +752,30 @@ static struct pkvm_hyp_vcpu selftest_vcpu = { struct pkvm_hyp_vcpu *init_selftest_vm(void *virt) { struct hyp_page *p = hyp_virt_to_page(virt); + unsigned long min_pages, seeded = 0; int i; selftest_vm.kvm.arch.mmu.vtcr = host_mmu.arch.mmu.vtcr; WARN_ON(kvm_guest_prepare_stage2(&selftest_vm, virt)); + /* + * Mirror pkvm_refill_memcache() for the share/donate pre-checks; + * the selftest invokes those functions directly and would + * otherwise see an empty memcache. + */ + min_pages = kvm_mmu_cache_min_pages(&selftest_vm.kvm.arch.mmu); + for (i = 0; i < pkvm_selftest_pages(); i++) { if (p[i].refcount) continue; p[i].refcount = 1; - hyp_put_page(&selftest_vm.pool, hyp_page_to_virt(&p[i])); + if (seeded < min_pages) { + push_hyp_memcache(&selftest_vcpu.vcpu.arch.pkvm_memcache, + hyp_page_to_virt(&p[i]), hyp_virt_to_phys); + seeded++; + } else { + hyp_put_page(&selftest_vm.pool, hyp_page_to_virt(&p[i])); + } } selftest_vm.kvm.arch.pkvm.handle = __pkvm_reserve_vm(); diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c index 9db3f11a4754d6..1e8995add14fa3 100644 --- a/arch/arm64/kvm/hyp/vhe/switch.c +++ b/arch/arm64/kvm/hyp/vhe/switch.c @@ -663,7 +663,8 @@ static void __noreturn __hyp_call_panic(u64 spsr, u64 elr, u64 par) host_ctxt = host_data_ptr(host_ctxt); vcpu = host_ctxt->__hyp_running_vcpu; - __deactivate_traps(vcpu); + if (vcpu) + __deactivate_traps(vcpu); sysreg_restore_host_state_vhe(host_ctxt); panic("HYP panic:\nPS:%08llx PC:%016llx ESR:%08llx\nFAR:%016llx HPFAR:%016llx PAR:%016llx\nVCPU:%p\n", diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index d089c107d9b711..4da9281312eb8f 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1576,21 +1576,24 @@ struct kvm_s2_fault_desc { static int gmem_abort(const struct kvm_s2_fault_desc *s2fd) { bool write_fault, exec_fault; + bool perm_fault = kvm_vcpu_trap_is_permission_fault(s2fd->vcpu); enum kvm_pgtable_walk_flags flags = KVM_PGTABLE_WALK_SHARED; enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R; struct kvm_pgtable *pgt = s2fd->vcpu->arch.hw_mmu->pgt; unsigned long mmu_seq; struct page *page; struct kvm *kvm = s2fd->vcpu->kvm; - void *memcache; + void *memcache = NULL; kvm_pfn_t pfn; gfn_t gfn; int ret; - memcache = get_mmu_memcache(s2fd->vcpu); - ret = topup_mmu_memcache(s2fd->vcpu, memcache); - if (ret) - return ret; + if (!perm_fault) { + memcache = get_mmu_memcache(s2fd->vcpu); + ret = topup_mmu_memcache(s2fd->vcpu, memcache); + if (ret) + return ret; + } if (s2fd->nested) gfn = kvm_s2_trans_output(s2fd->nested) >> PAGE_SHIFT; @@ -1631,9 +1634,19 @@ static int gmem_abort(const struct kvm_s2_fault_desc *s2fd) goto out_unlock; } - ret = KVM_PGT_FN(kvm_pgtable_stage2_map)(pgt, s2fd->fault_ipa, PAGE_SIZE, - __pfn_to_phys(pfn), prot, - memcache, flags); + if (perm_fault) { + /* + * Drop the SW bits in favour of those stored in the + * PTE, which will be preserved. + */ + prot &= ~KVM_NV_GUEST_MAP_SZ; + ret = KVM_PGT_FN(kvm_pgtable_stage2_relax_perms)(pgt, s2fd->fault_ipa, + prot, flags); + } else { + ret = KVM_PGT_FN(kvm_pgtable_stage2_map)(pgt, s2fd->fault_ipa, PAGE_SIZE, + __pfn_to_phys(pfn), prot, + memcache, flags); + } out_unlock: kvm_release_faultin_page(kvm, page, !!ret, prot & KVM_PGTABLE_PROT_W); diff --git a/arch/loongarch/Kbuild b/arch/loongarch/Kbuild index beb8499dd8ed84..1c7a0dbe5e72f2 100644 --- a/arch/loongarch/Kbuild +++ b/arch/loongarch/Kbuild @@ -3,7 +3,7 @@ obj-y += mm/ obj-y += net/ obj-y += vdso/ -obj-$(CONFIG_KVM) += kvm/ +obj-$(subst m,y,$(CONFIG_KVM)) += kvm/ # for cleaning subdir- += boot diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig index 3b042dbb2c4123..606597da46b8df 100644 --- a/arch/loongarch/Kconfig +++ b/arch/loongarch/Kconfig @@ -220,6 +220,7 @@ menu "Kernel type and options" choice prompt "Kernel type" + default 64BIT # Keep existing behavior config 32BIT bool "32-bit kernel" diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile index 47516aeea9d29f..54fcfa1eac1f89 100644 --- a/arch/loongarch/Makefile +++ b/arch/loongarch/Makefile @@ -55,9 +55,11 @@ endif ifdef CONFIG_32BIT tool-archpref = $(32bit-tool-archpref) UTS_MACHINE := loongarch32 +cflags-y += $(call cc-option,-m32) else tool-archpref = $(64bit-tool-archpref) UTS_MACHINE := loongarch64 +cflags-y += $(call cc-option,-m64) endif ifneq ($(SUBARCH),$(ARCH)) diff --git a/arch/loongarch/include/asm/asm-prototypes.h b/arch/loongarch/include/asm/asm-prototypes.h index 704066b4f7368b..de0c17f3f49c2c 100644 --- a/arch/loongarch/include/asm/asm-prototypes.h +++ b/arch/loongarch/include/asm/asm-prototypes.h @@ -20,3 +20,23 @@ asmlinkage void noinstr __no_stack_protector ret_from_kernel_thread(struct task_ struct pt_regs *regs, int (*fn)(void *), void *fn_arg); + +struct kvm_run; +struct kvm_vcpu; +struct loongarch_fpu; + +void kvm_exc_entry(void); +int kvm_enter_guest(struct kvm_run *run, struct kvm_vcpu *vcpu); + +void kvm_save_fpu(struct loongarch_fpu *fpu); +void kvm_restore_fpu(struct loongarch_fpu *fpu); + +#ifdef CONFIG_CPU_HAS_LSX +void kvm_save_lsx(struct loongarch_fpu *fpu); +void kvm_restore_lsx(struct loongarch_fpu *fpu); +#endif + +#ifdef CONFIG_CPU_HAS_LASX +void kvm_save_lasx(struct loongarch_fpu *fpu); +void kvm_restore_lasx(struct loongarch_fpu *fpu); +#endif diff --git a/arch/loongarch/include/asm/kvm_host.h b/arch/loongarch/include/asm/kvm_host.h index 130cedbb6b39b3..776bc487a70527 100644 --- a/arch/loongarch/include/asm/kvm_host.h +++ b/arch/loongarch/include/asm/kvm_host.h @@ -87,7 +87,6 @@ struct kvm_context { struct kvm_world_switch { int (*exc_entry)(void); int (*enter_guest)(struct kvm_run *run, struct kvm_vcpu *vcpu); - unsigned long page_order; }; #define MAX_PGTABLE_LEVELS 4 @@ -359,8 +358,6 @@ void kvm_exc_entry(void); int kvm_enter_guest(struct kvm_run *run, struct kvm_vcpu *vcpu); extern unsigned long vpid_mask; -extern const unsigned long kvm_exception_size; -extern const unsigned long kvm_enter_guest_size; extern struct kvm_world_switch *kvm_loongarch_ops; #define SW_GCSR (1 << 0) diff --git a/arch/loongarch/include/asm/linkage.h b/arch/loongarch/include/asm/linkage.h index a1bd6a3ee03a19..ae937d1708b247 100644 --- a/arch/loongarch/include/asm/linkage.h +++ b/arch/loongarch/include/asm/linkage.h @@ -69,7 +69,7 @@ 9, 10, 11, 12, 13, 14, 15, 16, \ 17, 18, 19, 20, 21, 22, 23, 24, \ 25, 26, 27, 28, 29, 30, 31; \ - .cfi_offset \num, SC_REGS + \num * SZREG; \ + .cfi_offset \num, SC_REGS + \num * 8; \ .endr; \ \ nop; \ diff --git a/arch/loongarch/include/asm/vdso/gettimeofday.h b/arch/loongarch/include/asm/vdso/gettimeofday.h index bae76767c693a9..18ba403e1ed944 100644 --- a/arch/loongarch/include/asm/vdso/gettimeofday.h +++ b/arch/loongarch/include/asm/vdso/gettimeofday.h @@ -85,12 +85,6 @@ static __always_inline u64 __arch_get_hw_counter(s32 clock_mode, return count; } -static inline bool loongarch_vdso_hres_capable(void) -{ - return true; -} -#define __arch_vdso_hres_capable loongarch_vdso_hres_capable - #endif /* CONFIG_GENERIC_GETTIMEOFDAY */ #endif /* !__ASSEMBLER__ */ diff --git a/arch/loongarch/kvm/Makefile b/arch/loongarch/kvm/Makefile index ae469edec99c16..a4d044da3aa7c2 100644 --- a/arch/loongarch/kvm/Makefile +++ b/arch/loongarch/kvm/Makefile @@ -7,11 +7,12 @@ include $(srctree)/virt/kvm/Makefile.kvm obj-$(CONFIG_KVM) += kvm.o +obj-y += switch.o + kvm-y += exit.o kvm-y += interrupt.o kvm-y += main.o kvm-y += mmu.o -kvm-y += switch.o kvm-y += timer.o kvm-y += tlb.o kvm-y += vcpu.o diff --git a/arch/loongarch/kvm/exit.c b/arch/loongarch/kvm/exit.c index da0ad89f2eb746..3b95cd0f989b08 100644 --- a/arch/loongarch/kvm/exit.c +++ b/arch/loongarch/kvm/exit.c @@ -390,6 +390,7 @@ int kvm_emu_mmio_read(struct kvm_vcpu *vcpu, larch_inst inst) run->mmio.len = 8; break; default: + ret = EMULATE_FAIL; break; } break; diff --git a/arch/loongarch/kvm/interrupt.c b/arch/loongarch/kvm/interrupt.c index 32930959f7c256..a18c60dffbbaea 100644 --- a/arch/loongarch/kvm/interrupt.c +++ b/arch/loongarch/kvm/interrupt.c @@ -28,23 +28,29 @@ static unsigned int priority_to_irq[EXCCODE_INT_NUM] = { static int kvm_irq_deliver(struct kvm_vcpu *vcpu, unsigned int priority) { unsigned int irq = 0; + unsigned long old, new; clear_bit(priority, &vcpu->arch.irq_pending); if (priority < EXCCODE_INT_NUM) irq = priority_to_irq[priority]; - if (kvm_guest_has_msgint(&vcpu->arch) && (priority == INT_AVEC)) { - dmsintc_inject_irq(vcpu); - set_gcsr_estat(irq); - return 1; - } - switch (priority) { + case INT_AVEC: + if (!kvm_guest_has_msgint(&vcpu->arch)) + break; + dmsintc_inject_irq(vcpu); + fallthrough; case INT_TI: case INT_IPI: case INT_SWI0: case INT_SWI1: + old = kvm_read_hw_gcsr(LOONGARCH_CSR_TVAL); set_gcsr_estat(irq); + new = kvm_read_hw_gcsr(LOONGARCH_CSR_TVAL); + + /* Inject TI if TVAL inverted */ + if (new > old) + set_gcsr_estat(CPU_TIMER); break; case INT_HWI0 ... INT_HWI7: @@ -61,22 +67,28 @@ static int kvm_irq_deliver(struct kvm_vcpu *vcpu, unsigned int priority) static int kvm_irq_clear(struct kvm_vcpu *vcpu, unsigned int priority) { unsigned int irq = 0; + unsigned long old, new; clear_bit(priority, &vcpu->arch.irq_clear); if (priority < EXCCODE_INT_NUM) irq = priority_to_irq[priority]; - if (kvm_guest_has_msgint(&vcpu->arch) && (priority == INT_AVEC)) { - clear_gcsr_estat(irq); - return 1; - } - switch (priority) { + case INT_AVEC: + if (!kvm_guest_has_msgint(&vcpu->arch)) + break; + fallthrough; case INT_TI: case INT_IPI: case INT_SWI0: case INT_SWI1: + old = kvm_read_hw_gcsr(LOONGARCH_CSR_TVAL); clear_gcsr_estat(irq); + new = kvm_read_hw_gcsr(LOONGARCH_CSR_TVAL); + + /* Inject TI if TVAL inverted */ + if (new > old) + set_gcsr_estat(CPU_TIMER); break; case INT_HWI0 ... INT_HWI7: diff --git a/arch/loongarch/kvm/main.c b/arch/loongarch/kvm/main.c index 76ebff2faeddc9..f105a86143f5ba 100644 --- a/arch/loongarch/kvm/main.c +++ b/arch/loongarch/kvm/main.c @@ -348,8 +348,7 @@ void kvm_arch_disable_virtualization_cpu(void) static int kvm_loongarch_env_init(void) { - int cpu, order, ret; - void *addr; + int cpu, ret; struct kvm_context *context; vmcs = alloc_percpu(struct kvm_context); @@ -365,30 +364,8 @@ static int kvm_loongarch_env_init(void) return -ENOMEM; } - /* - * PGD register is shared between root kernel and kvm hypervisor. - * So world switch entry should be in DMW area rather than TLB area - * to avoid page fault reenter. - * - * In future if hardware pagetable walking is supported, we won't - * need to copy world switch code to DMW area. - */ - order = get_order(kvm_exception_size + kvm_enter_guest_size); - addr = (void *)__get_free_pages(GFP_KERNEL, order); - if (!addr) { - free_percpu(vmcs); - vmcs = NULL; - kfree(kvm_loongarch_ops); - kvm_loongarch_ops = NULL; - return -ENOMEM; - } - - memcpy(addr, kvm_exc_entry, kvm_exception_size); - memcpy(addr + kvm_exception_size, kvm_enter_guest, kvm_enter_guest_size); - flush_icache_range((unsigned long)addr, (unsigned long)addr + kvm_exception_size + kvm_enter_guest_size); - kvm_loongarch_ops->exc_entry = addr; - kvm_loongarch_ops->enter_guest = addr + kvm_exception_size; - kvm_loongarch_ops->page_order = order; + kvm_loongarch_ops->exc_entry = (void *)kvm_exc_entry; + kvm_loongarch_ops->enter_guest = (void *)kvm_enter_guest; vpid_mask = read_csr_gstat(); vpid_mask = (vpid_mask & CSR_GSTAT_GIDBIT) >> CSR_GSTAT_GIDBIT_SHIFT; @@ -428,16 +405,10 @@ static int kvm_loongarch_env_init(void) static void kvm_loongarch_env_exit(void) { - unsigned long addr; - if (vmcs) free_percpu(vmcs); if (kvm_loongarch_ops) { - if (kvm_loongarch_ops->exc_entry) { - addr = (unsigned long)kvm_loongarch_ops->exc_entry; - free_pages(addr, kvm_loongarch_ops->page_order); - } kfree(kvm_loongarch_ops); } diff --git a/arch/loongarch/kvm/mmu.c b/arch/loongarch/kvm/mmu.c index a7fa458e33605e..e104897aa53285 100644 --- a/arch/loongarch/kvm/mmu.c +++ b/arch/loongarch/kvm/mmu.c @@ -95,7 +95,7 @@ static int kvm_flush_pte(kvm_pte_t *pte, phys_addr_t addr, kvm_ptw_ctx *ctx) else kvm->stat.pages--; - *pte = ctx->invalid_entry; + kvm_set_pte(pte, ctx->invalid_entry); return 1; } diff --git a/arch/loongarch/kvm/switch.S b/arch/loongarch/kvm/switch.S index f1768b7a619497..936e4ae3e40859 100644 --- a/arch/loongarch/kvm/switch.S +++ b/arch/loongarch/kvm/switch.S @@ -4,9 +4,11 @@ */ #include +#include #include #include #include +#include #include #include @@ -100,11 +102,16 @@ * - is still in guest mode, such as pgd table/vmid registers etc, * - will fix with hw page walk enabled in future * load kvm_vcpu from reserved CSR KVM_VCPU_KS, and save a2 to KVM_TEMP_KS + * + * PGD register is shared between root kernel and kvm hypervisor. + * So world switch entry should be in DMW area rather than TLB area + * to avoid page fault re-enter. */ .text + .p2align PAGE_SHIFT .cfi_sections .debug_frame SYM_CODE_START(kvm_exc_entry) - UNWIND_HINT_UNDEFINED + UNWIND_HINT_END_OF_STACK csrwr a2, KVM_TEMP_KS csrrd a2, KVM_VCPU_KS addi.d a2, a2, KVM_VCPU_ARCH @@ -190,8 +197,8 @@ ret_to_host: kvm_restore_host_gpr a2 jr ra -SYM_INNER_LABEL(kvm_exc_entry_end, SYM_L_LOCAL) SYM_CODE_END(kvm_exc_entry) +EXPORT_SYMBOL_FOR_KVM(kvm_exc_entry) /* * int kvm_enter_guest(struct kvm_run *run, struct kvm_vcpu *vcpu) @@ -215,8 +222,8 @@ SYM_FUNC_START(kvm_enter_guest) /* Save kvm_vcpu to kscratch */ csrwr a1, KVM_VCPU_KS kvm_switch_to_guest -SYM_INNER_LABEL(kvm_enter_guest_end, SYM_L_LOCAL) SYM_FUNC_END(kvm_enter_guest) +EXPORT_SYMBOL_FOR_KVM(kvm_enter_guest) SYM_FUNC_START(kvm_save_fpu) fpu_save_csr a0 t1 @@ -224,6 +231,7 @@ SYM_FUNC_START(kvm_save_fpu) fpu_save_cc a0 t1 t2 jr ra SYM_FUNC_END(kvm_save_fpu) +EXPORT_SYMBOL_FOR_KVM(kvm_save_fpu) SYM_FUNC_START(kvm_restore_fpu) fpu_restore_double a0 t1 @@ -231,6 +239,7 @@ SYM_FUNC_START(kvm_restore_fpu) fpu_restore_cc a0 t1 t2 jr ra SYM_FUNC_END(kvm_restore_fpu) +EXPORT_SYMBOL_FOR_KVM(kvm_restore_fpu) #ifdef CONFIG_CPU_HAS_LSX SYM_FUNC_START(kvm_save_lsx) @@ -239,6 +248,7 @@ SYM_FUNC_START(kvm_save_lsx) lsx_save_data a0 t1 jr ra SYM_FUNC_END(kvm_save_lsx) +EXPORT_SYMBOL_FOR_KVM(kvm_save_lsx) SYM_FUNC_START(kvm_restore_lsx) lsx_restore_data a0 t1 @@ -246,6 +256,7 @@ SYM_FUNC_START(kvm_restore_lsx) fpu_restore_csr a0 t1 t2 jr ra SYM_FUNC_END(kvm_restore_lsx) +EXPORT_SYMBOL_FOR_KVM(kvm_restore_lsx) #endif #ifdef CONFIG_CPU_HAS_LASX @@ -255,6 +266,7 @@ SYM_FUNC_START(kvm_save_lasx) lasx_save_data a0 t1 jr ra SYM_FUNC_END(kvm_save_lasx) +EXPORT_SYMBOL_FOR_KVM(kvm_save_lasx) SYM_FUNC_START(kvm_restore_lasx) lasx_restore_data a0 t1 @@ -262,10 +274,8 @@ SYM_FUNC_START(kvm_restore_lasx) fpu_restore_csr a0 t1 t2 jr ra SYM_FUNC_END(kvm_restore_lasx) +EXPORT_SYMBOL_FOR_KVM(kvm_restore_lasx) #endif - .section ".rodata" -SYM_DATA(kvm_exception_size, .quad kvm_exc_entry_end - kvm_exc_entry) -SYM_DATA(kvm_enter_guest_size, .quad kvm_enter_guest_end - kvm_enter_guest) #ifdef CONFIG_CPU_HAS_LBT STACK_FRAME_NON_STANDARD kvm_restore_fpu diff --git a/arch/loongarch/kvm/timer.c b/arch/loongarch/kvm/timer.c index 29c2aaba63c33b..8356fce0043f60 100644 --- a/arch/loongarch/kvm/timer.c +++ b/arch/loongarch/kvm/timer.c @@ -96,15 +96,21 @@ void kvm_restore_timer(struct kvm_vcpu *vcpu) * and set CSR TVAL with -1 */ write_gcsr_timertick(0); - __delay(2); /* Wait cycles until timer interrupt injected */ /* * Writing CSR_TINTCLR_TI to LOONGARCH_CSR_TINTCLR will clear * timer interrupt, and CSR TVAL keeps unchanged with -1, it * avoids spurious timer interrupt */ - if (!(estat & CPU_TIMER)) + if (!(estat & CPU_TIMER)) { + __delay(2); /* Wait cycles until timer interrupt injected */ + + /* Write TVAL with max value if no TI shot */ + estat = kvm_read_hw_gcsr(LOONGARCH_CSR_ESTAT); + if (!(estat & CPU_TIMER)) + write_gcsr_timertick(CSR_TCFG_VAL); gcsr_write(CSR_TINTCLR_TI, LOONGARCH_CSR_TINTCLR); + } return; } diff --git a/arch/loongarch/kvm/vm.c b/arch/loongarch/kvm/vm.c index 8cc5ee1c53efbe..1317c718f896af 100644 --- a/arch/loongarch/kvm/vm.c +++ b/arch/loongarch/kvm/vm.c @@ -125,7 +125,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) r = 1; break; case KVM_CAP_NR_VCPUS: - r = num_online_cpus(); + r = min_t(unsigned int, num_online_cpus(), KVM_MAX_VCPUS); break; case KVM_CAP_MAX_VCPUS: r = KVM_MAX_VCPUS; diff --git a/arch/loongarch/pci/acpi.c b/arch/loongarch/pci/acpi.c index 0dde3ddcd54436..b02698a338eefe 100644 --- a/arch/loongarch/pci/acpi.c +++ b/arch/loongarch/pci/acpi.c @@ -61,11 +61,16 @@ static void acpi_release_root_info(struct acpi_pci_root_info *ci) static int acpi_prepare_root_resources(struct acpi_pci_root_info *ci) { int status; + unsigned long long pci_h = 0; struct resource_entry *entry, *tmp; struct acpi_device *device = ci->bridge; status = acpi_pci_probe_root_resources(ci); if (status > 0) { + acpi_evaluate_integer(device->handle, "PCIH", NULL, &pci_h); + if (pci_h) + return status; + resource_list_for_each_entry_safe(entry, tmp, &ci->resources) { if (entry->res->flags & IORESOURCE_MEM) { entry->offset = ci->root->mcfg_addr & GENMASK_ULL(63, 40); diff --git a/arch/loongarch/pci/pci.c b/arch/loongarch/pci/pci.c index d233ea2218fe0a..f33c7ea1443d94 100644 --- a/arch/loongarch/pci/pci.c +++ b/arch/loongarch/pci/pci.c @@ -132,6 +132,9 @@ static void loongson_gpu_fixup_dma_hang(struct pci_dev *pdev, bool on) crtc_reg = regbase; crtc_offset = 0x400; break; + default: + iounmap(regbase); + return; } for (i = 0; i < CRTC_NUM_MAX; i++, crtc_reg += crtc_offset) { diff --git a/arch/loongarch/vdso/Makefile b/arch/loongarch/vdso/Makefile index 42aa96249828c9..9c9181bb40715b 100644 --- a/arch/loongarch/vdso/Makefile +++ b/arch/loongarch/vdso/Makefile @@ -12,6 +12,8 @@ obj-vdso-$(CONFIG_GENERIC_GETTIMEOFDAY) += vgettimeofday.o ccflags-vdso := \ $(filter -I%,$(KBUILD_CFLAGS)) \ $(filter -E%,$(KBUILD_CFLAGS)) \ + $(filter -m32,$(KBUILD_CFLAGS)) \ + $(filter -m64,$(KBUILD_CFLAGS)) \ $(filter -march=%,$(KBUILD_CFLAGS)) \ $(filter -m%-float,$(KBUILD_CFLAGS)) \ $(CLANG_FLAGS) \ diff --git a/arch/parisc/Makefile b/arch/parisc/Makefile index edab2a94835255..4391783521bd9a 100644 --- a/arch/parisc/Makefile +++ b/arch/parisc/Makefile @@ -174,15 +174,21 @@ ifeq ($(KBUILD_EXTMOD),) # this hack. prepare: vdso_prepare vdso_prepare: prepare0 - $(if $(CONFIG_64BIT),$(Q)$(MAKE) \ - $(build)=arch/parisc/kernel/vdso64 include/generated/vdso64-offsets.h) - $(if $(CONFIG_PA11)$(CONFIG_COMPAT),$(Q)$(MAKE) \ +ifdef CONFIG_64BIT + $(Q)$(MAKE) $(build)=arch/parisc/kernel/vdso64 include/generated/vdso64-offsets.h + $(if $(CONFIG_COMPAT),$(Q)$(MAKE) \ $(build)=arch/parisc/kernel/vdso32 include/generated/vdso32-offsets.h) +else + $(Q)$(MAKE) $(build)=arch/parisc/kernel/vdso32 include/generated/vdso32-offsets.h +endif endif -vdso-install-$(CONFIG_PA11) += arch/parisc/kernel/vdso32/vdso32.so +ifdef CONFIG_64BIT +vdso-install-y += arch/parisc/kernel/vdso64/vdso64.so vdso-install-$(CONFIG_COMPAT) += arch/parisc/kernel/vdso32/vdso32.so -vdso-install-$(CONFIG_64BIT) += arch/parisc/kernel/vdso64/vdso64.so +else +vdso-install-y += arch/parisc/kernel/vdso32/vdso32.so +endif install: KBUILD_IMAGE := vmlinux zinstall: KBUILD_IMAGE := vmlinuz diff --git a/arch/parisc/include/asm/vdso.h b/arch/parisc/include/asm/vdso.h index 5501560f5ffe0f..e5cca3c9c8e7aa 100644 --- a/arch/parisc/include/asm/vdso.h +++ b/arch/parisc/include/asm/vdso.h @@ -6,13 +6,14 @@ #ifdef CONFIG_64BIT #include +#define VDSO64_SYMBOL(tsk, name) ((tsk)->mm->context.vdso_base + (vdso64_offset_##name)) #endif #if !defined(CONFIG_64BIT) || defined(CONFIG_COMPAT) #include -#endif - -#define VDSO64_SYMBOL(tsk, name) ((tsk)->mm->context.vdso_base + (vdso64_offset_##name)) #define VDSO32_SYMBOL(tsk, name) ((tsk)->mm->context.vdso_base + (vdso32_offset_##name)) +#else +#define VDSO32_SYMBOL(tsk, name) 0UL +#endif #endif /* __ASSEMBLER__ */ diff --git a/arch/parisc/kernel/Makefile b/arch/parisc/kernel/Makefile index 2f3441769ac543..49f937c2abbeb9 100644 --- a/arch/parisc/kernel/Makefile +++ b/arch/parisc/kernel/Makefile @@ -46,6 +46,9 @@ obj-$(CONFIG_KEXEC_FILE) += kexec_file.o # vdso obj-y += vdso.o -obj-$(CONFIG_64BIT) += vdso64/ -obj-$(CONFIG_PA11) += vdso32/ +ifdef CONFIG_64BIT +obj-y += vdso64/ obj-$(CONFIG_COMPAT) += vdso32/ +else +obj-y += vdso32/ +endif diff --git a/arch/parisc/kernel/drivers.c b/arch/parisc/kernel/drivers.c index bc47bbe3026e72..b52ad704ec8a8c 100644 --- a/arch/parisc/kernel/drivers.c +++ b/arch/parisc/kernel/drivers.c @@ -41,9 +41,7 @@ const struct dma_map_ops *hppa_dma_ops __ro_after_init; EXPORT_SYMBOL(hppa_dma_ops); -static struct device root = { - .init_name = "parisc", -}; +static struct device *root; static inline int check_dev(struct device *dev) { @@ -89,7 +87,7 @@ static int for_each_padev(int (*fn)(struct device *, void *), void * data) .obj = data, .fn = fn, }; - return device_for_each_child(&root, &recurse_data, descend_children); + return device_for_each_child(root, &recurse_data, descend_children); } /** @@ -290,7 +288,7 @@ const struct parisc_device * find_pa_parent_type(const struct parisc_device *padev, int type) { const struct device *dev = &padev->dev; - while (dev != &root) { + while (dev != root) { struct parisc_device *candidate = to_parisc_device(dev); if (candidate->id.hw_type == type) return candidate; @@ -319,7 +317,7 @@ static void get_node_path(struct device *dev, struct hardware_path *path) dev = dev->parent; } - while (dev != &root) { + while (dev != root) { if (dev_is_pci(dev)) { unsigned int devfn = to_pci_dev(dev)->devfn; path->bc[i--] = PCI_SLOT(devfn) | (PCI_FUNC(devfn)<< 5); @@ -482,7 +480,7 @@ static struct parisc_device * __init alloc_tree_node( static struct parisc_device *create_parisc_device(struct hardware_path *modpath) { int i; - struct device *parent = &root; + struct device *parent = root; for (i = 0; i < 6; i++) { if (modpath->bc[i] == -1) continue; @@ -755,7 +753,7 @@ parse_tree_node(struct device *parent, int index, struct hardware_path *modpath) struct device *hwpath_to_device(struct hardware_path *modpath) { int i; - struct device *parent = &root; + struct device *parent = root; for (i = 0; i < 6; i++) { if (modpath->bc[i] == -1) continue; @@ -880,7 +878,7 @@ void __init walk_central_bus(void) { walk_native_bus(CENTRAL_BUS_ADDR, CENTRAL_BUS_ADDR + (MAX_NATIVE_DEVICES * NATIVE_DEVICE_OFFSET), - &root); + root); } static __init void print_parisc_device(struct parisc_device *dev) @@ -907,9 +905,10 @@ void __init init_parisc_bus(void) { if (bus_register(&parisc_bus_type)) panic("Could not register PA-RISC bus type\n"); - if (device_register(&root)) + + root = root_device_register("parisc"); + if (IS_ERR(root)) panic("Could not register PA-RISC root device\n"); - get_device(&root); } static __init void qemu_header(void) diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug index f15e5920080ba5..e8718bc13eeb1b 100644 --- a/arch/powerpc/Kconfig.debug +++ b/arch/powerpc/Kconfig.debug @@ -83,11 +83,10 @@ config MSI_BITMAP_SELFTEST depends on DEBUG_KERNEL config GUEST_STATE_BUFFER_TEST - def_tristate n + def_tristate KUNIT_ALL_TESTS prompt "Enable Guest State Buffer unit tests" depends on KUNIT depends on KVM_BOOK3S_HV_POSSIBLE - default KUNIT_ALL_TESTS help The Guest State Buffer is a data format specified in the PAPR. It is by hcalls to communicate the state of L2 guests between diff --git a/arch/powerpc/configs/amigaone_defconfig b/arch/powerpc/configs/amigaone_defconfig index 69ef3dc31c4b65..7a515390646b8c 100644 --- a/arch/powerpc/configs/amigaone_defconfig +++ b/arch/powerpc/configs/amigaone_defconfig @@ -76,7 +76,6 @@ CONFIG_SERIAL_8250_CONSOLE=y # CONFIG_HW_RANDOM is not set # CONFIG_HWMON is not set CONFIG_FB=y -CONFIG_FIRMWARE_EDID=y CONFIG_FB_TILEBLITTING=y CONFIG_FB_RADEON=y CONFIG_FB_3DFX=y diff --git a/arch/powerpc/configs/chrp32_defconfig b/arch/powerpc/configs/chrp32_defconfig index b799c95480ae6d..66eae5b7e16c6f 100644 --- a/arch/powerpc/configs/chrp32_defconfig +++ b/arch/powerpc/configs/chrp32_defconfig @@ -76,7 +76,6 @@ CONFIG_SERIAL_8250_CONSOLE=y CONFIG_NVRAM=y # CONFIG_HWMON is not set CONFIG_FB=y -CONFIG_FIRMWARE_EDID=y CONFIG_FB_OF=y CONFIG_FB_MATROX=y CONFIG_FB_MATROX_MILLENIUM=y diff --git a/arch/powerpc/configs/g5_defconfig b/arch/powerpc/configs/g5_defconfig index 04bbb37f5978aa..5ca1676e6058d2 100644 --- a/arch/powerpc/configs/g5_defconfig +++ b/arch/powerpc/configs/g5_defconfig @@ -85,6 +85,8 @@ CONFIG_PMAC_SMU=y CONFIG_MAC_EMUMOUSEBTN=y CONFIG_WINDFARM=y CONFIG_WINDFARM_PM81=y +CONFIG_WINDFARM_PM72=y +CONFIG_WINDFARM_RM31=y CONFIG_WINDFARM_PM91=y CONFIG_WINDFARM_PM112=y CONFIG_WINDFARM_PM121=y @@ -121,7 +123,6 @@ CONFIG_I2C_CHARDEV=y CONFIG_AGP=m CONFIG_AGP_UNINORTH=m CONFIG_FB=y -CONFIG_FIRMWARE_EDID=y CONFIG_FB_TILEBLITTING=y CONFIG_FB_OF=y CONFIG_FB_NVIDIA=y diff --git a/arch/powerpc/configs/pasemi_defconfig b/arch/powerpc/configs/pasemi_defconfig index 8bbf51b38480ec..89bcbeb05067af 100644 --- a/arch/powerpc/configs/pasemi_defconfig +++ b/arch/powerpc/configs/pasemi_defconfig @@ -98,7 +98,6 @@ CONFIG_SENSORS_LM85=y CONFIG_SENSORS_LM90=y CONFIG_DRM=y CONFIG_DRM_RADEON=y -CONFIG_FIRMWARE_EDID=y CONFIG_FB_TILEBLITTING=y CONFIG_FB_VGA16=y CONFIG_FB_NVIDIA=y diff --git a/arch/powerpc/configs/powernv_defconfig b/arch/powerpc/configs/powernv_defconfig index cc980242023758..5d32c2767a6558 100644 --- a/arch/powerpc/configs/powernv_defconfig +++ b/arch/powerpc/configs/powernv_defconfig @@ -196,7 +196,6 @@ CONFIG_I2C_CHARDEV=y # CONFIG_PTP_1588_CLOCK is not set CONFIG_DRM=y CONFIG_DRM_AST=y -CONFIG_FIRMWARE_EDID=y CONFIG_FB_OF=y CONFIG_FB_MATROX=m CONFIG_FB_MATROX_MILLENIUM=y diff --git a/arch/powerpc/configs/ppc64_defconfig b/arch/powerpc/configs/ppc64_defconfig index 3bf518e3a573a3..6316ca4df25d39 100644 --- a/arch/powerpc/configs/ppc64_defconfig +++ b/arch/powerpc/configs/ppc64_defconfig @@ -249,7 +249,6 @@ CONFIG_I2C_CHARDEV=y CONFIG_I2C_AMD8111=y CONFIG_I2C_PASEMI=y CONFIG_FB=y -CONFIG_FIRMWARE_EDID=y CONFIG_FB_OF=y CONFIG_FB_MATROX=y CONFIG_FB_MATROX_MILLENIUM=y diff --git a/arch/powerpc/configs/ppc64e_defconfig b/arch/powerpc/configs/ppc64e_defconfig index 0fd49f67331ff3..20cc17dce94d7b 100644 --- a/arch/powerpc/configs/ppc64e_defconfig +++ b/arch/powerpc/configs/ppc64e_defconfig @@ -118,7 +118,6 @@ CONFIG_SERIAL_8250_CONSOLE=y CONFIG_I2C_CHARDEV=y CONFIG_I2C_AMD8111=y CONFIG_FB=y -CONFIG_FIRMWARE_EDID=y CONFIG_FB_OF=y CONFIG_FB_MATROX=y CONFIG_FB_MATROX_MILLENIUM=y diff --git a/arch/powerpc/configs/skiroot_defconfig b/arch/powerpc/configs/skiroot_defconfig index ff1bed4b6d2cb1..005536ee75bb94 100644 --- a/arch/powerpc/configs/skiroot_defconfig +++ b/arch/powerpc/configs/skiroot_defconfig @@ -214,7 +214,6 @@ CONFIG_SENSORS_IBMPOWERNV=m CONFIG_DRM=m CONFIG_DRM_AST=m CONFIG_FB=y -CONFIG_FIRMWARE_EDID=y # CONFIG_VGA_CONSOLE is not set CONFIG_FRAMEBUFFER_CONSOLE=y CONFIG_LOGO=y diff --git a/arch/powerpc/include/asm/pmac_low_i2c.h b/arch/powerpc/include/asm/pmac_low_i2c.h index 21bd7297c87f65..fead8fae08ab95 100644 --- a/arch/powerpc/include/asm/pmac_low_i2c.h +++ b/arch/powerpc/include/asm/pmac_low_i2c.h @@ -79,10 +79,6 @@ extern int pmac_i2c_match_adapter(struct device_node *dev, struct i2c_adapter *adapter); -/* (legacy) Locking functions exposed to i2c-keywest */ -extern int pmac_low_i2c_lock(struct device_node *np); -extern int pmac_low_i2c_unlock(struct device_node *np); - /* Access functions for platform code */ extern int pmac_i2c_open(struct pmac_i2c_bus *bus, int polled); extern void pmac_i2c_close(struct pmac_i2c_bus *bus); diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c index 4bbeb8644d3da4..b4472288e0d434 100644 --- a/arch/powerpc/kernel/time.c +++ b/arch/powerpc/kernel/time.c @@ -458,6 +458,10 @@ DEFINE_PER_CPU(u8, irq_work_pending); #endif /* 32 vs 64 bit */ +/* + * Must be called with preemption disabled since it updates + * per-CPU irq_work state and programs the local CPU decrementer. + */ void arch_irq_work_raise(void) { /* @@ -471,10 +475,8 @@ void arch_irq_work_raise(void) * which could get tangled up if we're messing with the same state * here. */ - preempt_disable(); set_irq_work_pending_flag(); set_dec(1); - preempt_enable(); } static void set_dec_or_work(u64 val) diff --git a/arch/powerpc/kernel/vdso/Makefile b/arch/powerpc/kernel/vdso/Makefile index 8834dfe9d72796..368759f8170841 100644 --- a/arch/powerpc/kernel/vdso/Makefile +++ b/arch/powerpc/kernel/vdso/Makefile @@ -62,6 +62,12 @@ CC32FLAGSREMOVE += -fno-stack-clash-protection # 32-bit one. clang validates the values passed to these arguments during # parsing, even when -fno-stack-protector is passed afterwards. CC32FLAGSREMOVE += -mstack-protector-guard% +# ftrace is disabled for the vdso but arch/powerpc/Makefile adds this define to +# KBUILD_CPPFLAGS, which enables use of the 'patchable_function_entry' +# attribute in the 'inline' define via 'notrace'. This attribute is not +# supported for the powerpcle target, resulting in many instances of +# -Wunknown-attributes. +CC32FLAGSREMOVE += -DCC_USING_PATCHABLE_FUNCTION_ENTRY endif LD32FLAGS := -Wl,-soname=linux-vdso32.so.1 AS32FLAGS := -D__VDSO32__ diff --git a/arch/powerpc/kexec/Makefile b/arch/powerpc/kexec/Makefile index 470eb0453e17f3..ec7a0eed75dc59 100644 --- a/arch/powerpc/kexec/Makefile +++ b/arch/powerpc/kexec/Makefile @@ -16,4 +16,4 @@ GCOV_PROFILE_core_$(BITS).o := n KCOV_INSTRUMENT_core_$(BITS).o := n UBSAN_SANITIZE_core_$(BITS).o := n KASAN_SANITIZE_core.o := n -KASAN_SANITIZE_core_$(BITS) := n +KASAN_SANITIZE_core_$(BITS).o := n diff --git a/arch/powerpc/lib/vmx-helper.c b/arch/powerpc/lib/vmx-helper.c index 554b248002b4fd..57e897b60db86c 100644 --- a/arch/powerpc/lib/vmx-helper.c +++ b/arch/powerpc/lib/vmx-helper.c @@ -52,7 +52,14 @@ int exit_vmx_usercopy(void) } EXPORT_SYMBOL(exit_vmx_usercopy); -int enter_vmx_ops(void) +/* + * Can be called from kexec copy_page() path with MMU off. The kexec + * code sets preempt_count to HARDIRQ_OFFSET so we return early here. + * Since in_interrupt() is always inline, __no_sanitize_address on this + * function is sufficient to avoid KASAN shadow memory accesses in real + * mode. + */ +int __no_sanitize_address enter_vmx_ops(void) { if (in_interrupt()) return 0; diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c index 8b0081441f85d1..2e6adf5b95c404 100644 --- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -2242,6 +2242,7 @@ static void record_and_restart(struct perf_event *event, unsigned long val, const u64 last_period = event->hw.last_period; s64 prev, delta, left; int record = 0; + int mark_event = regs->dsisr & MMCRA_SAMPLE_ENABLE; if (event->hw.state & PERF_HES_STOPPED) { write_pmc(event->hw.idx, 0); @@ -2304,9 +2305,9 @@ static void record_and_restart(struct perf_event *event, unsigned long val, * In ISA v3.0 and before values "0" and "7" are considered reserved. * In ISA v3.1, value "7" has been used to indicate "larx/stcx". * Drop the sample if "type" has reserved values for this field with a - * ISA version check. + * ISA version check for marked events. */ - if (event->attr.sample_type & PERF_SAMPLE_DATA_SRC && + if (mark_event && event->attr.sample_type & PERF_SAMPLE_DATA_SRC && ppmu->get_mem_data_src) { val = (regs->dar & SIER_TYPE_MASK) >> SIER_TYPE_SHIFT; if (val == 0 || (val == 7 && !cpu_has_feature(CPU_FTR_ARCH_31))) { diff --git a/arch/powerpc/perf/hv-gpci.c b/arch/powerpc/perf/hv-gpci.c index 5cac2cf3bd1e57..10c82cf8f5b393 100644 --- a/arch/powerpc/perf/hv-gpci.c +++ b/arch/powerpc/perf/hv-gpci.c @@ -210,7 +210,7 @@ static ssize_t processor_bus_topology_show(struct device *dev, struct device_att 0, 0, buf, &n, arg); if (!ret) - return n; + goto out_success; if (ret != H_PARAMETER) goto out; @@ -244,12 +244,14 @@ static ssize_t processor_bus_topology_show(struct device *dev, struct device_att starting_index, 0, buf, &n, arg); if (!ret) - return n; + goto out_success; if (ret != H_PARAMETER) goto out; } +out_success: + put_cpu_var(hv_gpci_reqb); return n; out: @@ -278,7 +280,7 @@ static ssize_t processor_config_show(struct device *dev, struct device_attribute 0, 0, buf, &n, arg); if (!ret) - return n; + goto out_success; if (ret != H_PARAMETER) goto out; @@ -312,12 +314,14 @@ static ssize_t processor_config_show(struct device *dev, struct device_attribute starting_index, 0, buf, &n, arg); if (!ret) - return n; + goto out_success; if (ret != H_PARAMETER) goto out; } +out_success: + put_cpu_var(hv_gpci_reqb); return n; out: @@ -346,7 +350,7 @@ static ssize_t affinity_domain_via_virtual_processor_show(struct device *dev, 0, 0, buf, &n, arg); if (!ret) - return n; + goto out_success; if (ret != H_PARAMETER) goto out; @@ -382,12 +386,14 @@ static ssize_t affinity_domain_via_virtual_processor_show(struct device *dev, starting_index, secondary_index, buf, &n, arg); if (!ret) - return n; + goto out_success; if (ret != H_PARAMETER) goto out; } +out_success: + put_cpu_var(hv_gpci_reqb); return n; out: @@ -416,7 +422,7 @@ static ssize_t affinity_domain_via_domain_show(struct device *dev, struct device 0, 0, buf, &n, arg); if (!ret) - return n; + goto out_success; if (ret != H_PARAMETER) goto out; @@ -448,12 +454,14 @@ static ssize_t affinity_domain_via_domain_show(struct device *dev, struct device starting_index, 0, buf, &n, arg); if (!ret) - return n; + goto out_success; if (ret != H_PARAMETER) goto out; } +out_success: + put_cpu_var(hv_gpci_reqb); return n; out: diff --git a/arch/powerpc/platforms/44x/warp.c b/arch/powerpc/platforms/44x/warp.c index a5001d32f978d7..6f674f86dc853c 100644 --- a/arch/powerpc/platforms/44x/warp.c +++ b/arch/powerpc/platforms/44x/warp.c @@ -293,6 +293,8 @@ static int pika_dtm_thread(void __iomem *fpga) schedule_timeout(HZ); } + put_device(&client->dev); + return 0; } diff --git a/arch/powerpc/platforms/82xx/km82xx.c b/arch/powerpc/platforms/82xx/km82xx.c index 99f0f0f4187672..4ad223525e893c 100644 --- a/arch/powerpc/platforms/82xx/km82xx.c +++ b/arch/powerpc/platforms/82xx/km82xx.c @@ -27,8 +27,8 @@ static void __init km82xx_pic_init(void) { - struct device_node *np __free(device_node); - np = of_find_compatible_node(NULL, NULL, "fsl,pq2-pic"); + struct device_node *np __free(device_node) = of_find_compatible_node(NULL, + NULL, "fsl,pq2-pic"); if (!np) { pr_err("PIC init: can not find cpm-pic node\n"); diff --git a/arch/powerpc/platforms/8xx/cpm1.c b/arch/powerpc/platforms/8xx/cpm1.c index 7433be7d66ee96..f00734f0590cf7 100644 --- a/arch/powerpc/platforms/8xx/cpm1.c +++ b/arch/powerpc/platforms/8xx/cpm1.c @@ -477,7 +477,7 @@ int cpm1_gpiochip_add16(struct device *dev) struct device_node *np = dev->of_node; struct cpm1_gpio16_chip *cpm1_gc; struct gpio_chip *gc; - u16 mask; + u32 mask; cpm1_gc = devm_kzalloc(dev, sizeof(*cpm1_gc), GFP_KERNEL); if (!cpm1_gc) @@ -485,7 +485,7 @@ int cpm1_gpiochip_add16(struct device *dev) spin_lock_init(&cpm1_gc->lock); - if (!of_property_read_u16(np, "fsl,cpm1-gpio-irq-mask", &mask)) { + if (!of_property_read_u32(np, "fsl,cpm1-gpio-irq-mask", &mask)) { int i, j; for (i = 0, j = 0; i < 16; i++) diff --git a/arch/powerpc/platforms/pasemi/pci.c b/arch/powerpc/platforms/pasemi/pci.c index 60f990a336c470..2df9552746529c 100644 --- a/arch/powerpc/platforms/pasemi/pci.c +++ b/arch/powerpc/platforms/pasemi/pci.c @@ -272,13 +272,12 @@ void __init pas_pci_init(void) { struct device_node *root = of_find_node_by_path("/"); struct device_node *np; - int res; pci_set_flags(PCI_SCAN_ALL_PCIE_DEVS); np = of_find_compatible_node(root, NULL, "pasemi,rootbus"); if (np) { - res = pas_add_bridge(np); + pas_add_bridge(np); of_node_put(np); } of_node_put(root); diff --git a/arch/powerpc/platforms/powermac/low_i2c.c b/arch/powerpc/platforms/powermac/low_i2c.c index 73b7f4e8c04756..da72a30ab8657e 100644 --- a/arch/powerpc/platforms/powermac/low_i2c.c +++ b/arch/powerpc/platforms/powermac/low_i2c.c @@ -1058,40 +1058,6 @@ int pmac_i2c_match_adapter(struct device_node *dev, struct i2c_adapter *adapter) } EXPORT_SYMBOL_GPL(pmac_i2c_match_adapter); -int pmac_low_i2c_lock(struct device_node *np) -{ - struct pmac_i2c_bus *bus, *found = NULL; - - list_for_each_entry(bus, &pmac_i2c_busses, link) { - if (np == bus->controller) { - found = bus; - break; - } - } - if (!found) - return -ENODEV; - return pmac_i2c_open(bus, 0); -} -EXPORT_SYMBOL_GPL(pmac_low_i2c_lock); - -int pmac_low_i2c_unlock(struct device_node *np) -{ - struct pmac_i2c_bus *bus, *found = NULL; - - list_for_each_entry(bus, &pmac_i2c_busses, link) { - if (np == bus->controller) { - found = bus; - break; - } - } - if (!found) - return -ENODEV; - pmac_i2c_close(bus); - return 0; -} -EXPORT_SYMBOL_GPL(pmac_low_i2c_unlock); - - int pmac_i2c_open(struct pmac_i2c_bus *bus, int polled) { int rc; diff --git a/arch/powerpc/platforms/ps3/device-init.c b/arch/powerpc/platforms/ps3/device-init.c index 12c473768c397d..9109c218a060a2 100644 --- a/arch/powerpc/platforms/ps3/device-init.c +++ b/arch/powerpc/platforms/ps3/device-init.c @@ -950,8 +950,6 @@ static int __init ps3_start_probe_thread(enum ps3_bus_type bus_type) static int __init ps3_register_devices(void) { - int result; - if (!firmware_has_feature(FW_FEATURE_PS3_LV1)) return -ENODEV; @@ -959,7 +957,7 @@ static int __init ps3_register_devices(void) /* ps3_repository_dump_bus_info(); */ - result = ps3_start_probe_thread(PS3_BUS_TYPE_STORAGE); + ps3_start_probe_thread(PS3_BUS_TYPE_STORAGE); ps3_register_vuart_devices(); diff --git a/arch/powerpc/platforms/pseries/htmdump.c b/arch/powerpc/platforms/pseries/htmdump.c index 742ec52c9d4df9..489a80e8708218 100644 --- a/arch/powerpc/platforms/pseries/htmdump.c +++ b/arch/powerpc/platforms/pseries/htmdump.c @@ -16,6 +16,7 @@ static void *htm_buf; static void *htm_status_buf; static void *htm_info_buf; static void *htm_caps_buf; +static void *htm_mem_buf; static u32 nodeindex; static u32 nodalchipindex; static u32 coreindexonchip; @@ -86,7 +87,7 @@ static ssize_t htm_return_check(long rc) static ssize_t htmdump_read(struct file *filp, char __user *ubuf, size_t count, loff_t *ppos) { - void *htm_buf = filp->private_data; + void *htm_buf_data = filp->private_data; unsigned long page, read_size, available; loff_t offset; long rc, ret; @@ -100,7 +101,7 @@ static ssize_t htmdump_read(struct file *filp, char __user *ubuf, * - last three values are address, size and offset */ rc = htm_hcall_wrapper(htmflags, nodeindex, nodalchipindex, coreindexonchip, - htmtype, H_HTM_OP_DUMP_DATA, virt_to_phys(htm_buf), + htmtype, H_HTM_OP_DUMP_DATA, virt_to_phys(htm_buf_data), PAGE_SIZE, page); ret = htm_return_check(rc); @@ -112,7 +113,61 @@ static ssize_t htmdump_read(struct file *filp, char __user *ubuf, available = PAGE_SIZE; read_size = min(count, available); *ppos += read_size; - return simple_read_from_buffer(ubuf, count, &offset, htm_buf, available); + return simple_read_from_buffer(ubuf, count, &offset, htm_buf_data, available); +} + +static ssize_t htmsystem_mem_read(struct file *filp, char __user *ubuf, + size_t count, loff_t *ppos) +{ + void *htm_mem_data = filp->private_data; + long rc, ret; + u64 *num_entries; + u64 to_copy = 0; + loff_t offset = 0; + u64 mem_offset = 0; + + /* + * Invoke H_HTM call with: + * - operation as htm status (H_HTM_OP_STATUS) + * - last three values as addr, size and offset. "offset" + * is value from output buffer header that points to next + * entry to dump. 0 is the first entry to dump. next entry + * is read from the output bufferbyte offset 0x8. + * + * When first time hcall is invoked, mem_offset should be + * zero because zero is the first entry. + * In the next hcall, offset of next entry to read from is + * picked from output buffer header itself. So don't fill + * mem_offset for first read. + * + * If there is no further data to read in next iteration, + * offset value from output buffer header will point to -1. + */ + if (*ppos) { + mem_offset = *(u64 *)(htm_mem_data + 0x8); + if (mem_offset == -1) + return 0; + } + rc = htm_hcall_wrapper(htmflags, nodeindex, nodalchipindex, coreindexonchip, + htmtype, H_HTM_OP_DUMP_SYSMEM_CONF, virt_to_phys(htm_mem_data), + PAGE_SIZE, be64_to_cpu(mem_offset)); + ret = htm_return_check(rc); + if (ret <= 0) { + pr_debug("H_HTM hcall returned for op: H_HTM_OP_DUMP_SYSMEM_CONF with hcall returning %ld\n", ret); + return ret; + } + + /* + * HTM system mem buffer, start of buffer + 0x10 gives the + * number of HTM entries in the buffer. + * So total count to copy is: + * 32 bytes (for first 5 fields) + (number of HTM entries * entry size) + */ + num_entries = htm_mem_data + 0x10; + to_copy = 32 + (be64_to_cpu(*num_entries) * 32); + + *ppos += to_copy; + return simple_read_from_buffer(ubuf, count, &offset, htm_mem_data, to_copy); } static const struct file_operations htmdump_fops = { @@ -121,6 +176,12 @@ static const struct file_operations htmdump_fops = { .open = simple_open, }; +static const struct file_operations htmsystem_mem_fops = { + .llseek = NULL, + .read = htmsystem_mem_read, + .open = simple_open, +}; + static int htmconfigure_set(void *data, u64 val) { long rc, ret; @@ -226,20 +287,31 @@ static int htmstart_get(void *data, u64 *val) static ssize_t htmstatus_read(struct file *filp, char __user *ubuf, size_t count, loff_t *ppos) { - void *htm_status_buf = filp->private_data; + void *htm_status_data = filp->private_data; long rc, ret; u64 *num_entries; u64 to_copy; int htmstatus_flag; + loff_t offset = 0; + u64 status_offset = 0; /* * Invoke H_HTM call with: * - operation as htm status (H_HTM_OP_STATUS) - * - last three values as addr, size and offset + * - last three values as addr, size and offset. + * "offset" is value from output buffer header + * that points to next entry to dump. 0 is the first + * entry to dump. next entry is read from the output + * bufferbyte offset 0x8. */ + if (*ppos) { + status_offset = *(u64 *)(htm_status_data + 0x8); + if (status_offset == -1) + return 0; + } rc = htm_hcall_wrapper(htmflags, nodeindex, nodalchipindex, coreindexonchip, - htmtype, H_HTM_OP_STATUS, virt_to_phys(htm_status_buf), - PAGE_SIZE, 0); + htmtype, H_HTM_OP_STATUS, virt_to_phys(htm_status_data), + PAGE_SIZE, be64_to_cpu(status_offset)); ret = htm_return_check(rc); if (ret <= 0) { @@ -255,13 +327,15 @@ static ssize_t htmstatus_read(struct file *filp, char __user *ubuf, * So total count to copy is: * 32 bytes (for first 7 fields) + (number of HTM entries * entry size) */ - num_entries = htm_status_buf + 0x10; + num_entries = htm_status_data + 0x10; if (htmtype == 0x2) htmstatus_flag = 0x8; else htmstatus_flag = 0x6; to_copy = 32 + (be64_to_cpu(*num_entries) * htmstatus_flag); - return simple_read_from_buffer(ubuf, count, ppos, htm_status_buf, to_copy); + *ppos += to_copy; + + return simple_read_from_buffer(ubuf, count, &offset, htm_status_data, to_copy); } static const struct file_operations htmstatus_fops = { @@ -273,19 +347,30 @@ static const struct file_operations htmstatus_fops = { static ssize_t htminfo_read(struct file *filp, char __user *ubuf, size_t count, loff_t *ppos) { - void *htm_info_buf = filp->private_data; + void *htm_info_data = filp->private_data; long rc, ret; u64 *num_entries; u64 to_copy; + loff_t offset = 0; + u64 info_offset = 0; /* * Invoke H_HTM call with: * - operation as htm status (H_HTM_OP_STATUS) * - last three values as addr, size and offset + * "offset" is value from output buffer header + * that points to next entry to dump. 0 is the first + * entry to dump. next entry is read from the output + * bufferbyte offset 0x8. */ + if (*ppos) { + info_offset = *(u64 *)(htm_info_data + 0x8); + if (info_offset == -1) + return 0; + } rc = htm_hcall_wrapper(htmflags, nodeindex, nodalchipindex, coreindexonchip, - htmtype, H_HTM_OP_DUMP_SYSPROC_CONF, virt_to_phys(htm_info_buf), - PAGE_SIZE, 0); + htmtype, H_HTM_OP_DUMP_SYSPROC_CONF, virt_to_phys(htm_info_data), + PAGE_SIZE, be64_to_cpu(info_offset)); ret = htm_return_check(rc); if (ret <= 0) { @@ -301,15 +386,17 @@ static ssize_t htminfo_read(struct file *filp, char __user *ubuf, * So total count to copy is: * 32 bytes (for first 5 fields) + (number of HTM entries * entry size) */ - num_entries = htm_info_buf + 0x10; + num_entries = htm_info_data + 0x10; to_copy = 32 + (be64_to_cpu(*num_entries) * 16); - return simple_read_from_buffer(ubuf, count, ppos, htm_info_buf, to_copy); + + *ppos += to_copy; + return simple_read_from_buffer(ubuf, count, &offset, htm_info_data, to_copy); } static ssize_t htmcaps_read(struct file *filp, char __user *ubuf, size_t count, loff_t *ppos) { - void *htm_caps_buf = filp->private_data; + void *htm_caps_data = filp->private_data; long rc, ret; /* @@ -319,7 +406,7 @@ static ssize_t htmcaps_read(struct file *filp, char __user *ubuf, * and zero */ rc = htm_hcall_wrapper(htmflags, nodeindex, nodalchipindex, coreindexonchip, - htmtype, H_HTM_OP_CAPABILITIES, virt_to_phys(htm_caps_buf), + htmtype, H_HTM_OP_CAPABILITIES, virt_to_phys(htm_caps_data), 0x80, 0); ret = htm_return_check(rc); @@ -328,7 +415,7 @@ static ssize_t htmcaps_read(struct file *filp, char __user *ubuf, return ret; } - return simple_read_from_buffer(ubuf, count, ppos, htm_caps_buf, 0x80); + return simple_read_from_buffer(ubuf, count, ppos, htm_caps_data, 0x80); } static const struct file_operations htminfo_fops = { @@ -457,9 +544,17 @@ static int htmdump_init_debugfs(void) return -ENOMEM; } + /* Memory to present HTM system memory configuration */ + htm_mem_buf = kmalloc(PAGE_SIZE, GFP_KERNEL); + if (!htm_mem_buf) { + pr_err("Failed to allocate htm mem buf\n"); + return -ENOMEM; + } + debugfs_create_file("htmstatus", 0400, htmdump_debugfs_dir, htm_status_buf, &htmstatus_fops); debugfs_create_file("htminfo", 0400, htmdump_debugfs_dir, htm_info_buf, &htminfo_fops); debugfs_create_file("htmcaps", 0400, htmdump_debugfs_dir, htm_caps_buf, &htmcaps_fops); + debugfs_create_file("htmsystem_mem", 0400, htmdump_debugfs_dir, htm_mem_buf, &htmsystem_mem_fops); return 0; } @@ -482,6 +577,10 @@ static void __exit htmdump_exit(void) { debugfs_remove_recursive(htmdump_debugfs_dir); kfree(htm_buf); + kfree(htm_status_buf); + kfree(htm_info_buf); + kfree(htm_caps_buf); + kfree(htm_mem_buf); } module_init(htmdump_init); diff --git a/arch/powerpc/platforms/pseries/papr-hvpipe.c b/arch/powerpc/platforms/pseries/papr-hvpipe.c index 14ae480d060a4d..0c40bdde45e27f 100644 --- a/arch/powerpc/platforms/pseries/papr-hvpipe.c +++ b/arch/powerpc/platforms/pseries/papr-hvpipe.c @@ -190,33 +190,34 @@ static int hvpipe_rtas_recv_msg(char __user *buf, int size) return -ENOMEM; } - ret = rtas_ibm_receive_hvpipe_msg(work_area, &srcID, - &bytes_written); - if (!ret) { - /* - * Recv HVPIPE RTAS is successful. - * When releasing FD or no one is waiting on the - * specific source, issue recv HVPIPE RTAS call - * so that pipe is not blocked - this func is called - * with NULL buf. - */ - if (buf) { - if (size < bytes_written) { - pr_err("Received the payload size = %d, but the buffer size = %d\n", - bytes_written, size); - bytes_written = size; - } - ret = copy_to_user(buf, - rtas_work_area_raw_buf(work_area), - bytes_written); - if (!ret) - ret = bytes_written; - } - } else { - pr_err("ibm,receive-hvpipe-msg failed with %d\n", - ret); + /* + * Recv HVPIPE RTAS is successful. + * When releasing FD or no one is waiting on the + * specific source, issue recv HVPIPE RTAS call + * so that pipe is not blocked - this func is called + * with NULL buf. + */ + ret = rtas_ibm_receive_hvpipe_msg(work_area, &srcID, &bytes_written); + if (ret) { + pr_err("ibm,receive-hvpipe-msg failed with %d\n", ret); + goto out; } + if (!buf) + goto out; + + if (size < bytes_written) { + pr_err("Received the payload size = %d, but the buffer size = %d\n", + bytes_written, size); + bytes_written = size; + } + + if (copy_to_user(buf, rtas_work_area_raw_buf(work_area), bytes_written)) + ret = -EFAULT; + else + ret = bytes_written; + +out: rtas_work_area_free(work_area); return ret; } @@ -327,8 +328,8 @@ static ssize_t papr_hvpipe_handle_read(struct file *file, { struct hvpipe_source_info *src_info = file->private_data; - struct papr_hvpipe_hdr hdr; - long ret; + struct papr_hvpipe_hdr hdr = {}; + ssize_t ret = 0; /* * Return -ENXIO during migration @@ -376,7 +377,7 @@ static ssize_t papr_hvpipe_handle_read(struct file *file, ret = copy_to_user(buf, &hdr, HVPIPE_HDR_LEN); if (ret) - return ret; + return -EFAULT; /* * Message event has payload, so get the payload with @@ -385,19 +386,23 @@ static ssize_t papr_hvpipe_handle_read(struct file *file, if (hdr.flags & HVPIPE_MSG_AVAILABLE) { ret = hvpipe_rtas_recv_msg(buf + HVPIPE_HDR_LEN, size - HVPIPE_HDR_LEN); - if (ret > 0) { + /* + * Always clear MSG_AVAILABLE once the RTAS call has drained + * the message, regardless of whether copy_to_user succeeded. + */ + if (ret >= 0 || ret == -EFAULT) src_info->hvpipe_status &= ~HVPIPE_MSG_AVAILABLE; - ret += HVPIPE_HDR_LEN; - } } else if (hdr.flags & HVPIPE_LOST_CONNECTION) { /* * Hypervisor is closing the pipe for the specific * source. So notify user space. */ src_info->hvpipe_status &= ~HVPIPE_LOST_CONNECTION; - ret = HVPIPE_HDR_LEN; } + if (ret >= 0) + ret += HVPIPE_HDR_LEN; + return ret; } @@ -444,16 +449,18 @@ static int papr_hvpipe_handle_release(struct inode *inode, struct file *file) { struct hvpipe_source_info *src_info; + unsigned long flags; /* * Hold the lock, remove source from src_list, reset the * hvpipe status and release the lock to prevent any race * with message event IRQ. */ - spin_lock(&hvpipe_src_list_lock); + spin_lock_irqsave(&hvpipe_src_list_lock, flags); src_info = file->private_data; list_del(&src_info->list); file->private_data = NULL; + spin_unlock_irqrestore(&hvpipe_src_list_lock, flags); /* * If the pipe for this specific source has any pending * payload, issue recv HVPIPE RTAS so that pipe will not @@ -461,10 +468,8 @@ static int papr_hvpipe_handle_release(struct inode *inode, */ if (src_info->hvpipe_status & HVPIPE_MSG_AVAILABLE) { src_info->hvpipe_status = 0; - spin_unlock(&hvpipe_src_list_lock); hvpipe_rtas_recv_msg(NULL, 0); - } else - spin_unlock(&hvpipe_src_list_lock); + } kfree(src_info); return 0; @@ -479,50 +484,53 @@ static const struct file_operations papr_hvpipe_handle_ops = { static int papr_hvpipe_dev_create_handle(u32 srcID) { - struct hvpipe_source_info *src_info __free(kfree) = NULL; - - spin_lock(&hvpipe_src_list_lock); - /* - * Do not allow more than one process communicates with - * each source. - */ - src_info = hvpipe_find_source(srcID); - if (src_info) { - spin_unlock(&hvpipe_src_list_lock); - pr_err("pid(%d) is already using the source(%d)\n", - src_info->tsk->pid, srcID); - return -EALREADY; - } - spin_unlock(&hvpipe_src_list_lock); + struct hvpipe_source_info *src_info; + int fd; + unsigned long flags; src_info = kzalloc_obj(*src_info, GFP_KERNEL_ACCOUNT); if (!src_info) return -ENOMEM; src_info->srcID = srcID; - src_info->tsk = current; init_waitqueue_head(&src_info->recv_wqh); - FD_PREPARE(fdf, O_RDONLY | O_CLOEXEC, - anon_inode_getfile("[papr-hvpipe]", &papr_hvpipe_handle_ops, - (void *)src_info, O_RDWR)); - if (fdf.err) - return fdf.err; - - retain_and_null_ptr(src_info); - spin_lock(&hvpipe_src_list_lock); /* - * If two processes are executing ioctl() for the same - * source ID concurrently, prevent the second process to - * acquire FD. + * Do not allow more than one process communicates with + * each source. */ + spin_lock_irqsave(&hvpipe_src_list_lock, flags); if (hvpipe_find_source(srcID)) { - spin_unlock(&hvpipe_src_list_lock); + spin_unlock_irqrestore(&hvpipe_src_list_lock, flags); + pr_err("pid(%s:%d) could not get the source(%d)\n", + current->comm, task_pid_nr(current), srcID); + kfree(src_info); return -EALREADY; } list_add(&src_info->list, &hvpipe_src_list); - spin_unlock(&hvpipe_src_list_lock); - return fd_publish(fdf); + spin_unlock_irqrestore(&hvpipe_src_list_lock, flags); + + fd = FD_ADD(O_RDONLY | O_CLOEXEC, + anon_inode_getfile("[papr-hvpipe]", &papr_hvpipe_handle_ops, + (void *)src_info, O_RDWR)); + if (fd < 0) { + spin_lock_irqsave(&hvpipe_src_list_lock, flags); + list_del(&src_info->list); + spin_unlock_irqrestore(&hvpipe_src_list_lock, flags); + /* + * if we fail to add FD, that means no userspace program is + * polling. In that case if there is a msg pending because the + * interrupt was fired after the src_info was added to the + * global list, then let's consume it here, to unblock the + * hvpipe + */ + if (src_info->hvpipe_status & HVPIPE_MSG_AVAILABLE) + hvpipe_rtas_recv_msg(NULL, 0); + kfree(src_info); + return fd; + } + + return fd; } /* @@ -685,20 +693,19 @@ static int __init enable_hvpipe_IRQ(void) struct device_node *np; hvpipe_check_exception_token = rtas_function_token(RTAS_FN_CHECK_EXCEPTION); - if (hvpipe_check_exception_token == RTAS_UNKNOWN_SERVICE) + if (hvpipe_check_exception_token == RTAS_UNKNOWN_SERVICE) return -ENODEV; /* hvpipe events */ np = of_find_node_by_path("/event-sources/ibm,hvpipe-msg-events"); - if (np != NULL) { - request_event_sources_irqs(np, hvpipe_event_interrupt, - "HPIPE_EVENT"); - of_node_put(np); - } else { - pr_err("Can not enable hvpipe event IRQ\n"); + if (!np) { + pr_err("No device node found, could not enable hvpipe event IRQ\n"); return -ENODEV; } + request_event_sources_irqs(np, hvpipe_event_interrupt, "HPIPE_EVENT"); + of_node_put(np); + return 0; } @@ -775,23 +782,29 @@ static int __init papr_hvpipe_init(void) } ret = enable_hvpipe_IRQ(); - if (!ret) { - ret = set_hvpipe_sys_param(1); - if (!ret) - ret = misc_register(&papr_hvpipe_dev); - } + if (ret) + goto out_wq; - if (!ret) { - pr_info("hvpipe feature is enabled\n"); - hvpipe_feature = true; - return 0; - } + ret = misc_register(&papr_hvpipe_dev); + if (ret) + goto out_wq; - pr_err("hvpipe feature is not enabled %d\n", ret); + ret = set_hvpipe_sys_param(1); + if (ret) + goto out_misc; + + pr_info("hvpipe feature is enabled\n"); + hvpipe_feature = true; + return 0; + +out_misc: + misc_deregister(&papr_hvpipe_dev); +out_wq: destroy_workqueue(papr_hvpipe_wq); out: kfree(papr_hvpipe_work); papr_hvpipe_work = NULL; + pr_err("hvpipe feature is not enabled %d\n", ret); return ret; } machine_device_initcall(pseries, papr_hvpipe_init); diff --git a/arch/powerpc/platforms/pseries/papr-hvpipe.h b/arch/powerpc/platforms/pseries/papr-hvpipe.h index c343f4230865c0..4bdf7bb2fc4dbe 100644 --- a/arch/powerpc/platforms/pseries/papr-hvpipe.h +++ b/arch/powerpc/platforms/pseries/papr-hvpipe.h @@ -21,7 +21,6 @@ struct hvpipe_source_info { u32 srcID; u32 hvpipe_status; wait_queue_head_t recv_wqh; /* wake up poll() waitq */ - struct task_struct *tsk; }; /* diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index d235396c451413..c5754942cf85a4 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -937,6 +937,28 @@ config RISCV_VECTOR_MISALIGNED help Enable detecting support for vector misaligned loads and stores. +config RISCV_SBI_FWFT_DELEGATE_MISALIGNED + bool "Request firmware delegation of unaligned access exceptions" + depends on RISCV_SBI + depends on NONPORTABLE + help + Use SBI FWFT to request delegation of load address misaligned and + store address misaligned exceptions, if possible, and prefer Linux + kernel emulation of these accesses to firmware emulation. + + Unfortunately, Linux's emulation is still incomplete. Namely, it + currently does not handle vector instructions and KVM guest accesses. + On platforms where these accesses would have been handled by firmware, + enabling this causes unexpected kernel oopses, userspaces crashes and + KVM guest crashes. If you are sure that these are not a problem for + your platform, you can say Y here, which may improve performance. + + Saying N here will not worsen emulation support for unaligned accesses + even in the case where the firmware also has incomplete support. It + simply keeps the firmware's emulation enabled. + + If you don't know what to do here, say N. + choice prompt "Unaligned Accesses Support" default RISCV_PROBE_UNALIGNED_ACCESS diff --git a/arch/riscv/errata/mips/errata.c b/arch/riscv/errata/mips/errata.c index e984a8152208c3..2c3dc2259e93e9 100644 --- a/arch/riscv/errata/mips/errata.c +++ b/arch/riscv/errata/mips/errata.c @@ -57,7 +57,7 @@ void mips_errata_patch_func(struct alt_entry *begin, struct alt_entry *end, } tmp = (1U << alt->patch_id); - if (cpu_req_errata && tmp) { + if (cpu_req_errata & tmp) { mutex_lock(&text_mutex); patch_text_nosync(ALT_OLD_PTR(alt), ALT_ALT_PTR(alt), alt->alt_len); diff --git a/arch/riscv/kernel/compat_signal.c b/arch/riscv/kernel/compat_signal.c index 6ec4e34255a9ab..cf3eb33a11e464 100644 --- a/arch/riscv/kernel/compat_signal.c +++ b/arch/riscv/kernel/compat_signal.c @@ -107,6 +107,8 @@ static long compat_restore_sigcontext(struct pt_regs *regs, /* sc_regs is structured the same as the start of pt_regs */ err = __copy_from_user(&cregs, &sc->sc_regs, sizeof(sc->sc_regs)); + if (unlikely(err)) + return err; cregs_to_regs(&cregs, regs); diff --git a/arch/riscv/kernel/copy-unaligned.S b/arch/riscv/kernel/copy-unaligned.S index 2b3d9398c113fb..90f3549621f725 100644 --- a/arch/riscv/kernel/copy-unaligned.S +++ b/arch/riscv/kernel/copy-unaligned.S @@ -1,6 +1,7 @@ /* SPDX-License-Identifier: GPL-2.0 */ /* Copyright (C) 2023 Rivos Inc. */ +#include #include #include @@ -9,7 +10,7 @@ /* void __riscv_copy_words_unaligned(void *, const void *, size_t) */ /* Performs a memcpy without aligning buffers, using word loads and stores. */ /* Note: The size is truncated to a multiple of 8 * SZREG */ -SYM_FUNC_START(__riscv_copy_words_unaligned) +SYM_TYPED_FUNC_START(__riscv_copy_words_unaligned) andi a4, a2, ~((8*SZREG)-1) beqz a4, 2f add a3, a1, a4 @@ -41,7 +42,7 @@ SYM_FUNC_END(__riscv_copy_words_unaligned) /* void __riscv_copy_bytes_unaligned(void *, const void *, size_t) */ /* Performs a memcpy without aligning buffers, using only byte accesses. */ /* Note: The size is truncated to a multiple of 8 */ -SYM_FUNC_START(__riscv_copy_bytes_unaligned) +SYM_TYPED_FUNC_START(__riscv_copy_bytes_unaligned) andi a4, a2, ~(8-1) beqz a4, 2f add a3, a1, a4 diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index 1734f9a4c2fd70..f46aa5602d74d3 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -896,10 +896,8 @@ static void __init riscv_fill_hwcap_from_isa_string(unsigned long *isa2hwcap) * CPU cores with the ratified spec will contain non-zero * marchid. */ - if (acpi_disabled && boot_vendorid == THEAD_VENDOR_ID && boot_archid == 0x0) { - this_hwcap &= ~isa2hwcap[RISCV_ISA_EXT_v]; + if (acpi_disabled && boot_vendorid == THEAD_VENDOR_ID && boot_archid == 0x0) clear_bit(RISCV_ISA_EXT_v, source_isa); - } riscv_resolve_isa(source_isa, isainfo->isa, &this_hwcap, isa2hwcap); @@ -1104,16 +1102,16 @@ early_param("riscv_isa_fallback", riscv_isa_fallback_setup); void __init riscv_fill_hwcap(void) { char print_str[NUM_ALPHA_EXTS + 1]; - unsigned long isa2hwcap[26] = {0}; + unsigned long isa2hwcap[RISCV_ISA_EXT_BASE] = {0}; int i, j; - isa2hwcap['i' - 'a'] = COMPAT_HWCAP_ISA_I; - isa2hwcap['m' - 'a'] = COMPAT_HWCAP_ISA_M; - isa2hwcap['a' - 'a'] = COMPAT_HWCAP_ISA_A; - isa2hwcap['f' - 'a'] = COMPAT_HWCAP_ISA_F; - isa2hwcap['d' - 'a'] = COMPAT_HWCAP_ISA_D; - isa2hwcap['c' - 'a'] = COMPAT_HWCAP_ISA_C; - isa2hwcap['v' - 'a'] = COMPAT_HWCAP_ISA_V; + isa2hwcap[RISCV_ISA_EXT_i] = COMPAT_HWCAP_ISA_I; + isa2hwcap[RISCV_ISA_EXT_m] = COMPAT_HWCAP_ISA_M; + isa2hwcap[RISCV_ISA_EXT_a] = COMPAT_HWCAP_ISA_A; + isa2hwcap[RISCV_ISA_EXT_f] = COMPAT_HWCAP_ISA_F; + isa2hwcap[RISCV_ISA_EXT_d] = COMPAT_HWCAP_ISA_D; + isa2hwcap[RISCV_ISA_EXT_c] = COMPAT_HWCAP_ISA_C; + isa2hwcap[RISCV_ISA_EXT_v] = COMPAT_HWCAP_ISA_V; if (!acpi_disabled) { riscv_fill_hwcap_from_isa_string(isa2hwcap); diff --git a/arch/riscv/kernel/ptrace.c b/arch/riscv/kernel/ptrace.c index 93de2e7a30747d..793bcee4618282 100644 --- a/arch/riscv/kernel/ptrace.c +++ b/arch/riscv/kernel/ptrace.c @@ -577,8 +577,8 @@ static int compat_riscv_gpr_set(struct task_struct *target, struct compat_user_regs_struct cregs; ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &cregs, 0, -1); - - cregs_to_regs(&cregs, task_pt_regs(target)); + if (!ret) + cregs_to_regs(&cregs, task_pt_regs(target)); return ret; } diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c index 2a27d3ff4ac66d..81b7682e6c6dbc 100644 --- a/arch/riscv/kernel/traps_misaligned.c +++ b/arch/riscv/kernel/traps_misaligned.c @@ -584,7 +584,7 @@ static int cpu_online_check_unaligned_access_emulated(unsigned int cpu) static bool misaligned_traps_delegated; -#ifdef CONFIG_RISCV_SBI +#if defined(CONFIG_RISCV_SBI_FWFT_DELEGATE_MISALIGNED) static int cpu_online_sbi_unaligned_setup(unsigned int cpu) { diff --git a/arch/riscv/kernel/usercfi.c b/arch/riscv/kernel/usercfi.c index 6eaa0d94fdfe50..cbfb4e495e9f9f 100644 --- a/arch/riscv/kernel/usercfi.c +++ b/arch/riscv/kernel/usercfi.c @@ -109,15 +109,16 @@ void set_indir_lp_lock(struct task_struct *task, bool lock) task->thread_info.user_cfi_state.ufcfi_locked = lock; } /* - * If size is 0, then to be compatible with regular stack we want it to be as big as - * regular stack. Else PAGE_ALIGN it and return back + * The shadow stack only stores the return address and not any variables + * this should be more than sufficient for most applications. + * Else PAGE_ALIGN it and return back */ static unsigned long calc_shstk_size(unsigned long size) { if (size) return PAGE_ALIGN(size); - return PAGE_ALIGN(min_t(unsigned long long, rlimit(RLIMIT_STACK), SZ_4G)); + return PAGE_ALIGN(min(rlimit(RLIMIT_STACK) / 2, SZ_2G)); } /* diff --git a/arch/riscv/kernel/vec-copy-unaligned.S b/arch/riscv/kernel/vec-copy-unaligned.S index 7ce4de6f6e694a..361039f7b9441c 100644 --- a/arch/riscv/kernel/vec-copy-unaligned.S +++ b/arch/riscv/kernel/vec-copy-unaligned.S @@ -2,6 +2,7 @@ /* Copyright (C) 2024 Rivos Inc. */ #include +#include #include #include @@ -16,7 +17,7 @@ /* void __riscv_copy_vec_words_unaligned(void *, const void *, size_t) */ /* Performs a memcpy without aligning buffers, using word loads and stores. */ /* Note: The size is truncated to a multiple of WORD_EEW */ -SYM_FUNC_START(__riscv_copy_vec_words_unaligned) +SYM_TYPED_FUNC_START(__riscv_copy_vec_words_unaligned) andi a4, a2, ~(WORD_EEW-1) beqz a4, 2f add a3, a1, a4 @@ -38,7 +39,7 @@ SYM_FUNC_END(__riscv_copy_vec_words_unaligned) /* void __riscv_copy_vec_bytes_unaligned(void *, const void *, size_t) */ /* Performs a memcpy without aligning buffers, using only byte accesses. */ /* Note: The size is truncated to a multiple of 8 */ -SYM_FUNC_START(__riscv_copy_vec_bytes_unaligned) +SYM_TYPED_FUNC_START(__riscv_copy_vec_bytes_unaligned) andi a4, a2, ~(8-1) beqz a4, 2f add a3, a1, a4 diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index decd7df40fa421..fa8d2f6f554b57 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -792,6 +792,27 @@ static void __init set_mmap_rnd_bits_max(void) mmap_rnd_bits_max = MMAP_VA_BITS - PAGE_SHIFT - 3; } +static bool __init is_vaddr_valid(unsigned long va) +{ + unsigned long up = 0; + + switch (satp_mode) { + case SATP_MODE_39: + up = 1UL << 38; + break; + case SATP_MODE_48: + up = 1UL << 47; + break; + case SATP_MODE_57: + up = 1UL << 56; + break; + default: + return false; + } + + return (va < up) || (va >= (ULONG_MAX - up + 1)); +} + /* * There is a simple way to determine if 4-level is supported by the * underlying hardware: establish 1:1 mapping in 4-level page table mode @@ -833,6 +854,9 @@ static __init void set_satp_mode(uintptr_t dtb_pa) set_satp_mode_pmd + PMD_SIZE, PMD_SIZE, PAGE_KERNEL_EXEC); retry: + if (!is_vaddr_valid(set_satp_mode_pmd)) + goto out; + create_pgd_mapping(early_pg_dir, set_satp_mode_pmd, pgtable_l5_enabled ? @@ -855,6 +879,7 @@ static __init void set_satp_mode(uintptr_t dtb_pa) disable_pgtable_l4(); } +out: memset(early_pg_dir, 0, PAGE_SIZE); memset(early_p4d, 0, PAGE_SIZE); memset(early_pud, 0, PAGE_SIZE); diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c index 07f59c3b9a7b83..3bcdbbbb6891b2 100644 --- a/arch/s390/kvm/interrupt.c +++ b/arch/s390/kvm/interrupt.c @@ -3310,8 +3310,7 @@ static void aen_host_forward(unsigned long si) struct zpci_gaite *gaite; struct kvm *kvm; - gaite = (struct zpci_gaite *)aift->gait + - (si * sizeof(struct zpci_gaite)); + gaite = aift->gait + si; if (gaite->count == 0) return; if (gaite->aisb != 0) diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c index 86d93e8dddae3e..5b075c38998e31 100644 --- a/arch/s390/kvm/pci.c +++ b/arch/s390/kvm/pci.c @@ -166,7 +166,7 @@ static int kvm_zpci_set_airq(struct zpci_dev *zdev) fib.fmt0.noi = airq_iv_end(zdev->aibv); fib.fmt0.aibv = virt_to_phys(zdev->aibv->vector); fib.fmt0.aibvo = 0; - fib.fmt0.aisb = virt_to_phys(aift->sbv->vector + (zdev->aisb / 64) * 8); + fib.fmt0.aisb = virt_to_phys(aift->sbv->vector) + (zdev->aisb / 64) * 8; fib.fmt0.aisbo = zdev->aisb & 63; fib.gd = zdev->gisa; @@ -290,8 +290,7 @@ static int kvm_s390_pci_aif_enable(struct zpci_dev *zdev, struct zpci_fib *fib, phys_to_virt(fib->fmt0.aibv)); spin_lock_irq(&aift->gait_lock); - gaite = (struct zpci_gaite *)aift->gait + (zdev->aisb * - sizeof(struct zpci_gaite)); + gaite = aift->gait + zdev->aisb; /* If assist not requested, host will get all alerts */ if (assist) @@ -309,7 +308,7 @@ static int kvm_s390_pci_aif_enable(struct zpci_dev *zdev, struct zpci_fib *fib, /* Update guest FIB for re-issue */ fib->fmt0.aisbo = zdev->aisb & 63; - fib->fmt0.aisb = virt_to_phys(aift->sbv->vector + (zdev->aisb / 64) * 8); + fib->fmt0.aisb = virt_to_phys(aift->sbv->vector) + (zdev->aisb / 64) * 8; fib->fmt0.isc = gisc; /* Save some guest fib values in the host for later use */ @@ -357,8 +356,7 @@ static int kvm_s390_pci_aif_disable(struct zpci_dev *zdev, bool force) if (zdev->kzdev->fib.fmt0.aibv == 0) goto out; spin_lock_irq(&aift->gait_lock); - gaite = (struct zpci_gaite *)aift->gait + (zdev->aisb * - sizeof(struct zpci_gaite)); + gaite = aift->gait + zdev->aisb; isc = gaite->gisc; gaite->count--; if (gaite->count == 0) { diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index 810ab21ffd9913..4b9e105309c6a9 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -1294,13 +1294,16 @@ int x86_perf_rdpmc_index(struct perf_event *event) return event->hw.event_base_rdpmc; } -static inline int match_prev_assignment(struct hw_perf_event *hwc, +static inline int match_prev_assignment(struct perf_event *event, struct cpu_hw_events *cpuc, int i) { + struct hw_perf_event *hwc = &event->hw; + return hwc->idx == cpuc->assign[i] && - hwc->last_cpu == smp_processor_id() && - hwc->last_tag == cpuc->tags[i]; + hwc->last_cpu == smp_processor_id() && + hwc->last_tag == cpuc->tags[i] && + !is_acr_event_group(event); } static void x86_pmu_start(struct perf_event *event, int flags); @@ -1346,7 +1349,7 @@ static void x86_pmu_enable(struct pmu *pmu) * - no other event has used the counter since */ if (hwc->idx == -1 || - match_prev_assignment(hwc, cpuc, i)) + match_prev_assignment(event, cpuc, i)) continue; /* @@ -1367,7 +1370,7 @@ static void x86_pmu_enable(struct pmu *pmu) event = cpuc->event_list[i]; hwc = &event->hw; - if (!match_prev_assignment(hwc, cpuc, i)) + if (!match_prev_assignment(event, cpuc, i)) x86_assign_hw_event(event, cpuc, i); else if (i < n_running) continue; diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index d9488ade0f8ec4..dd1e3aa75ee9b7 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3118,11 +3118,11 @@ static void intel_pmu_enable_fixed(struct perf_event *event) intel_set_masks(event, idx); /* - * Enable IRQ generation (0x8), if not PEBS, - * and enable ring-3 counting (0x2) and ring-0 counting (0x1) - * if requested: + * Enable IRQ generation (0x8), if not PEBS or self-reloaded + * ACR event, and enable ring-3 counting (0x2) and ring-0 + * counting (0x1) if requested: */ - if (!event->attr.precise_ip) + if (!event->attr.precise_ip && !is_acr_self_reload_event(event)) bits |= INTEL_FIXED_0_ENABLE_PMI; if (hwc->config & ARCH_PERFMON_EVENTSEL_USR) bits |= INTEL_FIXED_0_USER; @@ -3306,6 +3306,15 @@ static void intel_pmu_enable_event(struct perf_event *event) intel_set_masks(event, idx); static_call_cond(intel_pmu_enable_acr_event)(event); static_call_cond(intel_pmu_enable_event_ext)(event); + /* + * For self-reloaded ACR event, don't enable PMI since + * HW won't set overflow bit in GLOBAL_STATUS. Otherwise, + * the PMI would be recognized as a suspicious NMI. + */ + if (is_acr_self_reload_event(event)) + hwc->config &= ~ARCH_PERFMON_EVENTSEL_INT; + else if (!event->attr.precise_ip) + hwc->config |= ARCH_PERFMON_EVENTSEL_INT; __x86_pmu_enable_event(hwc, enable_mask); break; case INTEL_PMC_IDX_FIXED ... INTEL_PMC_IDX_FIXED_BTS - 1: @@ -3332,23 +3341,41 @@ static void intel_pmu_enable_event(struct perf_event *event) static void intel_pmu_acr_late_setup(struct cpu_hw_events *cpuc) { struct perf_event *event, *leader; - int i, j, idx; + int i, j, k, bit, idx; + /* + * FIXME: ACR mask parsing relies on cpuc->event_list[] (active events only). + * Disabling an ACR event causes bit-shifting errors in the acr_mask of + * remaining group members. As ACR sampling requires all events to be active, + * this limitation is acceptable for now. Revisit if independent event toggling + * is required. + */ for (i = 0; i < cpuc->n_events; i++) { leader = cpuc->event_list[i]; if (!is_acr_event_group(leader)) continue; - /* The ACR events must be contiguous. */ + /* Find the last event of the ACR group. */ for (j = i; j < cpuc->n_events; j++) { event = cpuc->event_list[j]; if (event->group_leader != leader->group_leader) break; - for_each_set_bit(idx, (unsigned long *)&event->attr.config2, X86_PMC_IDX_MAX) { - if (i + idx >= cpuc->n_events || - !is_acr_event_group(cpuc->event_list[i + idx])) - return; - __set_bit(cpuc->assign[i + idx], (unsigned long *)&event->hw.config1); + } + + /* + * Translate the user-space ACR mask (attr.config2) into the physical + * counter bitmask (hw.config1) for each ACR event in the group. + * NOTE: ACR event contiguity is guaranteed by intel_pmu_hw_config(). + */ + for (k = i; k < j; k++) { + event = cpuc->event_list[k]; + event->hw.config1 = 0; + for_each_set_bit(bit, (unsigned long *)&event->attr.config2, X86_PMC_IDX_MAX) { + idx = i + bit; + /* Event index of ACR group must locate in [i, j). */ + if (idx >= j || !is_acr_event_group(cpuc->event_list[idx])) + continue; + __set_bit(cpuc->assign[idx], (unsigned long *)&event->hw.config1); } } i = j - 1; @@ -7504,6 +7531,7 @@ static __always_inline void intel_pmu_init_pnc(struct pmu *pmu) hybrid(pmu, event_constraints) = intel_pnc_event_constraints; hybrid(pmu, pebs_constraints) = intel_pnc_pebs_event_constraints; hybrid(pmu, extra_regs) = intel_pnc_extra_regs; + static_call_update(intel_pmu_enable_acr_event, intel_pmu_enable_acr); } static __always_inline void intel_pmu_init_skt(struct pmu *pmu) diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index fad87d3c8b2caa..524668dcf4cc10 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -137,6 +137,16 @@ static inline bool is_acr_event_group(struct perf_event *event) return check_leader_group(event->group_leader, PERF_X86_EVENT_ACR); } +static inline bool is_acr_self_reload_event(struct perf_event *event) +{ + struct hw_perf_event *hwc = &event->hw; + + if (hwc->idx < 0) + return false; + + return test_bit(hwc->idx, (unsigned long *)&hwc->config1); +} + struct amd_nb { int nb_id; /* NorthBridge id */ int refcnt; /* reference count */ diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h index dc8fe1361c18eb..be58b7f5c80634 100644 --- a/arch/x86/include/asm/efi.h +++ b/arch/x86/include/asm/efi.h @@ -137,7 +137,8 @@ extern void __init efi_dump_pagetable(void); extern void __init efi_apply_memmap_quirks(void); extern int __init efi_reuse_config(u64 tables, int nr_tables); extern void efi_delete_dummy_variable(void); -extern void efi_crash_gracefully_on_page_fault(unsigned long phys_addr); +extern void efi_crash_gracefully_on_page_fault(unsigned long phys_addr, + const struct pt_regs *regs); extern void efi_unmap_boot_services(void); void arch_efi_call_virt_setup(void); diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index a14a0f43e04ae8..86554de9a3f522 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -803,9 +803,10 @@ #define MSR_AMD64_LBR_SELECT 0xc000010e /* Zen4 */ -#define MSR_ZEN4_BP_CFG 0xc001102e +#define MSR_ZEN4_BP_CFG 0xc001102e #define MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT 4 #define MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT 5 +#define MSR_ZEN2_BP_CFG_BUG_FIX_BIT 33 /* Fam 19h MSRs */ #define MSR_F19H_UMC_PERF_CTL 0xc0010800 diff --git a/arch/x86/kernel/acpi/cppc.c b/arch/x86/kernel/acpi/cppc.c index d7c8ef1e354d30..be4c5e9e5ff6f9 100644 --- a/arch/x86/kernel/acpi/cppc.c +++ b/arch/x86/kernel/acpi/cppc.c @@ -88,19 +88,19 @@ static void amd_set_max_freq_ratio(void) rc = cppc_get_perf_caps(0, &perf_caps); if (rc) { - pr_warn("Could not retrieve perf counters (%d)\n", rc); + pr_debug("Could not retrieve perf counters (%d)\n", rc); return; } rc = amd_get_boost_ratio_numerator(0, &numerator); if (rc) { - pr_warn("Could not retrieve highest performance (%d)\n", rc); + pr_debug("Could not retrieve highest performance (%d)\n", rc); return; } nominal_perf = perf_caps.nominal_perf; if (!nominal_perf) { - pr_warn("Could not retrieve nominal performance\n"); + pr_debug("Could not retrieve nominal performance\n"); return; } diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 2d9ae6ab1701c0..2f8e8ff2d000a7 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -989,6 +989,9 @@ static void init_amd_zen2(struct cpuinfo_x86 *c) /* Correct misconfigured CPUID on some clients. */ clear_cpu_cap(c, X86_FEATURE_INVLPGB); + + if (!cpu_has(c, X86_FEATURE_HYPERVISOR)) + msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN2_BP_CFG_BUG_FIX_BIT); } static void init_amd_zen3(struct cpuinfo_x86 *c) diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 8dd424ac5de8a8..f3a793e3a6c8fa 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -90,7 +90,6 @@ struct mca_config mca_cfg __read_mostly = { }; static DEFINE_PER_CPU(struct mce_hw_err, hw_errs_seen); -static unsigned long mce_need_notify; /* * MCA banks polled by the period polling timer for corrected events. @@ -152,8 +151,10 @@ EXPORT_PER_CPU_SYMBOL_GPL(injectm); void mce_log(struct mce_hw_err *err) { - if (mce_gen_pool_add(err)) + if (mce_gen_pool_add(err)) { + pr_info(HW_ERR "Machine check events logged\n"); irq_work_queue(&mce_irq_work); + } } EXPORT_SYMBOL_GPL(mce_log); @@ -585,28 +586,6 @@ bool mce_is_correctable(struct mce *m) } EXPORT_SYMBOL_GPL(mce_is_correctable); -/* - * Notify the user(s) about new machine check events. - * Can be called from interrupt context, but not from machine check/NMI - * context. - */ -static bool mce_notify_irq(void) -{ - /* Not more than two messages every minute */ - static DEFINE_RATELIMIT_STATE(ratelimit, 60*HZ, 2); - - if (test_and_clear_bit(0, &mce_need_notify)) { - mce_work_trigger(); - - if (__ratelimit(&ratelimit)) - pr_info(HW_ERR "Machine check events logged\n"); - - return true; - } - - return false; -} - static int mce_early_notifier(struct notifier_block *nb, unsigned long val, void *data) { @@ -618,9 +597,7 @@ static int mce_early_notifier(struct notifier_block *nb, unsigned long val, /* Emit the trace record: */ trace_mce_record(err); - set_bit(0, &mce_need_notify); - - mce_notify_irq(); + mce_work_trigger(); return NOTIFY_DONE; } @@ -1804,7 +1781,7 @@ static void mce_timer_fn(struct timer_list *t) * Alert userspace if needed. If we logged an MCE, reduce the polling * interval, otherwise increase the polling interval. */ - if (mce_notify_irq()) + if (!mce_gen_pool_empty()) iv = max(iv / 2, (unsigned long) HZ/100); else iv = min(iv * 2, round_jiffies_relative(check_interval * HZ)); diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c index 2a999275893369..eb72537bc0b195 100644 --- a/arch/x86/kernel/e820.c +++ b/arch/x86/kernel/e820.c @@ -450,6 +450,10 @@ __init static int append_e820_table(struct boot_e820_entry *entries, u32 nr_entr { struct boot_e820_entry *entry = entries; + /* If there aren't any entries, we'll want to fall back to another source: */ + if (!nr_entries) + return -ENOENT; + while (nr_entries) { u64 start = entry->addr; u64 size = entry->size; @@ -458,7 +462,7 @@ __init static int append_e820_table(struct boot_e820_entry *entries, u32 nr_entr /* Ignore the remaining entries on 64-bit overflow: */ if (start > end && likely(size)) - return -1; + return -EINVAL; e820__range_add(start, size, type); diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S index 4ffba68dc57b29..eaeb77464c066e 100644 --- a/arch/x86/kernel/relocate_kernel_64.S +++ b/arch/x86/kernel/relocate_kernel_64.S @@ -136,6 +136,14 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped) * %r13 original CR4 when relocate_kernel() was invoked */ + /* + * Set return address to 0 if not preserving context. The purgatory + * shipped in kexec-tools will unconditionally look for the return + * address on the stack and set a kexec_jump_back_entry= command + * line option if it's non-zero. There's no other way that it can + * tell a preserve-context (kjump) kexec from a normal one. + */ + pushq $0 /* store the start address on the stack */ pushq %rdx diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index c8c6cc0406d6dd..8013dccb31102a 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -4481,7 +4481,7 @@ static const struct opcode opcode_map_0f_38[256] = { X16(N), X16(N), /* 0x20 - 0x2f */ X8(N), - X2(N), GP(SrcReg | DstMem | ModRM | Mov | Aligned, &pfx_0f_e7_0f_38_2a), N, N, N, N, N, + X2(N), GP(SrcMem | DstReg | ModRM | Mov | Aligned, &pfx_0f_e7_0f_38_2a), N, N, N, N, N, /* 0x30 - 0x7f */ X16(N), X16(N), X16(N), X16(N), X16(N), /* 0x80 - 0xef */ diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c index 9b140bbdc1d83b..4438ecac9a89bb 100644 --- a/arch/x86/kvm/hyperv.c +++ b/arch/x86/kvm/hyperv.c @@ -2040,7 +2040,7 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc) * flush). Translate the address here so the memory can be uniformly * read with kvm_read_guest(). */ - if (!hc->fast && is_guest_mode(vcpu)) { + if (!hc->fast && mmu_is_nested(vcpu)) { hc->ingpa = translate_nested_gpa(vcpu, hc->ingpa, 0, NULL); if (unlikely(hc->ingpa == INVALID_GPA)) return HV_STATUS_INVALID_HYPERCALL_INPUT; diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index e3ec4d8607c19f..4078e624ca6675 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -667,13 +667,15 @@ bool __kvm_apic_update_irr(unsigned long *pir, void *regs, int *max_irr) u32 *__pir = (void *)pir_vals; u32 i, vec; u32 irr_val, prev_irr_val; - int max_updated_irr; + int max_new_irr; - max_updated_irr = -1; - *max_irr = -1; - - if (!pi_harvest_pir(pir, pir_vals)) + if (!pi_harvest_pir(pir, pir_vals)) { + *max_irr = apic_find_highest_vector(regs + APIC_IRR); return false; + } + + max_new_irr = -1; + *max_irr = -1; for (i = vec = 0; i <= 7; i++, vec += 32) { u32 *p_irr = (u32 *)(regs + APIC_IRR + i * 0x10); @@ -688,25 +690,25 @@ bool __kvm_apic_update_irr(unsigned long *pir, void *regs, int *max_irr) !try_cmpxchg(p_irr, &prev_irr_val, irr_val)); if (prev_irr_val != irr_val) - max_updated_irr = __fls(irr_val ^ prev_irr_val) + vec; + max_new_irr = __fls(irr_val ^ prev_irr_val) + vec; } if (irr_val) *max_irr = __fls(irr_val) + vec; } - return ((max_updated_irr != -1) && - (max_updated_irr == *max_irr)); + return max_new_irr != -1 && max_new_irr == *max_irr; } EXPORT_SYMBOL_FOR_KVM_INTERNAL(__kvm_apic_update_irr); bool kvm_apic_update_irr(struct kvm_vcpu *vcpu, unsigned long *pir, int *max_irr) { struct kvm_lapic *apic = vcpu->arch.apic; - bool irr_updated = __kvm_apic_update_irr(pir, apic->regs, max_irr); + bool max_irr_is_from_pir; - if (unlikely(!apic->apicv_active && irr_updated)) + max_irr_is_from_pir = __kvm_apic_update_irr(pir, apic->regs, max_irr); + if (unlikely(!apic->apicv_active && max_irr_is_from_pir)) apic->irr_pending = true; - return irr_updated; + return max_irr_is_from_pir; } EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_apic_update_irr); diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 24fbc9ea502a30..f0144ae8d891d3 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -182,6 +182,8 @@ static struct kmem_cache *pte_list_desc_cache; struct kmem_cache *mmu_page_header_cache; static void mmu_spte_set(u64 *sptep, u64 spte); +static int mmu_page_zap_pte(struct kvm *kvm, struct kvm_mmu_page *sp, + u64 *spte, struct list_head *invalid_list); struct kvm_mmu_role_regs { const unsigned long cr0; @@ -1287,19 +1289,6 @@ static void drop_spte(struct kvm *kvm, u64 *sptep) rmap_remove(kvm, sptep); } -static void drop_large_spte(struct kvm *kvm, u64 *sptep, bool flush) -{ - struct kvm_mmu_page *sp; - - sp = sptep_to_sp(sptep); - WARN_ON_ONCE(sp->role.level == PG_LEVEL_4K); - - drop_spte(kvm, sptep); - - if (flush) - kvm_flush_remote_tlbs_sptep(kvm, sptep); -} - /* * Write-protect on the specified @sptep, @pt_protect indicates whether * spte write-protection is caused by protecting shadow page table. @@ -2466,7 +2455,8 @@ static struct kvm_mmu_page *kvm_mmu_get_child_sp(struct kvm_vcpu *vcpu, { union kvm_mmu_page_role role; - if (is_shadow_present_pte(*sptep) && !is_large_pte(*sptep)) + if (is_shadow_present_pte(*sptep) && !is_large_pte(*sptep) && + spte_to_child_sp(*sptep) && spte_to_child_sp(*sptep)->gfn == gfn) return ERR_PTR(-EEXIST); role = kvm_mmu_child_role(sptep, direct, access); @@ -2536,6 +2526,23 @@ static void shadow_walk_next(struct kvm_shadow_walk_iterator *iterator) __shadow_walk_next(iterator, *iterator->sptep); } +/* + * Note: while normally KVM uses a "bool flush" return value to let + * the caller batch flushes, __link_shadow_page() flushes immediately + * before populating the parent PTE with the new shadow page. The + * typical callers, direct_map() and FNAME(fetch)(), are not going + * to zap more than one huge SPTE anyway. + * + * The only exception, where @flush can be false, is when a huge SPTE + * is replaced with a shadow page SPTE with a fully populated page table, + * which can happen from shadow_mmu_split_huge_page(). In this case, + * no memory is unmapped across the change to the page tables and no + * immediate flush is needed for correctness. + * + * Even in that case, calls to kvm_mmu_commit_zap_page() are not + * batched. Doing so would require adding an invalid_list argument + * all the way down to __walk_slot_rmaps(). + */ static void __link_shadow_page(struct kvm *kvm, struct kvm_mmu_memory_cache *cache, u64 *sptep, struct kvm_mmu_page *sp, bool flush) @@ -2544,13 +2551,18 @@ static void __link_shadow_page(struct kvm *kvm, BUILD_BUG_ON(VMX_EPT_WRITABLE_MASK != PT_WRITABLE_MASK); - /* - * If an SPTE is present already, it must be a leaf and therefore - * a large one. Drop it, and flush the TLB if needed, before - * installing sp. - */ - if (is_shadow_present_pte(*sptep)) - drop_large_spte(kvm, sptep, flush); + if (is_shadow_present_pte(*sptep)) { + struct kvm_mmu_page *parent_sp; + LIST_HEAD(invalid_list); + + parent_sp = sptep_to_sp(sptep); + WARN_ON_ONCE(parent_sp->role.level == PG_LEVEL_4K); + + if (mmu_page_zap_pte(kvm, parent_sp, sptep, &invalid_list)) + kvm_mmu_commit_zap_page(kvm, &invalid_list); + else if (flush) + kvm_flush_remote_tlbs_sptep(kvm, sptep); + } spte = make_nonleaf_spte(sp->spt, sp_ad_disabled(sp)); diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 961804df5f451c..b340dc9991adb4 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -160,6 +160,16 @@ void nested_vmcb02_recalc_intercepts(struct vcpu_svm *svm) if (!intercept_smi) vmcb_clr_intercept(&vmcb02->control, INTERCEPT_SMI); + /* + * Intercept PAUSE if and only if L1 wants to. KVM intercepts PAUSE so + * that a vCPU that may be spinning waiting for a lock can be scheduled + * out in favor of the vCPU that holds said lock. KVM doesn't support + * yielding across L2 vCPUs, as KVM has limited visilibity into which + * L2 vCPUs are in the same L2 VM, i.e. may be contending for locks. + */ + if (!vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_PAUSE)) + vmcb_clr_intercept(&vmcb02->control, INTERCEPT_PAUSE); + if (nested_vmcb_needs_vls_intercept(svm)) { /* * If the virtual VMLOAD/VMSAVE is not enabled for the L2, @@ -819,7 +829,6 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) struct vmcb *vmcb02 = svm->nested.vmcb02.ptr; struct vmcb *vmcb01 = svm->vmcb01.ptr; struct kvm_vcpu *vcpu = &svm->vcpu; - u32 pause_count12, pause_thresh12; nested_svm_transition_tlb_flush(vcpu); @@ -947,31 +956,13 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) vmcb02->control.misc_ctl2 |= SVM_MISC2_ENABLE_V_VMLOAD_VMSAVE; if (guest_cpu_cap_has(vcpu, X86_FEATURE_PAUSEFILTER)) - pause_count12 = vmcb12_ctrl->pause_filter_count; + vmcb02->control.pause_filter_count = vmcb12_ctrl->pause_filter_count; else - pause_count12 = 0; + vmcb02->control.pause_filter_count = 0; if (guest_cpu_cap_has(vcpu, X86_FEATURE_PFTHRESHOLD)) - pause_thresh12 = vmcb12_ctrl->pause_filter_thresh; + vmcb02->control.pause_filter_thresh = vmcb12_ctrl->pause_filter_thresh; else - pause_thresh12 = 0; - if (kvm_pause_in_guest(svm->vcpu.kvm)) { - /* use guest values since host doesn't intercept PAUSE */ - vmcb02->control.pause_filter_count = pause_count12; - vmcb02->control.pause_filter_thresh = pause_thresh12; - - } else { - /* start from host values otherwise */ - vmcb02->control.pause_filter_count = vmcb01->control.pause_filter_count; - vmcb02->control.pause_filter_thresh = vmcb01->control.pause_filter_thresh; - - /* ... but ensure filtering is disabled if so requested. */ - if (vmcb12_is_intercept(vmcb12_ctrl, INTERCEPT_PAUSE)) { - if (!pause_count12) - vmcb02->control.pause_filter_count = 0; - if (!pause_thresh12) - vmcb02->control.pause_filter_thresh = 0; - } - } + vmcb02->control.pause_filter_thresh = 0; /* * Take ALLOW_LARGER_RAP from vmcb12 even though it should be safe to @@ -1298,12 +1289,6 @@ void nested_svm_vmexit(struct vcpu_svm *svm) /* in case we halted in L2 */ kvm_set_mp_state(vcpu, KVM_MP_STATE_RUNNABLE); - if (!kvm_pause_in_guest(vcpu->kvm)) { - vmcb01->control.pause_filter_count = vmcb02->control.pause_filter_count; - vmcb_mark_dirty(vmcb01, VMCB_INTERCEPTS); - - } - /* * Invalidate last_bus_lock_rip unless KVM is still waiting for the * guest to make forward progress before re-enabling bus lock detection. diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index e7fdd7a9c280d7..e02a38da5296e3 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -913,7 +913,15 @@ static void grow_ple_window(struct kvm_vcpu *vcpu) struct vmcb_control_area *control = &svm->vmcb->control; int old = control->pause_filter_count; - if (kvm_pause_in_guest(vcpu->kvm)) + /* Adjusting pause_filter_count makes no sense if PLE is disabled. */ + WARN_ON_ONCE(kvm_pause_in_guest(vcpu->kvm)); + + /* + * While running L2, KVM should intercept PAUSE if and only if L1 wants + * to intercept PAUSE, and L1's intercept should take priority, i.e. + * KVM should never handle a PAUSE intercept from L2. + */ + if (WARN_ON_ONCE(is_guest_mode(vcpu))) return; control->pause_filter_count = __grow_ple_window(old, @@ -934,7 +942,10 @@ static void shrink_ple_window(struct kvm_vcpu *vcpu) struct vmcb_control_area *control = &svm->vmcb->control; int old = control->pause_filter_count; - if (kvm_pause_in_guest(vcpu->kvm)) + /* Adjusting pause_filter_count makes no sense if PLE is disabled. */ + WARN_ON_ONCE(kvm_pause_in_guest(vcpu->kvm)); + + if (is_guest_mode(vcpu)) return; control->pause_filter_count = diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h index e7fdbe9efc904c..0db25bba17f6e8 100644 --- a/arch/x86/kvm/trace.h +++ b/arch/x86/kvm/trace.h @@ -154,7 +154,7 @@ TRACE_EVENT(kvm_xen_hypercall, __entry->a2 = a2; __entry->a3 = a3; __entry->a4 = a4; - __entry->a4 = a5; + __entry->a5 = a5; ), TP_printk("cpl %d nr 0x%lx a0 0x%lx a1 0x%lx a2 0x%lx a3 0x%lx a4 0x%lx a5 %lx", diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h index 56cacc06225ecc..31568274d8bb02 100644 --- a/arch/x86/kvm/vmx/capabilities.h +++ b/arch/x86/kvm/vmx/capabilities.h @@ -14,6 +14,7 @@ extern bool __read_mostly flexpriority_enabled; extern bool __read_mostly enable_ept; extern bool __read_mostly enable_unrestricted_guest; extern bool __read_mostly enable_ept_ad_bits; +extern bool __read_mostly enable_cet; extern bool __read_mostly enable_pml; extern int __read_mostly pt_mode; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index a29896a9ef1456..49feecb286b23c 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -108,6 +108,9 @@ module_param_named(unrestricted_guest, bool __read_mostly enable_ept_ad_bits = 1; module_param_named(eptad, enable_ept_ad_bits, bool, 0444); +bool __read_mostly enable_cet = 1; +module_param_named(cet, enable_cet, bool, 0444); + static bool __read_mostly emulate_invalid_guest_state = true; module_param(emulate_invalid_guest_state, bool, 0444); @@ -4476,7 +4479,7 @@ void vmx_set_constant_host_state(struct vcpu_vmx *vmx) * SSP is reloaded from IA32_PL3_SSP. Check SDM Vol.2A/B Chapter * 3 and 4 for details. */ - if (cpu_has_load_cet_ctrl()) { + if (enable_cet) { vmcs_writel(HOST_S_CET, kvm_host.s_cet); vmcs_writel(HOST_SSP, 0); vmcs_writel(HOST_INTR_SSP_TABLE, 0); @@ -4532,6 +4535,10 @@ static u32 vmx_get_initial_vmentry_ctrl(void) if (vmx_pt_mode_is_system()) vmentry_ctrl &= ~(VM_ENTRY_PT_CONCEAL_PIP | VM_ENTRY_LOAD_IA32_RTIT_CTL); + + if (!enable_cet) + vmentry_ctrl &= ~VM_ENTRY_LOAD_CET_STATE; + /* * IA32e mode, and loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically. */ @@ -4546,6 +4553,9 @@ static u32 vmx_get_initial_vmexit_ctrl(void) { u32 vmexit_ctrl = vmcs_config.vmexit_ctrl; + if (!enable_cet) + vmexit_ctrl &= ~VM_EXIT_LOAD_CET_STATE; + /* * Not used by KVM and never set in vmcs01 or vmcs02, but emulated for * nested virtualization and thus allowed to be set in vmcs12. @@ -7029,8 +7039,8 @@ static void vmx_set_rvi(int vector) int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu) { struct vcpu_vt *vt = to_vt(vcpu); + bool max_irr_is_from_pir; int max_irr; - bool got_posted_interrupt; if (KVM_BUG_ON(!enable_apicv, vcpu->kvm)) return -EIO; @@ -7042,17 +7052,22 @@ int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu) * But on x86 this is just a compiler barrier anyway. */ smp_mb__after_atomic(); - got_posted_interrupt = - kvm_apic_update_irr(vcpu, vt->pi_desc.pir, &max_irr); + max_irr_is_from_pir = kvm_apic_update_irr(vcpu, vt->pi_desc.pir, + &max_irr); } else { max_irr = kvm_lapic_find_highest_irr(vcpu); - got_posted_interrupt = false; + max_irr_is_from_pir = false; } /* - * Newly recognized interrupts are injected via either virtual interrupt - * delivery (RVI) or KVM_REQ_EVENT. Virtual interrupt delivery is - * disabled in two cases: + * If APICv is enabled and L2 is not active, then update the Requesting + * Virtual Interrupt (RVI) portion of vmcs01.GUEST_INTR_STATUS with the + * highest priority IRR to deliver the IRQ via Virtual Interrupt + * Delivery. Note, this is required even if the highest priority IRQ + * was already pending in the IRR, as RVI isn't updated in lockstep with + * the IRR (unlike apic->irr_pending). + * + * For the cases where Virtual Interrupt Delivery can't be used: * * 1) If L2 is running and the vCPU has a new pending interrupt. If L1 * wants to exit on interrupts, KVM_REQ_EVENT is needed to synthesize a @@ -7063,10 +7078,29 @@ int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu) * 2) If APICv is disabled for this vCPU, assigned devices may still * attempt to post interrupts. The posted interrupt vector will cause * a VM-Exit and the subsequent entry will call sync_pir_to_irr. + * + * In both cases, set KVM_REQ_EVENT if and only if the highest priority + * pending IRQ came from the PIR, as setting KVM_REQ_EVENT if any IRQ + * is pending may put the vCPU into an infinite loop, e.g. if the IRQ + * is blocked, then it will stay pending until an IRQ window is opened. + * + * Note! It's possible that one or more IRQs were moved from the PIR + * to the IRR _without_ max_irr_is_from_pir being true! I.e. if there + * was a higher priority IRQ already pending in the IRR. Not setting + * KVM_REQ_EVENT in this case is intentional and safe. If APICv is + * inactive, or L2 is running with exit-on-interrupt off (in vmcs12), + * i.e. without nested virtual interrupt delivery, then there's no need + * to request an IRQ window as the lower priority IRQ only needs to be + * delivered when the higher priority IRQ is dismissed from the ISR, + * i.e. on the next EOI, and EOIs are always intercepted if APICv is + * disabled or if L2 is running without nested VID. If L2 is running + * exit-on-interrupt on (in vmcs12), then the higher priority IRQ will + * trigger a nested VM-Exit, at which point KVM will re-evaluate L1's + * pending IRQs. */ if (!is_guest_mode(vcpu) && kvm_vcpu_apicv_active(vcpu)) vmx_set_rvi(max_irr); - else if (got_posted_interrupt) + else if (max_irr_is_from_pir) kvm_make_request(KVM_REQ_EVENT, vcpu); return max_irr; @@ -8131,7 +8165,7 @@ static __init void vmx_set_cpu_caps(void) * VMX_BASIC[bit56] == 0, inject #CP at VMX entry with error code * fails, so disable CET in this case too. */ - if (!cpu_has_load_cet_ctrl() || !enable_unrestricted_guest || + if (!enable_cet || !enable_unrestricted_guest || !cpu_has_vmx_basic_no_hw_errcode_cc()) { kvm_cpu_cap_clear(X86_FEATURE_SHSTK); kvm_cpu_cap_clear(X86_FEATURE_IBT); @@ -8606,6 +8640,9 @@ __init int vmx_hardware_setup(void) !cpu_has_vmx_invept_global()) enable_ept = 0; + if (!cpu_has_load_cet_ctrl()) + enable_cet = 0; + /* NX support is required for shadow paging. */ if (!enable_ept && !boot_cpu_has(X86_FEATURE_NX)) { pr_err_ratelimited("NX (Execute Disable) not supported\n"); diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index f0e77e08448207..63de8e8684f237 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -686,7 +686,7 @@ page_fault_oops(struct pt_regs *regs, unsigned long error_code, * avoid hanging the system. */ if (IS_ENABLED(CONFIG_EFI)) - efi_crash_gracefully_on_page_fault(address); + efi_crash_gracefully_on_page_fault(address, regs); /* Only not-present faults should be handled by KFENCE. */ if (!(error_code & X86_PF_PROT) && diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c index df24ffc6105d64..90a065fcb1fab2 100644 --- a/arch/x86/platform/efi/quirks.c +++ b/arch/x86/platform/efi/quirks.c @@ -761,7 +761,8 @@ int efi_capsule_setup_info(struct capsule_info *cap_info, void *kbuff, * @return: Returns, if the page fault is not handled. This function * will never return if the page fault is handled successfully. */ -void efi_crash_gracefully_on_page_fault(unsigned long phys_addr) +void efi_crash_gracefully_on_page_fault(unsigned long phys_addr, + const struct pt_regs *regs) { if (!IS_ENABLED(CONFIG_X86_64)) return; @@ -770,7 +771,7 @@ void efi_crash_gracefully_on_page_fault(unsigned long phys_addr) * If we get an interrupt/NMI while processing an EFI runtime service * then this is a regular OOPS, not an EFI failure. */ - if (in_interrupt()) + if (!in_task()) return; /* @@ -810,6 +811,14 @@ void efi_crash_gracefully_on_page_fault(unsigned long phys_addr) return; } + /* + * The API does not permit entering a kernel mode FPU section with + * interrupts enabled and leaving it with interrupts disabled. So + * re-enable interrupts now if they were enabled when the page fault + * occurred. + */ + local_irq_restore(regs->flags); + /* * Before calling EFI Runtime Service, the kernel has switched the * calling process to efi_mm. Hence, switch back to task_mm. diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c index c80d0058efd18f..3eee5f84f8a70d 100644 --- a/arch/x86/xen/mmu_pv.c +++ b/arch/x86/xen/mmu_pv.c @@ -2145,7 +2145,10 @@ static void xen_set_fixmap(unsigned idx, phys_addr_t phys, pgprot_t prot) static void xen_enter_lazy_mmu(void) { - enter_lazy(XEN_LAZY_MMU); + preempt_disable(); + if (xen_get_lazy_mode() != XEN_LAZY_MMU) + enter_lazy(XEN_LAZY_MMU); + preempt_enable(); } static void xen_flush_lazy_mmu(void) @@ -2182,7 +2185,8 @@ static void xen_leave_lazy_mmu(void) { preempt_disable(); xen_mc_flush(); - leave_lazy(XEN_LAZY_MMU); + if (xen_get_lazy_mode() != XEN_LAZY_NONE) + leave_lazy(XEN_LAZY_MMU); preempt_enable(); } diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index ac8021c3a997e0..41251d4cf953f2 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -655,7 +655,7 @@ static void __init xen_e820_swap_entry_with_ram(struct e820_entry *swap_entry) /* Fill new entry (keep size and page offset). */ entry->type = swap_entry->type; entry->addr = entry_end - swap_size + - swap_addr - swap_entry->addr; + swap_entry->addr - swap_addr; entry->size = swap_entry->size; /* Convert old entry to RAM, align to pages. */ @@ -695,17 +695,22 @@ static void __init xen_e820_resolve_conflicts(phys_addr_t start, return; end = start + size; - entry = xen_e820_table.entries; + mapcnt = 0; - for (mapcnt = 0; mapcnt < xen_e820_table.nr_entries; mapcnt++) { + while (mapcnt < xen_e820_table.nr_entries) { + entry = xen_e820_table.entries + mapcnt; if (entry->addr >= end) return; if (entry->addr + entry->size > start && - entry->type == E820_TYPE_NVS) + entry->type == E820_TYPE_NVS) { xen_e820_swap_entry_with_ram(entry); + /* E820 map has been changed, restart loop! */ + mapcnt = 0; + continue; + } - entry++; + mapcnt++; } } diff --git a/block/bio-integrity.c b/block/bio-integrity.c index e54c6e06e1cbba..e796de1a749e33 100644 --- a/block/bio-integrity.c +++ b/block/bio-integrity.c @@ -308,7 +308,6 @@ static int bio_integrity_copy_user(struct bio *bio, struct bio_vec *bvec, } bip->bip_flags |= BIP_COPY_USER; - bip->bip_vcnt = nr_vecs; return 0; free_bip: bio_integrity_free(bio); @@ -403,6 +402,24 @@ int bio_integrity_map_user(struct bio *bio, struct iov_iter *iter) if (unlikely(ret < 0)) goto free_bvec; + /* + * Handle partial pinning. This can happen when pin_user_pages_fast() + * returns fewer pages than requested. + */ + if (user_backed_iter(iter) && unlikely(ret != bytes)) { + if (ret > 0) { + int npinned = DIV_ROUND_UP(offset + ret, PAGE_SIZE); + int i; + + for (i = 0; i < npinned; i++) + unpin_user_page(pages[i]); + } + if (pages != stack_pages) + kvfree(pages); + ret = -EFAULT; + goto free_bvec; + } + nr_bvecs = bvec_from_pages(bvec, pages, nr_vecs, bytes, offset, &is_p2p); if (pages != stack_pages) diff --git a/block/bio.c b/block/bio.c index b8972dba68a090..5f10900b3f42a1 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1279,11 +1279,12 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter, return bio_iov_iter_align_down(bio, iter, len_align_mask); } -static struct folio *folio_alloc_greedy(gfp_t gfp, size_t *size) +static struct folio *folio_alloc_greedy(gfp_t gfp, size_t *size, + size_t minsize) { struct folio *folio; - while (*size > PAGE_SIZE) { + while (*size > minsize) { folio = folio_alloc(gfp | __GFP_NORETRY, get_order(*size)); if (folio) return folio; @@ -1307,7 +1308,7 @@ static void bio_free_folios(struct bio *bio) } static int bio_iov_iter_bounce_write(struct bio *bio, struct iov_iter *iter, - size_t maxlen) + size_t maxlen, size_t minsize) { size_t total_len = min(maxlen, iov_iter_count(iter)); @@ -1322,13 +1323,13 @@ static int bio_iov_iter_bounce_write(struct bio *bio, struct iov_iter *iter, size_t this_len = min(total_len, SZ_1M); struct folio *folio; - if (this_len > PAGE_SIZE * 2) + if (this_len > minsize * 2) this_len = rounddown_pow_of_two(this_len); if (bio->bi_iter.bi_size > BIO_MAX_SIZE - this_len) break; - folio = folio_alloc_greedy(GFP_KERNEL, &this_len); + folio = folio_alloc_greedy(GFP_KERNEL, &this_len, minsize); if (!folio) break; bio_add_folio_nofail(bio, folio, this_len, 0); @@ -1344,16 +1345,16 @@ static int bio_iov_iter_bounce_write(struct bio *bio, struct iov_iter *iter, if (!bio->bi_iter.bi_size) return -ENOMEM; - return 0; + return bio_iov_iter_align_down(bio, iter, minsize - 1); } static int bio_iov_iter_bounce_read(struct bio *bio, struct iov_iter *iter, - size_t maxlen) + size_t maxlen, size_t minsize) { size_t len = min3(iov_iter_count(iter), maxlen, SZ_1M); struct folio *folio; - folio = folio_alloc_greedy(GFP_KERNEL, &len); + folio = folio_alloc_greedy(GFP_KERNEL, &len, minsize); if (!folio) return -ENOMEM; @@ -1382,7 +1383,7 @@ static int bio_iov_iter_bounce_read(struct bio *bio, struct iov_iter *iter, bvec_set_folio(&bio->bi_io_vec[0], folio, bio->bi_iter.bi_size, 0); if (iov_iter_extract_will_pin(iter)) bio_set_flag(bio, BIO_PAGE_PINNED); - return 0; + return bio_iov_iter_align_down(bio, iter, minsize - 1); } /** @@ -1390,6 +1391,7 @@ static int bio_iov_iter_bounce_read(struct bio *bio, struct iov_iter *iter, * @bio: bio to send * @iter: iter to read from / write into * @maxlen: maximum size to bounce + * @minsize: minimum folio allocation size * * Helper for direct I/O implementations that need to bounce buffer because * we need to checksum the data or perform other operations that require @@ -1397,11 +1399,12 @@ static int bio_iov_iter_bounce_read(struct bio *bio, struct iov_iter *iter, * copies the data into it. Needs to be paired with bio_iov_iter_unbounce() * called on completion. */ -int bio_iov_iter_bounce(struct bio *bio, struct iov_iter *iter, size_t maxlen) +int bio_iov_iter_bounce(struct bio *bio, struct iov_iter *iter, size_t maxlen, + size_t minsize) { if (op_is_write(bio_op(bio))) - return bio_iov_iter_bounce_write(bio, iter, maxlen); - return bio_iov_iter_bounce_read(bio, iter, maxlen); + return bio_iov_iter_bounce_write(bio, iter, maxlen, minsize); + return bio_iov_iter_bounce_read(bio, iter, maxlen, minsize); } static void bvec_unpin(struct bio_vec *bv, bool mark_dirty) diff --git a/block/blk-mq.c b/block/blk-mq.c index 4c5c16cce4f8f3..d0c37daf568f24 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3307,6 +3307,25 @@ blk_status_t blk_insert_cloned_request(struct request *rq) return BLK_STS_IOERR; } + /* + * Integrity segment counting depends on the same queue limits + * (virt_boundary_mask, seg_boundary_mask, max_segment_size) that + * vary across stacked queues, so recompute against the bottom + * queue just like nr_phys_segments above. + */ + if (blk_integrity_rq(rq) && rq->bio) { + unsigned short max_int_segs = queue_max_integrity_segments(q); + + rq->nr_integrity_segments = + blk_rq_count_integrity_sg(rq->q, rq->bio); + if (rq->nr_integrity_segments > max_int_segs) { + printk(KERN_ERR "%s: over max integrity segments limit. (%u > %u)\n", + __func__, rq->nr_integrity_segments, + max_int_segs); + return BLK_STS_IOERR; + } + } + if (q->disk && should_fail_request(q->disk->part0, blk_rq_bytes(rq))) return BLK_STS_IOERR; diff --git a/block/blk-zoned.c b/block/blk-zoned.c index 30cad2bb9291b6..42ef830054dc8b 100644 --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -623,6 +623,28 @@ static void disk_mark_zone_wplug_dead(struct blk_zone_wplug *zwplug) } } +static inline bool disk_check_zone_wplug_dead(struct blk_zone_wplug *zwplug) +{ + if (!(zwplug->flags & BLK_ZONE_WPLUG_DEAD)) + return false; + + /* + * If a new write is received right after a zone reset completes and + * while the disk_zone_wplugs_worker() thread has not yet released the + * reference on the zone write plug after processing the last write to + * the zone, then the new write BIO will see the zone write plug marked + * as dead. This case is however a false positive and a perfectly valid + * pattern. In such case, restore the zone write plug to a live one. + */ + if (!zwplug->wp_offset && bio_list_empty(&zwplug->bio_list)) { + zwplug->flags &= ~BLK_ZONE_WPLUG_DEAD; + refcount_inc(&zwplug->ref); + return false; + } + + return true; +} + static bool disk_zone_wplug_submit_bio(struct gendisk *disk, struct blk_zone_wplug *zwplug); @@ -1444,12 +1466,12 @@ static bool blk_zone_wplug_handle_write(struct bio *bio, unsigned int nr_segs) spin_lock_irqsave(&zwplug->lock, flags); /* - * If we got a zone write plug marked as dead, then the user is issuing - * writes to a full zone, or without synchronizing with zone reset or - * zone finish operations. In such case, fail the BIO to signal this - * invalid usage. + * Check if we got a zone write plug marked as dead. If yes, then the + * user is likely issuing writes to a full zone, or without + * synchronizing with zone reset or zone finish operations. In such + * case, fail the BIO to signal this invalid usage. */ - if (zwplug->flags & BLK_ZONE_WPLUG_DEAD) { + if (disk_check_zone_wplug_dead(zwplug)) { spin_unlock_irqrestore(&zwplug->lock, flags); disk_put_zone_wplug(zwplug); bio_io_error(bio); diff --git a/block/ioctl.c b/block/ioctl.c index fc3be0549aa754..ab2c9ed7994683 100644 --- a/block/ioctl.c +++ b/block/ioctl.c @@ -857,6 +857,8 @@ long compat_blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg) #endif struct blk_iou_cmd { + u64 start; + u64 len; int res; bool nowait; }; @@ -946,23 +948,27 @@ int blkdev_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags) { struct block_device *bdev = I_BDEV(cmd->file->f_mapping->host); struct blk_iou_cmd *bic = io_uring_cmd_to_pdu(cmd, struct blk_iou_cmd); - const struct io_uring_sqe *sqe = cmd->sqe; u32 cmd_op = cmd->cmd_op; - uint64_t start, len; - if (unlikely(sqe->ioprio || sqe->__pad1 || sqe->len || - sqe->rw_flags || sqe->file_index)) - return -EINVAL; + /* Read what we need from the SQE on the first issue */ + if (!(issue_flags & IORING_URING_CMD_REISSUE)) { + const struct io_uring_sqe *sqe = cmd->sqe; + + if (unlikely(sqe->ioprio || sqe->__pad1 || sqe->len || + sqe->rw_flags || sqe->file_index)) + return -EINVAL; + + bic->start = READ_ONCE(sqe->addr); + bic->len = READ_ONCE(sqe->addr3); + } bic->res = 0; bic->nowait = issue_flags & IO_URING_F_NONBLOCK; - start = READ_ONCE(sqe->addr); - len = READ_ONCE(sqe->addr3); - switch (cmd_op) { case BLOCK_URING_CMD_DISCARD: - return blkdev_cmd_discard(cmd, bdev, start, len, bic->nowait); + return blkdev_cmd_discard(cmd, bdev, bic->start, bic->len, + bic->nowait); } return -EINVAL; } diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c index 2801378e3e1927..3b7b008bccfec7 100644 --- a/drivers/accel/ivpu/ivpu_drv.c +++ b/drivers/accel/ivpu/ivpu_drv.c @@ -537,6 +537,26 @@ static const struct file_operations ivpu_fops = { #endif }; +static int ivpu_gem_prime_handle_to_fd(struct drm_device *dev, struct drm_file *file_priv, + u32 handle, u32 flags, int *prime_fd) +{ + struct drm_gem_object *obj; + + obj = drm_gem_object_lookup(file_priv, handle); + if (!obj) + return -ENOENT; + + if (drm_gem_is_imported(obj)) { + /* Do not allow re-exporting */ + drm_gem_object_put(obj); + return -EOPNOTSUPP; + } + + drm_gem_object_put(obj); + + return drm_gem_prime_handle_to_fd(dev, file_priv, handle, flags, prime_fd); +} + static const struct drm_driver driver = { .driver_features = DRIVER_GEM | DRIVER_COMPUTE_ACCEL, @@ -545,6 +565,7 @@ static const struct drm_driver driver = { .gem_create_object = ivpu_gem_create_object, .gem_prime_import = ivpu_gem_prime_import, + .prime_handle_to_fd = ivpu_gem_prime_handle_to_fd, .ioctls = ivpu_drm_ioctls, .num_ioctls = ARRAY_SIZE(ivpu_drm_ioctls), diff --git a/drivers/accel/qaic/qaic_data.c b/drivers/accel/qaic/qaic_data.c index 95300c2f7d8af0..1e4c579d272562 100644 --- a/drivers/accel/qaic/qaic_data.c +++ b/drivers/accel/qaic/qaic_data.c @@ -606,8 +606,11 @@ static const struct vm_operations_struct drm_vm_ops = { static int qaic_gem_object_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma) { struct qaic_bo *bo = to_qaic_bo(obj); + unsigned long remap_start; unsigned long offset = 0; + unsigned long remap_end; struct scatterlist *sg; + unsigned long length; int ret = 0; if (drm_gem_is_imported(obj)) @@ -615,11 +618,27 @@ static int qaic_gem_object_mmap(struct drm_gem_object *obj, struct vm_area_struc for (sg = bo->sgt->sgl; sg; sg = sg_next(sg)) { if (sg_page(sg)) { + /* if sg is too large for the VMA, so truncate it to fit */ + if (check_add_overflow(vma->vm_start, offset, &remap_start)) + return -EINVAL; + if (check_add_overflow(remap_start, sg->length, &remap_end)) + return -EINVAL; + + if (remap_end > vma->vm_end) { + if (check_sub_overflow(vma->vm_end, remap_start, &length)) + return -EINVAL; + } else { + length = sg->length; + } + + if (length == 0) + goto out; + ret = remap_pfn_range(vma, vma->vm_start + offset, page_to_pfn(sg_page(sg)), - sg->length, vma->vm_page_prot); + length, vma->vm_page_prot); if (ret) goto out; - offset += sg->length; + offset += length; } } diff --git a/drivers/accel/qaic/qaic_ras.c b/drivers/accel/qaic/qaic_ras.c index cc0b75461e1ac3..6791af366cba73 100644 --- a/drivers/accel/qaic/qaic_ras.c +++ b/drivers/accel/qaic/qaic_ras.c @@ -497,11 +497,11 @@ static void decode_ras_msg(struct qaic_device *qdev, struct ras_data *msg) qdev->ce_count++; break; case UE: - if (qdev->ce_count != UINT_MAX) + if (qdev->ue_count != UINT_MAX) qdev->ue_count++; break; case UE_NF: - if (qdev->ce_count != UINT_MAX) + if (qdev->ue_nf_count != UINT_MAX) qdev->ue_nf_count++; break; default: diff --git a/drivers/accel/rocket/rocket_gem.c b/drivers/accel/rocket/rocket_gem.c index b6a385d2edfc30..c8084719208a2a 100644 --- a/drivers/accel/rocket/rocket_gem.c +++ b/drivers/accel/rocket/rocket_gem.c @@ -145,6 +145,8 @@ int rocket_ioctl_prep_bo(struct drm_device *dev, void *data, struct drm_file *fi ret = dma_resv_wait_timeout(gem_obj->resv, DMA_RESV_USAGE_WRITE, true, timeout); if (!ret) ret = timeout ? -ETIMEDOUT : -EBUSY; + else if (ret > 0) + ret = 0; shmem_obj = &to_rocket_bo(gem_obj)->base; diff --git a/drivers/acpi/ac.c b/drivers/acpi/ac.c index e9e970fd8f33e6..27f31744f29e23 100644 --- a/drivers/acpi/ac.c +++ b/drivers/acpi/ac.c @@ -192,11 +192,15 @@ static const struct dmi_system_id ac_dmi_table[] __initconst = { static int acpi_ac_probe(struct platform_device *pdev) { - struct acpi_device *adev = ACPI_COMPANION(&pdev->dev); struct power_supply_config psy_cfg = {}; + struct acpi_device *adev; struct acpi_ac *ac; int result; + adev = ACPI_COMPANION(&pdev->dev); + if (!adev) + return -ENODEV; + ac = kzalloc_obj(struct acpi_ac); if (!ac) return -ENOMEM; diff --git a/drivers/acpi/acpi_pad.c b/drivers/acpi/acpi_pad.c index 0a8e02bc8c8b6a..ec94b09bb74716 100644 --- a/drivers/acpi/acpi_pad.c +++ b/drivers/acpi/acpi_pad.c @@ -423,7 +423,11 @@ static void acpi_pad_notify(acpi_handle handle, u32 event, void *data) static int acpi_pad_probe(struct platform_device *pdev) { - struct acpi_device *adev = ACPI_COMPANION(&pdev->dev); + struct acpi_device *adev; + + adev = ACPI_COMPANION(&pdev->dev); + if (!adev) + return -ENODEV; return acpi_dev_install_notify_handler(adev, ACPI_DEVICE_NOTIFY, acpi_pad_notify, adev); diff --git a/drivers/acpi/acpi_tad.c b/drivers/acpi/acpi_tad.c index cac07e997028a3..386fc1abcbdcb5 100644 --- a/drivers/acpi/acpi_tad.c +++ b/drivers/acpi/acpi_tad.c @@ -815,12 +815,16 @@ static void acpi_tad_remove(void *data) static int acpi_tad_probe(struct platform_device *pdev) { struct device *dev = &pdev->dev; - acpi_handle handle = ACPI_HANDLE(dev); struct acpi_tad_driver_data *dd; + acpi_handle handle; acpi_status status; unsigned long long caps; int ret; + handle = ACPI_HANDLE(dev); + if (!handle) + return -ENODEV; + /* * Initialization failure messages are mostly about firmware issues, so * print them at the "info" level. diff --git a/drivers/acpi/battery.c b/drivers/acpi/battery.c index b4c25474f42f4d..d4ae21a7100796 100644 --- a/drivers/acpi/battery.c +++ b/drivers/acpi/battery.c @@ -1214,10 +1214,14 @@ static void sysfs_battery_cleanup(struct acpi_battery *battery) static int acpi_battery_probe(struct platform_device *pdev) { - struct acpi_device *device = ACPI_COMPANION(&pdev->dev); struct acpi_battery *battery; + struct acpi_device *device; int result; + device = ACPI_COMPANION(&pdev->dev); + if (!device) + return -ENODEV; + if (device->dep_unmet) return -EPROBE_DEFER; diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c index dc064a388c23ea..b47301ee4c8a82 100644 --- a/drivers/acpi/button.c +++ b/drivers/acpi/button.c @@ -531,15 +531,20 @@ static int acpi_lid_input_open(struct input_dev *input) static int acpi_button_probe(struct platform_device *pdev) { - struct acpi_device *device = ACPI_COMPANION(&pdev->dev); acpi_notify_handler handler; + struct acpi_device *device; struct acpi_button *button; struct input_dev *input; - const char *hid = acpi_device_hid(device); acpi_status status; char *name, *class; + const char *hid; int error = 0; + device = ACPI_COMPANION(&pdev->dev); + if (!device) + return -ENODEV; + + hid = acpi_device_hid(device); if (!strcmp(hid, ACPI_BUTTON_HID_LID) && lid_init_state == ACPI_BUTTON_LID_INIT_DISABLED) return -ENODEV; diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c index 45204538ed874e..64ad4cfa6208bd 100644 --- a/drivers/acpi/ec.c +++ b/drivers/acpi/ec.c @@ -1676,10 +1676,14 @@ static int acpi_ec_setup(struct acpi_ec *ec, struct acpi_device *device, bool ca static int acpi_ec_probe(struct platform_device *pdev) { - struct acpi_device *device = ACPI_COMPANION(&pdev->dev); + struct acpi_device *device; struct acpi_ec *ec; int ret; + device = ACPI_COMPANION(&pdev->dev); + if (!device) + return -ENODEV; + if (boot_ec && (boot_ec->handle == device->handle || !strcmp(acpi_device_hid(device), ACPI_ECDT_HID))) { /* Fast path: this device corresponds to the boot EC. */ diff --git a/drivers/acpi/hed.c b/drivers/acpi/hed.c index 4d5e12ed6f3c25..060e8d670f5d37 100644 --- a/drivers/acpi/hed.c +++ b/drivers/acpi/hed.c @@ -50,9 +50,13 @@ static void acpi_hed_notify(acpi_handle handle, u32 event, void *data) static int acpi_hed_probe(struct platform_device *pdev) { - struct acpi_device *device = ACPI_COMPANION(&pdev->dev); + struct acpi_device *device; int err; + device = ACPI_COMPANION(&pdev->dev); + if (!device) + return -ENODEV; + /* Only one hardware error device */ if (hed_handle) return -EINVAL; diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c index d13264fb9e026b..9304ac996d41ad 100644 --- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -3341,12 +3341,16 @@ static int acpi_nfit_probe(struct platform_device *pdev) struct acpi_buffer buf = { ACPI_ALLOCATE_BUFFER, NULL }; struct acpi_nfit_desc *acpi_desc; struct device *dev = &pdev->dev; - struct acpi_device *adev = ACPI_COMPANION(dev); struct acpi_table_header *tbl; + struct acpi_device *adev; acpi_status status = AE_OK; acpi_size sz; int rc = 0; + adev = ACPI_COMPANION(&pdev->dev); + if (!adev) + return -ENODEV; + rc = acpi_dev_install_notify_handler(adev, ACPI_DEVICE_NOTIFY, acpi_nfit_notify, dev); if (rc) diff --git a/drivers/acpi/pfr_telemetry.c b/drivers/acpi/pfr_telemetry.c index 32bdf8cbe8f237..2387376832a1b6 100644 --- a/drivers/acpi/pfr_telemetry.c +++ b/drivers/acpi/pfr_telemetry.c @@ -360,10 +360,14 @@ static void pfrt_log_put_idx(void *data) static int acpi_pfrt_log_probe(struct platform_device *pdev) { - acpi_handle handle = ACPI_HANDLE(&pdev->dev); struct pfrt_log_device *pfrt_log_dev; + acpi_handle handle; int ret; + handle = ACPI_HANDLE(&pdev->dev); + if (!handle) + return -ENODEV; + if (!acpi_has_method(handle, "_DSM")) { dev_dbg(&pdev->dev, "Missing _DSM\n"); return -ENODEV; diff --git a/drivers/acpi/pfr_update.c b/drivers/acpi/pfr_update.c index 11b1c282800525..6283105bb0e8b2 100644 --- a/drivers/acpi/pfr_update.c +++ b/drivers/acpi/pfr_update.c @@ -538,10 +538,14 @@ static void pfru_put_idx(void *data) static int acpi_pfru_probe(struct platform_device *pdev) { - acpi_handle handle = ACPI_HANDLE(&pdev->dev); struct pfru_device *pfru_dev; + acpi_handle handle; int ret; + handle = ACPI_HANDLE(&pdev->dev); + if (!handle) + return -ENODEV; + if (!acpi_has_method(handle, "_DSM")) { dev_dbg(&pdev->dev, "Missing _DSM\n"); return -ENODEV; diff --git a/drivers/acpi/sbs.c b/drivers/acpi/sbs.c index 440f1d69aca86c..86b7c797585263 100644 --- a/drivers/acpi/sbs.c +++ b/drivers/acpi/sbs.c @@ -629,11 +629,15 @@ static void acpi_sbs_callback(void *context) static int acpi_sbs_probe(struct platform_device *pdev) { - struct acpi_device *device = ACPI_COMPANION(&pdev->dev); + struct acpi_device *device; struct acpi_sbs *sbs; int result = 0; int id; + device = ACPI_COMPANION(&pdev->dev); + if (!device) + return -ENODEV; + sbs = kzalloc_obj(struct acpi_sbs); if (!sbs) { result = -ENOMEM; diff --git a/drivers/acpi/sbshc.c b/drivers/acpi/sbshc.c index f413270415b687..c0ffa267f96cd5 100644 --- a/drivers/acpi/sbshc.c +++ b/drivers/acpi/sbshc.c @@ -237,11 +237,15 @@ static int smbus_alarm(void *context) static int acpi_smbus_hc_probe(struct platform_device *pdev) { - struct acpi_device *device = ACPI_COMPANION(&pdev->dev); + struct acpi_device *device; int status; unsigned long long val; struct acpi_smb_hc *hc; + device = ACPI_COMPANION(&pdev->dev); + if (!device) + return -ENODEV; + status = acpi_evaluate_integer(device->handle, "_EC", NULL, &val); if (ACPI_FAILURE(status)) { pr_err("error obtaining _EC.\n"); diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c index b8b487d89d25b1..dfc7daa809b50f 100644 --- a/drivers/acpi/thermal.c +++ b/drivers/acpi/thermal.c @@ -789,7 +789,7 @@ static int acpi_thermal_probe(struct platform_device *pdev) int i; if (!device) - return -EINVAL; + return -ENODEV; tz = kzalloc_obj(struct acpi_thermal); if (!tz) diff --git a/drivers/acpi/tiny-power-button.c b/drivers/acpi/tiny-power-button.c index 531e65b01bcbe1..92516ef84b0216 100644 --- a/drivers/acpi/tiny-power-button.c +++ b/drivers/acpi/tiny-power-button.c @@ -38,9 +38,13 @@ static u32 acpi_tiny_power_button_event(void *not_used) static int acpi_tiny_power_button_probe(struct platform_device *pdev) { - struct acpi_device *device = ACPI_COMPANION(&pdev->dev); + struct acpi_device *device; acpi_status status; + device = ACPI_COMPANION(&pdev->dev); + if (!device) + return -ENODEV; + if (device->device_type == ACPI_BUS_TYPE_POWER_BUTTON) { status = acpi_install_fixed_event_handler(ACPI_EVENT_POWER_BUTTON, acpi_tiny_power_button_event, diff --git a/drivers/android/binder/range_alloc/array.rs b/drivers/android/binder/range_alloc/array.rs index ada1d1b4302e53..081d19b09d4bb4 100644 --- a/drivers/android/binder/range_alloc/array.rs +++ b/drivers/android/binder/range_alloc/array.rs @@ -204,7 +204,6 @@ impl ArrayRangeAllocator { // caller will mark them as unused, which means that they can be freed if the system comes // under memory pressure. let mut freed_range = FreedRange::interior_pages(offset, size); - #[expect(clippy::collapsible_if)] // reads better like this if offset % PAGE_SIZE != 0 { if i == 0 || self.ranges[i - 1].endpoint() <= (offset & PAGE_MASK) { freed_range.start_page_idx -= 1; diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 8e5f3738c20330..6c041eaebdb911 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -900,12 +900,29 @@ static int ublk_validate_params(const struct ublk_device *ub) if (p->logical_bs_shift > PAGE_SHIFT || p->logical_bs_shift < 9) return -EINVAL; + /* + * 256M is a reasonable upper bound for physical block size, + * io_min and io_opt; it aligns with the maximum physical + * block size possible in NVMe. + */ + if (p->physical_bs_shift > ilog2(SZ_256M)) + return -EINVAL; + + if (p->io_min_shift > ilog2(SZ_256M)) + return -EINVAL; + + if (p->io_opt_shift > ilog2(SZ_256M)) + return -EINVAL; + if (p->logical_bs_shift > p->physical_bs_shift) return -EINVAL; if (p->max_sectors > (ub->dev_info.max_io_buf_bytes >> 9)) return -EINVAL; + if (p->max_sectors < PAGE_SECTORS) + return -EINVAL; + if (ublk_dev_is_zoned(ub) && !p->chunk_sectors) return -EINVAL; } else @@ -2397,8 +2414,14 @@ static void ublk_reset_ch_dev(struct ublk_device *ub) { int i; - for (i = 0; i < ub->dev_info.nr_hw_queues; i++) - ublk_queue_reinit(ub, ublk_get_queue(ub, i)); + for (i = 0; i < ub->dev_info.nr_hw_queues; i++) { + struct ublk_queue *ubq = ublk_get_queue(ub, i); + + /* Sync with ublk_cancel_cmd() */ + spin_lock(&ubq->cancel_lock); + ublk_queue_reinit(ub, ubq); + spin_unlock(&ubq->cancel_lock); + } /* set to NULL, otherwise new tasks cannot mmap io_cmd_buf */ ub->mm = NULL; @@ -2739,6 +2762,7 @@ static void ublk_cancel_cmd(struct ublk_queue *ubq, unsigned tag, { struct ublk_io *io = &ubq->ios[tag]; struct ublk_device *ub = ubq->dev; + struct io_uring_cmd *cmd = NULL; struct request *req; bool done; @@ -2761,12 +2785,15 @@ static void ublk_cancel_cmd(struct ublk_queue *ubq, unsigned tag, spin_lock(&ubq->cancel_lock); done = !!(io->flags & UBLK_IO_FLAG_CANCELED); - if (!done) + if (!done) { io->flags |= UBLK_IO_FLAG_CANCELED; + cmd = io->cmd; + io->cmd = NULL; + } spin_unlock(&ubq->cancel_lock); - if (!done) - io_uring_cmd_done(io->cmd, UBLK_IO_RES_ABORT, issue_flags); + if (!done && cmd) + io_uring_cmd_done(cmd, UBLK_IO_RES_ABORT, issue_flags); } /* @@ -3496,8 +3523,10 @@ static void ublk_ch_uring_cmd_cb(struct io_tw_req tw_req, io_tw_token_t tw) { unsigned int issue_flags = IO_URING_CMD_TASK_WORK_ISSUE_FLAGS; struct io_uring_cmd *cmd = io_uring_cmd_from_tw(tw_req); - int ret = ublk_ch_uring_cmd_local(cmd, issue_flags); + int ret = -ECANCELED; + if (!tw.cancel) + ret = ublk_ch_uring_cmd_local(cmd, issue_flags); if (ret != -EIOCBQUEUED) io_uring_cmd_done(cmd, ret, issue_flags); } @@ -4990,13 +5019,15 @@ static int ublk_ctrl_set_params(struct ublk_device *ub, */ ret = -EACCES; } else if (copy_from_user(&ub->params, argp, ph.len)) { + /* zero out partial copy so no stale params survive */ + memset(&ub->params, 0, sizeof(ub->params)); ret = -EFAULT; } else { /* clear all we don't support yet */ ub->params.types &= UBLK_PARAM_TYPE_ALL; ret = ublk_validate_params(ub); if (ret) - ub->params.types = 0; + memset(&ub->params, 0, sizeof(ub->params)); } mutex_unlock(&ub->mutex); diff --git a/drivers/bluetooth/btintel_pcie.c b/drivers/bluetooth/btintel_pcie.c index 2f59c0d6f9ec4c..a3643e67b33fd1 100644 --- a/drivers/bluetooth/btintel_pcie.c +++ b/drivers/bluetooth/btintel_pcie.c @@ -289,6 +289,9 @@ static inline void btintel_pcie_dump_debug_registers(struct hci_dev *hdev) skb_put_data(skb, buf, strlen(buf)); data->boot_stage_cache = reg; + if (reg & BTINTEL_PCIE_CSR_BOOT_STAGE_DEVICE_WARNING) + bt_dev_warn(hdev, "Controller device warning (boot_stage: 0x%8.8x)", reg); + reg = btintel_pcie_rd_reg32(data, BTINTEL_PCIE_CSR_IPC_STATUS_REG); snprintf(buf, sizeof(buf), "ipc status: 0x%8.8x", reg); skb_put_data(skb, buf, strlen(buf)); @@ -880,8 +883,11 @@ static inline bool btintel_pcie_in_lockdown(struct btintel_pcie_data *data) static inline bool btintel_pcie_in_error(struct btintel_pcie_data *data) { - return (data->boot_stage_cache & BTINTEL_PCIE_CSR_BOOT_STAGE_DEVICE_ERR) || - (data->boot_stage_cache & BTINTEL_PCIE_CSR_BOOT_STAGE_ABORT_HANDLER); + if (data->boot_stage_cache & BTINTEL_PCIE_CSR_BOOT_STAGE_DEVICE_WARNING) + bt_dev_warn(data->hdev, "Controller device warning (boot_stage: 0x%8.8x)", + data->boot_stage_cache); + + return data->boot_stage_cache & BTINTEL_PCIE_CSR_BOOT_STAGE_ABORT_HANDLER; } static void btintel_pcie_msix_gp1_handler(struct btintel_pcie_data *data) @@ -914,7 +920,8 @@ static void btintel_pcie_msix_gp0_handler(struct btintel_pcie_data *data) data->img_resp_cache = reg; if (btintel_pcie_in_error(data)) { - bt_dev_err(data->hdev, "Controller in error state"); + bt_dev_err(data->hdev, "Controller in error state (boot_stage: 0x%8.8x)", + data->boot_stage_cache); btintel_pcie_dump_debug_registers(data->hdev); return; } diff --git a/drivers/bluetooth/btintel_pcie.h b/drivers/bluetooth/btintel_pcie.h index 3c7bb708362ded..f922abd1e7d881 100644 --- a/drivers/bluetooth/btintel_pcie.h +++ b/drivers/bluetooth/btintel_pcie.h @@ -48,7 +48,7 @@ #define BTINTEL_PCIE_CSR_BOOT_STAGE_OPFW (BIT(2)) #define BTINTEL_PCIE_CSR_BOOT_STAGE_ROM_LOCKDOWN (BIT(10)) #define BTINTEL_PCIE_CSR_BOOT_STAGE_IML_LOCKDOWN (BIT(11)) -#define BTINTEL_PCIE_CSR_BOOT_STAGE_DEVICE_ERR (BIT(12)) +#define BTINTEL_PCIE_CSR_BOOT_STAGE_DEVICE_WARNING (BIT(12)) #define BTINTEL_PCIE_CSR_BOOT_STAGE_ABORT_HANDLER (BIT(13)) #define BTINTEL_PCIE_CSR_BOOT_STAGE_DEVICE_HALTED (BIT(14)) #define BTINTEL_PCIE_CSR_BOOT_STAGE_MAC_ACCESS_ON (BIT(16)) diff --git a/drivers/bluetooth/btmtk.c b/drivers/bluetooth/btmtk.c index 6fb6ca2748086e..f70c1b0f899035 100644 --- a/drivers/bluetooth/btmtk.c +++ b/drivers/bluetooth/btmtk.c @@ -695,8 +695,13 @@ static int btmtk_usb_hci_wmt_sync(struct hci_dev *hdev, if (data->evt_skb == NULL) goto err_free_wc; - /* Parse and handle the return WMT event */ - wmt_evt = (struct btmtk_hci_wmt_evt *)data->evt_skb->data; + wmt_evt = skb_pull_data(data->evt_skb, sizeof(*wmt_evt)); + if (!wmt_evt) { + bt_dev_err(hdev, "WMT event too short (%u bytes)", + data->evt_skb->len); + err = -EINVAL; + goto err_free_skb; + } if (wmt_evt->whdr.op != hdr->op) { bt_dev_err(hdev, "Wrong op received %d expected %d", wmt_evt->whdr.op, hdr->op); @@ -712,6 +717,12 @@ static int btmtk_usb_hci_wmt_sync(struct hci_dev *hdev, status = BTMTK_WMT_PATCH_DONE; break; case BTMTK_WMT_FUNC_CTRL: + if (!skb_pull_data(data->evt_skb, + sizeof(wmt_evt_funcc->status))) { + err = -EINVAL; + goto err_free_skb; + } + wmt_evt_funcc = (struct btmtk_hci_wmt_evt_funcc *)wmt_evt; if (be16_to_cpu(wmt_evt_funcc->status) == 0x404) status = BTMTK_WMT_ON_DONE; diff --git a/drivers/bluetooth/hci_ath.c b/drivers/bluetooth/hci_ath.c index fa679ad0acdfa1..8201fa7f61e848 100644 --- a/drivers/bluetooth/hci_ath.c +++ b/drivers/bluetooth/hci_ath.c @@ -191,6 +191,9 @@ static int ath_recv(struct hci_uart *hu, const void *data, int count) { struct ath_struct *ath = hu->priv; + if (!ath) + return -ENODEV; + ath->rx_skb = h4_recv_buf(hu, ath->rx_skb, data, count, ath_recv_pkts, ARRAY_SIZE(ath_recv_pkts)); if (IS_ERR(ath->rx_skb)) { diff --git a/drivers/bluetooth/hci_bcsp.c b/drivers/bluetooth/hci_bcsp.c index b386f91d8b46d5..db56eead27cebd 100644 --- a/drivers/bluetooth/hci_bcsp.c +++ b/drivers/bluetooth/hci_bcsp.c @@ -585,6 +585,9 @@ static int bcsp_recv(struct hci_uart *hu, const void *data, int count) if (!test_bit(HCI_UART_REGISTERED, &hu->flags)) return -EUNATCH; + if (!bcsp) + return -ENODEV; + BT_DBG("hu %p count %d rx_state %d rx_count %ld", hu, count, bcsp->rx_state, bcsp->rx_count); diff --git a/drivers/bluetooth/hci_h4.c b/drivers/bluetooth/hci_h4.c index a889a66a326f7a..7673727074985c 100644 --- a/drivers/bluetooth/hci_h4.c +++ b/drivers/bluetooth/hci_h4.c @@ -109,6 +109,9 @@ static int h4_recv(struct hci_uart *hu, const void *data, int count) { struct h4_struct *h4 = hu->priv; + if (!h4) + return -ENODEV; + h4->rx_skb = h4_recv_buf(hu, h4->rx_skb, data, count, h4_recv_pkts, ARRAY_SIZE(h4_recv_pkts)); if (IS_ERR(h4->rx_skb)) { diff --git a/drivers/bluetooth/hci_h5.c b/drivers/bluetooth/hci_h5.c index cfdf75dc284751..d3538371821254 100644 --- a/drivers/bluetooth/hci_h5.c +++ b/drivers/bluetooth/hci_h5.c @@ -587,6 +587,9 @@ static int h5_recv(struct hci_uart *hu, const void *data, int count) struct h5 *h5 = hu->priv; const unsigned char *ptr = data; + if (!h5) + return -ENODEV; + BT_DBG("%s pending %zu count %d", hu->hdev->name, h5->rx_pending, count); diff --git a/drivers/bluetooth/virtio_bt.c b/drivers/bluetooth/virtio_bt.c index 76d61af8a275e9..140ab55c9fc5a9 100644 --- a/drivers/bluetooth/virtio_bt.c +++ b/drivers/bluetooth/virtio_bt.c @@ -12,6 +12,7 @@ #include #define VERSION "0.1" +#define VIRTBT_RX_BUF_SIZE 1000 enum { VIRTBT_VQ_TX, @@ -33,11 +34,11 @@ static int virtbt_add_inbuf(struct virtio_bluetooth *vbt) struct sk_buff *skb; int err; - skb = alloc_skb(1000, GFP_KERNEL); + skb = alloc_skb(VIRTBT_RX_BUF_SIZE, GFP_KERNEL); if (!skb) return -ENOMEM; - sg_init_one(sg, skb->data, 1000); + sg_init_one(sg, skb->data, VIRTBT_RX_BUF_SIZE); err = virtqueue_add_inbuf(vq, sg, 1, skb, GFP_KERNEL); if (err < 0) { @@ -197,6 +198,7 @@ static int virtbt_shutdown_generic(struct hci_dev *hdev) static void virtbt_rx_handle(struct virtio_bluetooth *vbt, struct sk_buff *skb) { + size_t min_hdr; __u8 pkt_type; pkt_type = *((__u8 *) skb->data); @@ -204,16 +206,32 @@ static void virtbt_rx_handle(struct virtio_bluetooth *vbt, struct sk_buff *skb) switch (pkt_type) { case HCI_EVENT_PKT: + min_hdr = sizeof(struct hci_event_hdr); + break; case HCI_ACLDATA_PKT: + min_hdr = sizeof(struct hci_acl_hdr); + break; case HCI_SCODATA_PKT: + min_hdr = sizeof(struct hci_sco_hdr); + break; case HCI_ISODATA_PKT: - hci_skb_pkt_type(skb) = pkt_type; - hci_recv_frame(vbt->hdev, skb); + min_hdr = sizeof(struct hci_iso_hdr); break; default: kfree_skb(skb); - break; + return; + } + + if (skb->len < min_hdr) { + bt_dev_err_ratelimited(vbt->hdev, + "rx pkt_type 0x%02x payload %u < hdr %zu\n", + pkt_type, skb->len, min_hdr); + kfree_skb(skb); + return; } + + hci_skb_pkt_type(skb) = pkt_type; + hci_recv_frame(vbt->hdev, skb); } static void virtbt_rx_work(struct work_struct *work) @@ -227,8 +245,15 @@ static void virtbt_rx_work(struct work_struct *work) if (!skb) return; - skb_put(skb, len); - virtbt_rx_handle(vbt, skb); + if (!len || len > VIRTBT_RX_BUF_SIZE) { + bt_dev_err_ratelimited(vbt->hdev, + "rx reply len %u outside [1, %u]\n", + len, VIRTBT_RX_BUF_SIZE); + kfree_skb(skb); + } else { + skb_put(skb, len); + virtbt_rx_handle(vbt, skb); + } if (virtbt_add_inbuf(vbt) < 0) return; diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c index 4a9e9de4d684f9..9a9d12be9bf743 100644 --- a/drivers/char/ipmi/ipmi_si_intf.c +++ b/drivers/char/ipmi/ipmi_si_intf.c @@ -168,6 +168,10 @@ struct smi_info { OEM2_DATA_AVAIL) unsigned char msg_flags; + /* When requesting events and messages, don't do it forever. */ + unsigned int num_requests_in_a_row; + bool last_was_flag_fetch; + /* Does the BMC have an event buffer? */ bool has_event_buffer; @@ -410,7 +414,10 @@ static void start_getting_msg_queue(struct smi_info *smi_info) start_new_msg(smi_info, smi_info->curr_msg->data, smi_info->curr_msg->data_size); - smi_info->si_state = SI_GETTING_MESSAGES; + if (smi_info->si_state != SI_GETTING_MESSAGES) { + smi_info->num_requests_in_a_row = 0; + smi_info->si_state = SI_GETTING_MESSAGES; + } } static void start_getting_events(struct smi_info *smi_info) @@ -421,7 +428,10 @@ static void start_getting_events(struct smi_info *smi_info) start_new_msg(smi_info, smi_info->curr_msg->data, smi_info->curr_msg->data_size); - smi_info->si_state = SI_GETTING_EVENTS; + if (smi_info->si_state != SI_GETTING_EVENTS) { + smi_info->num_requests_in_a_row = 0; + smi_info->si_state = SI_GETTING_EVENTS; + } } /* @@ -487,15 +497,19 @@ static void handle_flags(struct smi_info *smi_info) } else if (smi_info->msg_flags & RECEIVE_MSG_AVAIL) { /* Messages available. */ smi_info->curr_msg = alloc_msg_handle_irq(smi_info); - if (!smi_info->curr_msg) + if (!smi_info->curr_msg) { + smi_info->si_state = SI_NORMAL; return; + } start_getting_msg_queue(smi_info); } else if (smi_info->msg_flags & EVENT_MSG_BUFFER_FULL) { /* Events available. */ smi_info->curr_msg = alloc_msg_handle_irq(smi_info); - if (!smi_info->curr_msg) + if (!smi_info->curr_msg) { + smi_info->si_state = SI_NORMAL; return; + } start_getting_events(smi_info); } else if (smi_info->msg_flags & OEM_DATA_AVAIL && @@ -595,6 +609,7 @@ static void handle_transaction_done(struct smi_info *smi_info) smi_info->si_state = SI_NORMAL; } else { smi_info->msg_flags = msg[3]; + smi_info->last_was_flag_fetch = true; handle_flags(smi_info); } break; @@ -630,7 +645,13 @@ static void handle_transaction_done(struct smi_info *smi_info) */ msg = smi_info->curr_msg; smi_info->curr_msg = NULL; - if (msg->rsp[2] != 0) { + /* + * It appears some BMCs, with no event data, return no + * data in the message and not a 0x80 error as the + * spec says they should. Shut down processing if + * the data is not the right length. + */ + if (msg->rsp[2] != 0 || msg->rsp_size != 19) { /* Error getting event, probably done. */ msg->done(msg); @@ -640,6 +661,11 @@ static void handle_transaction_done(struct smi_info *smi_info) } else { smi_inc_stat(smi_info, events); + smi_info->num_requests_in_a_row++; + if (smi_info->num_requests_in_a_row > 10) + /* Stop if we do this too many times. */ + smi_info->msg_flags &= ~EVENT_MSG_BUFFER_FULL; + /* * Do this before we deliver the message * because delivering the message releases the @@ -678,6 +704,11 @@ static void handle_transaction_done(struct smi_info *smi_info) } else { smi_inc_stat(smi_info, incoming_messages); + smi_info->num_requests_in_a_row++; + if (smi_info->num_requests_in_a_row > 10) + /* Stop if we do this too many times. */ + smi_info->msg_flags &= ~RECEIVE_MSG_AVAIL; + /* * Do this before we deliver the message * because delivering the message releases the @@ -819,6 +850,26 @@ static enum si_sm_result smi_event_handler(struct smi_info *smi_info, goto out; } + /* + * If we are currently idle, or if the last thing that was + * done was a flag fetch and there is a message pending, try + * to start the next message. + * + * We do the waiting message check to avoid a stuck flag + * completely wedging the driver. Let a message through + * in between flag operations if that happens. + */ + if (si_sm_result == SI_SM_IDLE || + (si_sm_result == SI_SM_ATTN && smi_info->waiting_msg && + smi_info->last_was_flag_fetch)) { + smi_info->last_was_flag_fetch = false; + smi_inc_stat(smi_info, idles); + + si_sm_result = start_next_msg(smi_info); + if (si_sm_result != SI_SM_IDLE) + goto restart; + } + /* * We prefer handling attn over new messages. But don't do * this if there is not yet an upper layer to handle anything. @@ -846,15 +897,6 @@ static enum si_sm_result smi_event_handler(struct smi_info *smi_info, } } - /* If we are currently idle, try to start the next message. */ - if (si_sm_result == SI_SM_IDLE) { - smi_inc_stat(smi_info, idles); - - si_sm_result = start_next_msg(smi_info); - if (si_sm_result != SI_SM_IDLE) - goto restart; - } - if ((si_sm_result == SI_SM_IDLE) && (atomic_read(&smi_info->req_events))) { /* diff --git a/drivers/char/ipmi/ipmi_ssif.c b/drivers/char/ipmi/ipmi_ssif.c index b49500a1bd3637..f419b46bf00207 100644 --- a/drivers/char/ipmi/ipmi_ssif.c +++ b/drivers/char/ipmi/ipmi_ssif.c @@ -225,6 +225,9 @@ struct ssif_info { bool has_event_buffer; bool supports_alert; + /* When requesting events and messages, don't do it forever. */ + unsigned int num_requests_in_a_row; + /* * Used to tell what we should do with alerts. If we are * waiting on a response, read the data immediately. @@ -413,7 +416,10 @@ static void start_event_fetch(struct ssif_info *ssif_info, unsigned long *flags) } ssif_info->curr_msg = msg; - ssif_info->ssif_state = SSIF_GETTING_EVENTS; + if (ssif_info->ssif_state != SSIF_GETTING_EVENTS) { + ssif_info->num_requests_in_a_row = 0; + ssif_info->ssif_state = SSIF_GETTING_EVENTS; + } ipmi_ssif_unlock_cond(ssif_info, flags); msg->data[0] = (IPMI_NETFN_APP_REQUEST << 2); @@ -436,7 +442,10 @@ static void start_recv_msg_fetch(struct ssif_info *ssif_info, } ssif_info->curr_msg = msg; - ssif_info->ssif_state = SSIF_GETTING_MESSAGES; + if (ssif_info->ssif_state != SSIF_GETTING_MESSAGES) { + ssif_info->num_requests_in_a_row = 0; + ssif_info->ssif_state = SSIF_GETTING_MESSAGES; + } ipmi_ssif_unlock_cond(ssif_info, flags); msg->data[0] = (IPMI_NETFN_APP_REQUEST << 2); @@ -843,6 +852,11 @@ static void msg_done_handler(struct ssif_info *ssif_info, int result, ssif_info->msg_flags &= ~EVENT_MSG_BUFFER_FULL; handle_flags(ssif_info, flags); } else { + ssif_info->num_requests_in_a_row++; + if (ssif_info->num_requests_in_a_row > 10) + /* Stop if we do this too many times. */ + ssif_info->msg_flags &= ~EVENT_MSG_BUFFER_FULL; + handle_flags(ssif_info, flags); ssif_inc_stat(ssif_info, events); deliver_recv_msg(ssif_info, msg); @@ -876,6 +890,11 @@ static void msg_done_handler(struct ssif_info *ssif_info, int result, ssif_info->msg_flags &= ~RECEIVE_MSG_AVAIL; handle_flags(ssif_info, flags); } else { + ssif_info->num_requests_in_a_row++; + if (ssif_info->num_requests_in_a_row > 10) + /* Stop if we do this too many times. */ + ssif_info->msg_flags &= ~RECEIVE_MSG_AVAIL; + ssif_inc_stat(ssif_info, incoming_messages); handle_flags(ssif_info, flags); deliver_recv_msg(ssif_info, msg); @@ -1886,6 +1905,7 @@ static int ssif_probe(struct i2c_client *client) "kssif%4.4x", thread_num); if (IS_ERR(ssif_info->thread)) { rv = PTR_ERR(ssif_info->thread); + ssif_info->thread = NULL; dev_notice(&ssif_info->client->dev, "Could not start kernel thread: error %d\n", rv); diff --git a/drivers/clk/clk-eyeq.c b/drivers/clk/clk-eyeq.c index c1dccedf8d5b1f..d9303c2c7aa55e 100644 --- a/drivers/clk/clk-eyeq.c +++ b/drivers/clk/clk-eyeq.c @@ -110,6 +110,7 @@ struct eqc_match_data { const char *reset_auxdev_name; const char *pinctrl_auxdev_name; + const char *eth_phy_auxdev_name; unsigned int early_clk_count; }; @@ -321,38 +322,18 @@ static void eqc_probe_init_fixed_factors(struct device *dev, } } -static void eqc_auxdev_release(struct device *dev) -{ - struct auxiliary_device *adev = to_auxiliary_dev(dev); - - kfree(adev); -} - -static int eqc_auxdev_create(struct device *dev, void __iomem *base, - const char *name, u32 id) +static void eqc_auxdev_create_optional(struct device *dev, void __iomem *base, + const char *name) { struct auxiliary_device *adev; - int ret; - - adev = kzalloc_obj(*adev); - if (!adev) - return -ENOMEM; - - adev->name = name; - adev->dev.parent = dev; - adev->dev.platform_data = (void __force *)base; - adev->dev.release = eqc_auxdev_release; - adev->id = id; - ret = auxiliary_device_init(adev); - if (ret) - return ret; - - ret = auxiliary_device_add(adev); - if (ret) - auxiliary_device_uninit(adev); - - return ret; + if (name) { + adev = devm_auxiliary_device_create(dev, name, + (void __force *)base); + if (!adev) + dev_warn(dev, "failed creating auxiliary device %s.%s\n", + KBUILD_MODNAME, name); + } } static int eqc_probe(struct platform_device *pdev) @@ -364,7 +345,6 @@ static int eqc_probe(struct platform_device *pdev) unsigned int i, clk_count; struct resource *res; void __iomem *base; - int ret; data = device_get_match_data(dev); if (!data) @@ -378,21 +358,10 @@ static int eqc_probe(struct platform_device *pdev) if (!base) return -ENOMEM; - /* Init optional reset auxiliary device. */ - if (data->reset_auxdev_name) { - ret = eqc_auxdev_create(dev, base, data->reset_auxdev_name, 0); - if (ret) - dev_warn(dev, "failed creating auxiliary device %s.%s: %d\n", - KBUILD_MODNAME, data->reset_auxdev_name, ret); - } - - /* Init optional pinctrl auxiliary device. */ - if (data->pinctrl_auxdev_name) { - ret = eqc_auxdev_create(dev, base, data->pinctrl_auxdev_name, 0); - if (ret) - dev_warn(dev, "failed creating auxiliary device %s.%s: %d\n", - KBUILD_MODNAME, data->pinctrl_auxdev_name, ret); - } + /* Init optional auxiliary devices. */ + eqc_auxdev_create_optional(dev, base, data->reset_auxdev_name); + eqc_auxdev_create_optional(dev, base, data->pinctrl_auxdev_name); + eqc_auxdev_create_optional(dev, base, data->eth_phy_auxdev_name); if (data->pll_count + data->div_count + data->fixed_factor_count == 0) return 0; /* Zero clocks, we are done. */ @@ -553,6 +522,7 @@ static const struct eqc_match_data eqc_eyeq5_match_data = { .reset_auxdev_name = "reset", .pinctrl_auxdev_name = "pinctrl", + .eth_phy_auxdev_name = "phy", .early_clk_count = ARRAY_SIZE(eqc_eyeq5_early_plls) + ARRAY_SIZE(eqc_eyeq5_early_fixed_factors), diff --git a/drivers/clk/clk-rk808.c b/drivers/clk/clk-rk808.c index f7412b137e5ef4..5a75b5c9155519 100644 --- a/drivers/clk/clk-rk808.c +++ b/drivers/clk/clk-rk808.c @@ -153,7 +153,7 @@ static int rk808_clkout_probe(struct platform_device *pdev) struct rk808_clkout *rk808_clkout; int ret; - dev->of_node = pdev->dev.parent->of_node; + device_set_of_node_from_dev(dev, dev->parent); rk808_clkout = devm_kzalloc(dev, sizeof(*rk808_clkout), GFP_KERNEL); diff --git a/drivers/clk/spacemit/ccu-k3.c b/drivers/clk/spacemit/ccu-k3.c index e98afd59f05c64..bb8b75bdbdb30d 100644 --- a/drivers/clk/spacemit/ccu-k3.c +++ b/drivers/clk/spacemit/ccu-k3.c @@ -846,7 +846,7 @@ static const struct clk_parent_data top_parents[] = { CCU_PARENT_HW(pll6_d3), }; CCU_MUX_DIV_GATE_FC_DEFINE(top_dclk, top_parents, APMU_TOP_DCLK_CTRL, 5, 3, - BIT(8), 2, 3, BIT(1), 0); + BIT(8), 2, 3, BIT(1), CLK_IS_CRITICAL); static const struct clk_parent_data ucie_parents[] = { CCU_PARENT_HW(pll1_d8_307p2), diff --git a/drivers/edac/versalnet_edac.c b/drivers/edac/versalnet_edac.c index ec13155824141d..97ec05d68bbbce 100644 --- a/drivers/edac/versalnet_edac.c +++ b/drivers/edac/versalnet_edac.c @@ -777,9 +777,9 @@ static int init_one_mc(struct mc_priv *priv, struct platform_device *pdev, int i u32 num_chans, rank, dwidth, config; struct edac_mc_layer layers[2]; struct mem_ctl_info *mci; + char name[MC_NAME_LEN]; struct device *dev; enum dev_type dt; - char *name; int rc; config = priv->adec[CONF + i * ADEC_NUM]; @@ -813,13 +813,9 @@ static int init_one_mc(struct mc_priv *priv, struct platform_device *pdev, int i layers[1].is_virt_csrow = false; rc = -ENOMEM; - name = kzalloc(MC_NAME_LEN, GFP_KERNEL); - if (!name) - return rc; - dev = kzalloc(sizeof(*dev), GFP_KERNEL); if (!dev) - goto err_name_free; + return rc; mci = edac_mc_alloc(i, ARRAY_SIZE(layers), layers, sizeof(struct mc_priv)); if (!mci) { @@ -858,8 +854,6 @@ static int init_one_mc(struct mc_priv *priv, struct platform_device *pdev, int i edac_mc_free(mci); err_dev_free: kfree(dev); -err_name_free: - kfree(name); return rc; } diff --git a/drivers/firmware/efi/efi-pstore.c b/drivers/firmware/efi/efi-pstore.c index a253b61449459e..a5db3534f0a632 100644 --- a/drivers/firmware/efi/efi-pstore.c +++ b/drivers/firmware/efi/efi-pstore.c @@ -60,8 +60,10 @@ static int efi_pstore_open(struct pstore_info *psi) return err; psi->data = kzalloc(record_size, GFP_KERNEL); - if (!psi->data) + if (!psi->data) { + efivar_unlock(); return -ENOMEM; + } return 0; } diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile index 983a438e35f3d3..cfedb3025c2638 100644 --- a/drivers/firmware/efi/libstub/Makefile +++ b/drivers/firmware/efi/libstub/Makefile @@ -66,7 +66,7 @@ KBUILD_AFLAGS := $(KBUILD_CFLAGS) -D__ASSEMBLY__ lib-y := efi-stub-helper.o gop.o secureboot.o tpm.o \ file.o mem.o random.o randomalloc.o pci.o \ skip_spaces.o lib-cmdline.o lib-ctype.o \ - alignedmem.o relocate.o printk.o vsprintf.o + alignedmem.o printk.o vsprintf.o # include the stub's libfdt dependencies from lib/ when needed libfdt-deps := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c \ diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h index 979a21818cc1c6..fd91fc15ec810b 100644 --- a/drivers/firmware/efi/libstub/efistub.h +++ b/drivers/firmware/efi/libstub/efistub.h @@ -1104,13 +1104,6 @@ efi_status_t efi_allocate_pages_aligned(unsigned long size, unsigned long *addr, efi_status_t efi_low_alloc_above(unsigned long size, unsigned long align, unsigned long *addr, unsigned long min); -efi_status_t efi_relocate_kernel(unsigned long *image_addr, - unsigned long image_size, - unsigned long alloc_size, - unsigned long preferred_addr, - unsigned long alignment, - unsigned long min_addr); - efi_status_t efi_parse_options(char const *cmdline); void efi_parse_option_graphics(char *option); diff --git a/drivers/firmware/efi/libstub/loongarch-stub.c b/drivers/firmware/efi/libstub/loongarch-stub.c index 736b6aae323d35..c87ac70251076c 100644 --- a/drivers/firmware/efi/libstub/loongarch-stub.c +++ b/drivers/firmware/efi/libstub/loongarch-stub.c @@ -14,6 +14,86 @@ extern int kernel_asize; extern int kernel_fsize; extern int kernel_entry; +/** + * efi_relocate_kernel() - copy memory area + * @image_addr: pointer to address of memory area to copy + * @image_size: size of memory area to copy + * @alloc_size: minimum size of memory to allocate, must be greater or + * equal to image_size + * @preferred_addr: preferred target address + * @alignment: minimum alignment of the allocated memory area. It + * should be a power of two. + * @min_addr: minimum target address + * + * Copy a memory area to a newly allocated memory area aligned according + * to @alignment but at least EFI_ALLOC_ALIGN. If the preferred address + * is not available, the allocated address will not be below @min_addr. + * On exit, @image_addr is updated to the target copy address that was used. + * + * This function is used to copy the Linux kernel verbatim. It does not apply + * any relocation changes. + * + * Return: status code + */ +static +efi_status_t efi_relocate_kernel(unsigned long *image_addr, + unsigned long image_size, + unsigned long alloc_size, + unsigned long preferred_addr, + unsigned long alignment, + unsigned long min_addr) +{ + unsigned long cur_image_addr; + unsigned long new_addr = 0; + efi_status_t status; + unsigned long nr_pages; + efi_physical_addr_t efi_addr = preferred_addr; + + if (!image_addr || !image_size || !alloc_size) + return EFI_INVALID_PARAMETER; + if (alloc_size < image_size) + return EFI_INVALID_PARAMETER; + + cur_image_addr = *image_addr; + + /* + * The EFI firmware loader could have placed the kernel image + * anywhere in memory, but the kernel has restrictions on the + * max physical address it can run at. Some architectures + * also have a preferred address, so first try to relocate + * to the preferred address. If that fails, allocate as low + * as possible while respecting the required alignment. + */ + nr_pages = round_up(alloc_size, EFI_ALLOC_ALIGN) / EFI_PAGE_SIZE; + status = efi_bs_call(allocate_pages, EFI_ALLOCATE_ADDRESS, + EFI_LOADER_DATA, nr_pages, &efi_addr); + new_addr = efi_addr; + /* + * If preferred address allocation failed allocate as low as + * possible. + */ + if (status != EFI_SUCCESS) { + status = efi_low_alloc_above(alloc_size, alignment, &new_addr, + min_addr); + } + if (status != EFI_SUCCESS) { + efi_err("Failed to allocate usable memory for kernel.\n"); + return status; + } + + /* + * We know source/dest won't overlap since both memory ranges + * have been allocated by UEFI, so we can safely use memcpy. + */ + memcpy((void *)new_addr, (void *)cur_image_addr, image_size); + efi_cache_sync_image(new_addr, image_size); + + /* Return the new address of the relocated image. */ + *image_addr = new_addr; + + return status; +} + efi_status_t handle_kernel_image(unsigned long *image_addr, unsigned long *image_size, unsigned long *reserve_addr, diff --git a/drivers/firmware/efi/libstub/loongarch.c b/drivers/firmware/efi/libstub/loongarch.c index 9825f5218137f2..f7938d5c196aad 100644 --- a/drivers/firmware/efi/libstub/loongarch.c +++ b/drivers/firmware/efi/libstub/loongarch.c @@ -18,6 +18,11 @@ efi_status_t check_platform_features(void) return EFI_SUCCESS; } +void efi_cache_sync_image(unsigned long image_base, unsigned long alloc_size) +{ + asm volatile ("ibar 0" ::: "memory"); +} + struct exit_boot_struct { efi_memory_desc_t *runtime_map; int runtime_entry_count; diff --git a/drivers/firmware/efi/libstub/mem.c b/drivers/firmware/efi/libstub/mem.c index 9c82259eea8166..59f3f83de50c2d 100644 --- a/drivers/firmware/efi/libstub/mem.c +++ b/drivers/firmware/efi/libstub/mem.c @@ -124,3 +124,85 @@ void efi_free(unsigned long size, unsigned long addr) nr_pages = round_up(size, EFI_ALLOC_ALIGN) / EFI_PAGE_SIZE; efi_bs_call(free_pages, addr, nr_pages); } + +/** + * efi_low_alloc_above() - allocate pages at or above given address + * @size: size of the memory area to allocate + * @align: minimum alignment of the allocated memory area. It should + * a power of two. + * @addr: on exit the address of the allocated memory + * @min: minimum address to used for the memory allocation + * + * Allocate at the lowest possible address that is not below @min as + * EFI_LOADER_DATA. The allocated pages are aligned according to @align but at + * least EFI_ALLOC_ALIGN. The first allocated page will not below the address + * given by @min. + * + * Return: status code + */ +efi_status_t efi_low_alloc_above(unsigned long size, unsigned long align, + unsigned long *addr, unsigned long min) +{ + struct efi_boot_memmap *map __free(efi_pool) = NULL; + efi_status_t status; + unsigned long nr_pages; + int i; + + status = efi_get_memory_map(&map, false); + if (status != EFI_SUCCESS) + return status; + + /* + * Enforce minimum alignment that EFI or Linux requires when + * requesting a specific address. We are doing page-based (or + * larger) allocations, and both the address and size must meet + * alignment constraints. + */ + if (align < EFI_ALLOC_ALIGN) + align = EFI_ALLOC_ALIGN; + + size = round_up(size, EFI_ALLOC_ALIGN); + nr_pages = size / EFI_PAGE_SIZE; + for (i = 0; i < map->map_size / map->desc_size; i++) { + efi_memory_desc_t *desc; + unsigned long m = (unsigned long)map->map; + u64 start, end; + + desc = efi_memdesc_ptr(m, map->desc_size, i); + + if (desc->type != EFI_CONVENTIONAL_MEMORY) + continue; + + if (desc->attribute & EFI_MEMORY_HOT_PLUGGABLE) + continue; + + if (efi_soft_reserve_enabled() && + (desc->attribute & EFI_MEMORY_SP)) + continue; + + if (desc->num_pages < nr_pages) + continue; + + start = desc->phys_addr; + end = start + desc->num_pages * EFI_PAGE_SIZE; + + if (start < min) + start = min; + + start = round_up(start, align); + if ((start + size) > end) + continue; + + status = efi_bs_call(allocate_pages, EFI_ALLOCATE_ADDRESS, + EFI_LOADER_DATA, nr_pages, &start); + if (status == EFI_SUCCESS) { + *addr = start; + break; + } + } + + if (i == map->map_size / map->desc_size) + return EFI_NOT_FOUND; + + return EFI_SUCCESS; +} diff --git a/drivers/firmware/efi/libstub/relocate.c b/drivers/firmware/efi/libstub/relocate.c deleted file mode 100644 index d4264bfb6dc174..00000000000000 --- a/drivers/firmware/efi/libstub/relocate.c +++ /dev/null @@ -1,166 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 - -#include -#include - -#include "efistub.h" - -/** - * efi_low_alloc_above() - allocate pages at or above given address - * @size: size of the memory area to allocate - * @align: minimum alignment of the allocated memory area. It should - * a power of two. - * @addr: on exit the address of the allocated memory - * @min: minimum address to used for the memory allocation - * - * Allocate at the lowest possible address that is not below @min as - * EFI_LOADER_DATA. The allocated pages are aligned according to @align but at - * least EFI_ALLOC_ALIGN. The first allocated page will not below the address - * given by @min. - * - * Return: status code - */ -efi_status_t efi_low_alloc_above(unsigned long size, unsigned long align, - unsigned long *addr, unsigned long min) -{ - struct efi_boot_memmap *map __free(efi_pool) = NULL; - efi_status_t status; - unsigned long nr_pages; - int i; - - status = efi_get_memory_map(&map, false); - if (status != EFI_SUCCESS) - return status; - - /* - * Enforce minimum alignment that EFI or Linux requires when - * requesting a specific address. We are doing page-based (or - * larger) allocations, and both the address and size must meet - * alignment constraints. - */ - if (align < EFI_ALLOC_ALIGN) - align = EFI_ALLOC_ALIGN; - - size = round_up(size, EFI_ALLOC_ALIGN); - nr_pages = size / EFI_PAGE_SIZE; - for (i = 0; i < map->map_size / map->desc_size; i++) { - efi_memory_desc_t *desc; - unsigned long m = (unsigned long)map->map; - u64 start, end; - - desc = efi_memdesc_ptr(m, map->desc_size, i); - - if (desc->type != EFI_CONVENTIONAL_MEMORY) - continue; - - if (desc->attribute & EFI_MEMORY_HOT_PLUGGABLE) - continue; - - if (efi_soft_reserve_enabled() && - (desc->attribute & EFI_MEMORY_SP)) - continue; - - if (desc->num_pages < nr_pages) - continue; - - start = desc->phys_addr; - end = start + desc->num_pages * EFI_PAGE_SIZE; - - if (start < min) - start = min; - - start = round_up(start, align); - if ((start + size) > end) - continue; - - status = efi_bs_call(allocate_pages, EFI_ALLOCATE_ADDRESS, - EFI_LOADER_DATA, nr_pages, &start); - if (status == EFI_SUCCESS) { - *addr = start; - break; - } - } - - if (i == map->map_size / map->desc_size) - return EFI_NOT_FOUND; - - return EFI_SUCCESS; -} - -/** - * efi_relocate_kernel() - copy memory area - * @image_addr: pointer to address of memory area to copy - * @image_size: size of memory area to copy - * @alloc_size: minimum size of memory to allocate, must be greater or - * equal to image_size - * @preferred_addr: preferred target address - * @alignment: minimum alignment of the allocated memory area. It - * should be a power of two. - * @min_addr: minimum target address - * - * Copy a memory area to a newly allocated memory area aligned according - * to @alignment but at least EFI_ALLOC_ALIGN. If the preferred address - * is not available, the allocated address will not be below @min_addr. - * On exit, @image_addr is updated to the target copy address that was used. - * - * This function is used to copy the Linux kernel verbatim. It does not apply - * any relocation changes. - * - * Return: status code - */ -efi_status_t efi_relocate_kernel(unsigned long *image_addr, - unsigned long image_size, - unsigned long alloc_size, - unsigned long preferred_addr, - unsigned long alignment, - unsigned long min_addr) -{ - unsigned long cur_image_addr; - unsigned long new_addr = 0; - efi_status_t status; - unsigned long nr_pages; - efi_physical_addr_t efi_addr = preferred_addr; - - if (!image_addr || !image_size || !alloc_size) - return EFI_INVALID_PARAMETER; - if (alloc_size < image_size) - return EFI_INVALID_PARAMETER; - - cur_image_addr = *image_addr; - - /* - * The EFI firmware loader could have placed the kernel image - * anywhere in memory, but the kernel has restrictions on the - * max physical address it can run at. Some architectures - * also have a preferred address, so first try to relocate - * to the preferred address. If that fails, allocate as low - * as possible while respecting the required alignment. - */ - nr_pages = round_up(alloc_size, EFI_ALLOC_ALIGN) / EFI_PAGE_SIZE; - status = efi_bs_call(allocate_pages, EFI_ALLOCATE_ADDRESS, - EFI_LOADER_DATA, nr_pages, &efi_addr); - new_addr = efi_addr; - /* - * If preferred address allocation failed allocate as low as - * possible. - */ - if (status != EFI_SUCCESS) { - status = efi_low_alloc_above(alloc_size, alignment, &new_addr, - min_addr); - } - if (status != EFI_SUCCESS) { - efi_err("Failed to allocate usable memory for kernel.\n"); - return status; - } - - /* - * We know source/dest won't overlap since both memory ranges - * have been allocated by UEFI, so we can safely use memcpy. - */ - memcpy((void *)new_addr, (void *)cur_image_addr, image_size); - - /* Return the new address of the relocated image. */ - *image_addr = new_addr; - - return status; -} diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 46aae3fad4bf6c..60debd543e44ee 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -3149,11 +3149,7 @@ static int __init amdgpu_init(void) r = amdgpu_sync_init(); if (r) - goto error_sync; - - r = amdgpu_userq_fence_slab_init(); - if (r) - goto error_fence; + return r; amdgpu_register_atpx_handler(); amdgpu_acpi_detect(); @@ -3161,7 +3157,7 @@ static int __init amdgpu_init(void) /* Ignore KFD init failures when CONFIG_HSA_AMD is not set. */ r = amdgpu_amdkfd_init(); if (r && r != -ENOENT) - goto error_fence; + goto error_fini_sync; if (amdgpu_pp_feature_mask & PP_OVERDRIVE_MASK) { add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_STILL_OK); @@ -3172,10 +3168,8 @@ static int __init amdgpu_init(void) /* let modprobe override vga console setting */ return pci_register_driver(&amdgpu_kms_pci_driver); -error_fence: +error_fini_sync: amdgpu_sync_fini(); - -error_sync: return r; } @@ -3186,7 +3180,6 @@ static void __exit amdgpu_exit(void) amdgpu_unregister_atpx_handler(); amdgpu_acpi_release(); amdgpu_sync_fini(); - amdgpu_userq_fence_slab_fini(); mmu_notifier_synchronize(); amdgpu_xcp_drv_release(); } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c index bc772ca3dab726..b6f849d51c2e77 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c @@ -262,12 +262,19 @@ void amdgpu_gart_table_ram_free(struct amdgpu_device *adev) */ int amdgpu_gart_table_vram_alloc(struct amdgpu_device *adev) { + int r; + if (adev->gart.bo != NULL) return 0; - return amdgpu_bo_create_kernel(adev, adev->gart.table_size, PAGE_SIZE, - AMDGPU_GEM_DOMAIN_VRAM, &adev->gart.bo, - NULL, (void *)&adev->gart.ptr); + r = amdgpu_bo_create_kernel(adev, adev->gart.table_size, PAGE_SIZE, + AMDGPU_GEM_DOMAIN_VRAM, &adev->gart.bo, + NULL, (void *)&adev->gart.ptr); + if (r) + return r; + + memset_io(adev->gart.ptr, adev->gart.gart_pte_flags, adev->gart.table_size); + return 0; } /** diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c index 66e8a2f7afcf6f..d6bee5c30073dc 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c @@ -552,8 +552,9 @@ static ssize_t amdgpu_debugfs_ring_read(struct file *f, char __user *buf, size_t size, loff_t *pos) { struct amdgpu_ring *ring = file_inode(f)->i_private; - uint32_t value, result, early[3]; + u32 value, result, early[3] = { 0 }; uint64_t p; + u32 avail_dw, start_dw, read_dw; loff_t i; int r; @@ -565,10 +566,10 @@ static ssize_t amdgpu_debugfs_ring_read(struct file *f, char __user *buf, result = 0; - if (*pos < 12) { - if (ring->funcs->type == AMDGPU_RING_TYPE_CPER) - mutex_lock(&ring->adev->cper.ring_lock); + if (ring->funcs->type == AMDGPU_RING_TYPE_CPER) + mutex_lock(&ring->adev->cper.ring_lock); + if (*pos < 12) { early[0] = amdgpu_ring_get_rptr(ring) & ring->buf_mask; early[1] = amdgpu_ring_get_wptr(ring) & ring->buf_mask; early[2] = ring->wptr & ring->buf_mask; @@ -600,13 +601,24 @@ static ssize_t amdgpu_debugfs_ring_read(struct file *f, char __user *buf, *pos += 4; } } else { + early[0] = amdgpu_ring_get_rptr(ring) & ring->buf_mask; + early[1] = amdgpu_ring_get_wptr(ring) & ring->buf_mask; + p = early[0]; if (early[0] <= early[1]) - size = (early[1] - early[0]); + avail_dw = early[1] - early[0]; else - size = ring->ring_size - (early[0] - early[1]); + avail_dw = ring->buf_mask + 1 - (early[0] - early[1]); - while (size) { + start_dw = (*pos > 12) ? ((*pos - 12) >> 2) : 0; + if (start_dw >= avail_dw) + goto out; + + p = (p + start_dw) & ring->ptr_mask; + avail_dw -= start_dw; + read_dw = min_t(u32, avail_dw, size >> 2); + + while (read_dw) { if (p == early[1]) goto out; @@ -619,9 +631,10 @@ static ssize_t amdgpu_debugfs_ring_read(struct file *f, char __user *buf, buf += 4; result += 4; - size--; + read_dw--; p++; p &= ring->ptr_mask; + *pos += 4; } } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c index de140a8ed1354a..70d74f04d2ddcb 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c @@ -106,9 +106,6 @@ amdgpu_userq_detect_and_reset_queues(struct amdgpu_userq_mgr *uq_mgr) int r = 0; int i; - /* Warning if current process mutex is not held */ - WARN_ON(!mutex_is_locked(&uq_mgr->userq_mutex)); - if (unlikely(adev->debug_disable_gpu_ring_reset)) { dev_err(adev->dev, "userq reset disabled by debug mask\n"); return 0; @@ -127,9 +124,11 @@ amdgpu_userq_detect_and_reset_queues(struct amdgpu_userq_mgr *uq_mgr) */ for (i = 0; i < num_queue_types; i++) { int ring_type = queue_types[i]; - const struct amdgpu_userq_funcs *funcs = adev->userq_funcs[ring_type]; + const struct amdgpu_userq_funcs *funcs = + adev->userq_funcs[ring_type]; - if (!amdgpu_userq_is_reset_type_supported(adev, ring_type, AMDGPU_RESET_TYPE_PER_QUEUE)) + if (!amdgpu_userq_is_reset_type_supported(adev, ring_type, + AMDGPU_RESET_TYPE_PER_QUEUE)) continue; if (atomic_read(&uq_mgr->userq_count[ring_type]) > 0 && @@ -150,38 +149,22 @@ amdgpu_userq_detect_and_reset_queues(struct amdgpu_userq_mgr *uq_mgr) static void amdgpu_userq_hang_detect_work(struct work_struct *work) { - struct amdgpu_usermode_queue *queue = container_of(work, - struct amdgpu_usermode_queue, - hang_detect_work.work); - struct dma_fence *fence; - struct amdgpu_userq_mgr *uq_mgr; - - if (!queue->userq_mgr) - return; - - uq_mgr = queue->userq_mgr; - fence = READ_ONCE(queue->hang_detect_fence); - /* Fence already signaled – no action needed */ - if (!fence || dma_fence_is_signaled(fence)) - return; + struct amdgpu_usermode_queue *queue = + container_of(work, struct amdgpu_usermode_queue, + hang_detect_work.work); - mutex_lock(&uq_mgr->userq_mutex); - amdgpu_userq_detect_and_reset_queues(uq_mgr); - mutex_unlock(&uq_mgr->userq_mutex); + amdgpu_userq_detect_and_reset_queues(queue->userq_mgr); } /* * Start hang detection for a user queue fence. A delayed work will be scheduled - * to check if the fence is still pending after the timeout period. -*/ + * to reset the queues when the fence doesn't signal in time. + */ void amdgpu_userq_start_hang_detect_work(struct amdgpu_usermode_queue *queue) { struct amdgpu_device *adev; unsigned long timeout_ms; - if (!queue || !queue->userq_mgr || !queue->userq_mgr->adev) - return; - adev = queue->userq_mgr->adev; /* Determine timeout based on queue type */ switch (queue->queue_type) { @@ -199,8 +182,6 @@ void amdgpu_userq_start_hang_detect_work(struct amdgpu_usermode_queue *queue) break; } - /* Store the fence to monitor and schedule hang detection */ - WRITE_ONCE(queue->hang_detect_fence, queue->last_fence); schedule_delayed_work(&queue->hang_detect_work, msecs_to_jiffies(timeout_ms)); } @@ -210,18 +191,24 @@ void amdgpu_userq_process_fence_irq(struct amdgpu_device *adev, u32 doorbell) struct xarray *xa = &adev->userq_doorbell_xa; struct amdgpu_usermode_queue *queue; unsigned long flags; + int r; xa_lock_irqsave(xa, flags); queue = xa_load(xa, doorbell); - if (queue) - amdgpu_userq_fence_driver_process(queue->fence_drv); - xa_unlock_irqrestore(xa, flags); -} + if (queue) { + r = amdgpu_userq_fence_driver_process(queue->fence_drv); + /* + * We are in interrupt context here, this *can't* wait for + * reset work to finish. + */ + if (r >= 0) + cancel_delayed_work(&queue->hang_detect_work); -static void amdgpu_userq_init_hang_detect_work(struct amdgpu_usermode_queue *queue) -{ - INIT_DELAYED_WORK(&queue->hang_detect_work, amdgpu_userq_hang_detect_work); - queue->hang_detect_fence = NULL; + /* Restart the timer when there are still fences pending */ + if (r == 1) + amdgpu_userq_start_hang_detect_work(queue); + } + xa_unlock_irqrestore(xa, flags); } static int amdgpu_userq_buffer_va_list_add(struct amdgpu_usermode_queue *queue, @@ -345,23 +332,18 @@ static int amdgpu_userq_preempt_helper(struct amdgpu_usermode_queue *queue) struct amdgpu_device *adev = uq_mgr->adev; const struct amdgpu_userq_funcs *userq_funcs = adev->userq_funcs[queue->queue_type]; - bool found_hung_queue = false; - int r = 0; + int r; if (queue->state == AMDGPU_USERQ_STATE_MAPPED) { r = userq_funcs->preempt(queue); if (r) { queue->state = AMDGPU_USERQ_STATE_HUNG; - found_hung_queue = true; + return r; } else { queue->state = AMDGPU_USERQ_STATE_PREEMPTED; } } - - if (found_hung_queue) - amdgpu_userq_detect_and_reset_queues(uq_mgr); - - return r; + return 0; } static int amdgpu_userq_restore_helper(struct amdgpu_usermode_queue *queue) @@ -390,24 +372,21 @@ static int amdgpu_userq_unmap_helper(struct amdgpu_usermode_queue *queue) struct amdgpu_device *adev = uq_mgr->adev; const struct amdgpu_userq_funcs *userq_funcs = adev->userq_funcs[queue->queue_type]; - bool found_hung_queue = false; - int r = 0; + int r; if ((queue->state == AMDGPU_USERQ_STATE_MAPPED) || - (queue->state == AMDGPU_USERQ_STATE_PREEMPTED)) { + (queue->state == AMDGPU_USERQ_STATE_PREEMPTED)) { + r = userq_funcs->unmap(queue); if (r) { queue->state = AMDGPU_USERQ_STATE_HUNG; - found_hung_queue = true; + return r; } else { queue->state = AMDGPU_USERQ_STATE_UNMAPPED; } } - if (found_hung_queue) - amdgpu_userq_detect_and_reset_queues(uq_mgr); - - return r; + return 0; } static int amdgpu_userq_map_helper(struct amdgpu_usermode_queue *queue) @@ -416,19 +395,19 @@ static int amdgpu_userq_map_helper(struct amdgpu_usermode_queue *queue) struct amdgpu_device *adev = uq_mgr->adev; const struct amdgpu_userq_funcs *userq_funcs = adev->userq_funcs[queue->queue_type]; - int r = 0; + int r; if (queue->state == AMDGPU_USERQ_STATE_UNMAPPED) { r = userq_funcs->map(queue); if (r) { queue->state = AMDGPU_USERQ_STATE_HUNG; - amdgpu_userq_detect_and_reset_queues(uq_mgr); + return r; } else { queue->state = AMDGPU_USERQ_STATE_MAPPED; } } - return r; + return 0; } static void amdgpu_userq_wait_for_last_fence(struct amdgpu_usermode_queue *queue) @@ -648,13 +627,11 @@ amdgpu_userq_destroy(struct amdgpu_userq_mgr *uq_mgr, struct amdgpu_usermode_que amdgpu_bo_unreserve(vm->root.bo); mutex_lock(&uq_mgr->userq_mutex); - queue->hang_detect_fence = NULL; amdgpu_userq_wait_for_last_fence(queue); #if defined(CONFIG_DEBUG_FS) debugfs_remove_recursive(queue->debugfs_queue); #endif - amdgpu_userq_detect_and_reset_queues(uq_mgr); r = amdgpu_userq_unmap_helper(queue); atomic_dec(&uq_mgr->userq_count[queue->queue_type]); amdgpu_userq_cleanup(queue); @@ -800,6 +777,7 @@ amdgpu_userq_create(struct drm_file *filp, union drm_amdgpu_userq *args) } queue->doorbell_index = index; + mutex_init(&queue->fence_drv_lock); xa_init_flags(&queue->fence_drv_xa, XA_FLAGS_ALLOC); r = amdgpu_userq_fence_driver_alloc(adev, &queue->fence_drv); if (r) { @@ -855,7 +833,8 @@ amdgpu_userq_create(struct drm_file *filp, union drm_amdgpu_userq *args) up_read(&adev->reset_domain->sem); amdgpu_debugfs_userq_init(filp, queue, qid); - amdgpu_userq_init_hang_detect_work(queue); + INIT_DELAYED_WORK(&queue->hang_detect_work, + amdgpu_userq_hang_detect_work); args->out.queue_id = qid; atomic_inc(&uq_mgr->userq_count[queue->queue_type]); @@ -873,6 +852,7 @@ amdgpu_userq_create(struct drm_file *filp, union drm_amdgpu_userq *args) amdgpu_bo_reserve(fpriv->vm.root.bo, true); amdgpu_userq_buffer_vas_list_cleanup(adev, queue); amdgpu_bo_unreserve(fpriv->vm.root.bo); + mutex_destroy(&queue->fence_drv_lock); free_queue: kfree(queue); err_pm_runtime: @@ -1262,7 +1242,6 @@ amdgpu_userq_evict_all(struct amdgpu_userq_mgr *uq_mgr) unsigned long queue_id; int ret = 0, r; - amdgpu_userq_detect_and_reset_queues(uq_mgr); /* Try to unmap all the queues in this process ctx */ xa_for_each(&uq_mgr->userq_xa, queue_id, queue) { r = amdgpu_userq_preempt_helper(queue); @@ -1270,9 +1249,11 @@ amdgpu_userq_evict_all(struct amdgpu_userq_mgr *uq_mgr) ret = r; } - if (ret) + if (ret) { drm_file_err(uq_mgr->file, "Couldn't unmap all the queues, eviction failed ret=%d\n", ret); + amdgpu_userq_detect_and_reset_queues(uq_mgr); + } return ret; } @@ -1372,7 +1353,6 @@ int amdgpu_userq_suspend(struct amdgpu_device *adev) uqm = queue->userq_mgr; cancel_delayed_work_sync(&uqm->resume_work); guard(mutex)(&uqm->userq_mutex); - amdgpu_userq_detect_and_reset_queues(uqm); if (adev->in_s0ix) r = amdgpu_userq_preempt_helper(queue); else @@ -1431,7 +1411,6 @@ int amdgpu_userq_stop_sched_for_enforce_isolation(struct amdgpu_device *adev, if (((queue->queue_type == AMDGPU_HW_IP_GFX) || (queue->queue_type == AMDGPU_HW_IP_COMPUTE)) && (queue->xcp_id == idx)) { - amdgpu_userq_detect_and_reset_queues(uqm); r = amdgpu_userq_preempt_helper(queue); if (r) ret = r; @@ -1504,23 +1483,21 @@ void amdgpu_userq_pre_reset(struct amdgpu_device *adev) { const struct amdgpu_userq_funcs *userq_funcs; struct amdgpu_usermode_queue *queue; - struct amdgpu_userq_mgr *uqm; unsigned long queue_id; + /* TODO: We probably need a new lock for the queue state */ xa_for_each(&adev->userq_doorbell_xa, queue_id, queue) { - uqm = queue->userq_mgr; - cancel_delayed_work_sync(&uqm->resume_work); - if (queue->state == AMDGPU_USERQ_STATE_MAPPED) { - amdgpu_userq_wait_for_last_fence(queue); - userq_funcs = adev->userq_funcs[queue->queue_type]; - userq_funcs->unmap(queue); - /* just mark all queues as hung at this point. - * if unmap succeeds, we could map again - * in amdgpu_userq_post_reset() if vram is not lost - */ - queue->state = AMDGPU_USERQ_STATE_HUNG; - amdgpu_userq_fence_driver_force_completion(queue); - } + if (queue->state != AMDGPU_USERQ_STATE_MAPPED) + continue; + + userq_funcs = adev->userq_funcs[queue->queue_type]; + userq_funcs->unmap(queue); + /* just mark all queues as hung at this point. + * if unmap succeeds, we could map again + * in amdgpu_userq_post_reset() if vram is not lost + */ + queue->state = AMDGPU_USERQ_STATE_HUNG; + amdgpu_userq_fence_driver_force_completion(queue); } } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h index 8b8f345b60b6b1..85f460e7c31b24 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h @@ -66,6 +66,18 @@ struct amdgpu_usermode_queue { struct amdgpu_userq_obj db_obj; struct amdgpu_userq_obj fw_obj; struct amdgpu_userq_obj wptr_obj; + + /** + * @fence_drv_lock: Protecting @fence_drv_xa. + */ + struct mutex fence_drv_lock; + + /** + * @fence_drv_xa: + * + * References to the external fence drivers returned by wait_ioctl. + * Dropped on the next signaled dma_fence or queue destruction. + */ struct xarray fence_drv_xa; struct amdgpu_userq_fence_driver *fence_drv; struct dma_fence *last_fence; @@ -73,7 +85,6 @@ struct amdgpu_usermode_queue { int priority; struct dentry *debugfs_queue; struct delayed_work hang_detect_work; - struct dma_fence *hang_detect_fence; struct kref refcount; struct list_head userq_va_list; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c index da39ac862f37cf..53a8944bab05ac 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c @@ -32,29 +32,9 @@ #include "amdgpu.h" #include "amdgpu_userq_fence.h" -static const struct dma_fence_ops amdgpu_userq_fence_ops; -static struct kmem_cache *amdgpu_userq_fence_slab; - #define AMDGPU_USERQ_MAX_HANDLES (1U << 16) -int amdgpu_userq_fence_slab_init(void) -{ - amdgpu_userq_fence_slab = kmem_cache_create("amdgpu_userq_fence", - sizeof(struct amdgpu_userq_fence), - 0, - SLAB_HWCACHE_ALIGN, - NULL); - if (!amdgpu_userq_fence_slab) - return -ENOMEM; - - return 0; -} - -void amdgpu_userq_fence_slab_fini(void) -{ - rcu_barrier(); - kmem_cache_destroy(amdgpu_userq_fence_slab); -} +static const struct dma_fence_ops amdgpu_userq_fence_ops; static inline struct amdgpu_userq_fence *to_amdgpu_userq_fence(struct dma_fence *f) { @@ -141,6 +121,7 @@ amdgpu_userq_fence_driver_free(struct amdgpu_usermode_queue *userq) userq->last_fence = NULL; amdgpu_userq_walk_and_drop_fence_drv(&userq->fence_drv_xa); xa_destroy(&userq->fence_drv_xa); + mutex_destroy(&userq->fence_drv_lock); /* Drop the queue's ownership reference to fence_drv explicitly */ amdgpu_userq_fence_driver_put(userq->fence_drv); } @@ -154,7 +135,14 @@ amdgpu_userq_fence_put_fence_drv_array(struct amdgpu_userq_fence *userq_fence) userq_fence->fence_drv_array_count = 0; } -void amdgpu_userq_fence_driver_process(struct amdgpu_userq_fence_driver *fence_drv) +/* + * Returns: + * -ENOENT when no fences were processes + * 1 when more fences are pending + * 0 when no fences are pending any more + */ +int +amdgpu_userq_fence_driver_process(struct amdgpu_userq_fence_driver *fence_drv) { struct amdgpu_userq_fence *userq_fence, *tmp; LIST_HEAD(to_be_signaled); @@ -162,9 +150,6 @@ void amdgpu_userq_fence_driver_process(struct amdgpu_userq_fence_driver *fence_d unsigned long flags; u64 rptr; - if (!fence_drv) - return; - spin_lock_irqsave(&fence_drv->fence_list_lock, flags); rptr = amdgpu_userq_fence_read(fence_drv); @@ -177,6 +162,9 @@ void amdgpu_userq_fence_driver_process(struct amdgpu_userq_fence_driver *fence_d &userq_fence->link); spin_unlock_irqrestore(&fence_drv->fence_list_lock, flags); + if (list_empty(&to_be_signaled)) + return -ENOENT; + list_for_each_entry_safe(userq_fence, tmp, &to_be_signaled, link) { fence = &userq_fence->base; list_del_init(&userq_fence->link); @@ -188,6 +176,8 @@ void amdgpu_userq_fence_driver_process(struct amdgpu_userq_fence_driver *fence_d dma_fence_put(fence); } + /* That doesn't need to be accurate so no locking */ + return list_empty(&fence_drv->fences) ? 0 : 1; } void amdgpu_userq_fence_driver_destroy(struct kref *ref) @@ -229,80 +219,84 @@ void amdgpu_userq_fence_driver_put(struct amdgpu_userq_fence_driver *fence_drv) kref_put(&fence_drv->refcount, amdgpu_userq_fence_driver_destroy); } -static int amdgpu_userq_fence_alloc(struct amdgpu_userq_fence **userq_fence) +static int amdgpu_userq_fence_alloc(struct amdgpu_usermode_queue *userq, + struct amdgpu_userq_fence **pfence) { - *userq_fence = kmem_cache_alloc(amdgpu_userq_fence_slab, GFP_ATOMIC); - return *userq_fence ? 0 : -ENOMEM; + struct amdgpu_userq_fence_driver *fence_drv = userq->fence_drv; + struct amdgpu_userq_fence *userq_fence; + void *entry; + + userq_fence = kmalloc(sizeof(*userq_fence), GFP_KERNEL); + if (!userq_fence) + return -ENOMEM; + + /* + * Get the next unused entry, since we fill from the start this can be + * used as size to allocate the array. + */ + mutex_lock(&userq->fence_drv_lock); + XA_STATE(xas, &userq->fence_drv_xa, 0); + + rcu_read_lock(); + do { + entry = xas_find_marked(&xas, ULONG_MAX, XA_FREE_MARK); + } while (xas_retry(&xas, entry)); + rcu_read_unlock(); + + userq_fence->fence_drv_array = kvmalloc_array(xas.xa_index, + sizeof(fence_drv), + GFP_KERNEL); + if (!userq_fence->fence_drv_array) { + mutex_unlock(&userq->fence_drv_lock); + kfree(userq_fence); + return -ENOMEM; + } + + userq_fence->fence_drv_array_count = xas.xa_index; + xa_extract(&userq->fence_drv_xa, (void **)userq_fence->fence_drv_array, + 0, ULONG_MAX, xas.xa_index, XA_PRESENT); + xa_destroy(&userq->fence_drv_xa); + + mutex_unlock(&userq->fence_drv_lock); + + amdgpu_userq_fence_driver_get(fence_drv); + userq_fence->fence_drv = fence_drv; + + *pfence = userq_fence; + return 0; } -static int amdgpu_userq_fence_create(struct amdgpu_usermode_queue *userq, - struct amdgpu_userq_fence *userq_fence, - u64 seq, struct dma_fence **f) +static void amdgpu_userq_fence_init(struct amdgpu_usermode_queue *userq, + struct amdgpu_userq_fence *fence, + u64 seq) { - struct amdgpu_userq_fence_driver *fence_drv; - struct dma_fence *fence; + struct amdgpu_userq_fence_driver *fence_drv = userq->fence_drv; unsigned long flags; bool signaled = false; - fence_drv = userq->fence_drv; - if (!fence_drv) - return -EINVAL; - - spin_lock_init(&userq_fence->lock); - INIT_LIST_HEAD(&userq_fence->link); - fence = &userq_fence->base; - userq_fence->fence_drv = fence_drv; - - dma_fence_init64(fence, &amdgpu_userq_fence_ops, &userq_fence->lock, + spin_lock_init(&fence->lock); + dma_fence_init64(&fence->base, &amdgpu_userq_fence_ops, &fence->lock, fence_drv->context, seq); - amdgpu_userq_fence_driver_get(fence_drv); - dma_fence_get(fence); - - if (!xa_empty(&userq->fence_drv_xa)) { - struct amdgpu_userq_fence_driver *stored_fence_drv; - unsigned long index, count = 0; - int i = 0; - - xa_lock(&userq->fence_drv_xa); - xa_for_each(&userq->fence_drv_xa, index, stored_fence_drv) - count++; - - userq_fence->fence_drv_array = - kvmalloc_objs(struct amdgpu_userq_fence_driver *, count, - GFP_ATOMIC); - - if (userq_fence->fence_drv_array) { - xa_for_each(&userq->fence_drv_xa, index, stored_fence_drv) { - userq_fence->fence_drv_array[i] = stored_fence_drv; - __xa_erase(&userq->fence_drv_xa, index); - i++; - } - } - - userq_fence->fence_drv_array_count = i; - xa_unlock(&userq->fence_drv_xa); - } else { - userq_fence->fence_drv_array = NULL; - userq_fence->fence_drv_array_count = 0; - } + /* Make sure the fence is visible to the hang detect worker */ + dma_fence_put(userq->last_fence); + userq->last_fence = dma_fence_get(&fence->base); - /* Check if hardware has already processed the job */ + /* Check if hardware has already processed the fence */ spin_lock_irqsave(&fence_drv->fence_list_lock, flags); - if (!dma_fence_is_signaled(fence)) { - list_add_tail(&userq_fence->link, &fence_drv->fences); + if (!dma_fence_is_signaled(&fence->base)) { + dma_fence_get(&fence->base); + list_add_tail(&fence->link, &fence_drv->fences); } else { + INIT_LIST_HEAD(&fence->link); signaled = true; - dma_fence_put(fence); } spin_unlock_irqrestore(&fence_drv->fence_list_lock, flags); if (signaled) - amdgpu_userq_fence_put_fence_drv_array(userq_fence); - - *f = fence; - - return 0; + amdgpu_userq_fence_put_fence_drv_array(fence); + else + amdgpu_userq_start_hang_detect_work(userq); } static const char *amdgpu_userq_fence_get_driver_name(struct dma_fence *f) @@ -342,7 +336,7 @@ static void amdgpu_userq_fence_free(struct rcu_head *rcu) amdgpu_userq_fence_driver_put(fence_drv); kvfree(userq_fence->fence_drv_array); - kmem_cache_free(amdgpu_userq_fence_slab, userq_fence); + kfree(userq_fence); } static void amdgpu_userq_fence_release(struct dma_fence *f) @@ -423,11 +417,6 @@ static int amdgpu_userq_fence_read_wptr(struct amdgpu_device *adev, return r; } -static void amdgpu_userq_fence_cleanup(struct dma_fence *fence) -{ - dma_fence_put(fence); -} - static void amdgpu_userq_fence_driver_set_error(struct amdgpu_userq_fence *fence, int error) @@ -471,13 +460,14 @@ int amdgpu_userq_signal_ioctl(struct drm_device *dev, void *data, const unsigned int num_read_bo_handles = args->num_bo_read_handles; struct amdgpu_fpriv *fpriv = filp->driver_priv; struct amdgpu_userq_mgr *userq_mgr = &fpriv->userq_mgr; + struct drm_gem_object **gobj_write, **gobj_read; u32 *syncobj_handles, num_syncobj_handles; - struct amdgpu_userq_fence *userq_fence; - struct amdgpu_usermode_queue *queue = NULL; - struct drm_syncobj **syncobj = NULL; - struct dma_fence *fence; + struct amdgpu_usermode_queue *queue; + struct amdgpu_userq_fence *fence; + struct drm_syncobj **syncobj; struct drm_exec exec; + void __user *ptr; int r, i, entry; u64 wptr; @@ -489,13 +479,14 @@ int amdgpu_userq_signal_ioctl(struct drm_device *dev, void *data, return -EINVAL; num_syncobj_handles = args->num_syncobj_handles; - syncobj_handles = memdup_array_user(u64_to_user_ptr(args->syncobj_handles), - num_syncobj_handles, sizeof(u32)); + ptr = u64_to_user_ptr(args->syncobj_handles); + syncobj_handles = memdup_array_user(ptr, num_syncobj_handles, + sizeof(u32)); if (IS_ERR(syncobj_handles)) return PTR_ERR(syncobj_handles); - /* Array of pointers to the looked up syncobjs */ - syncobj = kmalloc_array(num_syncobj_handles, sizeof(*syncobj), GFP_KERNEL); + syncobj = kmalloc_array(num_syncobj_handles, sizeof(*syncobj), + GFP_KERNEL); if (!syncobj) { r = -ENOMEM; goto free_syncobj_handles; @@ -509,21 +500,17 @@ int amdgpu_userq_signal_ioctl(struct drm_device *dev, void *data, } } - r = drm_gem_objects_lookup(filp, - u64_to_user_ptr(args->bo_read_handles), - num_read_bo_handles, - &gobj_read); + ptr = u64_to_user_ptr(args->bo_read_handles); + r = drm_gem_objects_lookup(filp, ptr, num_read_bo_handles, &gobj_read); if (r) goto free_syncobj; - r = drm_gem_objects_lookup(filp, - u64_to_user_ptr(args->bo_write_handles), - num_write_bo_handles, + ptr = u64_to_user_ptr(args->bo_write_handles); + r = drm_gem_objects_lookup(filp, ptr, num_write_bo_handles, &gobj_write); if (r) goto put_gobj_read; - /* Retrieve the user queue */ queue = amdgpu_userq_get(userq_mgr, args->queue_id); if (!queue) { r = -ENOENT; @@ -532,73 +519,61 @@ int amdgpu_userq_signal_ioctl(struct drm_device *dev, void *data, r = amdgpu_userq_fence_read_wptr(adev, queue, &wptr); if (r) - goto put_gobj_write; + goto put_queue; - r = amdgpu_userq_fence_alloc(&userq_fence); + r = amdgpu_userq_fence_alloc(queue, &fence); if (r) - goto put_gobj_write; + goto put_queue; /* We are here means UQ is active, make sure the eviction fence is valid */ amdgpu_userq_ensure_ev_fence(&fpriv->userq_mgr, &fpriv->evf_mgr); - /* Create a new fence */ - r = amdgpu_userq_fence_create(queue, userq_fence, wptr, &fence); - if (r) { - mutex_unlock(&userq_mgr->userq_mutex); - kmem_cache_free(amdgpu_userq_fence_slab, userq_fence); - goto put_gobj_write; - } + /* Create the new fence */ + amdgpu_userq_fence_init(queue, fence, wptr); - dma_fence_put(queue->last_fence); - queue->last_fence = dma_fence_get(fence); - amdgpu_userq_start_hang_detect_work(queue); mutex_unlock(&userq_mgr->userq_mutex); + /* + * This needs to come after the fence is created since + * amdgpu_userq_ensure_ev_fence() can't be called while holding the resv + * locks. + */ drm_exec_init(&exec, DRM_EXEC_INTERRUPTIBLE_WAIT, (num_read_bo_handles + num_write_bo_handles)); - /* Lock all BOs with retry handling */ drm_exec_until_all_locked(&exec) { - r = drm_exec_prepare_array(&exec, gobj_read, num_read_bo_handles, 1); + r = drm_exec_prepare_array(&exec, gobj_read, + num_read_bo_handles, 1); drm_exec_retry_on_contention(&exec); - if (r) { - amdgpu_userq_fence_cleanup(fence); + if (r) goto exec_fini; - } - r = drm_exec_prepare_array(&exec, gobj_write, num_write_bo_handles, 1); + r = drm_exec_prepare_array(&exec, gobj_write, + num_write_bo_handles, 1); drm_exec_retry_on_contention(&exec); - if (r) { - amdgpu_userq_fence_cleanup(fence); + if (r) goto exec_fini; - } } - for (i = 0; i < num_read_bo_handles; i++) { - if (!gobj_read || !gobj_read[i]->resv) - continue; - - dma_resv_add_fence(gobj_read[i]->resv, fence, + /* And publish the new fence in the BOs and syncobj */ + for (i = 0; i < num_read_bo_handles; i++) + dma_resv_add_fence(gobj_read[i]->resv, &fence->base, DMA_RESV_USAGE_READ); - } - - for (i = 0; i < num_write_bo_handles; i++) { - if (!gobj_write || !gobj_write[i]->resv) - continue; - dma_resv_add_fence(gobj_write[i]->resv, fence, + for (i = 0; i < num_write_bo_handles; i++) + dma_resv_add_fence(gobj_write[i]->resv, &fence->base, DMA_RESV_USAGE_WRITE); - } - /* Add the created fence to syncobj/BO's */ for (i = 0; i < num_syncobj_handles; i++) - drm_syncobj_replace_fence(syncobj[i], fence); + drm_syncobj_replace_fence(syncobj[i], &fence->base); +exec_fini: /* drop the reference acquired in fence creation function */ - dma_fence_put(fence); + dma_fence_put(&fence->base); -exec_fini: drm_exec_fini(&exec); +put_queue: + amdgpu_userq_put(queue); put_gobj_write: for (i = 0; i < num_write_bo_handles; i++) drm_gem_object_put(gobj_write[i]); @@ -609,15 +584,11 @@ int amdgpu_userq_signal_ioctl(struct drm_device *dev, void *data, kvfree(gobj_read); free_syncobj: while (entry-- > 0) - if (syncobj[entry]) - drm_syncobj_put(syncobj[entry]); + drm_syncobj_put(syncobj[entry]); kfree(syncobj); free_syncobj_handles: kfree(syncobj_handles); - if (queue) - amdgpu_userq_put(queue); - return r; } @@ -892,8 +863,10 @@ amdgpu_userq_wait_return_fence_info(struct drm_file *filp, * Otherwise, we would gather those references until we don't * have any more space left and crash. */ + mutex_lock(&waitq->fence_drv_lock); r = xa_alloc(&waitq->fence_drv_xa, &index, fence_drv, xa_limit_32b, GFP_KERNEL); + mutex_unlock(&waitq->fence_drv_lock); if (r) goto put_waitq; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.h index d56246ad8c260b..0bd51616cef1bb 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.h @@ -58,15 +58,12 @@ struct amdgpu_userq_fence_driver { char timeline_name[TASK_COMM_LEN]; }; -int amdgpu_userq_fence_slab_init(void); -void amdgpu_userq_fence_slab_fini(void); - void amdgpu_userq_fence_driver_get(struct amdgpu_userq_fence_driver *fence_drv); void amdgpu_userq_fence_driver_put(struct amdgpu_userq_fence_driver *fence_drv); int amdgpu_userq_fence_driver_alloc(struct amdgpu_device *adev, struct amdgpu_userq_fence_driver **fence_drv_req); void amdgpu_userq_fence_driver_free(struct amdgpu_usermode_queue *userq); -void amdgpu_userq_fence_driver_process(struct amdgpu_userq_fence_driver *fence_drv); +int amdgpu_userq_fence_driver_process(struct amdgpu_userq_fence_driver *fence_drv); void amdgpu_userq_fence_driver_force_completion(struct amdgpu_usermode_queue *userq); void amdgpu_userq_fence_driver_destroy(struct kref *ref); int amdgpu_userq_signal_ioctl(struct drm_device *dev, void *data, diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c index 0e0b1e5b88fce8..c35372e21261d2 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c @@ -602,6 +602,13 @@ static int gfx_v12_0_init_microcode(struct amdgpu_device *adev) "amdgpu/%s_pfp.bin", ucode_prefix); if (err) goto out; + + adev->gfx.rs64_enable = amdgpu_ucode_hdr_version( + (union amdgpu_firmware_header *) + adev->gfx.pfp_fw->data, 2, 0); + if (adev->gfx.rs64_enable) + dev_dbg(adev->dev, "CP RS64 enable\n"); + amdgpu_gfx_cp_init_microcode(adev, AMDGPU_UCODE_ID_CP_RS64_PFP); amdgpu_gfx_cp_init_microcode(adev, AMDGPU_UCODE_ID_CP_RS64_PFP_P0_STACK); diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c index 95be105671ece8..86c7c2a429b713 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c @@ -5660,9 +5660,6 @@ static void gfx_v9_0_ring_emit_fence_kiq(struct amdgpu_ring *ring, u64 addr, { struct amdgpu_device *adev = ring->adev; - /* we only allocate 32bit for each seq wb address */ - BUG_ON(flags & AMDGPU_FENCE_FLAG_64BIT); - /* write fence seq to the "addr" */ amdgpu_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 3)); amdgpu_ring_write(ring, (WRITE_DATA_ENGINE_SEL(0) | diff --git a/drivers/gpu/drm/amd/amdgpu/mes_userqueue.c b/drivers/gpu/drm/amd/amdgpu/mes_userqueue.c index 2fc39a6938f6db..5b4121ddc78c6f 100644 --- a/drivers/gpu/drm/amd/amdgpu/mes_userqueue.c +++ b/drivers/gpu/drm/amd/amdgpu/mes_userqueue.c @@ -30,34 +30,6 @@ #define AMDGPU_USERQ_PROC_CTX_SZ PAGE_SIZE #define AMDGPU_USERQ_GANG_CTX_SZ PAGE_SIZE -static int -mes_userq_map_gtt_bo_to_gart(struct amdgpu_bo *bo) -{ - int ret; - - ret = amdgpu_bo_reserve(bo, true); - if (ret) { - DRM_ERROR("Failed to reserve bo. ret %d\n", ret); - goto err_reserve_bo_failed; - } - - ret = amdgpu_ttm_alloc_gart(&bo->tbo); - if (ret) { - DRM_ERROR("Failed to bind bo to GART. ret %d\n", ret); - goto err_map_bo_gart_failed; - } - - amdgpu_bo_unreserve(bo); - bo = amdgpu_bo_ref(bo); - - return 0; - -err_map_bo_gart_failed: - amdgpu_bo_unreserve(bo); -err_reserve_bo_failed: - return ret; -} - static int mes_userq_create_wptr_mapping(struct amdgpu_device *adev, struct amdgpu_userq_mgr *uq_mgr, @@ -65,55 +37,62 @@ mes_userq_create_wptr_mapping(struct amdgpu_device *adev, uint64_t wptr) { struct amdgpu_bo_va_mapping *wptr_mapping; - struct amdgpu_vm *wptr_vm; struct amdgpu_userq_obj *wptr_obj = &queue->wptr_obj; + struct amdgpu_bo *obj; + struct amdgpu_vm *vm = queue->vm; + struct drm_exec exec; int ret; - wptr_vm = queue->vm; - ret = amdgpu_bo_reserve(wptr_vm->root.bo, false); - if (ret) - return ret; - wptr &= AMDGPU_GMC_HOLE_MASK; - wptr_mapping = amdgpu_vm_bo_lookup_mapping(wptr_vm, wptr >> PAGE_SHIFT); - amdgpu_bo_unreserve(wptr_vm->root.bo); - if (!wptr_mapping) { - DRM_ERROR("Failed to lookup wptr bo\n"); - return -EINVAL; - } - wptr_obj->obj = wptr_mapping->bo_va->base.bo; - if (wptr_obj->obj->tbo.base.size > PAGE_SIZE) { - DRM_ERROR("Requested GART mapping for wptr bo larger than one page\n"); - return -EINVAL; - } + drm_exec_init(&exec, DRM_EXEC_IGNORE_DUPLICATES, 2); + drm_exec_until_all_locked(&exec) { + ret = amdgpu_vm_lock_pd(vm, &exec, 1); + drm_exec_retry_on_contention(&exec); + if (unlikely(ret)) + goto fail_lock; + + wptr_mapping = amdgpu_vm_bo_lookup_mapping(vm, wptr >> PAGE_SHIFT); + if (!wptr_mapping) { + ret = -EINVAL; + goto fail_lock; + } - ret = mes_userq_map_gtt_bo_to_gart(wptr_obj->obj); - if (ret) { - DRM_ERROR("Failed to map wptr bo to GART\n"); - return ret; + obj = wptr_mapping->bo_va->base.bo; + ret = drm_exec_lock_obj(&exec, &obj->tbo.base); + drm_exec_retry_on_contention(&exec); + if (unlikely(ret)) + goto fail_lock; } - ret = amdgpu_bo_reserve(wptr_obj->obj, true); - if (ret) { - DRM_ERROR("Failed to reserve wptr bo\n"); - return ret; + wptr_obj->obj = amdgpu_bo_ref(wptr_mapping->bo_va->base.bo); + if (wptr_obj->obj->tbo.base.size > PAGE_SIZE) { + ret = -EINVAL; + goto fail_map; } /* TODO use eviction fence instead of pinning. */ ret = amdgpu_bo_pin(wptr_obj->obj, AMDGPU_GEM_DOMAIN_GTT); if (ret) { - drm_file_err(uq_mgr->file, "[Usermode queues] Failed to pin wptr bo\n"); - goto unresv_bo; + DRM_ERROR("Failed to pin wptr bo. ret %d\n", ret); + goto fail_map; + } + + ret = amdgpu_ttm_alloc_gart(&wptr_obj->obj->tbo); + if (ret) { + DRM_ERROR("Failed to bind bo to GART. ret %d\n", ret); + goto fail_map; } queue->wptr_obj.gpu_addr = amdgpu_bo_gpu_offset(wptr_obj->obj); - amdgpu_bo_unreserve(wptr_obj->obj); + drm_exec_fini(&exec); return 0; -unresv_bo: - amdgpu_bo_unreserve(wptr_obj->obj); +fail_map: + amdgpu_bo_unref(&wptr_obj->obj); +fail_lock: + drm_exec_fini(&exec); return ret; } diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c index 44f0f23e114843..e64f2f6df9a9e2 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c @@ -889,7 +889,7 @@ static void sdma_v4_0_ring_emit_fence(struct amdgpu_ring *ring, u64 addr, u64 se /* write the fence */ amdgpu_ring_write(ring, SDMA_PKT_HEADER_OP(SDMA_OP_FENCE)); /* zero in first two bits */ - BUG_ON(addr & 0x3); + WARN_ON(addr & 0x3); amdgpu_ring_write(ring, lower_32_bits(addr)); amdgpu_ring_write(ring, upper_32_bits(addr)); amdgpu_ring_write(ring, lower_32_bits(seq)); @@ -899,7 +899,7 @@ static void sdma_v4_0_ring_emit_fence(struct amdgpu_ring *ring, u64 addr, u64 se addr += 4; amdgpu_ring_write(ring, SDMA_PKT_HEADER_OP(SDMA_OP_FENCE)); /* zero in first two bits */ - BUG_ON(addr & 0x3); + WARN_ON(addr & 0x3); amdgpu_ring_write(ring, lower_32_bits(addr)); amdgpu_ring_write(ring, upper_32_bits(addr)); amdgpu_ring_write(ring, upper_32_bits(seq)); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index f829d65a79b43e..f95bf6d9553463 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -1360,7 +1360,7 @@ static int kfd_ioctl_map_memory_to_gpu(struct file *filep, peer_pdd = kfd_process_device_data_by_id(p, devices_arr[i]); if (WARN_ON_ONCE(!peer_pdd)) continue; - kfd_flush_tlb(peer_pdd, TLB_FLUSH_LEGACY); + kfd_flush_tlb(peer_pdd); } kfree(devices_arr); @@ -1455,7 +1455,7 @@ static int kfd_ioctl_unmap_memory_from_gpu(struct file *filep, if (WARN_ON_ONCE(!peer_pdd)) continue; if (flush_tlb) - kfd_flush_tlb(peer_pdd, TLB_FLUSH_HEAVYWEIGHT); + kfd_flush_tlb(peer_pdd); /* Remove dma mapping after tlb flush to avoid IO_PAGE_FAULT */ err = amdgpu_amdkfd_gpuvm_dmaunmap_mem(mem, peer_pdd->drm_priv); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index 8ff97bf7d95a0b..b7f8f7ff819834 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -1737,37 +1737,6 @@ bool kgd2kfd_vmfault_fast_path(struct amdgpu_device *adev, struct amdgpu_iv_entr return false; } -/* check if there is kfd process still uses adev */ -static bool kgd2kfd_check_device_idle(struct amdgpu_device *adev) -{ - struct kfd_process *p; - struct hlist_node *p_temp; - unsigned int temp; - struct kfd_node *dev; - - mutex_lock(&kfd_processes_mutex); - - if (hash_empty(kfd_processes_table)) { - mutex_unlock(&kfd_processes_mutex); - return true; - } - - /* check if there is device still use adev */ - hash_for_each_safe(kfd_processes_table, temp, p_temp, p, kfd_processes) { - for (int i = 0; i < p->n_pdds; i++) { - dev = p->pdds[i]->dev; - if (dev->adev == adev) { - mutex_unlock(&kfd_processes_mutex); - return false; - } - } - } - - mutex_unlock(&kfd_processes_mutex); - - return true; -} - /** kgd2kfd_teardown_processes - gracefully tear down existing * kfd processes that use adev * @@ -1800,7 +1769,7 @@ void kgd2kfd_teardown_processes(struct amdgpu_device *adev) mutex_unlock(&kfd_processes_mutex); /* wait all kfd processes use adev terminate */ - while (!kgd2kfd_check_device_idle(adev)) + while (!!atomic_read(&adev->kfd.dev->kfd_processes_count)) cond_resched(); } diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index ab3b2e7be9bd04..9185ebe4c079a8 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -572,7 +572,7 @@ static int allocate_vmid(struct device_queue_manager *dqm, qpd->vmid, qpd->page_table_base); /* invalidate the VM context after pasid and vmid mapping is set up */ - kfd_flush_tlb(qpd_to_pdd(qpd), TLB_FLUSH_LEGACY); + kfd_flush_tlb(qpd_to_pdd(qpd)); if (dqm->dev->kfd2kgd->set_scratch_backing_va) dqm->dev->kfd2kgd->set_scratch_backing_va(dqm->dev->adev, @@ -610,7 +610,7 @@ static void deallocate_vmid(struct device_queue_manager *dqm, if (flush_texture_cache_nocpsch(q->device, qpd)) dev_err(dev, "Failed to flush TC\n"); - kfd_flush_tlb(qpd_to_pdd(qpd), TLB_FLUSH_LEGACY); + kfd_flush_tlb(qpd_to_pdd(qpd)); /* Release the vmid mapping */ set_pasid_vmid_mapping(dqm, 0, qpd->vmid); @@ -1284,7 +1284,7 @@ static int restore_process_queues_nocpsch(struct device_queue_manager *dqm, dqm->dev->adev, qpd->vmid, qpd->page_table_base); - kfd_flush_tlb(pdd, TLB_FLUSH_LEGACY); + kfd_flush_tlb(pdd); } /* Take a safe reference to the mm_struct, which may otherwise diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index 163d665a6074cc..7b5b1220691987 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -1554,13 +1554,13 @@ void kfd_signal_reset_event(struct kfd_node *dev); void kfd_signal_poison_consumed_event(struct kfd_node *dev, u32 pasid); void kfd_signal_process_terminate_event(struct kfd_process *p); -static inline void kfd_flush_tlb(struct kfd_process_device *pdd, - enum TLB_FLUSH_TYPE type) +static inline void kfd_flush_tlb(struct kfd_process_device *pdd) { struct amdgpu_device *adev = pdd->dev->adev; struct amdgpu_vm *vm = drm_priv_to_vm(pdd->drm_priv); - amdgpu_vm_flush_compute_tlb(adev, vm, type, pdd->dev->xcc_mask); + amdgpu_vm_flush_compute_tlb(adev, vm, TLB_FLUSH_HEAVYWEIGHT, + pdd->dev->xcc_mask); } static inline bool kfd_flush_tlb_after_unmap(struct kfd_dev *dev) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 38085a0a0f58ee..35ec67d9739bdd 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -1424,7 +1424,7 @@ svm_range_unmap_from_gpus(struct svm_range *prange, unsigned long start, if (r) break; } - kfd_flush_tlb(pdd, TLB_FLUSH_HEAVYWEIGHT); + kfd_flush_tlb(pdd); } return r; @@ -1571,7 +1571,7 @@ svm_range_map_to_gpus(struct svm_range *prange, unsigned long offset, } } - kfd_flush_tlb(pdd, TLB_FLUSH_LEGACY); + kfd_flush_tlb(pdd); } return r; diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c index 82f81b58698660..3751f7a94a059e 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c @@ -92,9 +92,14 @@ #include "dml/dcn32/dcn32_fpu.h" #include "dc_state_priv.h" +#include "dc_fpu.h" #include "dml2_0/dml2_wrapper.h" +#if !defined(DC_RUN_WITH_PREEMPTION_ENABLED) +#define DC_RUN_WITH_PREEMPTION_ENABLED(code) code +#endif + #define DC_LOGGER_INIT(logger) enum dcn32_clk_src_array_id { @@ -1684,7 +1689,8 @@ static void dcn32_enable_phantom_plane(struct dc *dc, if (curr_pipe->top_pipe && curr_pipe->top_pipe->plane_state == curr_pipe->plane_state) phantom_plane = prev_phantom_plane; else - phantom_plane = dc_state_create_phantom_plane(dc, context, curr_pipe->plane_state); + DC_RUN_WITH_PREEMPTION_ENABLED(phantom_plane = + dc_state_create_phantom_plane(dc, context, curr_pipe->plane_state)); if (!phantom_plane) continue; diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c b/drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c index 731355bdb9bc30..3650e7beeb6712 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c @@ -1333,12 +1333,13 @@ static int ci_populate_all_memory_levels(struct pp_hwmgr *hwmgr) dev_id = adev->pdev->device; - if ((dpm_table->mclk_table.count >= 2) - && ((dev_id == 0x67B0) || (dev_id == 0x67B1))) { - smu_data->smc_state_table.MemoryLevel[1].MinVddci = - smu_data->smc_state_table.MemoryLevel[0].MinVddci; - smu_data->smc_state_table.MemoryLevel[1].MinMvdd = - smu_data->smc_state_table.MemoryLevel[0].MinMvdd; + if ((dpm_table->mclk_table.count >= 2) && + ((dev_id == 0x67B0) || (dev_id == 0x67B1)) && + (adev->pdev->revision == 0)) { + smu_data->smc_state_table.MemoryLevel[1].MinVddc = + smu_data->smc_state_table.MemoryLevel[0].MinVddc; + smu_data->smc_state_table.MemoryLevel[1].MinVddcPhases = + smu_data->smc_state_table.MemoryLevel[0].MinVddcPhases; } smu_data->smc_state_table.MemoryLevel[0].ActivityLevel = 0x1F; CONVERT_FROM_HOST_TO_SMC_US(smu_data->smc_state_table.MemoryLevel[0].ActivityLevel); diff --git a/drivers/gpu/drm/bridge/imx/imx8qxp-pxl2dpi.c b/drivers/gpu/drm/bridge/imx/imx8qxp-pxl2dpi.c index 441fd32dc91c7a..d64e328bf542f3 100644 --- a/drivers/gpu/drm/bridge/imx/imx8qxp-pxl2dpi.c +++ b/drivers/gpu/drm/bridge/imx/imx8qxp-pxl2dpi.c @@ -222,52 +222,58 @@ static const struct drm_bridge_funcs imx8qxp_pxl2dpi_bridge_funcs = { imx8qxp_pxl2dpi_bridge_atomic_get_output_bus_fmts, }; -static struct device_node * +static int imx8qxp_pxl2dpi_get_available_ep_from_port(struct imx8qxp_pxl2dpi *p2d, - u32 port_id) + u32 port_id, + struct device_node **ep) { - struct device_node *port, *ep; + struct device_node *port; + int ret = 0; int ep_cnt; + *ep = NULL; + port = of_graph_get_port_by_id(p2d->dev->of_node, port_id); if (!port) { DRM_DEV_ERROR(p2d->dev, "failed to get port@%u\n", port_id); - return ERR_PTR(-ENODEV); + return -ENODEV; } ep_cnt = of_get_available_child_count(port); if (ep_cnt == 0) { DRM_DEV_ERROR(p2d->dev, "no available endpoints of port@%u\n", port_id); - ep = ERR_PTR(-ENODEV); + ret = -ENODEV; goto out; } else if (ep_cnt > 1) { DRM_DEV_ERROR(p2d->dev, "invalid available endpoints of port@%u\n", port_id); - ep = ERR_PTR(-EINVAL); + ret = -EINVAL; goto out; } - ep = of_get_next_available_child(port, NULL); - if (!ep) { + *ep = of_get_next_available_child(port, NULL); + if (!*ep) { DRM_DEV_ERROR(p2d->dev, "failed to get available endpoint of port@%u\n", port_id); - ep = ERR_PTR(-ENODEV); + ret = -ENODEV; goto out; } out: of_node_put(port); - return ep; + return ret; } static int imx8qxp_pxl2dpi_find_next_bridge(struct imx8qxp_pxl2dpi *p2d) { - struct device_node *ep __free(device_node) = - imx8qxp_pxl2dpi_get_available_ep_from_port(p2d, 1); - if (IS_ERR(ep)) - return PTR_ERR(ep); + struct device_node *ep __free(device_node) = NULL; + int ret; + + ret = imx8qxp_pxl2dpi_get_available_ep_from_port(p2d, 1, &ep); + if (ret) + return ret; struct device_node *remote __free(device_node) = of_graph_get_remote_port_parent(ep); if (!remote || !of_device_is_available(remote)) { @@ -291,9 +297,9 @@ static int imx8qxp_pxl2dpi_set_pixel_link_sel(struct imx8qxp_pxl2dpi *p2d) struct of_endpoint endpoint; int ret; - ep = imx8qxp_pxl2dpi_get_available_ep_from_port(p2d, 0); - if (IS_ERR(ep)) - return PTR_ERR(ep); + ret = imx8qxp_pxl2dpi_get_available_ep_from_port(p2d, 0, &ep); + if (ret) + return ret; ret = of_graph_parse_endpoint(ep, &endpoint); if (ret) { diff --git a/drivers/gpu/drm/bridge/tda998x_drv.c b/drivers/gpu/drm/bridge/tda998x_drv.c index d9b388165de151..6c427bc75896bc 100644 --- a/drivers/gpu/drm/bridge/tda998x_drv.c +++ b/drivers/gpu/drm/bridge/tda998x_drv.c @@ -1293,7 +1293,7 @@ static const struct drm_edid *tda998x_edid_read(struct tda998x_priv *priv, * can't handle signals gracefully. */ if (tda998x_edid_delay_wait(priv)) - return 0; + return NULL; if (priv->rev == TDA19988) reg_clear(priv, REG_TX4, TX4_PD_RAM); @@ -1762,7 +1762,7 @@ static const struct drm_bridge_funcs tda998x_bridge_funcs = { static int tda998x_get_audio_ports(struct tda998x_priv *priv, struct device_node *np) { - const u32 *port_data; + const __be32 *port_data; u32 size; int i; diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c index a80a335f414801..1541fc8a9ac2d3 100644 --- a/drivers/gpu/drm/drm_fb_helper.c +++ b/drivers/gpu/drm/drm_fb_helper.c @@ -490,7 +490,7 @@ static void drm_fb_helper_memory_range_to_clip(struct fb_info *info, off_t off, * the number of horizontal pixels that need an update. */ off_t bit_off = (off % line_length) * 8; - off_t bit_end = (end % line_length) * 8; + off_t bit_end = bit_off + len * 8; x1 = bit_off / info->var.bits_per_pixel; x2 = DIV_ROUND_UP(bit_end, info->var.bits_per_pixel); diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index d6424267260bdb..8afab57fc0554e 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -1019,7 +1019,7 @@ int drm_gem_change_handle_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv) { struct drm_gem_change_handle *args = data; - struct drm_gem_object *obj; + struct drm_gem_object *obj, *idrobj; int handle, ret; if (!drm_core_check_feature(dev, DRIVER_GEM)) @@ -1042,12 +1042,30 @@ int drm_gem_change_handle_ioctl(struct drm_device *dev, void *data, mutex_lock(&file_priv->prime.lock); spin_lock(&file_priv->table_lock); + + /* When create_tail allocs an obj idr, it needs to first alloc as NULL, + * then later replace with the correct object. This is not necessary + * here, because the only operations that could race are drm_prime + * bookkeeping, and we hold the prime lock. + */ ret = idr_alloc(&file_priv->object_idr, obj, handle, handle + 1, GFP_NOWAIT); - spin_unlock(&file_priv->table_lock); - if (ret < 0) - goto out_unlock; + if (ret < 0) { + spin_unlock(&file_priv->table_lock); + goto out_unlock; + } + + idrobj = idr_replace(&file_priv->object_idr, NULL, handle); + if (idrobj != obj) { + idr_replace(&file_priv->object_idr, idrobj, handle); + idr_remove(&file_priv->object_idr, args->new_handle); + spin_unlock(&file_priv->table_lock); + ret = -ENOENT; + goto out_unlock; + } + + spin_unlock(&file_priv->table_lock); if (obj->dma_buf) { ret = drm_prime_add_buf_handle(&file_priv->prime, obj->dma_buf, @@ -1066,7 +1084,9 @@ int drm_gem_change_handle_ioctl(struct drm_device *dev, void *data, spin_lock(&file_priv->table_lock); idr_remove(&file_priv->object_idr, args->handle); + idrobj = idr_replace(&file_priv->object_idr, obj, handle); spin_unlock(&file_priv->table_lock); + WARN_ON(idrobj != NULL); out_unlock: mutex_unlock(&file_priv->prime.lock); diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c index df4232d7e135d1..3cc50d697c8917 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c @@ -116,16 +116,18 @@ int etnaviv_sched_push_job(struct etnaviv_gem_submit *submit) */ mutex_lock(&gpu->sched_lock); + ret = xa_alloc_cyclic(&gpu->user_fences, &submit->out_fence_id, + NULL, xa_limit_32b, &gpu->next_user_fence, + GFP_KERNEL); + if (ret < 0) + goto out_unlock; + drm_sched_job_arm(&submit->sched_job); submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished); - ret = xa_alloc_cyclic(&gpu->user_fences, &submit->out_fence_id, - submit->out_fence, xa_limit_32b, - &gpu->next_user_fence, GFP_KERNEL); - if (ret < 0) { - drm_sched_job_cleanup(&submit->sched_job); - goto out_unlock; - } + + xa_store(&gpu->user_fences, submit->out_fence_id, + submit->out_fence, GFP_KERNEL); /* the scheduler holds on to the job now */ kref_get(&submit->refcount); diff --git a/drivers/gpu/drm/exynos/exynos_drm_mic.c b/drivers/gpu/drm/exynos/exynos_drm_mic.c index 29a8366513fa70..e68c954ec3e614 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_mic.c +++ b/drivers/gpu/drm/exynos/exynos_drm_mic.c @@ -423,7 +423,9 @@ static int exynos_mic_probe(struct platform_device *pdev) mic->bridge.of_node = dev->of_node; - drm_bridge_add(&mic->bridge); + ret = devm_drm_bridge_add(dev, &mic->bridge); + if (ret) + goto err; pm_runtime_enable(dev); @@ -443,12 +445,8 @@ static int exynos_mic_probe(struct platform_device *pdev) static void exynos_mic_remove(struct platform_device *pdev) { - struct exynos_mic *mic = platform_get_drvdata(pdev); - component_del(&pdev->dev, &exynos_mic_component_ops); pm_runtime_disable(&pdev->dev); - - drm_bridge_remove(&mic->bridge); } static const struct of_device_id exynos_mic_of_match[] = { diff --git a/drivers/gpu/drm/gma500/oaktrail_hdmi.c b/drivers/gpu/drm/gma500/oaktrail_hdmi.c index 58d7e191fd56f8..403d21cbb3a230 100644 --- a/drivers/gpu/drm/gma500/oaktrail_hdmi.c +++ b/drivers/gpu/drm/gma500/oaktrail_hdmi.c @@ -580,6 +580,7 @@ static int oaktrail_hdmi_get_modes(struct drm_connector *connector) } else { edid = (struct edid *)raw_edid; /* FIXME ? edid = drm_get_edid(connector, i2c_adap); */ + i2c_put_adapter(i2c_adap); } if (edid) { diff --git a/drivers/gpu/drm/gma500/oaktrail_lvds.c b/drivers/gpu/drm/gma500/oaktrail_lvds.c index 884d324f004408..e194d0cce0671a 100644 --- a/drivers/gpu/drm/gma500/oaktrail_lvds.c +++ b/drivers/gpu/drm/gma500/oaktrail_lvds.c @@ -293,7 +293,7 @@ void oaktrail_lvds_init(struct drm_device *dev, { struct gma_encoder *gma_encoder; struct gma_connector *gma_connector; - struct gma_i2c_chan *ddc_bus; + struct gma_i2c_chan *ddc_bus = NULL; struct drm_connector *connector; struct drm_encoder *encoder; struct drm_psb_private *dev_priv = to_drm_psb_private(dev); @@ -367,6 +367,8 @@ void oaktrail_lvds_init(struct drm_device *dev, if (edid == NULL && dev_priv->lpc_gpio_base) { ddc_bus = oaktrail_lvds_i2c_init(dev); if (!IS_ERR(ddc_bus)) { + if (i2c_adap) + i2c_put_adapter(i2c_adap); i2c_adap = &ddc_bus->base; edid = drm_get_edid(connector, i2c_adap); } @@ -421,7 +423,10 @@ void oaktrail_lvds_init(struct drm_device *dev, err_unlock: mutex_unlock(&dev->mode_config.mutex); - gma_i2c_destroy(to_gma_i2c_chan(connector->ddc)); + if (!IS_ERR_OR_NULL(ddc_bus)) + gma_i2c_destroy(ddc_bus); + else if (i2c_adap) + i2c_put_adapter(i2c_adap); drm_encoder_cleanup(encoder); err_connector_cleanup: drm_connector_cleanup(connector); diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c index 4955bd8b11d7ad..50d34b4fdf82a9 100644 --- a/drivers/gpu/drm/i915/display/intel_dp.c +++ b/drivers/gpu/drm/i915/display/intel_dp.c @@ -3119,8 +3119,13 @@ static void intel_dp_compute_vsc_colorimetry(const struct intel_crtc_state *crtc drm_WARN_ON(display->drm, vsc->bpc == 6 && vsc->pixelformat != DP_PIXELFORMAT_RGB); - /* all YCbCr are always limited range */ - vsc->dynamic_range = DP_DYNAMIC_RANGE_CTA; + /* All YCbCr formats are always limited range. */ + if (vsc->pixelformat == DP_PIXELFORMAT_RGB) + vsc->dynamic_range = crtc_state->limited_color_range ? + DP_DYNAMIC_RANGE_CTA : DP_DYNAMIC_RANGE_VESA; + else + vsc->dynamic_range = DP_DYNAMIC_RANGE_CTA; + vsc->content_type = DP_CONTENT_TYPE_NOT_DEFINED; } diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c index 984d0056c01c25..adff482a6c9cd9 100644 --- a/drivers/gpu/drm/i915/gt/intel_reset.c +++ b/drivers/gpu/drm/i915/gt/intel_reset.c @@ -132,7 +132,8 @@ void __i915_request_reset(struct i915_request *rq, bool guilty) rcu_read_lock(); /* protect the GEM context */ if (guilty) { i915_request_set_error_once(rq, -EIO); - __i915_request_skip(rq); + if (!i915_request_signaled(rq)) + __i915_request_skip(rq); banned = mark_guilty(rq); } else { i915_request_set_error_once(rq, -EAGAIN); diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c index 385a634c3ed004..d9be7a5a239c16 100644 --- a/drivers/gpu/drm/i915/i915_driver.c +++ b/drivers/gpu/drm/i915/i915_driver.c @@ -750,9 +750,8 @@ static bool has_auxccs(struct drm_device *drm) { struct drm_i915_private *i915 = to_i915(drm); - return IS_GRAPHICS_VER(i915, 9, 12) || - IS_ALDERLAKE_P(i915) || - IS_METEORLAKE(i915); + return IS_GRAPHICS_VER(i915, 9, 12) && + !HAS_FLAT_CCS(i915); } static bool has_fenced_regions(struct drm_device *drm) diff --git a/drivers/gpu/drm/loongson/lsdc_drv.c b/drivers/gpu/drm/loongson/lsdc_drv.c index 1ece1ea42f7816..34405073c4d48e 100644 --- a/drivers/gpu/drm/loongson/lsdc_drv.c +++ b/drivers/gpu/drm/loongson/lsdc_drv.c @@ -293,7 +293,7 @@ static int lsdc_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent) vga_client_register(pdev, lsdc_vga_set_decode); - drm_kms_helper_poll_init(ddev); + drmm_kms_helper_poll_init(ddev); if (loongson_vblank) { ret = drm_vblank_init(ddev, descp->num_of_crtc); diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c index 72848ed80df73f..b101e14f841e03 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c @@ -2513,6 +2513,7 @@ static const struct nvkm_device_chip nv170_chipset = { .name = "GA100", .bar = { 0x00000001, tu102_bar_new }, + .bios = { 0x00000001, nvkm_bios_new }, .devinit = { 0x00000001, ga100_devinit_new }, .fault = { 0x00000001, tu102_fault_new }, .fb = { 0x00000001, ga100_fb_new }, @@ -2529,7 +2530,6 @@ nv170_chipset = { .vfn = { 0x00000001, ga100_vfn_new }, .ce = { 0x000003ff, ga100_ce_new }, .fifo = { 0x00000001, ga100_fifo_new }, - .sec2 = { 0x00000001, tu102_sec2_new }, }; static const struct nvkm_device_chip @@ -3341,7 +3341,6 @@ nvkm_device_ctor(const struct nvkm_device_func *func, case 0x166: device->chip = &nv166_chipset; break; case 0x167: device->chip = &nv167_chipset; break; case 0x168: device->chip = &nv168_chipset; break; - case 0x170: device->chip = &nv170_chipset; break; case 0x172: device->chip = &nv172_chipset; break; case 0x173: device->chip = &nv173_chipset; break; case 0x174: device->chip = &nv174_chipset; break; @@ -3361,6 +3360,14 @@ nvkm_device_ctor(const struct nvkm_device_func *func, case 0x1b6: device->chip = &nv1b6_chipset; break; case 0x1b7: device->chip = &nv1b7_chipset; break; default: + if (nvkm_boolopt(device->cfgopt, "NvEnableUnsupportedChipsets", false)) { + switch (device->chipset) { + case 0x170: device->chip = &nv170_chipset; break; + default: + break; + } + } + if (!device->chip) { nvdev_error(device, "unknown chipset (%08x)\n", boot0); ret = -ENODEV; diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ga100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ga100.c index fdd820eeef815e..27a13aeccd3cb3 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ga100.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ga100.c @@ -41,11 +41,15 @@ ga100_gsp_flcn = { static const struct nvkm_gsp_func ga100_gsp = { .flcn = &ga100_gsp_flcn, + .fwsec = &tu102_gsp_fwsec, .sig_section = ".fwsignature_ga100", .booter.ctor = tu102_gsp_booter_ctor, + .fwsec_sb.ctor = tu102_gsp_fwsec_sb_ctor, + .fwsec_sb.dtor = tu102_gsp_fwsec_sb_dtor, + .dtor = r535_gsp_dtor, .oneinit = tu102_gsp_oneinit, .init = tu102_gsp_init, diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu102.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu102.c index dd82c76b8b9a5b..19cb269e7a2659 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu102.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu102.c @@ -318,13 +318,8 @@ tu102_gsp_oneinit(struct nvkm_gsp *gsp) if (ret) return ret; - /* - * Calculate FB layout. FRTS is a memory region created by the FWSEC-FRTS firmware. - * FWSEC comes from VBIOS. So on systems with no VBIOS (e.g. GA100), the FRTS does - * not exist. Therefore, use the existence of VBIOS to determine whether to reserve - * an FRTS region. - */ - gsp->fb.wpr2.frts.size = device->bios ? 0x100000 : 0; + /* Calculate FB layout. */ + gsp->fb.wpr2.frts.size = 0x100000; gsp->fb.wpr2.frts.addr = ALIGN_DOWN(gsp->fb.bios.addr, 0x20000) - gsp->fb.wpr2.frts.size; gsp->fb.wpr2.boot.size = gsp->boot.fw.size; @@ -348,12 +343,9 @@ tu102_gsp_oneinit(struct nvkm_gsp *gsp) if (ret) return ret; - /* Only boot FWSEC-FRTS if it actually exists */ - if (gsp->fb.wpr2.frts.size) { - ret = nvkm_gsp_fwsec_frts(gsp); - if (WARN_ON(ret)) - return ret; - } + ret = nvkm_gsp_fwsec_frts(gsp); + if (WARN_ON(ret)) + return ret; /* Reset GSP into RISC-V mode. */ ret = gsp->func->reset(gsp); diff --git a/drivers/gpu/drm/panel/Kconfig b/drivers/gpu/drm/panel/Kconfig index d6863b28ddc559..d592f4f4b939a9 100644 --- a/drivers/gpu/drm/panel/Kconfig +++ b/drivers/gpu/drm/panel/Kconfig @@ -208,6 +208,7 @@ config DRM_PANEL_HIMAX_HX83121A depends on OF depends on DRM_MIPI_DSI depends on BACKLIGHT_CLASS_DEVICE + select DRM_DISPLAY_DSC_HELPER select DRM_KMS_HELPER help Say Y here if you want to enable support for Himax HX83121A-based diff --git a/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c b/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c index d5fe105bdbdde5..658ce64c71eb2b 100644 --- a/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c +++ b/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c @@ -1324,6 +1324,8 @@ static int boe_panel_disable(struct drm_panel *panel) mipi_dsi_dcs_set_display_off_multi(&ctx); mipi_dsi_dcs_enter_sleep_mode_multi(&ctx); + boe->dsi->mode_flags |= MIPI_DSI_MODE_LPM; + mipi_dsi_msleep(&ctx, 150); return ctx.accum_err; diff --git a/drivers/gpu/drm/panel/panel-feiyang-fy07024di26a30d.c b/drivers/gpu/drm/panel/panel-feiyang-fy07024di26a30d.c index 4f8d6d8c07e4d7..dbdb7e3cb7b625 100644 --- a/drivers/gpu/drm/panel/panel-feiyang-fy07024di26a30d.c +++ b/drivers/gpu/drm/panel/panel-feiyang-fy07024di26a30d.c @@ -98,9 +98,7 @@ static int feiyang_enable(struct drm_panel *panel) /* T12 (video & logic signal rise + backlight rise) T12 >= 200ms */ msleep(200); - mipi_dsi_dcs_set_display_on(ctx->dsi); - - return 0; + return mipi_dsi_dcs_set_display_on(ctx->dsi); } static int feiyang_disable(struct drm_panel *panel) diff --git a/drivers/gpu/drm/panel/panel-himax-hx83102.c b/drivers/gpu/drm/panel/panel-himax-hx83102.c index 8b2a68ee851e39..a5e5c9ea7a73f5 100644 --- a/drivers/gpu/drm/panel/panel-himax-hx83102.c +++ b/drivers/gpu/drm/panel/panel-himax-hx83102.c @@ -937,6 +937,8 @@ static int hx83102_disable(struct drm_panel *panel) mipi_dsi_dcs_set_display_off_multi(&dsi_ctx); mipi_dsi_dcs_enter_sleep_mode_multi(&dsi_ctx); + dsi->mode_flags |= MIPI_DSI_MODE_LPM; + mipi_dsi_msleep(&dsi_ctx, 150); return dsi_ctx.accum_err; diff --git a/drivers/gpu/drm/panel/panel-himax-hx83121a.c b/drivers/gpu/drm/panel/panel-himax-hx83121a.c index ebe643ba41844a..bed79aa06f46a6 100644 --- a/drivers/gpu/drm/panel/panel-himax-hx83121a.c +++ b/drivers/gpu/drm/panel/panel-himax-hx83121a.c @@ -596,8 +596,8 @@ static int himax_probe(struct mipi_dsi_device *dsi) ctx = devm_drm_panel_alloc(dev, struct himax, panel, &himax_panel_funcs, DRM_MODE_CONNECTOR_DSI); - if (!ctx) - return -ENOMEM; + if (IS_ERR(ctx)) + return PTR_ERR(ctx); ret = devm_regulator_bulk_get_const(&dsi->dev, ARRAY_SIZE(himax_supplies), diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c index 711f5101aa04ce..074c0995ddc26c 100644 --- a/drivers/gpu/drm/panfrost/panfrost_drv.c +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c @@ -390,6 +390,8 @@ panfrost_ioctl_wait_bo(struct drm_device *dev, void *data, true, timeout); if (!ret) ret = timeout ? -ETIMEDOUT : -EBUSY; + else if (ret > 0) + ret = 0; drm_gem_object_put(gem_obj); diff --git a/drivers/gpu/drm/qxl/qxl_drv.c b/drivers/gpu/drm/qxl/qxl_drv.c index 2bbb1168a3ffa5..1e6a2392d7c6c3 100644 --- a/drivers/gpu/drm/qxl/qxl_drv.c +++ b/drivers/gpu/drm/qxl/qxl_drv.c @@ -118,12 +118,13 @@ qxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent) /* Complete initialization. */ ret = drm_dev_register(&qdev->ddev, ent->driver_data); if (ret) - goto modeset_cleanup; + goto poll_fini; drm_client_setup(&qdev->ddev, NULL); return 0; -modeset_cleanup: +poll_fini: + drm_kms_helper_poll_fini(&qdev->ddev); qxl_modeset_fini(qdev); unload: qxl_device_fini(qdev); @@ -154,6 +155,7 @@ qxl_pci_remove(struct pci_dev *pdev) { struct drm_device *dev = pci_get_drvdata(pdev); + drm_kms_helper_poll_fini(dev); drm_dev_unregister(dev); drm_atomic_helper_shutdown(dev); if (pci_is_vga(pdev) && pdev->revision < 5) diff --git a/drivers/gpu/drm/radeon/ci_dpm.c b/drivers/gpu/drm/radeon/ci_dpm.c index 22321eb95b7d5d..703848fac18933 100644 --- a/drivers/gpu/drm/radeon/ci_dpm.c +++ b/drivers/gpu/drm/radeon/ci_dpm.c @@ -2461,7 +2461,8 @@ static void ci_register_patching_mc_arb(struct radeon_device *rdev, if (patch && ((rdev->pdev->device == 0x67B0) || - (rdev->pdev->device == 0x67B1))) { + (rdev->pdev->device == 0x67B1)) && + (rdev->pdev->revision == 0)) { if ((memory_clock > 100000) && (memory_clock <= 125000)) { tmp2 = (((0x31 * engine_clock) / 125000) - 1) & 0xff; *dram_timimg2 &= ~0x00ff0000; @@ -3304,7 +3305,8 @@ static int ci_populate_all_memory_levels(struct radeon_device *rdev) pi->smc_state_table.MemoryLevel[0].EnabledForActivity = 1; if ((dpm_table->mclk_table.count >= 2) && - ((rdev->pdev->device == 0x67B0) || (rdev->pdev->device == 0x67B1))) { + ((rdev->pdev->device == 0x67B0) || (rdev->pdev->device == 0x67B1)) && + (rdev->pdev->revision == 0)) { pi->smc_state_table.MemoryLevel[1].MinVddc = pi->smc_state_table.MemoryLevel[0].MinVddc; pi->smc_state_table.MemoryLevel[1].MinVddcPhases = @@ -4493,7 +4495,8 @@ static int ci_register_patching_mc_seq(struct radeon_device *rdev, if (patch && ((rdev->pdev->device == 0x67B0) || - (rdev->pdev->device == 0x67B1))) { + (rdev->pdev->device == 0x67B1)) && + (rdev->pdev->revision == 0)) { for (i = 0; i < table->last; i++) { if (table->last >= SMU7_DISCRETE_MC_REGISTER_ARRAY_SIZE) return -EINVAL; diff --git a/drivers/gpu/drm/sti/sti_hda.c b/drivers/gpu/drm/sti/sti_hda.c index b7397827889c94..360a88ca8f0c5a 100644 --- a/drivers/gpu/drm/sti/sti_hda.c +++ b/drivers/gpu/drm/sti/sti_hda.c @@ -741,6 +741,7 @@ static int sti_hda_probe(struct platform_device *pdev) struct device *dev = &pdev->dev; struct sti_hda *hda; struct resource *res; + int ret; DRM_INFO("%s\n", __func__); @@ -779,7 +780,9 @@ static int sti_hda_probe(struct platform_device *pdev) return PTR_ERR(hda->clk_hddac); } - drm_bridge_add(&hda->bridge); + ret = devm_drm_bridge_add(dev, &hda->bridge); + if (ret) + return ret; platform_set_drvdata(pdev, hda); @@ -788,10 +791,7 @@ static int sti_hda_probe(struct platform_device *pdev) static void sti_hda_remove(struct platform_device *pdev) { - struct sti_hda *hda = platform_get_drvdata(pdev); - component_del(&pdev->dev, &sti_hda_ops); - drm_bridge_remove(&hda->bridge); } static const struct of_device_id hda_of_match[] = { diff --git a/drivers/gpu/drm/tiny/bochs.c b/drivers/gpu/drm/tiny/bochs.c index 222e4ae1abbd14..5d8dc5efec776e 100644 --- a/drivers/gpu/drm/tiny/bochs.c +++ b/drivers/gpu/drm/tiny/bochs.c @@ -761,25 +761,21 @@ static int bochs_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent ret = pcim_enable_device(pdev); if (ret) - goto err_free_dev; + return ret; pci_set_drvdata(pdev, dev); ret = bochs_load(bochs); if (ret) - goto err_free_dev; + return ret; ret = drm_dev_register(dev, 0); if (ret) - goto err_free_dev; + return ret; drm_client_setup(dev, NULL); return ret; - -err_free_dev: - drm_dev_put(dev); - return ret; } static void bochs_pci_remove(struct pci_dev *pdev) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index d85f0a37ac35f3..bcd76f6bb7f02d 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -739,7 +739,7 @@ static int ttm_bo_alloc_resource(struct ttm_buffer_object *bo, may_evict = (force_space && place->mem_type != TTM_PL_SYSTEM); ret = ttm_resource_alloc(bo, place, res, force_space ? &limit_pool : NULL); if (ret) { - if (ret != -ENOSPC && ret != -EAGAIN) { + if (ret != -ENOSPC) { dmem_cgroup_pool_state_put(limit_pool); return ret; } @@ -1177,17 +1177,13 @@ ttm_bo_swapout_cb(struct ttm_lru_walk *walk, struct ttm_buffer_object *bo) bdev->funcs->swap_notify(bo); if (ttm_tt_is_populated(tt)) { - spin_lock(&bdev->lru_lock); - ttm_resource_del_bulk_move(bo->resource, bo); - spin_unlock(&bdev->lru_lock); - ret = ttm_tt_swapout(bdev, tt, swapout_walk->gfp_flags); - - spin_lock(&bdev->lru_lock); - if (ret) - ttm_resource_add_bulk_move(bo->resource, bo); - ttm_resource_move_to_lru_tail(bo->resource); - spin_unlock(&bdev->lru_lock); + if (!ret) { + spin_lock(&bdev->lru_lock); + ttm_resource_del_bulk_move_unevictable(bo->resource, bo); + ttm_resource_move_to_lru_tail(bo->resource); + spin_unlock(&bdev->lru_lock); + } } out: diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c index f83b7d5ec6c6d0..3e3c201a022267 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_util.c +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c @@ -1112,19 +1112,14 @@ long ttm_bo_shrink(struct ttm_operation_ctx *ctx, struct ttm_buffer_object *bo, if (lret < 0) return lret; - if (bo->bulk_move) { - spin_lock(&bdev->lru_lock); - ttm_resource_del_bulk_move(bo->resource, bo); - spin_unlock(&bdev->lru_lock); - } - lret = ttm_tt_backup(bdev, bo->ttm, (struct ttm_backup_flags) {.purge = flags.purge, .writeback = flags.writeback}); - if (lret <= 0 && bo->bulk_move) { + if (lret > 0) { spin_lock(&bdev->lru_lock); - ttm_resource_add_bulk_move(bo->resource, bo); + ttm_resource_del_bulk_move_unevictable(bo->resource, bo); + ttm_resource_move_to_lru_tail(bo->resource); spin_unlock(&bdev->lru_lock); } diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c index 26a3689e5fd90a..278bbe7a11add3 100644 --- a/drivers/gpu/drm/ttm/ttm_pool.c +++ b/drivers/gpu/drm/ttm/ttm_pool.c @@ -206,6 +206,14 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags, return NULL; } +static void __free_pages_gpu_account(struct page *p, unsigned int order, + bool reclaim) +{ + mod_lruvec_page_state(p, reclaim ? NR_GPU_RECLAIM : NR_GPU_ACTIVE, + -(1 << order)); + __free_pages(p, order); +} + /* Reset the caching and pages of size 1 << order */ static void ttm_pool_free_page(struct ttm_pool *pool, enum ttm_caching caching, unsigned int order, struct page *p, bool reclaim) @@ -223,9 +231,7 @@ static void ttm_pool_free_page(struct ttm_pool *pool, enum ttm_caching caching, #endif if (!pool || !ttm_pool_uses_dma_alloc(pool)) { - mod_lruvec_page_state(p, reclaim ? NR_GPU_RECLAIM : NR_GPU_ACTIVE, - -(1 << order)); - __free_pages(p, order); + __free_pages_gpu_account(p, order, reclaim); return; } @@ -606,7 +612,7 @@ static int ttm_pool_restore_commit(struct ttm_pool_tt_restore *restore, */ ttm_pool_split_for_swap(restore->pool, p); copy_highpage(restore->alloced_page + i, p); - __free_pages(p, 0); + __free_pages_gpu_account(p, 0, false); } restore->restored_pages++; @@ -1068,7 +1074,7 @@ long ttm_pool_backup(struct ttm_pool *pool, struct ttm_tt *tt, if (flags->purge) { shrunken += num_pages; page->private = 0; - __free_pages(page, order); + __free_pages_gpu_account(page, order, false); memset(tt->pages + i, 0, num_pages * sizeof(*tt->pages)); } @@ -1109,7 +1115,7 @@ long ttm_pool_backup(struct ttm_pool *pool, struct ttm_tt *tt, } handle = shandle; tt->pages[i] = ttm_backup_handle_to_page_ptr(handle); - put_page(page); + __free_pages_gpu_account(page, 0, false); shrunken++; } diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c index 9f36631d48b617..154d6739256f86 100644 --- a/drivers/gpu/drm/ttm/ttm_resource.c +++ b/drivers/gpu/drm/ttm/ttm_resource.c @@ -292,6 +292,19 @@ void ttm_resource_del_bulk_move(struct ttm_resource *res, ttm_lru_bulk_move_del(bo->bulk_move, res); } +/* + * Remove a resource from its bulk_move, bypassing the unevictable check. + * Use only when the resource is known to still be tracked in the range despite + * the BO having just become unevictable; asserts that this is the case. + */ +void ttm_resource_del_bulk_move_unevictable(struct ttm_resource *res, + struct ttm_buffer_object *bo) +{ + WARN_ON_ONCE(!ttm_resource_unevictable(res, bo)); + if (bo->bulk_move) + ttm_lru_bulk_move_del(bo->bulk_move, res); +} + /* Move a resource to the LRU or bulk tail */ void ttm_resource_move_to_lru_tail(struct ttm_resource *res) { @@ -385,8 +398,11 @@ int ttm_resource_alloc(struct ttm_buffer_object *bo, if (man->cg) { ret = dmem_cgroup_try_charge(man->cg, bo->base.size, &pool, ret_limit_pool); - if (ret) + if (ret) { + if (ret == -EAGAIN) + ret = -ENOSPC; return ret; + } } ret = man->func->alloc(man, bo, place, res_ptr); diff --git a/drivers/gpu/drm/xe/display/xe_hdcp_gsc.c b/drivers/gpu/drm/xe/display/xe_hdcp_gsc.c index 29c72aa4b0d2d7..33494b86205d2e 100644 --- a/drivers/gpu/drm/xe/display/xe_hdcp_gsc.c +++ b/drivers/gpu/drm/xe/display/xe_hdcp_gsc.c @@ -37,9 +37,17 @@ static bool intel_hdcp_gsc_check_status(struct drm_device *drm) struct xe_device *xe = to_xe_device(drm); struct xe_tile *tile = xe_device_get_root_tile(xe); struct xe_gt *gt = tile->media_gt; - struct xe_gsc *gsc = >->uc.gsc; + struct xe_gsc *gsc; + + if (!gt) { + drm_dbg_kms(&xe->drm, + "not checking GSC status for HDCP2.x: media GT not present or disabled\n"); + return false; + } + + gsc = >->uc.gsc; - if (!gsc || !xe_uc_fw_is_available(&gsc->fw)) { + if (!xe_uc_fw_is_available(&gsc->fw)) { drm_dbg_kms(&xe->drm, "GSC Components not ready for HDCP2.x\n"); return false; diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c index 4075edf974216c..6b518858538f57 100644 --- a/drivers/gpu/drm/xe/xe_bo.c +++ b/drivers/gpu/drm/xe/xe_bo.c @@ -897,10 +897,10 @@ void xe_bo_set_purgeable_state(struct xe_bo *bo, new_state == XE_MADV_PURGEABLE_PURGED); /* Once purged, always purged - cannot transition out */ - xe_assert(xe, !(bo->madv_purgeable == XE_MADV_PURGEABLE_PURGED && + xe_assert(xe, !(bo->purgeable.state == XE_MADV_PURGEABLE_PURGED && new_state != XE_MADV_PURGEABLE_PURGED)); - bo->madv_purgeable = new_state; + bo->purgeable.state = new_state; xe_bo_set_purgeable_shrinker(bo, new_state); } @@ -2368,7 +2368,7 @@ struct xe_bo *xe_bo_init_locked(struct xe_device *xe, struct xe_bo *bo, INIT_LIST_HEAD(&bo->vram_userfault_link); /* Initialize purge advisory state */ - bo->madv_purgeable = XE_MADV_PURGEABLE_WILLNEED; + bo->purgeable.state = XE_MADV_PURGEABLE_WILLNEED; drm_gem_private_object_init(&xe->drm, &bo->ttm.base, size); diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h index 68dea7d25a6b7c..6340317f7d2e6a 100644 --- a/drivers/gpu/drm/xe/xe_bo.h +++ b/drivers/gpu/drm/xe/xe_bo.h @@ -251,7 +251,7 @@ static inline bool xe_bo_is_protected(const struct xe_bo *bo) static inline bool xe_bo_is_purged(struct xe_bo *bo) { xe_bo_assert_held(bo); - return bo->madv_purgeable == XE_MADV_PURGEABLE_PURGED; + return bo->purgeable.state == XE_MADV_PURGEABLE_PURGED; } /** @@ -268,11 +268,95 @@ static inline bool xe_bo_is_purged(struct xe_bo *bo) static inline bool xe_bo_madv_is_dontneed(struct xe_bo *bo) { xe_bo_assert_held(bo); - return bo->madv_purgeable == XE_MADV_PURGEABLE_DONTNEED; + return bo->purgeable.state == XE_MADV_PURGEABLE_DONTNEED; } void xe_bo_set_purgeable_state(struct xe_bo *bo, enum xe_madv_purgeable_state new_state); +/** + * xe_bo_willneed_get_locked() - Acquire a WILLNEED holder on a BO + * @bo: Buffer object + * + * Increments willneed_count and, on a 0->1 transition, promotes the BO + * from DONTNEED to WILLNEED. PURGED is terminal and is never modified. + * + * Caller must hold the BO's dma-resv lock. + */ +static inline void xe_bo_willneed_get_locked(struct xe_bo *bo) +{ + xe_bo_assert_held(bo); + + /* Imported BOs are owned externally; do not track purgeability. */ + if (drm_gem_is_imported(&bo->ttm.base)) + return; + + if (bo->purgeable.willneed_count++ == 0 && xe_bo_madv_is_dontneed(bo)) + xe_bo_set_purgeable_state(bo, XE_MADV_PURGEABLE_WILLNEED); +} + +/** + * xe_bo_willneed_put_locked() - Release a WILLNEED holder on a BO + * @bo: Buffer object + * + * Decrements willneed_count and, on a 1->0 transition, marks the BO + * DONTNEED only if it still has VMAs (implying all active VMAs are + * DONTNEED). If the last VMA is being removed, preserve the current BO + * state to match the previous VMA-walk semantics. + * + * PURGED is terminal and the BO state is never modified. + * + * Caller must hold the BO's dma-resv lock. + */ +static inline void xe_bo_willneed_put_locked(struct xe_bo *bo) +{ + xe_bo_assert_held(bo); + + if (drm_gem_is_imported(&bo->ttm.base)) + return; + + xe_assert(xe_bo_device(bo), bo->purgeable.willneed_count > 0); + if (--bo->purgeable.willneed_count == 0 && bo->purgeable.vma_count > 0 && + !xe_bo_is_purged(bo)) + xe_bo_set_purgeable_state(bo, XE_MADV_PURGEABLE_DONTNEED); +} + +/** + * xe_bo_vma_count_inc_locked() - Account a new VMA on a BO + * @bo: Buffer object + * + * Increments vma_count. + * + * Caller must hold the BO's dma-resv lock. + */ +static inline void xe_bo_vma_count_inc_locked(struct xe_bo *bo) +{ + xe_bo_assert_held(bo); + + if (drm_gem_is_imported(&bo->ttm.base)) + return; + + bo->purgeable.vma_count++; +} + +/** + * xe_bo_vma_count_dec_locked() - Account a VMA removal on a BO + * @bo: Buffer object + * + * Decrements vma_count. + * + * Caller must hold the BO's dma-resv lock. + */ +static inline void xe_bo_vma_count_dec_locked(struct xe_bo *bo) +{ + xe_bo_assert_held(bo); + + if (drm_gem_is_imported(&bo->ttm.base)) + return; + + xe_assert(xe_bo_device(bo), bo->purgeable.vma_count > 0); + bo->purgeable.vma_count--; +} + static inline void xe_bo_unpin_map_no_vm(struct xe_bo *bo) { if (likely(bo)) { diff --git a/drivers/gpu/drm/xe/xe_bo_types.h b/drivers/gpu/drm/xe/xe_bo_types.h index 9d19940b8fc035..077e35b4cdce75 100644 --- a/drivers/gpu/drm/xe/xe_bo_types.h +++ b/drivers/gpu/drm/xe/xe_bo_types.h @@ -111,10 +111,32 @@ struct xe_bo { u64 min_align; /** - * @madv_purgeable: user space advise on BO purgeability, protected - * by BO's dma-resv lock. + * @purgeable: Purgeability state and accounting. + * + * All fields are protected by the BO's dma-resv lock. */ - u32 madv_purgeable; + struct { + /** + * @purgeable.state: BO purgeability state + * (WILLNEED/DONTNEED/PURGED). + */ + u32 state; + + /** + * @purgeable.vma_count: Number of VMAs currently mapping this BO. + */ + u32 vma_count; + + /** + * @purgeable.willneed_count: Number of active WILLNEED holders. + * + * Counts WILLNEED VMAs plus active dma-buf exports for + * non-imported BOs. The BO flips to DONTNEED on a 1->0 + * transition only when VMAs still exist; if the last VMA is + * removed, the previous BO state is preserved. + */ + u32 willneed_count; + } purgeable; }; #endif diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c b/drivers/gpu/drm/xe/xe_dma_buf.c index b9828da1589723..8a920e58245cd7 100644 --- a/drivers/gpu/drm/xe/xe_dma_buf.c +++ b/drivers/gpu/drm/xe/xe_dma_buf.c @@ -193,6 +193,18 @@ static int xe_dma_buf_begin_cpu_access(struct dma_buf *dma_buf, return 0; } +static void xe_dma_buf_release(struct dma_buf *dmabuf) +{ + struct drm_gem_object *obj = dmabuf->priv; + struct xe_bo *bo = gem_to_xe_bo(obj); + + xe_bo_lock(bo, false); + xe_bo_willneed_put_locked(bo); + xe_bo_unlock(bo); + + drm_gem_dmabuf_release(dmabuf); +} + static const struct dma_buf_ops xe_dmabuf_ops = { .attach = xe_dma_buf_attach, .detach = xe_dma_buf_detach, @@ -200,7 +212,7 @@ static const struct dma_buf_ops xe_dmabuf_ops = { .unpin = xe_dma_buf_unpin, .map_dma_buf = xe_dma_buf_map, .unmap_dma_buf = xe_dma_buf_unmap, - .release = drm_gem_dmabuf_release, + .release = xe_dma_buf_release, .begin_cpu_access = xe_dma_buf_begin_cpu_access, .mmap = drm_gem_dmabuf_mmap, .vmap = drm_gem_dmabuf_vmap, @@ -241,33 +253,33 @@ struct dma_buf *xe_gem_prime_export(struct drm_gem_object *obj, int flags) ret = -EINVAL; goto out_unlock; } + + xe_bo_willneed_get_locked(bo); xe_bo_unlock(bo); ret = ttm_bo_setup_export(&bo->ttm, &ctx); if (ret) - return ERR_PTR(ret); + goto out_put; buf = drm_gem_prime_export(obj, flags); - if (!IS_ERR(buf)) - buf->ops = &xe_dmabuf_ops; + if (IS_ERR(buf)) { + ret = PTR_ERR(buf); + goto out_put; + } + buf->ops = &xe_dmabuf_ops; return buf; +out_put: + xe_bo_lock(bo, false); + xe_bo_willneed_put_locked(bo); out_unlock: xe_bo_unlock(bo); return ERR_PTR(ret); } -/* - * Takes ownership of @storage: on success it is transferred to the returned - * drm_gem_object; on failure it is freed before returning the error. - * This matches the contract of xe_bo_init_locked() which frees @storage on - * its error paths, so callers need not (and must not) free @storage after - * this call. - */ static struct drm_gem_object * -xe_dma_buf_init_obj(struct drm_device *dev, struct xe_bo *storage, - struct dma_buf *dma_buf) +xe_dma_buf_create_obj(struct drm_device *dev, struct dma_buf *dma_buf) { struct dma_resv *resv = dma_buf->resv; struct xe_device *xe = to_xe_device(dev); @@ -278,10 +290,8 @@ xe_dma_buf_init_obj(struct drm_device *dev, struct xe_bo *storage, int ret = 0; dummy_obj = drm_gpuvm_resv_object_alloc(&xe->drm); - if (!dummy_obj) { - xe_bo_free(storage); + if (!dummy_obj) return ERR_PTR(-ENOMEM); - } dummy_obj->resv = resv; xe_validation_guard(&ctx, &xe->val, &exec, (struct xe_val_flags) {}, ret) { @@ -290,8 +300,7 @@ xe_dma_buf_init_obj(struct drm_device *dev, struct xe_bo *storage, if (ret) break; - /* xe_bo_init_locked() frees storage on error */ - bo = xe_bo_init_locked(xe, storage, NULL, resv, NULL, dma_buf->size, + bo = xe_bo_init_locked(xe, NULL, NULL, resv, NULL, dma_buf->size, 0, /* Will require 1way or 2way for vm_bind */ ttm_bo_type_sg, XE_BO_FLAG_SYSTEM, &exec); drm_exec_retry_on_contention(&exec); @@ -342,7 +351,6 @@ struct drm_gem_object *xe_gem_prime_import(struct drm_device *dev, const struct dma_buf_attach_ops *attach_ops; struct dma_buf_attachment *attach; struct drm_gem_object *obj; - struct xe_bo *bo; if (dma_buf->ops == &xe_dmabuf_ops) { obj = dma_buf->priv; @@ -358,13 +366,15 @@ struct drm_gem_object *xe_gem_prime_import(struct drm_device *dev, } /* - * Don't publish the bo until we have a valid attachment, and a - * valid attachment needs the bo address. So pre-create a bo before - * creating the attachment and publish. + * This needs to happen before the attach, since it will create a new + * attachment for this, and add it to the list of attachments, at which + * point it is globally visible, and at any point the export side can + * call into on invalidate_mappings callback, which require a working + * object. */ - bo = xe_bo_alloc(); - if (IS_ERR(bo)) - return ERR_CAST(bo); + obj = xe_dma_buf_create_obj(dev, dma_buf); + if (IS_ERR(obj)) + return obj; attach_ops = &xe_dma_buf_attach_ops; #if IS_ENABLED(CONFIG_DRM_XE_KUNIT_TEST) @@ -372,29 +382,15 @@ struct drm_gem_object *xe_gem_prime_import(struct drm_device *dev, attach_ops = test->attach_ops; #endif - attach = dma_buf_dynamic_attach(dma_buf, dev->dev, attach_ops, &bo->ttm.base); + attach = dma_buf_dynamic_attach(dma_buf, dev->dev, attach_ops, obj); if (IS_ERR(attach)) { - obj = ERR_CAST(attach); - goto out_err; + xe_bo_put(gem_to_xe_bo(obj)); + return ERR_CAST(attach); } - /* - * xe_dma_buf_init_obj() takes ownership of bo on both success - * and failure, so we must not touch bo after this call. - */ - obj = xe_dma_buf_init_obj(dev, bo, dma_buf); - if (IS_ERR(obj)) { - dma_buf_detach(dma_buf, attach); - return obj; - } get_dma_buf(dma_buf); obj->import_attach = attach; return obj; - -out_err: - xe_bo_free(bo); - - return obj; } #if IS_ENABLED(CONFIG_DRM_XE_KUNIT_TEST) diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c index 87a164efcc33c8..01fe03b9efe85c 100644 --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c @@ -385,10 +385,10 @@ static int pf_migration_mmio_save(struct xe_gt *gt, unsigned int vfid, void *buf if (xe_gt_is_media_type(gt)) for (n = 0; n < MED_VF_SW_FLAG_COUNT; n++) - regs[n] = xe_mmio_read32(>->mmio, MED_VF_SW_FLAG(n)); + regs[n] = xe_mmio_read32(&mmio, MED_VF_SW_FLAG(n)); else for (n = 0; n < VF_SW_FLAG_COUNT; n++) - regs[n] = xe_mmio_read32(>->mmio, VF_SW_FLAG(n)); + regs[n] = xe_mmio_read32(&mmio, VF_SW_FLAG(n)); return 0; } @@ -407,10 +407,10 @@ static int pf_migration_mmio_restore(struct xe_gt *gt, unsigned int vfid, if (xe_gt_is_media_type(gt)) for (n = 0; n < MED_VF_SW_FLAG_COUNT; n++) - xe_mmio_write32(>->mmio, MED_VF_SW_FLAG(n), regs[n]); + xe_mmio_write32(&mmio, MED_VF_SW_FLAG(n), regs[n]); else for (n = 0; n < VF_SW_FLAG_COUNT; n++) - xe_mmio_write32(>->mmio, VF_SW_FLAG(n), regs[n]); + xe_mmio_write32(&mmio, VF_SW_FLAG(n), regs[n]); return 0; } diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h index 8b55cf25a75fa1..fffb5d631b69b5 100644 --- a/drivers/gpu/drm/xe/xe_gt_types.h +++ b/drivers/gpu/drm/xe/xe_gt_types.h @@ -144,6 +144,13 @@ struct xe_gt { u8 id; /** @info.has_indirect_ring_state: GT has indirect ring state support */ u8 has_indirect_ring_state:1; + /** + * @info.has_xe2_blt_instructions: GT supports Xe2-style MEM_SET + * and MEM_COPY blitter functionality. Note that despite the + * name, some Xe1 platforms may also support this "Xe2-style" + * feature. + */ + u8 has_xe2_blt_instructions:1; /** * @info.num_geometry_xecore_fuse_regs: Number of 32b-bit fuse * registers the geometry XeCore mask spans. diff --git a/drivers/gpu/drm/xe/xe_guc_ads.c b/drivers/gpu/drm/xe/xe_guc_ads.c index 81b5f01b1f65cc..2b835d48b5652f 100644 --- a/drivers/gpu/drm/xe/xe_guc_ads.c +++ b/drivers/gpu/drm/xe/xe_guc_ads.c @@ -512,12 +512,9 @@ static void guc_golden_lrc_init(struct xe_guc_ads *ads) * that starts after the execlists LRC registers. This is * required to allow the GuC to restore just the engine state * when a watchdog reset occurs. - * We calculate the engine state size by removing the size of - * what comes before it in the context image (which is identical - * on all engines). */ ads_blob_write(ads, ads.eng_state_size[guc_class], - real_size - xe_lrc_skip_size(xe)); + xe_lrc_engine_state_size(gt, class)); ads_blob_write(ads, ads.golden_context_lrca[guc_class], addr_ggtt); diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c index c725cde4508d20..4af9f0d7c6f3ba 100644 --- a/drivers/gpu/drm/xe/xe_lrc.c +++ b/drivers/gpu/drm/xe/xe_lrc.c @@ -746,9 +746,16 @@ size_t xe_lrc_reg_size(struct xe_device *xe) return 80 * sizeof(u32); } -size_t xe_lrc_skip_size(struct xe_device *xe) +/** + * xe_lrc_engine_state_size() - Get size of the engine state within LRC + * @gt: the &xe_gt struct instance + * @class: Hardware engine class + * + * Returns: Size of the engine state + */ +size_t xe_lrc_engine_state_size(struct xe_gt *gt, enum xe_engine_class class) { - return LRC_PPHWSP_SIZE + xe_lrc_reg_size(xe); + return xe_gt_lrc_hang_replay_size(gt, class) - xe_lrc_reg_size(gt_to_xe(gt)); } static inline u32 __xe_lrc_seqno_offset(struct xe_lrc *lrc) diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h index e7c975f9e2d975..5440663183f6d7 100644 --- a/drivers/gpu/drm/xe/xe_lrc.h +++ b/drivers/gpu/drm/xe/xe_lrc.h @@ -130,7 +130,7 @@ u32 xe_lrc_parallel_ggtt_addr(struct xe_lrc *lrc); struct iosys_map xe_lrc_parallel_map(struct xe_lrc *lrc); size_t xe_lrc_reg_size(struct xe_device *xe); -size_t xe_lrc_skip_size(struct xe_device *xe); +size_t xe_lrc_engine_state_size(struct xe_gt *gt, enum xe_engine_class class); void xe_lrc_dump_default(struct drm_printer *p, struct xe_gt *gt, diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c index 5fdc89ed5256b3..a22413f892a096 100644 --- a/drivers/gpu/drm/xe/xe_migrate.c +++ b/drivers/gpu/drm/xe/xe_migrate.c @@ -1524,23 +1524,9 @@ static void emit_clear_main_copy(struct xe_gt *gt, struct xe_bb *bb, bb->len += len; } -static bool has_service_copy_support(struct xe_gt *gt) -{ - /* - * What we care about is whether the architecture was designed with - * service copy functionality (specifically the new MEM_SET / MEM_COPY - * instructions) so check the architectural engine list rather than the - * actual list since these instructions are usable on BCS0 even if - * all of the actual service copy engines (BCS1-BCS8) have been fused - * off. - */ - return gt->info.engine_mask & GENMASK(XE_HW_ENGINE_BCS8, - XE_HW_ENGINE_BCS1); -} - static u32 emit_clear_cmd_len(struct xe_gt *gt) { - if (has_service_copy_support(gt)) + if (gt->info.has_xe2_blt_instructions) return PVC_MEM_SET_CMD_LEN_DW; else return XY_FAST_COLOR_BLT_DW; @@ -1549,7 +1535,7 @@ static u32 emit_clear_cmd_len(struct xe_gt *gt) static void emit_clear(struct xe_gt *gt, struct xe_bb *bb, u64 src_ofs, u32 size, u32 pitch, bool is_vram) { - if (has_service_copy_support(gt)) + if (gt->info.has_xe2_blt_instructions) emit_clear_link_copy(gt, bb, src_ofs, size, pitch); else emit_clear_main_copy(gt, bb, src_ofs, size, pitch, diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c index 9f98d033416490..c2ecd27ec770a2 100644 --- a/drivers/gpu/drm/xe/xe_pci.c +++ b/drivers/gpu/drm/xe/xe_pci.c @@ -851,6 +851,15 @@ static struct xe_gt *alloc_primary_gt(struct xe_tile *tile, gt->info.num_geometry_xecore_fuse_regs = graphics_desc->num_geometry_xecore_fuse_regs; gt->info.num_compute_xecore_fuse_regs = graphics_desc->num_compute_xecore_fuse_regs; + /* + * Even if the service copy engines wind up being fused off, their + * presence in the IP descriptor indicates that the platform supports + * Xe2-style MEM_SET and MEM_COPY functionality. + */ + if (graphics_desc->hw_engine_mask & GENMASK(XE_HW_ENGINE_BCS8, + XE_HW_ENGINE_BCS1)) + gt->info.has_xe2_blt_instructions = true; + /* * Before media version 13, the media IP was part of the primary GT * so we need to add the media engines to the primary GT's engine list. diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c index 6c4b16409cc9af..150a241110fbde 100644 --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c @@ -149,10 +149,11 @@ pf_migration_consume(struct xe_device *xe, unsigned int vfid) for_each_gt(gt, xe, gt_id) { data = xe_gt_sriov_pf_migration_save_consume(gt, vfid); - if (data && PTR_ERR(data) != EAGAIN) + if (!data) + continue; + if (!IS_ERR(data) || PTR_ERR(data) != -EAGAIN) return data; - if (PTR_ERR(data) == -EAGAIN) - more_data = true; + more_data = true; } if (!more_data) diff --git a/drivers/gpu/drm/xe/xe_tile_types.h b/drivers/gpu/drm/xe/xe_tile_types.h index 33932fd547d71a..0048100ccb723d 100644 --- a/drivers/gpu/drm/xe/xe_tile_types.h +++ b/drivers/gpu/drm/xe/xe_tile_types.h @@ -106,8 +106,6 @@ struct xe_tile { struct xe_lmtt lmtt; } pf; struct { - /** @sriov.vf.ggtt_balloon: GGTT regions excluded from use. */ - struct xe_ggtt_node *ggtt_balloon[2]; /** @sriov.vf.self_config: VF configuration data */ struct xe_tile_sriov_vf_selfconfig self_config; } vf; diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index a717a2b8dea3c9..ab6cc1f0a78949 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -1120,6 +1120,25 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm, xe_bo_assert_held(bo); + /* + * Reject only WILLNEED mappings on DONTNEED/PURGED BOs. This + * gates new vm_bind ioctls (user supplies WILLNEED) while + * still allowing partial-unbind / remap splits whose new VMAs + * inherit the parent's DONTNEED attr. It must also run before + * xe_bo_willneed_get_locked() below so a 0->1 holder bump + * cannot silently promote DONTNEED back to WILLNEED. + */ + if (vma->attr.purgeable_state == XE_MADV_PURGEABLE_WILLNEED) { + if (xe_bo_madv_is_dontneed(bo)) { + xe_vma_free(vma); + return ERR_PTR(-EBUSY); + } + if (xe_bo_is_purged(bo)) { + xe_vma_free(vma); + return ERR_PTR(-EINVAL); + } + } + vm_bo = drm_gpuvm_bo_obtain_locked(vma->gpuva.vm, &bo->ttm.base); if (IS_ERR(vm_bo)) { xe_vma_free(vma); @@ -1131,6 +1150,10 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm, vma->gpuva.gem.offset = bo_offset_or_userptr; drm_gpuva_link(&vma->gpuva, vm_bo); drm_gpuvm_bo_put(vm_bo); + + xe_bo_vma_count_inc_locked(bo); + if (vma->attr.purgeable_state == XE_MADV_PURGEABLE_WILLNEED) + xe_bo_willneed_get_locked(bo); } else /* userptr or null */ { if (!is_null && !is_cpu_addr_mirror) { struct xe_userptr_vma *uvma = to_userptr_vma(vma); @@ -1208,7 +1231,10 @@ static void xe_vma_destroy(struct xe_vma *vma, struct dma_fence *fence) xe_bo_assert_held(bo); drm_gpuva_unlink(&vma->gpuva); - xe_bo_recompute_purgeable_state(bo); + + xe_bo_vma_count_dec_locked(bo); + if (vma->attr.purgeable_state == XE_MADV_PURGEABLE_WILLNEED) + xe_bo_willneed_put_locked(bo); } xe_vm_assert_held(vm); @@ -3016,7 +3042,7 @@ static void vm_bind_ioctl_ops_unwind(struct xe_vm *vm, * @res_evict: Allow evicting resources during validation * @validate: Perform BO validation * @request_decompress: Request BO decompression - * @check_purged: Reject operation if BO is purged + * @check_purged: Reject operation if BO is DONTNEED or PURGED */ struct xe_vma_lock_and_validate_flags { u32 res_evict : 1; @@ -3030,6 +3056,7 @@ static int vma_lock_and_validate(struct drm_exec *exec, struct xe_vma *vma, { struct xe_bo *bo = xe_vma_bo(vma); struct xe_vm *vm = xe_vma_vm(vma); + bool validate_bo = flags.validate; int err = 0; if (bo) { @@ -3044,7 +3071,11 @@ static int vma_lock_and_validate(struct drm_exec *exec, struct xe_vma *vma, err = -EINVAL; /* BO already purged */ } - if (!err && flags.validate) + /* Don't validate the BO for DONTNEED/PURGED remap remnants. */ + if (vma->attr.purgeable_state != XE_MADV_PURGEABLE_WILLNEED) + validate_bo = false; + + if (!err && validate_bo) err = xe_bo_validate(bo, vm, xe_vm_allow_vm_eviction(vm) && flags.res_evict, exec); @@ -3152,7 +3183,7 @@ static int op_lock_and_prep(struct drm_exec *exec, struct xe_vm *vm, op->map.immediate, .request_decompress = op->map.request_decompress, - .check_purged = true, + .check_purged = false, }); break; case DRM_GPUVA_OP_REMAP: @@ -3174,7 +3205,7 @@ static int op_lock_and_prep(struct drm_exec *exec, struct xe_vm *vm, .res_evict = res_evict, .validate = true, .request_decompress = false, - .check_purged = true, + .check_purged = false, }); if (!err && op->remap.next) err = vma_lock_and_validate(exec, op->remap.next, @@ -3182,7 +3213,7 @@ static int op_lock_and_prep(struct drm_exec *exec, struct xe_vm *vm, .res_evict = res_evict, .validate = true, .request_decompress = false, - .check_purged = true, + .check_purged = false, }); break; case DRM_GPUVA_OP_UNMAP: @@ -3211,9 +3242,11 @@ static int op_lock_and_prep(struct drm_exec *exec, struct xe_vm *vm, } /* - * Prefetch attempts to migrate BO's backing store without - * repopulating it first. Purged BOs have no backing store - * to migrate, so reject the operation. + * PREFETCH is the only op that still gates on BO purge state. + * MAP/REMAP handle this inside xe_vma_create() so partial + * unbind on a DONTNEED BO still works. PREFETCH skips + * xe_vma_create() and would migrate a BO with no backing + * store, so reject DONTNEED/PURGED here. */ err = vma_lock_and_validate(exec, gpuva_to_vma(op->base.prefetch.va), diff --git a/drivers/gpu/drm/xe/xe_vm_madvise.c b/drivers/gpu/drm/xe/xe_vm_madvise.c index c78906dea82bec..c4fb290041956d 100644 --- a/drivers/gpu/drm/xe/xe_vm_madvise.c +++ b/drivers/gpu/drm/xe/xe_vm_madvise.c @@ -185,147 +185,6 @@ static void madvise_pat_index(struct xe_device *xe, struct xe_vm *vm, } } -/** - * xe_bo_is_dmabuf_shared() - Check if BO is shared via dma-buf - * @bo: Buffer object - * - * Prevent marking imported or exported dma-bufs as purgeable. - * For imported BOs, Xe doesn't own the backing store and cannot - * safely reclaim pages (exporter or other devices may still be - * using them). For exported BOs, external devices may have active - * mappings we cannot track. - * - * Return: true if BO is imported or exported, false otherwise - */ -static bool xe_bo_is_dmabuf_shared(struct xe_bo *bo) -{ - struct drm_gem_object *obj = &bo->ttm.base; - - /* Imported: exporter owns backing store */ - if (drm_gem_is_imported(obj)) - return true; - - /* Exported: external devices may be accessing */ - if (obj->dma_buf) - return true; - - return false; -} - -/** - * enum xe_bo_vmas_purge_state - VMA purgeable state aggregation - * - * Distinguishes whether a BO's VMAs are all DONTNEED, have at least - * one WILLNEED, or have no VMAs at all. - * - * Enum values align with XE_MADV_PURGEABLE_* states for consistency. - */ -enum xe_bo_vmas_purge_state { - /** @XE_BO_VMAS_STATE_WILLNEED: At least one VMA is WILLNEED */ - XE_BO_VMAS_STATE_WILLNEED = 0, - /** @XE_BO_VMAS_STATE_DONTNEED: All VMAs are DONTNEED */ - XE_BO_VMAS_STATE_DONTNEED = 1, - /** @XE_BO_VMAS_STATE_NO_VMAS: BO has no VMAs */ - XE_BO_VMAS_STATE_NO_VMAS = 2, -}; - -/* - * xe_bo_recompute_purgeable_state() casts between xe_bo_vmas_purge_state and - * xe_madv_purgeable_state. Enforce that WILLNEED=0 and DONTNEED=1 match across - * both enums so the single-line cast is always valid. - */ -static_assert(XE_BO_VMAS_STATE_WILLNEED == (int)XE_MADV_PURGEABLE_WILLNEED, - "VMA purge state WILLNEED must equal madv purgeable WILLNEED"); -static_assert(XE_BO_VMAS_STATE_DONTNEED == (int)XE_MADV_PURGEABLE_DONTNEED, - "VMA purge state DONTNEED must equal madv purgeable DONTNEED"); - -/** - * xe_bo_all_vmas_dontneed() - Determine BO VMA purgeable state - * @bo: Buffer object - * - * Check all VMAs across all VMs to determine aggregate purgeable state. - * Shared BOs require unanimous DONTNEED state from all mappings. - * - * Caller must hold BO dma-resv lock. - * - * Return: XE_BO_VMAS_STATE_DONTNEED if all VMAs are DONTNEED, - * XE_BO_VMAS_STATE_WILLNEED if at least one VMA is not DONTNEED, - * XE_BO_VMAS_STATE_NO_VMAS if BO has no VMAs - */ -static enum xe_bo_vmas_purge_state xe_bo_all_vmas_dontneed(struct xe_bo *bo) -{ - struct drm_gpuvm_bo *vm_bo; - struct drm_gpuva *gpuva; - struct drm_gem_object *obj = &bo->ttm.base; - bool has_vmas = false; - - xe_bo_assert_held(bo); - - /* Shared dma-bufs cannot be purgeable */ - if (xe_bo_is_dmabuf_shared(bo)) - return XE_BO_VMAS_STATE_WILLNEED; - - drm_gem_for_each_gpuvm_bo(vm_bo, obj) { - drm_gpuvm_bo_for_each_va(gpuva, vm_bo) { - struct xe_vma *vma = gpuva_to_vma(gpuva); - - has_vmas = true; - - /* Any non-DONTNEED VMA prevents purging */ - if (vma->attr.purgeable_state != XE_MADV_PURGEABLE_DONTNEED) - return XE_BO_VMAS_STATE_WILLNEED; - } - } - - /* - * No VMAs => preserve existing BO purgeable state. - * Avoids incorrectly flipping DONTNEED -> WILLNEED when last VMA unmapped. - */ - if (!has_vmas) - return XE_BO_VMAS_STATE_NO_VMAS; - - return XE_BO_VMAS_STATE_DONTNEED; -} - -/** - * xe_bo_recompute_purgeable_state() - Recompute BO purgeable state from VMAs - * @bo: Buffer object - * - * Walk all VMAs to determine if BO should be purgeable or not. - * Shared BOs require unanimous DONTNEED state from all mappings. - * If the BO has no VMAs the existing state is preserved. - * - * Locking: Caller must hold BO dma-resv lock. When iterating GPUVM lists, - * VM lock must also be held (write) to prevent concurrent VMA modifications. - * This is satisfied at both call sites: - * - xe_vma_destroy(): holds vm->lock write - * - madvise_purgeable(): holds vm->lock write (from madvise ioctl path) - * - * Return: nothing - */ -void xe_bo_recompute_purgeable_state(struct xe_bo *bo) -{ - enum xe_bo_vmas_purge_state vma_state; - - if (!bo) - return; - - xe_bo_assert_held(bo); - - /* - * Once purged, always purged. Cannot transition back to WILLNEED. - * This matches i915 semantics where purged BOs are permanently invalid. - */ - if (bo->madv_purgeable == XE_MADV_PURGEABLE_PURGED) - return; - - vma_state = xe_bo_all_vmas_dontneed(bo); - - if (vma_state != (enum xe_bo_vmas_purge_state)bo->madv_purgeable && - vma_state != XE_BO_VMAS_STATE_NO_VMAS) - xe_bo_set_purgeable_state(bo, (enum xe_madv_purgeable_state)vma_state); -} - /** * madvise_purgeable - Handle purgeable buffer object advice * @xe: XE device @@ -359,12 +218,6 @@ static void madvise_purgeable(struct xe_device *xe, struct xe_vm *vm, /* BO must be locked before modifying madv state */ xe_bo_assert_held(bo); - /* Skip shared dma-bufs - no PTEs to zap */ - if (xe_bo_is_dmabuf_shared(bo)) { - vmas[i]->skip_invalidation = true; - continue; - } - /* * Once purged, always purged. Cannot transition back to WILLNEED. * This matches i915 semantics where purged BOs are permanently invalid. @@ -377,13 +230,14 @@ static void madvise_purgeable(struct xe_device *xe, struct xe_vm *vm, switch (op->purge_state_val.val) { case DRM_XE_VMA_PURGEABLE_STATE_WILLNEED: - vmas[i]->attr.purgeable_state = XE_MADV_PURGEABLE_WILLNEED; vmas[i]->skip_invalidation = true; - - xe_bo_recompute_purgeable_state(bo); + /* Only act on a real DONTNEED -> WILLNEED transition. */ + if (vmas[i]->attr.purgeable_state == XE_MADV_PURGEABLE_DONTNEED) { + vmas[i]->attr.purgeable_state = XE_MADV_PURGEABLE_WILLNEED; + xe_bo_willneed_get_locked(bo); + } break; case DRM_XE_VMA_PURGEABLE_STATE_DONTNEED: - vmas[i]->attr.purgeable_state = XE_MADV_PURGEABLE_DONTNEED; /* * Don't zap PTEs at DONTNEED time -- pages are still * alive. The zap happens in xe_bo_move_notify() right @@ -391,7 +245,11 @@ static void madvise_purgeable(struct xe_device *xe, struct xe_vm *vm, */ vmas[i]->skip_invalidation = true; - xe_bo_recompute_purgeable_state(bo); + /* Only act on a real WILLNEED -> DONTNEED transition. */ + if (vmas[i]->attr.purgeable_state == XE_MADV_PURGEABLE_WILLNEED) { + vmas[i]->attr.purgeable_state = XE_MADV_PURGEABLE_DONTNEED; + xe_bo_willneed_put_locked(bo); + } break; default: /* Should never hit - values validated in madvise_args_are_sane() */ diff --git a/drivers/gpu/drm/xe/xe_vm_madvise.h b/drivers/gpu/drm/xe/xe_vm_madvise.h index 39acd2689ca0d0..a3078f634c7e95 100644 --- a/drivers/gpu/drm/xe/xe_vm_madvise.h +++ b/drivers/gpu/drm/xe/xe_vm_madvise.h @@ -13,6 +13,4 @@ struct xe_bo; int xe_vm_madvise_ioctl(struct drm_device *dev, void *data, struct drm_file *file); -void xe_bo_recompute_purgeable_state(struct xe_bo *bo); - #endif diff --git a/drivers/hid/bpf/hid_bpf_dispatch.c b/drivers/hid/bpf/hid_bpf_dispatch.c index 50c7b45c59e3fb..d0130658091b02 100644 --- a/drivers/hid/bpf/hid_bpf_dispatch.c +++ b/drivers/hid/bpf/hid_bpf_dispatch.c @@ -24,7 +24,8 @@ EXPORT_SYMBOL(hid_ops); u8 * dispatch_hid_bpf_device_event(struct hid_device *hdev, enum hid_report_type type, u8 *data, - u32 *size, int interrupt, u64 source, bool from_bpf) + size_t *buf_size, u32 *size, int interrupt, u64 source, + bool from_bpf) { struct hid_bpf_ctx_kern ctx_kern = { .ctx = { @@ -74,6 +75,7 @@ dispatch_hid_bpf_device_event(struct hid_device *hdev, enum hid_report_type type *size = ret; } + *buf_size = ctx_kern.ctx.allocated_size; return ctx_kern.data; } EXPORT_SYMBOL_GPL(dispatch_hid_bpf_device_event); @@ -505,7 +507,7 @@ __hid_bpf_input_report(struct hid_bpf_ctx *ctx, enum hid_report_type type, u8 *b if (ret) return ret; - return hid_ops->hid_input_report(ctx->hid, type, buf, size, 0, (u64)(long)ctx, true, + return hid_ops->hid_input_report(ctx->hid, type, buf, size, size, 0, (u64)(long)ctx, true, lock_already_taken); } diff --git a/drivers/hid/hid-appletb-kbd.c b/drivers/hid/hid-appletb-kbd.c index 0fdc0968b9ef26..462010a758993e 100644 --- a/drivers/hid/hid-appletb-kbd.c +++ b/drivers/hid/hid-appletb-kbd.c @@ -17,7 +17,7 @@ #include #include #include -#include +#include #include #include "hid-ids.h" @@ -62,7 +62,8 @@ struct appletb_kbd { struct input_handle kbd_handle; struct input_handle tpd_handle; struct backlight_device *backlight_dev; - struct timer_list inactivity_timer; + struct delayed_work inactivity_work; + struct work_struct restore_brightness_work; bool has_dimmed; bool has_turned_off; u8 saved_mode; @@ -164,16 +165,18 @@ static int appletb_tb_key_to_slot(unsigned int code) } } -static void appletb_inactivity_timer(struct timer_list *t) +static void appletb_inactivity_work(struct work_struct *work) { - struct appletb_kbd *kbd = timer_container_of(kbd, t, inactivity_timer); + struct appletb_kbd *kbd = container_of(to_delayed_work(work), + struct appletb_kbd, + inactivity_work); if (kbd->backlight_dev && appletb_tb_autodim) { if (!kbd->has_dimmed) { backlight_device_set_brightness(kbd->backlight_dev, 1); kbd->has_dimmed = true; - mod_timer(&kbd->inactivity_timer, - jiffies + secs_to_jiffies(appletb_tb_idle_timeout)); + mod_delayed_work(system_wq, &kbd->inactivity_work, + secs_to_jiffies(appletb_tb_idle_timeout)); } else if (!kbd->has_turned_off) { backlight_device_set_brightness(kbd->backlight_dev, 0); kbd->has_turned_off = true; @@ -181,16 +184,25 @@ static void appletb_inactivity_timer(struct timer_list *t) } } +static void appletb_restore_brightness_work(struct work_struct *work) +{ + struct appletb_kbd *kbd = container_of(work, struct appletb_kbd, + restore_brightness_work); + + if (kbd->backlight_dev) + backlight_device_set_brightness(kbd->backlight_dev, 2); +} + static void reset_inactivity_timer(struct appletb_kbd *kbd) { if (kbd->backlight_dev && appletb_tb_autodim) { if (kbd->has_dimmed || kbd->has_turned_off) { - backlight_device_set_brightness(kbd->backlight_dev, 2); kbd->has_dimmed = false; kbd->has_turned_off = false; + schedule_work(&kbd->restore_brightness_work); } - mod_timer(&kbd->inactivity_timer, - jiffies + secs_to_jiffies(appletb_tb_dim_timeout)); + mod_delayed_work(system_wq, &kbd->inactivity_work, + secs_to_jiffies(appletb_tb_dim_timeout)); } } @@ -408,9 +420,11 @@ static int appletb_kbd_probe(struct hid_device *hdev, const struct hid_device_id dev_err_probe(dev, -ENODEV, "Failed to get backlight device\n"); } else { backlight_device_set_brightness(kbd->backlight_dev, 2); - timer_setup(&kbd->inactivity_timer, appletb_inactivity_timer, 0); - mod_timer(&kbd->inactivity_timer, - jiffies + secs_to_jiffies(appletb_tb_dim_timeout)); + INIT_DELAYED_WORK(&kbd->inactivity_work, appletb_inactivity_work); + INIT_WORK(&kbd->restore_brightness_work, + appletb_restore_brightness_work); + mod_delayed_work(system_wq, &kbd->inactivity_work, + secs_to_jiffies(appletb_tb_dim_timeout)); } kbd->inp_handler.event = appletb_kbd_inp_event; @@ -440,13 +454,14 @@ static int appletb_kbd_probe(struct hid_device *hdev, const struct hid_device_id unregister_handler: input_unregister_handler(&kbd->inp_handler); close_hw: - if (kbd->backlight_dev) { - put_device(&kbd->backlight_dev->dev); - timer_delete_sync(&kbd->inactivity_timer); - } hid_hw_close(hdev); stop_hw: hid_hw_stop(hdev); + if (kbd->backlight_dev) { + cancel_delayed_work_sync(&kbd->inactivity_work); + cancel_work_sync(&kbd->restore_brightness_work); + put_device(&kbd->backlight_dev->dev); + } return ret; } @@ -457,13 +472,14 @@ static void appletb_kbd_remove(struct hid_device *hdev) appletb_kbd_set_mode(kbd, APPLETB_KBD_MODE_OFF); input_unregister_handler(&kbd->inp_handler); + hid_hw_close(hdev); + hid_hw_stop(hdev); + if (kbd->backlight_dev) { + cancel_delayed_work_sync(&kbd->inactivity_work); + cancel_work_sync(&kbd->restore_brightness_work); put_device(&kbd->backlight_dev->dev); - timer_delete_sync(&kbd->inactivity_timer); } - - hid_hw_close(hdev); - hid_hw_stop(hdev); } static int appletb_kbd_suspend(struct hid_device *hdev, pm_message_t msg) diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c index 61afec5915ecfa..b3596851c7191a 100644 --- a/drivers/hid/hid-core.c +++ b/drivers/hid/hid-core.c @@ -2033,24 +2033,32 @@ int __hid_request(struct hid_device *hid, struct hid_report *report, } EXPORT_SYMBOL_GPL(__hid_request); -int hid_report_raw_event(struct hid_device *hid, enum hid_report_type type, u8 *data, u32 size, - int interrupt) +int hid_report_raw_event(struct hid_device *hid, enum hid_report_type type, u8 *data, + size_t bufsize, u32 size, int interrupt) { struct hid_report_enum *report_enum = hid->report_enum + type; struct hid_report *report; struct hid_driver *hdrv; int max_buffer_size = HID_MAX_BUFFER_SIZE; u32 rsize, csize = size; + size_t bsize = bufsize; u8 *cdata = data; int ret = 0; report = hid_get_report(report_enum, data); if (!report) - goto out; + return 0; + + if (unlikely(bsize < csize)) { + hid_warn_ratelimited(hid, "Event data for report %d is incorrect (%d vs %ld)\n", + report->id, csize, bsize); + return -EINVAL; + } if (report_enum->numbered) { cdata++; csize--; + bsize--; } rsize = hid_compute_report_size(report); @@ -2063,11 +2071,16 @@ int hid_report_raw_event(struct hid_device *hid, enum hid_report_type type, u8 * else if (rsize > max_buffer_size) rsize = max_buffer_size; + if (bsize < rsize) { + hid_warn_ratelimited(hid, "Event data for report %d was too short (%d vs %ld)\n", + report->id, rsize, bsize); + return -EINVAL; + } + if (csize < rsize) { - hid_warn_ratelimited(hid, "Event data for report %d was too short (%d vs %d)\n", - report->id, rsize, csize); - ret = -EINVAL; - goto out; + dbg_hid("report %d is too short, (%d < %d)\n", report->id, + csize, rsize); + memset(cdata + csize, 0, rsize - csize); } if ((hid->claimed & HID_CLAIMED_HIDDEV) && hid->hiddev_report_event) @@ -2075,7 +2088,7 @@ int hid_report_raw_event(struct hid_device *hid, enum hid_report_type type, u8 * if (hid->claimed & HID_CLAIMED_HIDRAW) { ret = hidraw_report_event(hid, data, size); if (ret) - goto out; + return ret; } if (hid->claimed != HID_CLAIMED_HIDRAW && report->maxfield) { @@ -2087,15 +2100,15 @@ int hid_report_raw_event(struct hid_device *hid, enum hid_report_type type, u8 * if (hid->claimed & HID_CLAIMED_INPUT) hidinput_report_event(hid, report); -out: + return ret; } EXPORT_SYMBOL_GPL(hid_report_raw_event); static int __hid_input_report(struct hid_device *hid, enum hid_report_type type, - u8 *data, u32 size, int interrupt, u64 source, bool from_bpf, - bool lock_already_taken) + u8 *data, size_t bufsize, u32 size, int interrupt, u64 source, + bool from_bpf, bool lock_already_taken) { struct hid_report_enum *report_enum; struct hid_driver *hdrv; @@ -2120,7 +2133,8 @@ static int __hid_input_report(struct hid_device *hid, enum hid_report_type type, report_enum = hid->report_enum + type; hdrv = hid->driver; - data = dispatch_hid_bpf_device_event(hid, type, data, &size, interrupt, source, from_bpf); + data = dispatch_hid_bpf_device_event(hid, type, data, &bufsize, &size, interrupt, + source, from_bpf); if (IS_ERR(data)) { ret = PTR_ERR(data); goto unlock; @@ -2149,7 +2163,7 @@ static int __hid_input_report(struct hid_device *hid, enum hid_report_type type, goto unlock; } - ret = hid_report_raw_event(hid, type, data, size, interrupt); + ret = hid_report_raw_event(hid, type, data, bufsize, size, interrupt); unlock: if (!lock_already_taken) @@ -2167,16 +2181,41 @@ static int __hid_input_report(struct hid_device *hid, enum hid_report_type type, * @interrupt: distinguish between interrupt and control transfers * * This is data entry for lower layers. + * Legacy, please use hid_safe_input_report() instead. */ int hid_input_report(struct hid_device *hid, enum hid_report_type type, u8 *data, u32 size, int interrupt) { - return __hid_input_report(hid, type, data, size, interrupt, 0, + return __hid_input_report(hid, type, data, size, size, interrupt, 0, false, /* from_bpf */ false /* lock_already_taken */); } EXPORT_SYMBOL_GPL(hid_input_report); +/** + * hid_safe_input_report - report data from lower layer (usb, bt...) + * + * @hid: hid device + * @type: HID report type (HID_*_REPORT) + * @data: report contents + * @bufsize: allocated size of the data buffer + * @size: useful size of data parameter + * @interrupt: distinguish between interrupt and control transfers + * + * This is data entry for lower layers. + * Please use this function instead of the non safe version because we provide + * here the size of the buffer, allowing hid-core to make smarter decisions + * regarding the incoming buffer. + */ +int hid_safe_input_report(struct hid_device *hid, enum hid_report_type type, u8 *data, + size_t bufsize, u32 size, int interrupt) +{ + return __hid_input_report(hid, type, data, bufsize, size, interrupt, 0, + false, /* from_bpf */ + false /* lock_already_taken */); +} +EXPORT_SYMBOL_GPL(hid_safe_input_report); + bool hid_match_one_id(const struct hid_device *hdev, const struct hid_device_id *id) { diff --git a/drivers/hid/hid-elan.c b/drivers/hid/hid-elan.c index 76d93fc48f6a28..0190ad567ce4d4 100644 --- a/drivers/hid/hid-elan.c +++ b/drivers/hid/hid-elan.c @@ -513,6 +513,7 @@ static const struct hid_device_id elan_devices[] = { { HID_USB_DEVICE(USB_VENDOR_ID_ELAN, USB_DEVICE_ID_HP_X2_10_COVER), .driver_data = ELAN_HAS_LED }, { HID_I2C_DEVICE(USB_VENDOR_ID_ELAN, USB_DEVICE_ID_TOSHIBA_CLICK_L9W) }, + { HID_I2C_DEVICE(USB_VENDOR_ID_ELAN, USB_DEVICE_ID_SB974D) }, { } }; MODULE_DEVICE_TABLE(hid, elan_devices); diff --git a/drivers/hid/hid-ft260.c b/drivers/hid/hid-ft260.c index 333341e80b0ec6..70e2eedb465af1 100644 --- a/drivers/hid/hid-ft260.c +++ b/drivers/hid/hid-ft260.c @@ -1068,10 +1068,22 @@ static int ft260_raw_event(struct hid_device *hdev, struct hid_report *report, struct ft260_device *dev = hid_get_drvdata(hdev); struct ft260_i2c_input_report *xfer = (void *)data; + if (size < offsetof(struct ft260_i2c_input_report, data)) { + hid_err(hdev, "short report %d\n", size); + return -1; + } + if (xfer->report >= FT260_I2C_REPORT_MIN && xfer->report <= FT260_I2C_REPORT_MAX) { - ft260_dbg("i2c resp: rep %#02x len %d\n", xfer->report, - xfer->length); + ft260_dbg("i2c resp: rep %#02x len %d size %d\n", + xfer->report, xfer->length, size); + + if (xfer->length > size - + offsetof(struct ft260_i2c_input_report, data)) { + hid_err(hdev, "report %#02x: length %d exceeds HID report size\n", + xfer->report, xfer->length); + return -1; + } if ((dev->read_buf == NULL) || (xfer->length > dev->read_len - dev->read_idx)) { diff --git a/drivers/hid/hid-gfrm.c b/drivers/hid/hid-gfrm.c index 699186ff2349e9..d2a56bf92b416e 100644 --- a/drivers/hid/hid-gfrm.c +++ b/drivers/hid/hid-gfrm.c @@ -66,7 +66,7 @@ static int gfrm_raw_event(struct hid_device *hdev, struct hid_report *report, switch (data[1]) { case GFRM100_SEARCH_KEY_DOWN: ret = hid_report_raw_event(hdev, HID_INPUT_REPORT, search_key_dn, - sizeof(search_key_dn), 1); + sizeof(search_key_dn), sizeof(search_key_dn), 1); break; case GFRM100_SEARCH_KEY_AUDIO_DATA: @@ -74,7 +74,7 @@ static int gfrm_raw_event(struct hid_device *hdev, struct hid_report *report, case GFRM100_SEARCH_KEY_UP: ret = hid_report_raw_event(hdev, HID_INPUT_REPORT, search_key_up, - sizeof(search_key_up), 1); + sizeof(search_key_up), sizeof(search_key_up), 1); break; default: diff --git a/drivers/hid/hid-google-hammer.c b/drivers/hid/hid-google-hammer.c index 1af477e58480b7..c99c3c0d442e1b 100644 --- a/drivers/hid/hid-google-hammer.c +++ b/drivers/hid/hid-google-hammer.c @@ -496,7 +496,7 @@ static int hammer_probe(struct hid_device *hdev, if (error) return error; - error = devm_add_action(&hdev->dev, hammer_stop, hdev); + error = devm_add_action_or_reset(&hdev->dev, hammer_stop, hdev); if (error) return error; diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h index 0cf63742315bf8..4657d96fb0836f 100644 --- a/drivers/hid/hid-ids.h +++ b/drivers/hid/hid-ids.h @@ -277,6 +277,9 @@ #define USB_VENDOR_ID_BIGBEN 0x146b #define USB_DEVICE_ID_BIGBEN_PS3OFMINIPAD 0x0902 +#define I2C_VENDOR_ID_BLTP 0x36b6 +#define I2C_PRODUCT_ID_BLTP7853 0xc001 + #define USB_VENDOR_ID_BTC 0x046e #define USB_DEVICE_ID_BTC_EMPREX_REMOTE 0x5578 #define USB_DEVICE_ID_BTC_EMPREX_REMOTE_2 0x5577 @@ -455,6 +458,7 @@ #define USB_DEVICE_ID_EDIFIER_QR30 0xa101 /* EDIFIER Hal0 2.0 SE */ #define USB_VENDOR_ID_ELAN 0x04f3 +#define USB_DEVICE_ID_SB974D 0x0400 #define USB_DEVICE_ID_TOSHIBA_CLICK_L9W 0x0401 #define USB_DEVICE_ID_HP_X2 0x074d #define USB_DEVICE_ID_HP_X2_10_COVER 0x0755 diff --git a/drivers/hid/hid-lenovo-go-s.c b/drivers/hid/hid-lenovo-go-s.c index 01c7bdd4fbe044..ff1782a751915f 100644 --- a/drivers/hid/hid-lenovo-go-s.c +++ b/drivers/hid/hid-lenovo-go-s.c @@ -1369,6 +1369,14 @@ static void cfg_setup(struct work_struct *work) "Failed to retrieve IMU Manufacturer: %i\n", ret); return; } + + ret = mcu_property_out(drvdata.hdev, GET_GAMEPAD_CFG, FEATURE_OS_MODE, + NULL, 0); + if (ret) { + dev_err(&drvdata.hdev->dev, + "Failed to retrieve OS Mode: %i\n", ret); + return; + } } static int hid_gos_cfg_probe(struct hid_device *hdev, @@ -1427,6 +1435,27 @@ static void hid_gos_cfg_remove(struct hid_device *hdev) hid_set_drvdata(hdev, NULL); } +static int hid_gos_cfg_reset_resume(struct hid_device *hdev) +{ + u8 os_mode = drvdata.os_mode; + int ret; + + ret = mcu_property_out(drvdata.hdev, SET_GAMEPAD_CFG, + FEATURE_OS_MODE, &os_mode, 1); + if (ret < 0) + return ret; + + ret = mcu_property_out(drvdata.hdev, GET_GAMEPAD_CFG, + FEATURE_OS_MODE, NULL, 0); + if (ret < 0) + return ret; + + if (drvdata.os_mode != os_mode) + return -ENODEV; + + return 0; +} + static int hid_gos_probe(struct hid_device *hdev, const struct hid_device_id *id) { @@ -1481,6 +1510,20 @@ static void hid_gos_remove(struct hid_device *hdev) } } +static int hid_gos_reset_resume(struct hid_device *hdev) +{ + int ep = get_endpoint_address(hdev); + + switch (ep) { + case GO_S_CFG_INTF_IN: + return hid_gos_cfg_reset_resume(hdev); + default: + break; + } + + return 0; +} + static const struct hid_device_id hid_gos_devices[] = { { HID_USB_DEVICE(USB_VENDOR_ID_QHE, USB_DEVICE_ID_LENOVO_LEGION_GO_S_XINPUT) }, @@ -1496,6 +1539,7 @@ static struct hid_driver hid_lenovo_go_s = { .probe = hid_gos_probe, .remove = hid_gos_remove, .raw_event = hid_gos_raw_event, + .reset_resume = hid_gos_reset_resume, }; module_hid_driver(hid_lenovo_go_s); diff --git a/drivers/hid/hid-logitech-hidpp.c b/drivers/hid/hid-logitech-hidpp.c index b1330d23bd2d03..ccbf28869a968d 100644 --- a/drivers/hid/hid-logitech-hidpp.c +++ b/drivers/hid/hid-logitech-hidpp.c @@ -3673,7 +3673,7 @@ static int hidpp10_consumer_keys_raw_event(struct hidpp_device *hidpp, memcpy(&consumer_report[1], &data[3], 4); /* We are called from atomic context */ hid_report_raw_event(hidpp->hid_dev, HID_INPUT_REPORT, - consumer_report, 5, 1); + consumer_report, sizeof(consumer_report), 5, 1); return 1; } @@ -4685,6 +4685,44 @@ static const struct hid_device_id hidpp_devices[] = { HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb391) }, { /* MX Master 4 mouse over Bluetooth */ HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb042) }, + { /* Logitech Signature K650 over Bluetooth */ + HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb36f) }, + { /* Logitech Signature K650 B2B over Bluetooth */ + HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb370) }, + { /* Logitech Pebble Keys 2 K380S over Bluetooth */ + HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb377) }, + { /* Logitech Casa Pop-Up Desk over Bluetooth */ + HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb371) }, + { /* Logitech Casa Pop-Up Desk B2B over Bluetooth */ + HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb374) }, + { /* Logitech Wave Keys over Bluetooth */ + HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb383) }, + { /* Logitech Wave Keys B2B over Bluetooth */ + HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb384) }, + { /* Logitech Signature Slim K950 over Bluetooth */ + HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb386) }, + { /* Logitech Signature Slim K950 B2B over Bluetooth */ + HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb388) }, + { /* Logitech MX Keys S over Bluetooth */ + HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb378) }, + { /* Logitech MX Keys S B2B over Bluetooth */ + HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb380) }, + { /* Logitech Keys-To-Go 2 over Bluetooth */ + HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb38c) }, + { /* Logitech Pop Icon Keys over Bluetooth */ + HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb38f) }, + { /* Logitech MX Keys Mini over Bluetooth */ + HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb369) }, + { /* Logitech MX Keys Mini B2B over Bluetooth */ + HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb36e) }, + { /* Logitech Signature Slim Solar+ K980 B2B over Bluetooth */ + HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb394) }, + { /* Logitech Bluetooth Keyboard K250/K251 over Bluetooth */ + HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb397) }, + { /* Logitech Signature Comfort K880 over Bluetooth */ + HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb39c) }, + { /* Logitech Signature Comfort K880 B2B over Bluetooth */ + HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb39d) }, {} }; diff --git a/drivers/hid/hid-magicmouse.c b/drivers/hid/hid-magicmouse.c index e70bd3dc07ab72..802a3479e24b92 100644 --- a/drivers/hid/hid-magicmouse.c +++ b/drivers/hid/hid-magicmouse.c @@ -390,6 +390,10 @@ static int magicmouse_raw_event(struct hid_device *hdev, struct input_dev *input = msc->input; int x = 0, y = 0, ii, clicks = 0, npoints; + /* Protect against zero sized recursive calls from DOUBLE_REPORT_ID */ + if (size < 1) + return 0; + switch (data[0]) { case TRACKPAD_REPORT_ID: case TRACKPAD2_BT_REPORT_ID: @@ -490,6 +494,18 @@ static int magicmouse_raw_event(struct hid_device *hdev, /* Sometimes the trackpad sends two touch reports in one * packet. */ + + /* Ensure that we have at least 2 elements (report type and size) */ + if (size < 2) + return 0; + + if (size < data[1] + 2) { + hid_warn(hdev, + "received report length (%d) was smaller than specified (%d)", + size, data[1] + 2); + return 0; + } + magicmouse_raw_event(hdev, report, data + 2, data[1]); magicmouse_raw_event(hdev, report, data + 2 + data[1], size - 2 - data[1]); diff --git a/drivers/hid/hid-mcp2221.c b/drivers/hid/hid-mcp2221.c index be80970ab48e27..e4ddd8e9293b68 100644 --- a/drivers/hid/hid-mcp2221.c +++ b/drivers/hid/hid-mcp2221.c @@ -128,6 +128,7 @@ struct mcp2221 { u8 *rxbuf; u8 txbuf[64]; int rxbuf_idx; + int rxbuf_size; int status; u8 cur_i2c_clk_div; struct gpio_chip *gc; @@ -330,12 +331,14 @@ static int mcp_i2c_smbus_read(struct mcp2221 *mcp, mcp->txbuf[3] = (u8)(msg->addr << 1); total_len = msg->len; mcp->rxbuf = msg->buf; + mcp->rxbuf_size = msg->len; } else { mcp->txbuf[1] = smbus_len; mcp->txbuf[2] = 0; mcp->txbuf[3] = (u8)(smbus_addr << 1); total_len = smbus_len; mcp->rxbuf = smbus_buf; + mcp->rxbuf_size = smbus_len; } ret = mcp_send_data_req_status(mcp, mcp->txbuf, 4); @@ -919,6 +922,10 @@ static int mcp2221_raw_event(struct hid_device *hdev, mcp->status = -EINVAL; break; } + if (mcp->rxbuf_idx + data[3] > mcp->rxbuf_size) { + mcp->status = -EINVAL; + break; + } buf = mcp->rxbuf; memcpy(&buf[mcp->rxbuf_idx], &data[4], data[3]); mcp->rxbuf_idx = mcp->rxbuf_idx + data[3]; diff --git a/drivers/hid/hid-multitouch.c b/drivers/hid/hid-multitouch.c index e82a3c4e5b44ef..eeab0b6e32ccce 100644 --- a/drivers/hid/hid-multitouch.c +++ b/drivers/hid/hid-multitouch.c @@ -533,7 +533,7 @@ static void mt_get_feature(struct hid_device *hdev, struct hid_report *report) } ret = hid_report_raw_event(hdev, HID_FEATURE_REPORT, buf, - size, 0); + size, size, 0); if (ret) dev_warn(&hdev->dev, "failed to report feature\n"); } diff --git a/drivers/hid/hid-playstation.c b/drivers/hid/hid-playstation.c index c43caac20b61b1..e485373316756a 100644 --- a/drivers/hid/hid-playstation.c +++ b/drivers/hid/hid-playstation.c @@ -2384,7 +2384,8 @@ static int dualshock4_parse_report(struct ps_device *ps_dev, struct hid_report * } ds4_report = &usb->common; - num_touch_reports = usb->num_touch_reports; + num_touch_reports = min_t(u8, usb->num_touch_reports, + ARRAY_SIZE(usb->touch_reports)); touch_reports = usb->touch_reports; } else if (hdev->bus == BUS_BLUETOOTH && report->id == DS4_INPUT_REPORT_BT && size == DS4_INPUT_REPORT_BT_SIZE) { @@ -2404,7 +2405,8 @@ static int dualshock4_parse_report(struct ps_device *ps_dev, struct hid_report * } ds4_report = &bt->common; - num_touch_reports = bt->num_touch_reports; + num_touch_reports = min_t(u8, bt->num_touch_reports, + ARRAY_SIZE(bt->touch_reports)); touch_reports = bt->touch_reports; } else if (hdev->bus == BUS_BLUETOOTH && report->id == DS4_INPUT_REPORT_BT_MINIMAL && diff --git a/drivers/hid/hid-primax.c b/drivers/hid/hid-primax.c index e44d79dff8de63..8db054280afbcd 100644 --- a/drivers/hid/hid-primax.c +++ b/drivers/hid/hid-primax.c @@ -44,7 +44,7 @@ static int px_raw_event(struct hid_device *hid, struct hid_report *report, data[0] |= (1 << (data[idx] - 0xE0)); data[idx] = 0; } - hid_report_raw_event(hid, HID_INPUT_REPORT, data, size, 0); + hid_report_raw_event(hid, HID_INPUT_REPORT, data, size, size, 0); return 1; default: /* unknown report */ diff --git a/drivers/hid/hid-quirks.c b/drivers/hid/hid-quirks.c index 9e88c9d6c6dc0b..512049963978a7 100644 --- a/drivers/hid/hid-quirks.c +++ b/drivers/hid/hid-quirks.c @@ -235,7 +235,7 @@ static const struct hid_device_id hid_quirks[] = { * used as a driver. See hid_scan_report(). */ static const struct hid_device_id hid_have_special_driver[] = { -#if IS_ENABLED(CONFIG_APPLEDISPLAY) +#if IS_ENABLED(CONFIG_USB_APPLEDISPLAY) { HID_USB_DEVICE(USB_VENDOR_ID_APPLE, 0x9218) }, { HID_USB_DEVICE(USB_VENDOR_ID_APPLE, 0x9219) }, { HID_USB_DEVICE(USB_VENDOR_ID_APPLE, 0x921c) }, diff --git a/drivers/hid/hid-sjoy.c b/drivers/hid/hid-sjoy.c index bab93d71b7608f..963c451132045b 100644 --- a/drivers/hid/hid-sjoy.c +++ b/drivers/hid/hid-sjoy.c @@ -91,17 +91,17 @@ static int sjoyff_init(struct hid_device *hid) set_bit(FF_RUMBLE, dev->ffbit); - error = input_ff_create_memless(dev, sjoyff, hid_sjoyff_play); - if (error) { - kfree(sjoyff); - return error; - } - sjoyff->report = report; sjoyff->report->field[0]->value[0] = 0x01; sjoyff->report->field[0]->value[1] = 0x00; sjoyff->report->field[0]->value[2] = 0x00; hid_hw_request(hid, sjoyff->report, HID_REQ_SET_REPORT); + + error = input_ff_create_memless(dev, sjoyff, hid_sjoyff_play); + if (error) { + kfree(sjoyff); + return error; + } } hid_info(hid, "Force feedback for SmartJoy PLUS PS2/USB adapter\n"); diff --git a/drivers/hid/hid-sony.c b/drivers/hid/hid-sony.c index b5e724676c1dea..315343415e8f1f 100644 --- a/drivers/hid/hid-sony.c +++ b/drivers/hid/hid-sony.c @@ -1169,10 +1169,9 @@ static int sony_raw_event(struct hid_device *hdev, struct hid_report *report, sixaxis_parse_report(sc, rd, size); } else if ((sc->quirks & MOTION_CONTROLLER_BT) && rd[0] == 0x01 && size == 49) { sixaxis_parse_report(sc, rd, size); - } else if ((sc->quirks & NAVIGATION_CONTROLLER) && rd[0] == 0x01 && - size == 49) { + } else if ((sc->quirks & NAVIGATION_CONTROLLER) && rd[0] == 0x01 && size == 49) { sixaxis_parse_report(sc, rd, size); - } else if ((sc->quirks & NSG_MRXU_REMOTE) && rd[0] == 0x02) { + } else if ((sc->quirks & NSG_MRXU_REMOTE) && rd[0] == 0x02 && size >= 12) { nsg_mrxu_parse_report(sc, rd, size); return 1; } else if ((sc->quirks & RB4_GUITAR_PS4_USB) && rd[0] == 0x01 && size == 64) { @@ -1189,7 +1188,7 @@ static int sony_raw_event(struct hid_device *hdev, struct hid_report *report, /* Rock Band 3 PS3 Pro instruments set rd[24] to 0xE0 when they're * sending full reports, and 0x02 when only sending navigation. */ - if ((sc->quirks & RB3_PRO_INSTRUMENT) && rd[24] == 0x02) { + if ((sc->quirks & RB3_PRO_INSTRUMENT) && size >= 25 && rd[24] == 0x02) { /* Only attempt to enable full report every 8 seconds */ if (time_after(jiffies, sc->rb3_pro_poke_jiffies)) { sc->rb3_pro_poke_jiffies = jiffies + secs_to_jiffies(8); @@ -1640,9 +1639,6 @@ static int sony_leds_init(struct sony_sc *sc) u8 max_brightness[MAX_LEDS] = { [0 ... (MAX_LEDS - 1)] = 1 }; u8 use_hw_blink[MAX_LEDS] = { 0 }; - if (WARN_ON(!(sc->quirks & SONY_LED_SUPPORT))) - return -EINVAL; - if (sc->quirks & BUZZ_CONTROLLER) { sc->led_count = 4; use_color_names = 0; @@ -2456,11 +2452,10 @@ static void sony_remove(struct hid_device *hdev) static int sony_suspend(struct hid_device *hdev, pm_message_t message) { #ifdef CONFIG_SONY_FF + struct sony_sc *sc = hid_get_drvdata(hdev); /* On suspend stop any running force-feedback events */ - if (SONY_FF_SUPPORT) { - struct sony_sc *sc = hid_get_drvdata(hdev); - + if (sc->quirks & SONY_FF_SUPPORT) { sc->left = sc->right = 0; sony_send_output_report(sc); } diff --git a/drivers/hid/hid-uclogic-core.c b/drivers/hid/hid-uclogic-core.c index bd7f93e96e4e48..b73f09d26688ab 100644 --- a/drivers/hid/hid-uclogic-core.c +++ b/drivers/hid/hid-uclogic-core.c @@ -184,7 +184,9 @@ static int uclogic_input_configured(struct hid_device *hdev, suffix = "System Control"; break; } - } else { + } + + if (suffix) { hi->input->name = devm_kasprintf(&hdev->dev, GFP_KERNEL, "%s %s", hdev->name, suffix); if (!hi->input->name) diff --git a/drivers/hid/hid-vivaldi-common.c b/drivers/hid/hid-vivaldi-common.c index bf734055d4b69d..b12bb5cc091aa3 100644 --- a/drivers/hid/hid-vivaldi-common.c +++ b/drivers/hid/hid-vivaldi-common.c @@ -85,7 +85,7 @@ void vivaldi_feature_mapping(struct hid_device *hdev, } ret = hid_report_raw_event(hdev, HID_FEATURE_REPORT, report_data, - report_len, 0); + report_len, report_len, 0); if (ret) { dev_warn(&hdev->dev, "failed to report feature %d\n", field->report->id); diff --git a/drivers/hid/i2c-hid/i2c-hid-core.c b/drivers/hid/i2c-hid/i2c-hid-core.c index 5a183af3d5c6a6..3adb16366e9394 100644 --- a/drivers/hid/i2c-hid/i2c-hid-core.c +++ b/drivers/hid/i2c-hid/i2c-hid-core.c @@ -149,6 +149,8 @@ static const struct i2c_hid_quirks { I2C_HID_QUIRK_BOGUS_IRQ }, { I2C_VENDOR_ID_GOODIX, I2C_DEVICE_ID_GOODIX_0D42, I2C_HID_QUIRK_DELAY_WAKEUP_AFTER_RESUME }, + { I2C_VENDOR_ID_BLTP, I2C_PRODUCT_ID_BLTP7853, + I2C_HID_QUIRK_NO_IRQ_AFTER_RESET }, { 0, 0 } }; @@ -574,9 +576,10 @@ static void i2c_hid_get_input(struct i2c_hid *ihid) if (ihid->hid->group != HID_GROUP_RMI) pm_wakeup_event(&ihid->client->dev, 0); - hid_input_report(ihid->hid, HID_INPUT_REPORT, - ihid->inbuf + sizeof(__le16), - ret_size - sizeof(__le16), 1); + hid_safe_input_report(ihid->hid, HID_INPUT_REPORT, + ihid->inbuf + sizeof(__le16), + ihid->bufsize - sizeof(__le16), + ret_size - sizeof(__le16), 1); } return; diff --git a/drivers/hid/intel-thc-hid/intel-quickspi/quickspi-protocol.c b/drivers/hid/intel-thc-hid/intel-quickspi/quickspi-protocol.c index 16f780bc879b12..cb19057f1191ba 100644 --- a/drivers/hid/intel-thc-hid/intel-quickspi/quickspi-protocol.c +++ b/drivers/hid/intel-thc-hid/intel-quickspi/quickspi-protocol.c @@ -94,7 +94,7 @@ static int quickspi_get_device_descriptor(struct quickspi_device *qsdev) dev_err_once(qsdev->dev, "Read DEVICE_DESCRIPTOR failed, ret = %d\n", ret); dev_err_once(qsdev->dev, "DEVICE_DESCRIPTOR expected len = %u, actual read = %u\n", input_len, read_len); - return ret; + return ret ?: -EINVAL; } input_rep_type = ((struct input_report_body_header *)read_buf)->input_report_type; @@ -318,7 +318,7 @@ int reset_tic(struct quickspi_device *qsdev) dev_err_once(qsdev->dev, "Read RESET_RESPONSE body failed, ret = %d\n", ret); dev_err_once(qsdev->dev, "RESET_RESPONSE body expected len = %u, actual = %u\n", read_len, actual_read_len); - return ret; + return ret ?: -EINVAL; } input_rep_type = FIELD_GET(HIDSPI_IN_REP_BDY_HDR_REP_TYPE, reset_response); diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c index fbbfc0f60829be..5af93b9b1fb560 100644 --- a/drivers/hid/usbhid/hid-core.c +++ b/drivers/hid/usbhid/hid-core.c @@ -283,9 +283,9 @@ static void hid_irq_in(struct urb *urb) break; usbhid_mark_busy(usbhid); if (!test_bit(HID_RESUME_RUNNING, &usbhid->iofl)) { - hid_input_report(urb->context, HID_INPUT_REPORT, - urb->transfer_buffer, - urb->actual_length, 1); + hid_safe_input_report(urb->context, HID_INPUT_REPORT, + urb->transfer_buffer, urb->transfer_buffer_length, + urb->actual_length, 1); /* * autosuspend refused while keys are pressed * because most keyboards don't wake up when @@ -482,9 +482,10 @@ static void hid_ctrl(struct urb *urb) switch (status) { case 0: /* success */ if (usbhid->ctrl[usbhid->ctrltail].dir == USB_DIR_IN) - hid_input_report(urb->context, + hid_safe_input_report(urb->context, usbhid->ctrl[usbhid->ctrltail].report->type, - urb->transfer_buffer, urb->actual_length, 0); + urb->transfer_buffer, urb->transfer_buffer_length, + urb->actual_length, 0); break; case -ESHUTDOWN: /* unplug */ unplug = 1; diff --git a/drivers/hid/usbhid/hid-pidff.c b/drivers/hid/usbhid/hid-pidff.c index aee8a44433059e..c45f182d044800 100644 --- a/drivers/hid/usbhid/hid-pidff.c +++ b/drivers/hid/usbhid/hid-pidff.c @@ -11,6 +11,7 @@ #include "hid-pidff.h" #include #include +#include #include #include #include @@ -326,8 +327,10 @@ static s32 pidff_clamp(s32 i, struct hid_field *field) */ static int pidff_rescale(int i, int max, struct hid_field *field) { - return i * (field->logical_maximum - field->logical_minimum) / max + - field->logical_minimum; + /* 64 bits needed for big values during rescale */ + s64 result = field->logical_maximum - field->logical_minimum; + + return div_s64(result * i, max) + field->logical_minimum; } /* diff --git a/drivers/hid/wacom_sys.c b/drivers/hid/wacom_sys.c index 0d1c6d90fe21c5..a32320b351e3ee 100644 --- a/drivers/hid/wacom_sys.c +++ b/drivers/hid/wacom_sys.c @@ -90,7 +90,7 @@ static void wacom_wac_queue_flush(struct hid_device *hdev, kfree(buf); continue; } - err = hid_report_raw_event(hdev, HID_INPUT_REPORT, buf, size, false); + err = hid_report_raw_event(hdev, HID_INPUT_REPORT, buf, size, size, false); if (err) { hid_warn(hdev, "%s: unable to flush event due to error %d\n", __func__, err); @@ -334,7 +334,7 @@ static void wacom_feature_mapping(struct hid_device *hdev, data, n, WAC_CMD_RETRIES); if (ret == n && features->type == HID_GENERIC) { ret = hid_report_raw_event(hdev, - HID_FEATURE_REPORT, data, n, 0); + HID_FEATURE_REPORT, data, n, n, 0); } else if (ret == 2 && features->type != HID_GENERIC) { features->touch_max = data[1]; } else { @@ -395,7 +395,7 @@ static void wacom_feature_mapping(struct hid_device *hdev, data, n, WAC_CMD_RETRIES); if (ret == n) { ret = hid_report_raw_event(hdev, HID_FEATURE_REPORT, - data, n, 0); + data, n, n, 0); } else { hid_warn(hdev, "%s: could not retrieve sensor offsets\n", __func__); diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile index 4788996aa1374d..982ee2c6f9deb6 100644 --- a/drivers/hwmon/Makefile +++ b/drivers/hwmon/Makefile @@ -201,7 +201,6 @@ obj-$(CONFIG_SENSORS_PWM_FAN) += pwm-fan.o obj-$(CONFIG_SENSORS_QNAP_MCU_HWMON) += qnap-mcu-hwmon.o obj-$(CONFIG_SENSORS_RASPBERRYPI_HWMON) += raspberrypi-hwmon.o obj-$(CONFIG_SENSORS_SBTSI) += sbtsi_temp.o -obj-$(CONFIG_SENSORS_SBRMI) += sbrmi.o obj-$(CONFIG_SENSORS_SCH56XX_COMMON)+= sch56xx-common.o obj-$(CONFIG_SENSORS_SCH5627) += sch5627.o obj-$(CONFIG_SENSORS_SCH5636) += sch5636.o diff --git a/drivers/hwmon/acpi_power_meter.c b/drivers/hwmon/acpi_power_meter.c index be7f702dcde9c0..0c9b9f4180fb77 100644 --- a/drivers/hwmon/acpi_power_meter.c +++ b/drivers/hwmon/acpi_power_meter.c @@ -884,10 +884,14 @@ static void acpi_power_meter_notify(acpi_handle handle, u32 event, void *data) static int acpi_power_meter_probe(struct platform_device *pdev) { - struct acpi_device *device = ACPI_COMPANION(&pdev->dev); struct acpi_power_meter_resource *resource; + struct acpi_device *device; int res; + device = ACPI_COMPANION(&pdev->dev); + if (!device) + return -ENODEV; + resource = kzalloc_obj(*resource); if (!resource) return -ENOMEM; diff --git a/drivers/hwmon/ads7871.c b/drivers/hwmon/ads7871.c index 9bfdf9e6bcd77d..9ee3ce01f130bb 100644 --- a/drivers/hwmon/ads7871.c +++ b/drivers/hwmon/ads7871.c @@ -77,9 +77,13 @@ static int ads7871_read_reg8(struct spi_device *spi, int reg) static int ads7871_read_reg16(struct spi_device *spi, int reg) { int ret; + reg = reg | INST_READ_BM | INST_16BIT_BM; ret = spi_w8r16(spi, reg); - return ret; + if (ret < 0) + return ret; + + return le16_to_cpu((__force __le16)ret); } static int ads7871_write_reg8(struct spi_device *spi, int reg, u8 val) diff --git a/drivers/hwmon/asus_atk0110.c b/drivers/hwmon/asus_atk0110.c index 5688ff5f7c28d8..109318b0434d99 100644 --- a/drivers/hwmon/asus_atk0110.c +++ b/drivers/hwmon/asus_atk0110.c @@ -1273,15 +1273,20 @@ static int atk_probe(struct platform_device *pdev) struct acpi_buffer buf; union acpi_object *obj; struct atk_data *data; + acpi_handle handle; dev_dbg(&pdev->dev, "adding...\n"); + handle = ACPI_HANDLE(&pdev->dev); + if (!handle) + return -ENODEV; + data = devm_kzalloc(&pdev->dev, sizeof(*data), GFP_KERNEL); if (!data) return -ENOMEM; data->dev = &pdev->dev; - data->atk_handle = ACPI_HANDLE(&pdev->dev); + data->atk_handle = handle; INIT_LIST_HEAD(&data->sensor_list); data->disable_ec = false; diff --git a/drivers/hwmon/corsair-psu.c b/drivers/hwmon/corsair-psu.c index dddbd2463f8da7..76f3e1da68d09e 100644 --- a/drivers/hwmon/corsair-psu.c +++ b/drivers/hwmon/corsair-psu.c @@ -796,13 +796,13 @@ static int corsairpsu_probe(struct hid_device *hdev, const struct hid_device_id ret = corsairpsu_init(priv); if (ret < 0) { dev_err(&hdev->dev, "unable to initialize device (%d)\n", ret); - goto fail_and_stop; + goto fail_and_close; } ret = corsairpsu_fwinfo(priv); if (ret < 0) { dev_err(&hdev->dev, "unable to query firmware (%d)\n", ret); - goto fail_and_stop; + goto fail_and_close; } corsairpsu_get_criticals(priv); diff --git a/drivers/hwmon/lm63.c b/drivers/hwmon/lm63.c index 035176a98ce9c6..30500b4d222129 100644 --- a/drivers/hwmon/lm63.c +++ b/drivers/hwmon/lm63.c @@ -333,7 +333,13 @@ static ssize_t show_fan(struct device *dev, struct device_attribute *devattr, { struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr); struct lm63_data *data = lm63_update_device(dev); - return sprintf(buf, "%d\n", FAN_FROM_REG(data->fan[attr->index])); + int fan; + + mutex_lock(&data->update_lock); + fan = FAN_FROM_REG(data->fan[attr->index]); + mutex_unlock(&data->update_lock); + + return sprintf(buf, "%d\n", fan); } static ssize_t set_fan(struct device *dev, struct device_attribute *dummy, @@ -366,12 +372,14 @@ static ssize_t show_pwm1(struct device *dev, struct device_attribute *devattr, int nr = attr->index; int pwm; + mutex_lock(&data->update_lock); if (data->pwm_highres) pwm = data->pwm1[nr]; else pwm = data->pwm1[nr] >= 2 * data->pwm1_freq ? 255 : (data->pwm1[nr] * 255 + data->pwm1_freq) / (2 * data->pwm1_freq); + mutex_unlock(&data->update_lock); return sprintf(buf, "%d\n", pwm); } @@ -529,6 +537,7 @@ static ssize_t show_temp11(struct device *dev, struct device_attribute *devattr, int nr = attr->index; int temp; + mutex_lock(&data->update_lock); if (!nr) { /* * Use unsigned temperature unless its value is zero. @@ -544,7 +553,10 @@ static ssize_t show_temp11(struct device *dev, struct device_attribute *devattr, else temp = TEMP11_FROM_REG(data->temp11[nr]); } - return sprintf(buf, "%d\n", temp + data->temp2_offset); + temp += data->temp2_offset; + mutex_unlock(&data->update_lock); + + return sprintf(buf, "%d\n", temp); } static ssize_t set_temp11(struct device *dev, struct device_attribute *devattr, @@ -592,9 +604,14 @@ static ssize_t temp2_crit_hyst_show(struct device *dev, struct device_attribute *dummy, char *buf) { struct lm63_data *data = lm63_update_device(dev); - return sprintf(buf, "%d\n", temp8_from_reg(data, 2) - + data->temp2_offset - - TEMP8_FROM_REG(data->temp2_crit_hyst)); + int temp; + + mutex_lock(&data->update_lock); + temp = temp8_from_reg(data, 2) + data->temp2_offset + - TEMP8_FROM_REG(data->temp2_crit_hyst); + mutex_unlock(&data->update_lock); + + return sprintf(buf, "%d\n", temp); } static ssize_t show_lut_temp_hyst(struct device *dev, @@ -602,10 +619,14 @@ static ssize_t show_lut_temp_hyst(struct device *dev, { struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr); struct lm63_data *data = lm63_update_device(dev); + int temp; - return sprintf(buf, "%d\n", lut_temp_from_reg(data, attr->index) - + data->temp2_offset - - TEMP8_FROM_REG(data->lut_temp_hyst)); + mutex_lock(&data->update_lock); + temp = lut_temp_from_reg(data, attr->index) + data->temp2_offset + - TEMP8_FROM_REG(data->lut_temp_hyst); + mutex_unlock(&data->update_lock); + + return sprintf(buf, "%d\n", temp); } /* @@ -616,7 +637,7 @@ static ssize_t temp2_crit_hyst_store(struct device *dev, struct device_attribute *dummy, const char *buf, size_t count) { - struct lm63_data *data = dev_get_drvdata(dev); + struct lm63_data *data = lm63_update_device(dev); struct i2c_client *client = data->client; long val; int err; diff --git a/drivers/hwmon/lm75.c b/drivers/hwmon/lm75.c index f1a1e5b888f64e..c283443e363b42 100644 --- a/drivers/hwmon/lm75.c +++ b/drivers/hwmon/lm75.c @@ -137,7 +137,7 @@ static const struct lm75_params device_params[] = { }, [as6200] = { .config_reg_16bits = true, - .set_mask = 0x94C0, /* 8 sample/s, 4 CF, positive polarity */ + .set_mask = 0xC010, /* 8 sample/s, 4 CF */ .default_resolution = 12, .default_sample_time = 125, .num_sample_times = 4, @@ -286,8 +286,8 @@ static const struct lm75_params device_params[] = { }, [tmp112] = { .config_reg_16bits = true, - .set_mask = 0x60C0, /* 12-bit mode, 8 samples / second */ - .clr_mask = 1 << 15, /* no one-shot mode*/ + .set_mask = 0xC060, /* 12-bit mode, 8 samples / second */ + .clr_mask = 1 << 7, /* no one-shot mode*/ .default_resolution = 12, .default_sample_time = 125, .num_sample_times = 4, @@ -353,7 +353,7 @@ static inline int lm75_write_config(struct lm75_data *data, u16 set_mask, u16 clr_mask) { return regmap_update_bits(data->regmap, LM75_REG_CONF, - clr_mask | LM75_SHUTDOWN, set_mask); + clr_mask | set_mask | LM75_SHUTDOWN, set_mask); } static irqreturn_t lm75_alarm_handler(int irq, void *private) @@ -416,7 +416,7 @@ static int lm75_read(struct device *dev, enum hwmon_sensor_types type, switch (data->kind) { case as6200: case tmp112: - *val = (regval >> 13) & 0x1; + *val = !!(regval & BIT(13)) == !!(regval & BIT(2)); break; default: return -EINVAL; diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c index 3c10a5066b53de..1eeb608e59039d 100644 --- a/drivers/hwmon/lm90.c +++ b/drivers/hwmon/lm90.c @@ -736,6 +736,7 @@ struct lm90_data { struct hwmon_chip_info chip; struct delayed_work alert_work; struct work_struct report_work; + bool shutdown; /* true if shutting down */ bool valid; /* true if register values are valid */ bool alarms_valid; /* true if status register values are valid */ unsigned long last_updated; /* in jiffies */ @@ -1154,6 +1155,9 @@ static void lm90_report_alarms(struct work_struct *work) static int lm90_update_alarms_locked(struct lm90_data *data, bool force) { + if (data->shutdown) + return 0; + if (force || !data->alarms_valid || time_after(jiffies, data->alarms_updated + msecs_to_jiffies(data->update_interval))) { struct i2c_client *client = data->client; @@ -2584,15 +2588,23 @@ static void lm90_restore_conf(void *_data) struct lm90_data *data = _data; struct i2c_client *client = data->client; - cancel_delayed_work_sync(&data->alert_work); - cancel_work_sync(&data->report_work); - /* Restore initial configuration */ if (data->flags & LM90_HAVE_CONVRATE) lm90_write_convrate(data, data->convrate_orig); lm90_write_reg(client, LM90_REG_CONFIG1, data->config_orig); } +static void lm90_stop_work(void *_data) +{ + struct lm90_data *data = _data; + + hwmon_lock(data->hwmon_dev); + data->shutdown = true; + hwmon_unlock(data->hwmon_dev); + cancel_delayed_work_sync(&data->alert_work); + cancel_work_sync(&data->report_work); +} + static int lm90_init_client(struct i2c_client *client, struct lm90_data *data) { struct device_node *np = client->dev.of_node; @@ -2902,6 +2914,10 @@ static int lm90_probe(struct i2c_client *client) data->hwmon_dev = hwmon_dev; + err = devm_add_action_or_reset(&client->dev, lm90_stop_work, data); + if (err) + return err; + if (client->irq) { dev_dbg(dev, "IRQ: %d\n", client->irq); err = devm_request_threaded_irq(dev, client->irq, @@ -2930,7 +2946,8 @@ static void lm90_alert(struct i2c_client *client, enum i2c_alert_protocol type, */ struct lm90_data *data = i2c_get_clientdata(client); - if ((data->flags & LM90_HAVE_BROKEN_ALERT) && + hwmon_lock(data->hwmon_dev); + if (!data->shutdown && (data->flags & LM90_HAVE_BROKEN_ALERT) && (data->current_alarms & data->alert_alarms)) { if (!(data->config & 0x80)) { dev_dbg(&client->dev, "Disabling ALERT#\n"); @@ -2939,6 +2956,7 @@ static void lm90_alert(struct i2c_client *client, enum i2c_alert_protocol type, schedule_delayed_work(&data->alert_work, max_t(int, HZ, msecs_to_jiffies(data->update_interval))); } + hwmon_unlock(data->hwmon_dev); } else { dev_dbg(&client->dev, "Everything OK\n"); } diff --git a/drivers/hwmon/ltc2992.c b/drivers/hwmon/ltc2992.c index 1fcd320d616197..2617c4538af91d 100644 --- a/drivers/hwmon/ltc2992.c +++ b/drivers/hwmon/ltc2992.c @@ -431,10 +431,16 @@ static int ltc2992_get_voltage(struct ltc2992_state *st, u32 reg, u32 scale, lon static int ltc2992_set_voltage(struct ltc2992_state *st, u32 reg, u32 scale, long val) { - val = DIV_ROUND_CLOSEST(val * 1000, scale); - val = val << 4; + u32 reg_val; + long vmax; + + vmax = DIV_ROUND_CLOSEST_ULL(0xFFFULL * scale, 1000); + val = max(val, 0L); + val = min(val, vmax); + reg_val = min(DIV_ROUND_CLOSEST_ULL((u64)val * 1000, scale), + 0xFFFULL) << 4; - return ltc2992_write_reg(st, reg, 2, val); + return ltc2992_write_reg(st, reg, 2, reg_val); } static int ltc2992_read_gpio_alarm(struct ltc2992_state *st, int nr_gpio, u32 attr, long *val) @@ -559,9 +565,15 @@ static int ltc2992_get_current(struct ltc2992_state *st, u32 reg, u32 channel, l static int ltc2992_set_current(struct ltc2992_state *st, u32 reg, u32 channel, long val) { u32 reg_val; + long cmax; - reg_val = DIV_ROUND_CLOSEST(val * st->r_sense_uohm[channel], LTC2992_IADC_NANOV_LSB); - reg_val = reg_val << 4; + cmax = DIV_ROUND_CLOSEST_ULL(0xFFFULL * LTC2992_IADC_NANOV_LSB, + st->r_sense_uohm[channel]); + val = max(val, 0L); + val = min(val, cmax); + reg_val = min(DIV_ROUND_CLOSEST_ULL((u64)val * st->r_sense_uohm[channel], + LTC2992_IADC_NANOV_LSB), + 0xFFFULL) << 4; return ltc2992_write_reg(st, reg, 2, reg_val); } @@ -625,8 +637,10 @@ static int ltc2992_get_power(struct ltc2992_state *st, u32 reg, u32 channel, lon if (reg_val < 0) return reg_val; - *val = mul_u64_u32_div(reg_val, LTC2992_VADC_UV_LSB * LTC2992_IADC_NANOV_LSB, - st->r_sense_uohm[channel] * 1000); + *val = mul_u64_u32_div(reg_val, + LTC2992_VADC_UV_LSB / 1000 * + LTC2992_IADC_NANOV_LSB, + st->r_sense_uohm[channel]); return 0; } @@ -634,9 +648,18 @@ static int ltc2992_get_power(struct ltc2992_state *st, u32 reg, u32 channel, lon static int ltc2992_set_power(struct ltc2992_state *st, u32 reg, u32 channel, long val) { u32 reg_val; - - reg_val = mul_u64_u32_div(val, st->r_sense_uohm[channel] * 1000, - LTC2992_VADC_UV_LSB * LTC2992_IADC_NANOV_LSB); + u64 pmax, uval; + + uval = max(val, 0L); + pmax = mul_u64_u32_div(0xFFFFFFULL, + LTC2992_VADC_UV_LSB / 1000 * + LTC2992_IADC_NANOV_LSB, + st->r_sense_uohm[channel]); + uval = min(uval, pmax); + reg_val = min(mul_u64_u32_div(uval, st->r_sense_uohm[channel], + LTC2992_VADC_UV_LSB / 1000 * + LTC2992_IADC_NANOV_LSB), + 0xFFFFFFULL); return ltc2992_write_reg(st, reg, 3, reg_val); } diff --git a/drivers/i2c/busses/i2c-stm32f7.c b/drivers/i2c/busses/i2c-stm32f7.c index 70cb5822bf17bf..53d9df70ebe4de 100644 --- a/drivers/i2c/busses/i2c-stm32f7.c +++ b/drivers/i2c/busses/i2c-stm32f7.c @@ -895,8 +895,6 @@ static void stm32f7_i2c_xfer_msg(struct stm32f7_i2c_dev *i2c_dev, f7_msg->result = 0; f7_msg->stop = (i2c_dev->msg_id >= i2c_dev->msg_num - 1); - reinit_completion(&i2c_dev->complete); - cr1 = readl_relaxed(base + STM32F7_I2C_CR1); cr2 = readl_relaxed(base + STM32F7_I2C_CR2); @@ -1728,6 +1726,8 @@ static int stm32f7_i2c_xfer_core(struct i2c_adapter *i2c_adap, if (ret) goto pm_free; + reinit_completion(&i2c_dev->complete); + stm32f7_i2c_xfer_msg(i2c_dev, msgs); if (!i2c_dev->atomic) @@ -2253,7 +2253,7 @@ static int stm32f7_i2c_probe(struct platform_device *pdev) snprintf(adap->name, sizeof(adap->name), "STM32F7 I2C(%pa)", &res->start); adap->owner = THIS_MODULE; - adap->timeout = 2 * HZ; + adap->timeout = 8 * HZ; adap->retries = 3; adap->algo = &stm32f7_i2c_algo; adap->dev.parent = &pdev->dev; diff --git a/drivers/i2c/i2c-core-acpi.c b/drivers/i2c/i2c-core-acpi.c index 2cbd31f77667aa..28c0e4884a7f2e 100644 --- a/drivers/i2c/i2c-core-acpi.c +++ b/drivers/i2c/i2c-core-acpi.c @@ -371,6 +371,7 @@ static const struct acpi_device_id i2c_acpi_force_100khz_device_ids[] = { * a 400KHz frequency. The root cause of the issue is not known. */ { "DLL0945", 0 }, + { "ELAN0678", 0 }, { "ELAN06FA", 0 }, {} }; diff --git a/drivers/i2c/i2c-core-base.c b/drivers/i2c/i2c-core-base.c index 9c46147e3506d1..a2132d70fb360d 100644 --- a/drivers/i2c/i2c-core-base.c +++ b/drivers/i2c/i2c-core-base.c @@ -445,8 +445,7 @@ static int i2c_init_recovery(struct i2c_adapter *adap) bri->set_scl = set_scl_gpio_value; if (bri->sda_gpiod) { bri->get_sda = get_sda_gpio_value; - /* FIXME: add proper flag instead of '0' once available */ - if (gpiod_get_direction(bri->sda_gpiod) == 0) + if (gpiod_get_direction(bri->sda_gpiod) == GPIO_LINE_DIRECTION_OUT) bri->set_sda = set_sda_gpio_value; } } else if (bri->recover_bus == i2c_generic_scl_recovery) { diff --git a/drivers/i2c/i2c-core-smbus.c b/drivers/i2c/i2c-core-smbus.c index 71eb1ef56f0c34..ad6acb5ebadc32 100644 --- a/drivers/i2c/i2c-core-smbus.c +++ b/drivers/i2c/i2c-core-smbus.c @@ -566,6 +566,18 @@ s32 __i2c_smbus_xfer(struct i2c_adapter *adapter, u16 addr, if (res) return res; + /* Reject invalid caller-supplied block lengths before any + * tracepoint or native smbus_xfer callback runs. + */ + if (data && + (protocol == I2C_SMBUS_I2C_BLOCK_DATA || + protocol == I2C_SMBUS_BLOCK_PROC_CALL || + (protocol == I2C_SMBUS_BLOCK_DATA && + read_write == I2C_SMBUS_WRITE)) && + (data->block[0] == 0 || + data->block[0] > I2C_SMBUS_BLOCK_MAX)) + return -EINVAL; + /* If enabled, the following two tracepoints are conditional on * read_write and protocol. */ diff --git a/drivers/i2c/i2c-dev.c b/drivers/i2c/i2c-dev.c index 7bbe0263411eb7..ccaac5e29f906b 100644 --- a/drivers/i2c/i2c-dev.c +++ b/drivers/i2c/i2c-dev.c @@ -487,12 +487,13 @@ static long i2cdev_ioctl(struct file *file, unsigned int cmd, unsigned long arg) client->adapter->retries = arg; break; case I2C_TIMEOUT: - if (arg > INT_MAX) + /* + * For historical reasons, user-space sets the timeout value in + * units of 10 ms. + */ + if (arg > INT_MAX / 10) return -EINVAL; - /* For historical reasons, user-space sets the timeout - * value in units of 10 ms. - */ client->adapter->timeout = msecs_to_jiffies(arg * 10); break; default: diff --git a/drivers/i2c/i2c-slave-testunit.c b/drivers/i2c/i2c-slave-testunit.c index 6de4307050dde6..871c58461ebcc1 100644 --- a/drivers/i2c/i2c-slave-testunit.c +++ b/drivers/i2c/i2c-slave-testunit.c @@ -15,7 +15,7 @@ #include #include #include -#include /* FIXME: is system_long_wq the best choice? */ +#include #define TU_VERSION_MAX_LENGTH 128 @@ -124,7 +124,7 @@ static int i2c_slave_testunit_slave_cb(struct i2c_client *client, case I2C_SLAVE_STOP: if (tu->reg_idx == TU_NUM_REGS) { set_bit(TU_FLAG_IN_PROCESS, &tu->flags); - queue_delayed_work(system_long_wq, &tu->worker, + queue_delayed_work(system_dfl_long_wq, &tu->worker, msecs_to_jiffies(10 * tu->regs[TU_REG_DELAY])); } diff --git a/drivers/i2c/i2c-stub.c b/drivers/i2c/i2c-stub.c index fbb0db41b10e16..04314e3ed24c95 100644 --- a/drivers/i2c/i2c-stub.c +++ b/drivers/i2c/i2c-stub.c @@ -214,6 +214,11 @@ static s32 stub_xfer(struct i2c_adapter *adap, u16 addr, unsigned short flags, * We ignore banks here, because banked chips don't use I2C * block transfers */ + if (data->block[0] == 0 || + data->block[0] > I2C_SMBUS_BLOCK_MAX) { + ret = -EINVAL; + break; + } if (data->block[0] > 256 - command) /* Avoid overrun */ data->block[0] = 256 - command; len = data->block[0]; diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c index a40a765f030728..27992c38ad9020 100644 --- a/drivers/infiniband/core/addr.c +++ b/drivers/infiniband/core/addr.c @@ -149,7 +149,7 @@ static int ib_nl_ip_send_msg(struct rdma_dev_addr *dev_addr, attrtype = RDMA_NLA_F_MANDATORY | LS_NLA_TYPE_IPV6; } - len = nla_total_size(sizeof(size)); + len = nla_total_size(size); len += NLMSG_ALIGN(sizeof(*header)); skb = nlmsg_new(len, GFP_KERNEL); diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c index a768436ba46805..91a62d2ade4dd0 100644 --- a/drivers/infiniband/core/uverbs_cmd.c +++ b/drivers/infiniband/core/uverbs_cmd.c @@ -778,6 +778,7 @@ static int ib_uverbs_rereg_mr(struct uverbs_attr_bundle *attrs) struct ib_pd *orig_pd; struct ib_pd *new_pd; struct ib_mr *new_mr; + u32 lkey, rkey; ret = uverbs_request(attrs, &cmd, sizeof(cmd)); if (ret) @@ -846,6 +847,8 @@ static int ib_uverbs_rereg_mr(struct uverbs_attr_bundle *attrs) new_mr->uobject = uobj; atomic_inc(&new_pd->usecnt); new_uobj->object = new_mr; + lkey = new_mr->lkey; + rkey = new_mr->rkey; rdma_restrack_new(&new_mr->res, RDMA_RESTRACK_MR); rdma_restrack_set_name(&new_mr->res, NULL); @@ -871,11 +874,13 @@ static int ib_uverbs_rereg_mr(struct uverbs_attr_bundle *attrs) mr->iova = cmd.hca_va; mr->length = cmd.length; } + lkey = mr->lkey; + rkey = mr->rkey; } memset(&resp, 0, sizeof(resp)); - resp.lkey = mr->lkey; - resp.rkey = mr->rkey; + resp.lkey = lkey; + resp.rkey = rkey; ret = uverbs_response(attrs, &resp, sizeof(resp)); diff --git a/drivers/infiniband/hw/hfi1/pio.c b/drivers/infiniband/hw/hfi1/pio.c index 51afaac88c7259..9121d83bf88afe 100644 --- a/drivers/infiniband/hw/hfi1/pio.c +++ b/drivers/infiniband/hw/hfi1/pio.c @@ -1942,13 +1942,16 @@ int pio_map_init(struct hfi1_devdata *dd, u8 port, u8 num_vls, u8 *vl_scontexts) void free_pio_map(struct hfi1_devdata *dd) { + struct pio_vl_map *map; + /* Free PIO map if allocated */ if (rcu_access_pointer(dd->pio_map)) { spin_lock_irq(&dd->pio_map_lock); - pio_map_free(rcu_access_pointer(dd->pio_map)); + map = rcu_access_pointer(dd->pio_map); RCU_INIT_POINTER(dd->pio_map, NULL); spin_unlock_irq(&dd->pio_map_lock); synchronize_rcu(); + pio_map_free(map); } kfree(dd->kernel_send_context); dd->kernel_send_context = NULL; diff --git a/drivers/infiniband/hw/hfi1/sdma.c b/drivers/infiniband/hw/hfi1/sdma.c index e5f442938177eb..cfd9dd0f7e8176 100644 --- a/drivers/infiniband/hw/hfi1/sdma.c +++ b/drivers/infiniband/hw/hfi1/sdma.c @@ -1255,6 +1255,7 @@ void sdma_clean(struct hfi1_devdata *dd, size_t num_engines) { size_t i; struct sdma_engine *sde; + struct sdma_vl_map *map; if (dd->sdma_pad_dma) { dma_free_coherent(&dd->pcidev->dev, SDMA_PAD, @@ -1291,10 +1292,11 @@ void sdma_clean(struct hfi1_devdata *dd, size_t num_engines) } if (rcu_access_pointer(dd->sdma_map)) { spin_lock_irq(&dd->sde_map_lock); - sdma_map_free(rcu_access_pointer(dd->sdma_map)); + map = rcu_access_pointer(dd->sdma_map); RCU_INIT_POINTER(dd->sdma_map, NULL); spin_unlock_irq(&dd->sde_map_lock); synchronize_rcu(); + sdma_map_free(map); } kfree(dd->per_sdma); dd->per_sdma = NULL; diff --git a/drivers/infiniband/hw/hns/hns_roce_qp.c b/drivers/infiniband/hw/hns/hns_roce_qp.c index a27ea85bb06323..bf04ee84a94392 100644 --- a/drivers/infiniband/hw/hns/hns_roce_qp.c +++ b/drivers/infiniband/hw/hns/hns_roce_qp.c @@ -47,8 +47,8 @@ static struct hns_roce_qp *hns_roce_qp_lookup(struct hns_roce_dev *hr_dev, xa_lock_irqsave(&hr_dev->qp_table_xa, flags); qp = __hns_roce_qp_lookup(hr_dev, qpn); - if (qp) - refcount_inc(&qp->refcount); + if (qp && !refcount_inc_not_zero(&qp->refcount)) + qp = NULL; xa_unlock_irqrestore(&hr_dev->qp_table_xa, flags); if (!qp) @@ -1171,6 +1171,7 @@ static int hns_roce_create_qp_common(struct hns_roce_dev *hr_dev, struct hns_roce_ib_create_qp_resp resp = {}; struct ib_device *ibdev = &hr_dev->ib_dev; struct hns_roce_ib_create_qp ucmd = {}; + unsigned long flags; int ret; mutex_init(&hr_qp->mutex); @@ -1251,13 +1252,19 @@ static int hns_roce_create_qp_common(struct hns_roce_dev *hr_dev, hr_qp->ibqp.qp_num = hr_qp->qpn; hr_qp->event = hns_roce_ib_qp_event; - refcount_set(&hr_qp->refcount, 1); init_completion(&hr_qp->free); + refcount_set_release(&hr_qp->refcount, 1); return 0; err_flow_ctrl: + spin_lock_irqsave(&hr_dev->qp_list_lock, flags); + hns_roce_lock_cqs(init_attr->send_cq ? to_hr_cq(init_attr->send_cq) : NULL, + init_attr->recv_cq ? to_hr_cq(init_attr->recv_cq) : NULL); hns_roce_qp_remove(hr_dev, hr_qp); + hns_roce_unlock_cqs(init_attr->send_cq ? to_hr_cq(init_attr->send_cq) : NULL, + init_attr->recv_cq ? to_hr_cq(init_attr->recv_cq) : NULL); + spin_unlock_irqrestore(&hr_dev->qp_list_lock, flags); err_store: free_qpc(hr_dev, hr_qp); err_qpc: diff --git a/drivers/infiniband/hw/hns/hns_roce_srq.c b/drivers/infiniband/hw/hns/hns_roce_srq.c index cb848e8e6bbd76..8b94cbdfa54dfa 100644 --- a/drivers/infiniband/hw/hns/hns_roce_srq.c +++ b/drivers/infiniband/hw/hns/hns_roce_srq.c @@ -16,8 +16,8 @@ void hns_roce_srq_event(struct hns_roce_dev *hr_dev, u32 srqn, int event_type) xa_lock(&srq_table->xa); srq = xa_load(&srq_table->xa, srqn & (hr_dev->caps.num_srqs - 1)); - if (srq) - refcount_inc(&srq->refcount); + if (srq && !refcount_inc_not_zero(&srq->refcount)) + srq = NULL; xa_unlock(&srq_table->xa); if (!srq) { @@ -470,6 +470,10 @@ int hns_roce_create_srq(struct ib_srq *ib_srq, if (ret) goto err_srqn; + srq->event = hns_roce_ib_srq_event; + init_completion(&srq->free); + refcount_set_release(&srq->refcount, 1); + if (udata) { resp.cap_flags = srq->cap_flags; resp.srqn = srq->srqn; @@ -480,10 +484,6 @@ int hns_roce_create_srq(struct ib_srq *ib_srq, } } - srq->event = hns_roce_ib_srq_event; - refcount_set(&srq->refcount, 1); - init_completion(&srq->free); - return 0; err_srqc: diff --git a/drivers/infiniband/hw/ionic/ionic_ibdev.c b/drivers/infiniband/hw/ionic/ionic_ibdev.c index 0382a64839d26a..73a616ae350236 100644 --- a/drivers/infiniband/hw/ionic/ionic_ibdev.c +++ b/drivers/infiniband/hw/ionic/ionic_ibdev.c @@ -185,7 +185,7 @@ static ssize_t hca_type_show(struct device *device, struct ionic_ibdev *dev = rdma_device_to_drv_device(device, struct ionic_ibdev, ibdev); - return sysfs_emit(buf, "%s.64\n", dev->ibdev.node_desc); + return sysfs_emit(buf, "%.64s\n", dev->ibdev.node_desc); } static DEVICE_ATTR_RO(hca_type); diff --git a/drivers/infiniband/hw/mana/cq.c b/drivers/infiniband/hw/mana/cq.c index f4cbe21763bf11..2d682428ef202a 100644 --- a/drivers/infiniband/hw/mana/cq.c +++ b/drivers/infiniband/hw/mana/cq.c @@ -137,8 +137,9 @@ int mana_ib_install_cq_cb(struct mana_ib_dev *mdev, struct mana_ib_cq *cq) if (cq->queue.id >= gc->max_num_cqs) return -EINVAL; - /* Create CQ table entry */ - WARN_ON(gc->cq_table[cq->queue.id]); + /* Create CQ table entry, sharing a CQ between WQs is not supported */ + if (gc->cq_table[cq->queue.id]) + return -EINVAL; if (cq->queue.kmem) gdma_cq = cq->queue.kmem; else diff --git a/drivers/infiniband/hw/mana/qp.c b/drivers/infiniband/hw/mana/qp.c index 645581359cee0b..0fbcf449c134b5 100644 --- a/drivers/infiniband/hw/mana/qp.c +++ b/drivers/infiniband/hw/mana/qp.c @@ -21,6 +21,9 @@ static int mana_ib_cfg_vport_steering(struct mana_ib_dev *dev, gc = mdev_to_gc(dev); + if (rx_hash_key_len > sizeof(req->hashkey)) + return -EINVAL; + req_buf_size = struct_size(req, indir_tab, MANA_INDIRECT_TABLE_DEF_SIZE); req = kzalloc(req_buf_size, GFP_KERNEL); if (!req) @@ -173,11 +176,8 @@ static int mana_ib_create_qp_rss(struct ib_qp *ibqp, struct ib_pd *pd, ret = mana_create_wq_obj(mpc, mpc->port_handle, GDMA_RQ, &wq_spec, &cq_spec, &wq->rx_object); - if (ret) { - /* Do cleanup starting with index i-1 */ - i--; + if (ret) goto fail; - } /* The GDMA regions are now owned by the WQ object */ wq->queue.gdma_region = GDMA_INVALID_DMA_REGION; @@ -197,8 +197,10 @@ static int mana_ib_create_qp_rss(struct ib_qp *ibqp, struct ib_pd *pd, /* Create CQ table entry */ ret = mana_ib_install_cq_cb(mdev, cq); - if (ret) + if (ret) { + mana_destroy_wq_obj(mpc, GDMA_RQ, wq->rx_object); goto fail; + } } resp.num_entries = i; @@ -215,13 +217,15 @@ static int mana_ib_create_qp_rss(struct ib_qp *ibqp, struct ib_pd *pd, ibdev_dbg(&mdev->ib_dev, "Failed to copy to udata create rss-qp, %d\n", ret); - goto fail; + goto err_disable_vport_rx; } kfree(mana_ind_table); return 0; +err_disable_vport_rx: + mana_disable_vport_rx(mpc); fail: while (i-- > 0) { ibwq = ind_tbl->ind_tbl[i]; diff --git a/drivers/infiniband/hw/mlx4/srq.c b/drivers/infiniband/hw/mlx4/srq.c index 5b23e5f8b84aca..767840736d583b 100644 --- a/drivers/infiniband/hw/mlx4/srq.c +++ b/drivers/infiniband/hw/mlx4/srq.c @@ -194,13 +194,15 @@ int mlx4_ib_create_srq(struct ib_srq *ib_srq, if (udata) if (ib_copy_to_udata(udata, &srq->msrq.srqn, sizeof (__u32))) { err = -EFAULT; - goto err_wrid; + goto err_srq; } init_attr->attr.max_wr = srq->msrq.max - 1; return 0; +err_srq: + mlx4_srq_free(dev->dev, &srq->msrq); err_wrid: if (udata) mlx4_ib_db_unmap_user(ucontext, &srq->db); diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index 109661c2ac12b0..61078281953d6c 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -3310,7 +3310,7 @@ int mlx5_ib_dev_res_cq_init(struct mlx5_ib_dev *dev) * devr->c0 is set once, never changed until device unload. * Avoid taking the mutex if initialization is already done. */ - if (devr->c0) + if (smp_load_acquire(&devr->c0)) return 0; mutex_lock(&devr->cq_lock); @@ -3336,7 +3336,7 @@ int mlx5_ib_dev_res_cq_init(struct mlx5_ib_dev *dev) } devr->p0 = pd; - devr->c0 = cq; + smp_store_release(&devr->c0, cq); unlock: mutex_unlock(&devr->cq_lock); @@ -3354,7 +3354,7 @@ int mlx5_ib_dev_res_srq_init(struct mlx5_ib_dev *dev) * devr->s1 is set once, never changed until device unload. * Avoid taking the mutex if initialization is already done. */ - if (devr->s1) + if (smp_load_acquire(&devr->s1)) return 0; mutex_lock(&devr->srq_lock); @@ -3392,10 +3392,11 @@ int mlx5_ib_dev_res_srq_init(struct mlx5_ib_dev *dev) "Couldn't create SRQ 1 for res init, err=%pe\n", s1); ib_destroy_srq(s0); + goto unlock; } devr->s0 = s0; - devr->s1 = s1; + smp_store_release(&devr->s1, s1); unlock: mutex_unlock(&devr->srq_lock); diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c index 8f50e7342a7694..8fd05532c09cc7 100644 --- a/drivers/infiniband/hw/mlx5/qp.c +++ b/drivers/infiniband/hw/mlx5/qp.c @@ -1603,6 +1603,11 @@ static int create_raw_packet_qp(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp, } if (qp->rq.wqe_cnt) { + if (!rq->base.ubuffer.umem) { + err = -EINVAL; + goto err_destroy_sq; + } + rq->base.container_mibqp = qp; if (qp->flags & IB_QP_CREATE_CVLAN_STRIPPING) @@ -4692,7 +4697,7 @@ int mlx5_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, struct mlx5_ib_dev *dev = to_mdev(ibqp->device); struct mlx5_ib_modify_qp_resp resp = {}; struct mlx5_ib_qp *qp = to_mqp(ibqp); - struct mlx5_ib_modify_qp ucmd; + struct mlx5_ib_modify_qp ucmd = {}; enum ib_qp_type qp_type; enum ib_qp_state cur_state, new_state; int err = -EINVAL; diff --git a/drivers/infiniband/hw/mlx5/qpc.c b/drivers/infiniband/hw/mlx5/qpc.c index 146d03ae40bd9f..a7a4f9420271a2 100644 --- a/drivers/infiniband/hw/mlx5/qpc.c +++ b/drivers/infiniband/hw/mlx5/qpc.c @@ -314,7 +314,14 @@ int mlx5_core_destroy_dct(struct mlx5_ib_dev *dev, xa_cmpxchg_irq(&table->dct_xa, dct->mqp.qpn, XA_ZERO_ENTRY, dct, 0); return err; } - xa_erase_irq(&table->dct_xa, dct->mqp.qpn); + + /* + * A race can occur where a concurrent create gets the same dctn + * (after hardware released it) and overwrites XA_ZERO_ENTRY with + * its new DCT before we reach here. In that case, we must not erase + * the entry as it now belongs to the new DCT. + */ + xa_cmpxchg_irq(&table->dct_xa, dct->mqp.qpn, XA_ZERO_ENTRY, NULL, 0); return 0; } diff --git a/drivers/infiniband/hw/mlx5/srq_cmd.c b/drivers/infiniband/hw/mlx5/srq_cmd.c index 8b338539659933..c1a088120915c5 100644 --- a/drivers/infiniband/hw/mlx5/srq_cmd.c +++ b/drivers/infiniband/hw/mlx5/srq_cmd.c @@ -683,7 +683,14 @@ int mlx5_cmd_destroy_srq(struct mlx5_ib_dev *dev, struct mlx5_core_srq *srq) xa_cmpxchg_irq(&table->array, srq->srqn, XA_ZERO_ENTRY, srq, 0); return err; } - xa_erase_irq(&table->array, srq->srqn); + + /* + * A race can occur where a concurrent create gets the same srqn + * (after hardware released it) and overwrites XA_ZERO_ENTRY with + * its new SRQ before we reach here. In that case, we must not erase + * the entry as it now belongs to the new SRQ. + */ + xa_cmpxchg_irq(&table->array, srq->srqn, XA_ZERO_ENTRY, NULL, 0); mlx5_core_res_put(&srq->common); wait_for_completion(&srq->common.free); diff --git a/drivers/infiniband/hw/mlx5/umr.c b/drivers/infiniband/hw/mlx5/umr.c index 29488fba21a034..f2139474be3751 100644 --- a/drivers/infiniband/hw/mlx5/umr.c +++ b/drivers/infiniband/hw/mlx5/umr.c @@ -147,7 +147,7 @@ int mlx5r_umr_resource_init(struct mlx5_ib_dev *dev) * UMR qp is set once, never changed until device unload. * Avoid taking the mutex if initialization is already done. */ - if (dev->umrc.qp) + if (smp_load_acquire(&dev->umrc.qp)) return 0; mutex_lock(&dev->umrc.init_lock); @@ -185,7 +185,7 @@ int mlx5r_umr_resource_init(struct mlx5_ib_dev *dev) sema_init(&dev->umrc.sem, MAX_UMR_WR); mutex_init(&dev->umrc.lock); dev->umrc.state = MLX5_UMR_STATE_ACTIVE; - dev->umrc.qp = qp; + smp_store_release(&dev->umrc.qp, qp); mutex_unlock(&dev->umrc.init_lock); return 0; diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c index c17e2a54dbcaf9..a88cc5d84af828 100644 --- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c +++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c @@ -215,7 +215,7 @@ static void ocrdma_del_mmap(struct ocrdma_ucontext *uctx, u64 phy_addr, mutex_lock(&uctx->mm_list_lock); list_for_each_entry_safe(mm, tmp, &uctx->mm_head, entry) { - if (len != mm->key.len && phy_addr != mm->key.phy_addr) + if (len != mm->key.len || phy_addr != mm->key.phy_addr) continue; list_del(&mm->entry); @@ -233,7 +233,7 @@ static bool ocrdma_search_mmap(struct ocrdma_ucontext *uctx, u64 phy_addr, mutex_lock(&uctx->mm_list_lock); list_for_each_entry(mm, &uctx->mm_head, entry) { - if (len != mm->key.len && phy_addr != mm->key.phy_addr) + if (len != mm->key.len || phy_addr != mm->key.phy_addr) continue; found = true; @@ -620,9 +620,9 @@ static int ocrdma_copy_pd_uresp(struct ocrdma_dev *dev, struct ocrdma_pd *pd, ucopy_err: if (pd->dpp_enabled) - ocrdma_del_mmap(pd->uctx, dpp_page_addr, PAGE_SIZE); + ocrdma_del_mmap(uctx, dpp_page_addr, PAGE_SIZE); dpp_map_err: - ocrdma_del_mmap(pd->uctx, db_page_addr, db_page_size); + ocrdma_del_mmap(uctx, db_page_addr, db_page_size); return status; } diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.c index bcd43dc30e21c6..c7c2b41060e526 100644 --- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.c +++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.c @@ -322,7 +322,7 @@ int pvrdma_alloc_ucontext(struct ib_ucontext *uctx, struct ib_udata *udata) uresp.qp_tab_size = vdev->dsr->caps.max_qp; ret = ib_copy_to_udata(udata, &uresp, sizeof(uresp)); if (ret) { - pvrdma_uar_free(vdev, &context->uar); + /* pvrdma_dealloc_ucontext() also frees the UAR */ pvrdma_dealloc_ucontext(&context->ibucontext); return -EFAULT; } diff --git a/drivers/infiniband/sw/rxe/rxe_recv.c b/drivers/infiniband/sw/rxe/rxe_recv.c index f79214738c2b86..2d5e701ff961af 100644 --- a/drivers/infiniband/sw/rxe/rxe_recv.c +++ b/drivers/infiniband/sw/rxe/rxe_recv.c @@ -330,6 +330,17 @@ void rxe_rcv(struct sk_buff *skb) pkt->qp = NULL; pkt->mask |= rxe_opcode[pkt->opcode].mask; + /* + * Unknown opcodes have a zero-initialized rxe_opcode[] entry, so + * both mask and length are 0. Reject them before any length math: + * rxe_icrc_hdr() would otherwise compute length - RXE_BTH_BYTES + * and pass the underflowed value to rxe_crc32(), producing an + * out-of-bounds read. + */ + if (unlikely(!rxe_opcode[pkt->opcode].mask || + !rxe_opcode[pkt->opcode].length)) + goto drop; + if (unlikely(pkt->paylen < header_size(pkt) + bth_pad(pkt) + RXE_ICRC_SIZE)) goto drop; diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c index 9faf8c09aa8e42..9cb2f6fbf2dd6a 100644 --- a/drivers/infiniband/sw/rxe/rxe_resp.c +++ b/drivers/infiniband/sw/rxe/rxe_resp.c @@ -540,7 +540,19 @@ static enum resp_states check_rkey(struct rxe_qp *qp, } skip_check_range: - if (pkt->mask & (RXE_WRITE_MASK | RXE_ATOMIC_WRITE_MASK)) { + if (pkt->mask & RXE_ATOMIC_WRITE_MASK) { + /* IBA oA19-28: ATOMIC_WRITE payload is exactly 8 bytes. + * Reject any other length before the responder reads + * sizeof(u64) bytes from payload_addr(pkt); a shorter + * payload would read past the logical end of the packet + * into skb->head tailroom. + */ + if (resid != sizeof(u64) || pktlen != sizeof(u64) || + bth_pad(pkt)) { + state = RESPST_ERR_LENGTH; + goto err; + } + } else if (pkt->mask & RXE_WRITE_MASK) { if (resid > mtu) { if (pktlen != mtu || bth_pad(pkt)) { state = RESPST_ERR_LENGTH; diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index 1342e764a5486d..834d8fabfba387 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -11,6 +11,9 @@ #include "amd_iommu_types.h" +extern int amd_iommu_evtlog_size; +extern int amd_iommu_pprlog_size; + irqreturn_t amd_iommu_int_thread(int irq, void *data); irqreturn_t amd_iommu_int_thread_evtlog(int irq, void *data); irqreturn_t amd_iommu_int_thread_pprlog(int irq, void *data); diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index c685d3771436a2..f9f71808789303 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -15,6 +15,7 @@ #include #include #include +#include #include #include #include @@ -141,7 +142,6 @@ #define MMIO_STATUS_GALOG_INT_MASK BIT(10) /* event logging constants */ -#define EVENT_ENTRY_SIZE 0x10 #define EVENT_TYPE_SHIFT 28 #define EVENT_TYPE_MASK 0xf #define EVENT_TYPE_ILL_DEV 0x1 @@ -259,15 +259,20 @@ #define MMIO_CMD_BUFFER_TAIL(x) FIELD_GET(MMIO_CMD_TAIL_MASK, (x)) /* constants for event buffer handling */ -#define EVT_BUFFER_SIZE 8192 /* 512 entries */ -#define EVT_LEN_MASK (0x9ULL << 56) +#define EVTLOG_ENTRY_SIZE 0x10 +#define EVTLOG_SIZE_SHIFT 56 +#define EVTLOG_SIZE_DEF SZ_8K /* 512 entries */ +#define EVTLOG_LEN_MASK_DEF (0x9ULL << EVTLOG_SIZE_SHIFT) +#define EVTLOG_SIZE_MAX SZ_512K /* 32K entries */ +#define EVTLOG_LEN_MASK_MAX (0xFULL << EVTLOG_SIZE_SHIFT) /* Constants for PPR Log handling */ -#define PPR_LOG_ENTRIES 512 -#define PPR_LOG_SIZE_SHIFT 56 -#define PPR_LOG_SIZE_512 (0x9ULL << PPR_LOG_SIZE_SHIFT) -#define PPR_ENTRY_SIZE 16 -#define PPR_LOG_SIZE (PPR_ENTRY_SIZE * PPR_LOG_ENTRIES) +#define PPRLOG_ENTRY_SIZE 0x10 +#define PPRLOG_SIZE_SHIFT 56 +#define PPRLOG_SIZE_DEF SZ_8K /* 512 entries */ +#define PPRLOG_LEN_MASK_DEF (0x9ULL << PPRLOG_SIZE_SHIFT) +#define PPRLOG_SIZE_MAX SZ_512K /* 32K entries */ +#define PPRLOG_LEN_MASK_MAX (0xFULL << PPRLOG_SIZE_SHIFT) /* PAGE_SERVICE_REQUEST PPR Log Buffer Entry flags */ #define PPR_FLAG_EXEC 0x002 /* Execute permission requested */ diff --git a/drivers/iommu/amd/debugfs.c b/drivers/iommu/amd/debugfs.c index 4e66473d7ceafe..4c53b63613148e 100644 --- a/drivers/iommu/amd/debugfs.c +++ b/drivers/iommu/amd/debugfs.c @@ -31,11 +31,12 @@ static ssize_t iommu_mmio_write(struct file *filp, const char __user *ubuf, if (cnt > OFS_IN_SZ) return -EINVAL; - ret = kstrtou32_from_user(ubuf, cnt, 0, &dbg_mmio_offset); + ret = kstrtos32_from_user(ubuf, cnt, 0, &dbg_mmio_offset); if (ret) return ret; - if (dbg_mmio_offset > iommu->mmio_phys_end - sizeof(u64)) + if (dbg_mmio_offset < 0 || dbg_mmio_offset > + iommu->mmio_phys_end - sizeof(u64)) return -EINVAL; iommu->dbg_mmio_offset = dbg_mmio_offset; @@ -71,12 +72,12 @@ static ssize_t iommu_capability_write(struct file *filp, const char __user *ubuf if (cnt > OFS_IN_SZ) return -EINVAL; - ret = kstrtou32_from_user(ubuf, cnt, 0, &dbg_cap_offset); + ret = kstrtos32_from_user(ubuf, cnt, 0, &dbg_cap_offset); if (ret) return ret; /* Capability register at offset 0x14 is the last IOMMU capability register. */ - if (dbg_cap_offset > 0x14) + if (dbg_cap_offset < 0 || dbg_cap_offset > 0x14) return -EINVAL; iommu->dbg_cap_offset = dbg_cap_offset; diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 56ad020df49497..3bdb380d23e9a9 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -132,6 +132,9 @@ struct ivhd_entry { u8 uid; } __attribute__((packed)); +int amd_iommu_evtlog_size = EVTLOG_SIZE_DEF; +int amd_iommu_pprlog_size = PPRLOG_SIZE_DEF; + /* * An AMD IOMMU memory definition structure. It defines things like exclusion * ranges for devices and regions that should be unity mapped. @@ -865,35 +868,47 @@ void *__init iommu_alloc_4k_pages(struct amd_iommu *iommu, gfp_t gfp, } /* allocates the memory where the IOMMU will log its events to */ -static int __init alloc_event_buffer(struct amd_iommu *iommu) +static int __init alloc_event_buffer(void) { - iommu->evt_buf = iommu_alloc_4k_pages(iommu, GFP_KERNEL, - EVT_BUFFER_SIZE); + struct amd_iommu *iommu; - return iommu->evt_buf ? 0 : -ENOMEM; + for_each_iommu(iommu) { + iommu->evt_buf = iommu_alloc_4k_pages(iommu, GFP_KERNEL, + amd_iommu_evtlog_size); + if (!iommu->evt_buf) + return -ENOMEM; + } + + return 0; } -static void iommu_enable_event_buffer(struct amd_iommu *iommu) +static void iommu_enable_event_buffer(void) { + struct amd_iommu *iommu; u64 entry; - BUG_ON(iommu->evt_buf == NULL); + for_each_iommu(iommu) { + BUG_ON(iommu->evt_buf == NULL); - if (!is_kdump_kernel()) { - /* - * Event buffer is re-used for kdump kernel and setting - * of MMIO register is not required. - */ - entry = iommu_virt_to_phys(iommu->evt_buf) | EVT_LEN_MASK; - memcpy_toio(iommu->mmio_base + MMIO_EVT_BUF_OFFSET, - &entry, sizeof(entry)); - } + if (!is_kdump_kernel()) { + /* + * Event buffer is re-used for kdump kernel and setting + * of MMIO register is not required. + */ + entry = iommu_virt_to_phys(iommu->evt_buf); + entry |= (amd_iommu_evtlog_size == EVTLOG_SIZE_DEF) ? + EVTLOG_LEN_MASK_DEF : EVTLOG_LEN_MASK_MAX; - /* set head and tail to zero manually */ - writel(0x00, iommu->mmio_base + MMIO_EVT_HEAD_OFFSET); - writel(0x00, iommu->mmio_base + MMIO_EVT_TAIL_OFFSET); + memcpy_toio(iommu->mmio_base + MMIO_EVT_BUF_OFFSET, + &entry, sizeof(entry)); + } - iommu_feature_enable(iommu, CONTROL_EVT_LOG_EN); + /* set head and tail to zero manually */ + writel(0x00, iommu->mmio_base + MMIO_EVT_HEAD_OFFSET); + writel(0x00, iommu->mmio_base + MMIO_EVT_TAIL_OFFSET); + + iommu_feature_enable(iommu, CONTROL_EVT_LOG_EN); + } } /* @@ -984,15 +999,20 @@ static int __init alloc_cwwb_sem(struct amd_iommu *iommu) return 0; } -static int __init remap_event_buffer(struct amd_iommu *iommu) +static int __init remap_event_buffer(void) { + struct amd_iommu *iommu; u64 paddr; pr_info_once("Re-using event buffer from the previous kernel\n"); - paddr = readq(iommu->mmio_base + MMIO_EVT_BUF_OFFSET) & PM_ADDR_MASK; - iommu->evt_buf = iommu_memremap(paddr, EVT_BUFFER_SIZE); + for_each_iommu(iommu) { + paddr = readq(iommu->mmio_base + MMIO_EVT_BUF_OFFSET) & PM_ADDR_MASK; + iommu->evt_buf = iommu_memremap(paddr, amd_iommu_evtlog_size); + if (!iommu->evt_buf) + return -ENOMEM; + } - return iommu->evt_buf ? 0 : -ENOMEM; + return 0; } static int __init remap_command_buffer(struct amd_iommu *iommu) @@ -1044,10 +1064,6 @@ static int __init alloc_iommu_buffers(struct amd_iommu *iommu) ret = remap_command_buffer(iommu); if (ret) return ret; - - ret = remap_event_buffer(iommu); - if (ret) - return ret; } else { ret = alloc_cwwb_sem(iommu); if (ret) @@ -1056,10 +1072,6 @@ static int __init alloc_iommu_buffers(struct amd_iommu *iommu) ret = alloc_command_buffer(iommu); if (ret) return ret; - - ret = alloc_event_buffer(iommu); - if (ret) - return ret; } return 0; @@ -2893,7 +2905,6 @@ static void early_enable_iommu(struct amd_iommu *iommu) iommu_init_flags(iommu); iommu_set_device_table(iommu); iommu_enable_command_buffer(iommu); - iommu_enable_event_buffer(iommu); iommu_set_exclusion_range(iommu); iommu_enable_gt(iommu); iommu_enable_ga(iommu); @@ -2957,7 +2968,6 @@ static void early_enable_iommus(void) iommu_disable_event_buffer(iommu); iommu_disable_irtcachedis(iommu); iommu_enable_command_buffer(iommu); - iommu_enable_event_buffer(iommu); iommu_enable_ga(iommu); iommu_enable_xt(iommu); iommu_enable_irtcachedis(iommu); @@ -3070,6 +3080,7 @@ static void amd_iommu_resume(void *data) for_each_iommu(iommu) early_enable_iommu(iommu); + iommu_enable_event_buffer(); amd_iommu_enable_interrupts(); } @@ -3399,6 +3410,33 @@ static __init void iommu_snp_enable(void) #endif } +static void amd_iommu_apply_erratum_snp(void) +{ +#ifdef CONFIG_KVM_AMD_SEV + if (!amd_iommu_snp_en) + return; + + /* Errata fix for Family 0x19 */ + if (boot_cpu_data.x86 != 0x19) + return; + + /* Set event log buffer size to max */ + amd_iommu_evtlog_size = EVTLOG_SIZE_MAX; + pr_info("Applying erratum: Increase Event log size to 0x%x\n", + amd_iommu_evtlog_size); + + /* + * Set PPR log buffer size to max. + * (Family 0x19, model < 0x10 doesn't support PPR when SNP is enabled). + */ + if (boot_cpu_data.x86_model >= 0x10) { + amd_iommu_pprlog_size = PPRLOG_SIZE_MAX; + pr_info("Applying erratum: Increase PPR log size to 0x%x\n", + amd_iommu_pprlog_size); + } +#endif +} + /**************************************************************************** * * AMD IOMMU Initialization State Machine @@ -3435,6 +3473,21 @@ static int __init state_next(void) case IOMMU_ENABLED: register_syscore(&amd_iommu_syscore); iommu_snp_enable(); + + amd_iommu_apply_erratum_snp(); + + /* Allocate/enable event log buffer */ + if (is_kdump_kernel()) + ret = remap_event_buffer(); + else + ret = alloc_event_buffer(); + + if (ret) { + init_state = IOMMU_INIT_ERROR; + break; + } + iommu_enable_event_buffer(); + ret = amd_iommu_init_pci(); init_state = ret ? IOMMU_INIT_ERROR : IOMMU_PCI_INIT; break; @@ -4037,11 +4090,11 @@ int amd_iommu_snp_disable(void) return 0; for_each_iommu(iommu) { - ret = iommu_make_shared(iommu->evt_buf, EVT_BUFFER_SIZE); + ret = iommu_make_shared(iommu->evt_buf, amd_iommu_evtlog_size); if (ret) return ret; - ret = iommu_make_shared(iommu->ppr_log, PPR_LOG_SIZE); + ret = iommu_make_shared(iommu->ppr_log, amd_iommu_pprlog_size); if (ret) return ret; diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 01171361f9bc22..57dc8fabc7d9b9 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -351,8 +351,12 @@ static struct amd_iommu *__rlookup_amd_iommu(u16 seg, u16 devid) struct amd_iommu_pci_seg *pci_seg; for_each_pci_segment(pci_seg) { - if (pci_seg->id == seg) - return pci_seg->rlookup_table[devid]; + if (pci_seg->id != seg) + continue; + /* IVRS may not describe every device on the bus */ + if (devid > pci_seg->last_bdf) + return NULL; + return pci_seg->rlookup_table[devid]; } return NULL; } @@ -1010,7 +1014,7 @@ static void iommu_poll_events(struct amd_iommu *iommu) iommu_print_event(iommu, iommu->evt_buf + head); /* Update head pointer of hardware ring-buffer */ - head = (head + EVENT_ENTRY_SIZE) % EVT_BUFFER_SIZE; + head = (head + EVTLOG_ENTRY_SIZE) % amd_iommu_evtlog_size; writel(head, iommu->mmio_base + MMIO_EVT_HEAD_OFFSET); } @@ -2149,7 +2153,8 @@ static void set_dte_passthrough(struct iommu_dev_data *dev_data, new->data[0] |= DTE_FLAG_TV | DTE_FLAG_IR | DTE_FLAG_IW; new->data[1] |= FIELD_PREP(DTE_DOMID_MASK, domain->id) | - (dev_data->ats_enabled) ? DTE_FLAG_IOTLB : 0; + (dev_data->ats_enabled ? DTE_FLAG_IOTLB : 0); + } static void set_dte_entry(struct amd_iommu *iommu, diff --git a/drivers/iommu/amd/ppr.c b/drivers/iommu/amd/ppr.c index e6767c057d01fa..1f8d2823bea42c 100644 --- a/drivers/iommu/amd/ppr.c +++ b/drivers/iommu/amd/ppr.c @@ -20,7 +20,7 @@ int __init amd_iommu_alloc_ppr_log(struct amd_iommu *iommu) { iommu->ppr_log = iommu_alloc_4k_pages(iommu, GFP_KERNEL | __GFP_ZERO, - PPR_LOG_SIZE); + amd_iommu_pprlog_size); return iommu->ppr_log ? 0 : -ENOMEM; } @@ -33,7 +33,9 @@ void amd_iommu_enable_ppr_log(struct amd_iommu *iommu) iommu_feature_enable(iommu, CONTROL_PPR_EN); - entry = iommu_virt_to_phys(iommu->ppr_log) | PPR_LOG_SIZE_512; + entry = iommu_virt_to_phys(iommu->ppr_log); + entry |= (amd_iommu_pprlog_size == PPRLOG_SIZE_DEF) ? + PPRLOG_LEN_MASK_DEF : PPRLOG_LEN_MASK_MAX; memcpy_toio(iommu->mmio_base + MMIO_PPR_LOG_OFFSET, &entry, sizeof(entry)); @@ -201,7 +203,7 @@ void amd_iommu_poll_ppr_log(struct amd_iommu *iommu) raw[0] = raw[1] = 0UL; /* Update head pointer of hardware ring-buffer */ - head = (head + PPR_ENTRY_SIZE) % PPR_LOG_SIZE; + head = (head + PPRLOG_ENTRY_SIZE) % amd_iommu_pprlog_size; writel(head, iommu->mmio_base + MMIO_PPR_HEAD_OFFSET); /* Handle PPR entry */ diff --git a/drivers/iommu/generic_pt/iommu_pt.h b/drivers/iommu/generic_pt/iommu_pt.h index 19b6daf88f2ab1..dc91fb4e2f61cb 100644 --- a/drivers/iommu/generic_pt/iommu_pt.h +++ b/drivers/iommu/generic_pt/iommu_pt.h @@ -534,10 +534,12 @@ static int __map_range_leaf(struct pt_range *range, void *arg, struct pt_state pts = pt_init(range, level, table); struct pt_iommu_map_args *map = arg; unsigned int leaf_pgsize_lg2 = map->leaf_pgsize_lg2; + unsigned int leaves_avail; unsigned int start_index; pt_oaddr_t oa = map->oa; - unsigned int num_leaves; + pt_vaddr_t num_leaves; unsigned int orig_end; + unsigned int step_lg2; pt_vaddr_t last_va; unsigned int step; bool need_contig; @@ -546,21 +548,25 @@ static int __map_range_leaf(struct pt_range *range, void *arg, PT_WARN_ON(map->leaf_level != level); PT_WARN_ON(!pt_can_have_leaf(&pts)); - step = log2_to_int_t(unsigned int, - leaf_pgsize_lg2 - pt_table_item_lg2sz(&pts)); - need_contig = leaf_pgsize_lg2 != pt_table_item_lg2sz(&pts); + step_lg2 = leaf_pgsize_lg2 - pt_table_item_lg2sz(&pts); + step = log2_to_int_t(unsigned int, step_lg2); + need_contig = step_lg2 != 0; _pt_iter_first(&pts); start_index = pts.index; orig_end = pts.end_index; - if (pts.index + map->num_leaves < pts.end_index) { + leaves_avail = + log2_div_t(unsigned int, pts.end_index - pts.index, step_lg2); + if (map->num_leaves <= leaves_avail) { /* Need to stop in the middle of the table to change sizes */ - pts.end_index = pts.index + map->num_leaves; + pts.end_index = pts.index + log2_mul(map->num_leaves, step_lg2); num_leaves = 0; } else { - num_leaves = map->num_leaves - (pts.end_index - pts.index); + num_leaves = map->num_leaves - leaves_avail; } + PT_WARN_ON( + log2_mod_t(unsigned int, pts.end_index - pts.index, step_lg2)); do { pts.type = pt_load_entry_raw(&pts); if (pts.type != PT_ENTRY_EMPTY || need_contig) { @@ -920,8 +926,8 @@ static int NS(map_range)(struct pt_iommu *iommu_table, dma_addr_t iova, return ret; /* Calculate target page size and level for the leaves */ - if (pt_has_system_page_size(common) && len == PAGE_SIZE) { - PT_WARN_ON(!(pgsize_bitmap & PAGE_SIZE)); + if (pt_has_system_page_size(common) && len == PAGE_SIZE && + likely(pgsize_bitmap & PAGE_SIZE)) { if (log2_mod(iova | paddr, PAGE_SHIFT)) return -ENXIO; map.leaf_pgsize_lg2 = PAGE_SHIFT; diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index c3d18cd77d2f1a..4d0e65bc131d77 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -3530,8 +3530,8 @@ void domain_remove_dev_pasid(struct iommu_domain *domain, if (!domain) return; - /* Identity domain has no meta data for pasid. */ - if (domain->type == IOMMU_DOMAIN_IDENTITY) + /* Identity domain and blocked domain have no meta data for pasid. */ + if (domain->type == IOMMU_DOMAIN_IDENTITY || domain->type == IOMMU_DOMAIN_BLOCKED) return; dmar_domain = to_dmar_domain(domain); @@ -3545,12 +3545,13 @@ void domain_remove_dev_pasid(struct iommu_domain *domain, } spin_unlock_irqrestore(&dmar_domain->lock, flags); + if (WARN_ON_ONCE(!dev_pasid)) + return; + cache_tag_unassign_domain(dmar_domain, dev, pasid); domain_detach_iommu(dmar_domain, iommu); - if (!WARN_ON_ONCE(!dev_pasid)) { - intel_iommu_debugfs_remove_dev_pasid(dev_pasid); - kfree(dev_pasid); - } + intel_iommu_debugfs_remove_dev_pasid(dev_pasid); + kfree(dev_pasid); } static int blocking_domain_set_dev_pasid(struct iommu_domain *domain, @@ -3937,6 +3938,9 @@ static void quirk_iommu_igfx(struct pci_dev *dev) disable_igfx_iommu = 1; } +/* Q35 integrated gfx dmar support is totally busted. */ +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x29b2, quirk_iommu_igfx); + /* G4x/GM45 integrated gfx dmar support is totally busted. */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2a40, quirk_iommu_igfx); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e00, quirk_iommu_igfx); diff --git a/drivers/iommu/iommu-pages.h b/drivers/iommu/iommu-pages.h index ae9da4f571f614..e9e605b5fa3afc 100644 --- a/drivers/iommu/iommu-pages.h +++ b/drivers/iommu/iommu-pages.h @@ -137,7 +137,7 @@ static inline void iommu_pages_flush_incoherent(struct device *dma_dev, void *virt, size_t offset, size_t len) { - dma_sync_single_for_device(dma_dev, (uintptr_t)virt + offset, len, + dma_sync_single_for_device(dma_dev, virt_to_phys(virt) + offset, len, DMA_TO_DEVICE); } void iommu_pages_stop_incoherent_list(struct iommu_pages_list *list, diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 61c12ba782066a..d1a9e713d3a05f 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -62,14 +62,14 @@ struct iommu_group { int id; struct iommu_domain *default_domain; struct iommu_domain *blocking_domain; - /* - * During a group device reset, @resetting_domain points to the physical - * domain, while @domain points to the attached domain before the reset. - */ - struct iommu_domain *resetting_domain; struct iommu_domain *domain; struct list_head entry; unsigned int owner_cnt; + /* + * Number of devices in the group undergoing or awaiting recovery. + * If non-zero, concurrent domain attachments are rejected. + */ + unsigned int recovery_cnt; void *owner; }; @@ -77,12 +77,33 @@ struct group_device { struct list_head list; struct device *dev; char *name; + /* + * Device is blocked for a pending recovery while its group->domain is + * retained. This can happen when: + * - Device is undergoing a reset + */ + bool blocked; + unsigned int reset_depth; }; /* Iterate over each struct group_device in a struct iommu_group */ #define for_each_group_device(group, pos) \ list_for_each_entry(pos, &(group)->devices, list) +static struct group_device *__dev_to_gdev(struct device *dev) +{ + struct iommu_group *group = dev->iommu_group; + struct group_device *gdev; + + lockdep_assert_held(&group->mutex); + + for_each_group_device(group, gdev) { + if (gdev->dev == dev) + return gdev; + } + return NULL; +} + struct iommu_group_attribute { struct attribute attr; ssize_t (*show)(struct iommu_group *group, char *buf); @@ -2196,6 +2217,8 @@ EXPORT_SYMBOL_GPL(iommu_attach_device); int iommu_deferred_attach(struct device *dev, struct iommu_domain *domain) { + struct group_device *gdev; + /* * This is called on the dma mapping fast path so avoid locking. This is * racy, but we have an expectation that the driver will setup its DMAs @@ -2206,14 +2229,18 @@ int iommu_deferred_attach(struct device *dev, struct iommu_domain *domain) guard(mutex)(&dev->iommu_group->mutex); + gdev = __dev_to_gdev(dev); + if (WARN_ON(!gdev)) + return -ENODEV; + /* - * This is a concurrent attach during a device reset. Reject it until + * This is a concurrent attach during device recovery. Reject it until * pci_dev_reset_iommu_done() attaches the device to group->domain. * * Note that this might fail the iommu_dma_map(). But there's nothing * more we can do here. */ - if (dev->iommu_group->resetting_domain) + if (gdev->blocked) return -EBUSY; return __iommu_attach_device(domain, dev, NULL); } @@ -2270,19 +2297,24 @@ EXPORT_SYMBOL_GPL(iommu_get_domain_for_dev); struct iommu_domain *iommu_driver_get_domain_for_dev(struct device *dev) { struct iommu_group *group = dev->iommu_group; + struct group_device *gdev; lockdep_assert_held(&group->mutex); + gdev = __dev_to_gdev(dev); + if (WARN_ON(!gdev)) + return NULL; + /* * Driver handles the low-level __iommu_attach_device(), including the * one invoked by pci_dev_reset_iommu_done() re-attaching the device to * the cached group->domain. In this case, the driver must get the old - * domain from group->resetting_domain rather than group->domain. This + * domain from group->blocking_domain rather than group->domain. This * prevents it from re-attaching the device from group->domain (old) to * group->domain (new). */ - if (group->resetting_domain) - return group->resetting_domain; + if (gdev->blocked) + return group->blocking_domain; return group->domain; } @@ -2441,10 +2473,11 @@ static int __iommu_group_set_domain_internal(struct iommu_group *group, return -EINVAL; /* - * This is a concurrent attach during a device reset. Reject it until - * pci_dev_reset_iommu_done() attaches the device to group->domain. + * This is a concurrent attach during device recovery. Reject it until + * pci_dev_reset_iommu_done() attaches the device to group->domain, if + * IOMMU_SET_DOMAIN_MUST_SUCCEED is not set. */ - if (group->resetting_domain) + if (group->recovery_cnt && !(flags & IOMMU_SET_DOMAIN_MUST_SUCCEED)) return -EBUSY; /* @@ -2455,6 +2488,13 @@ static int __iommu_group_set_domain_internal(struct iommu_group *group, */ result = 0; for_each_group_device(group, gdev) { + /* + * Device under recovery is attached to group->blocking_domain. + * Don't change that. pci_dev_reset_iommu_done() will re-attach + * its domain to the updated group->domain, after the recovery. + */ + if (gdev->blocked) + continue; ret = __iommu_device_set_domain(group, gdev->dev, new_domain, group->domain, flags); if (ret) { @@ -2575,27 +2615,16 @@ static size_t iommu_pgsize(struct iommu_domain *domain, unsigned long iova, static int __iommu_map_domain_pgtbl(struct iommu_domain *domain, unsigned long iova, phys_addr_t paddr, - size_t size, int prot, gfp_t gfp) + size_t size, int prot, gfp_t gfp, + size_t *mapped) { const struct iommu_domain_ops *ops = domain->ops; - unsigned long orig_iova = iova; unsigned int min_pagesz; - size_t orig_size = size; int ret = 0; - might_sleep_if(gfpflags_allow_blocking(gfp)); - - if (unlikely(!(domain->type & __IOMMU_DOMAIN_PAGING))) - return -EINVAL; - - if (WARN_ON(!ops->map_pages || domain->pgsize_bitmap == 0UL)) + if (WARN_ON(!ops->map_pages)) return -ENODEV; - /* Discourage passing strange GFP flags */ - if (WARN_ON_ONCE(gfp & (__GFP_COMP | __GFP_DMA | __GFP_DMA32 | - __GFP_HIGHMEM))) - return -EINVAL; - /* find out the minimum page size supported */ min_pagesz = 1 << __ffs(domain->pgsize_bitmap); @@ -2613,31 +2642,25 @@ static int __iommu_map_domain_pgtbl(struct iommu_domain *domain, pr_debug("map: iova 0x%lx pa %pa size 0x%zx\n", iova, &paddr, size); while (size) { - size_t pgsize, count, mapped = 0; + size_t pgsize, count, op_mapped = 0; pgsize = iommu_pgsize(domain, iova, paddr, size, &count); pr_debug("mapping: iova 0x%lx pa %pa pgsize 0x%zx count %zu\n", iova, &paddr, pgsize, count); ret = ops->map_pages(domain, iova, paddr, pgsize, count, prot, - gfp, &mapped); + gfp, &op_mapped); /* * Some pages may have been mapped, even if an error occurred, * so we should account for those so they can be unmapped. */ - size -= mapped; - + *mapped += op_mapped; if (ret) - break; - - iova += mapped; - paddr += mapped; - } + return ret; - /* unroll mapping in case something went wrong */ - if (ret) { - iommu_unmap(domain, orig_iova, orig_size - size); - return ret; + size -= op_mapped; + iova += op_mapped; + paddr += op_mapped; } return 0; } @@ -2655,25 +2678,31 @@ int iommu_map_nosync(struct iommu_domain *domain, unsigned long iova, phys_addr_t paddr, size_t size, int prot, gfp_t gfp) { struct pt_iommu *pt = iommupt_from_domain(domain); + size_t mapped = 0; int ret; - if (pt) { - size_t mapped = 0; + might_sleep_if(gfpflags_allow_blocking(gfp)); + /* Discourage passing strange GFP flags or illegal domains */ + if (WARN_ON_ONCE(!(domain->type & __IOMMU_DOMAIN_PAGING) || + !domain->pgsize_bitmap || + (gfp & (__GFP_COMP | __GFP_DMA | __GFP_DMA32 | + __GFP_HIGHMEM)))) + return -EINVAL; + + if (pt) ret = pt->ops->map_range(pt, iova, paddr, size, prot, gfp, &mapped); - if (ret) { - iommu_unmap(domain, iova, mapped); - return ret; - } - return 0; - } - ret = __iommu_map_domain_pgtbl(domain, iova, paddr, size, prot, gfp); - if (!ret) - return ret; + else + ret = __iommu_map_domain_pgtbl(domain, iova, paddr, size, prot, + gfp, &mapped); - trace_map(iova, paddr, size); - iommu_debug_map(domain, paddr, size); + trace_map(iova, paddr, mapped); + iommu_debug_map(domain, paddr, mapped); + if (ret) { + iommu_unmap(domain, iova, mapped); + return ret; + } return 0; } @@ -2702,10 +2731,7 @@ __iommu_unmap_domain_pgtbl(struct iommu_domain *domain, unsigned long iova, size_t unmapped_page, unmapped = 0; unsigned int min_pagesz; - if (unlikely(!(domain->type & __IOMMU_DOMAIN_PAGING))) - return 0; - - if (WARN_ON(!ops->unmap_pages || domain->pgsize_bitmap == 0UL)) + if (WARN_ON(!ops->unmap_pages)) return 0; /* find out the minimum page size supported */ @@ -2724,8 +2750,6 @@ __iommu_unmap_domain_pgtbl(struct iommu_domain *domain, unsigned long iova, pr_debug("unmap this: iova 0x%lx size 0x%zx\n", iova, size); - iommu_debug_unmap_begin(domain, iova, size); - /* * Keep iterating until we either unmap 'size' bytes (or more) * or we hit an area that isn't mapped. @@ -2761,6 +2785,12 @@ static size_t __iommu_unmap(struct iommu_domain *domain, unsigned long iova, struct pt_iommu *pt = iommupt_from_domain(domain); size_t unmapped; + if (WARN_ON_ONCE(!(domain->type & __IOMMU_DOMAIN_PAGING) || + !domain->pgsize_bitmap)) + return 0; + + iommu_debug_unmap_begin(domain, iova, size); + if (pt) unmapped = pt->ops->unmap_range(pt, iova, size, iotlb_gather); else @@ -3570,7 +3600,12 @@ static void __iommu_remove_group_pasid(struct iommu_group *group, struct group_device *device; for_each_group_device(group, device) { - if (device->dev->iommu->max_pasids > 0) + /* + * A group-level detach cannot fail, even if there is a blocked + * device. In fact, blocked devices must be already detached for + * a pending device recovery. + */ + if (!device->blocked && device->dev->iommu->max_pasids > 0) iommu_remove_dev_pasid(device->dev, pasid, domain); } } @@ -3615,10 +3650,10 @@ int iommu_attach_device_pasid(struct iommu_domain *domain, mutex_lock(&group->mutex); /* - * This is a concurrent attach during a device reset. Reject it until + * This is a concurrent attach during device recovery. Reject it until * pci_dev_reset_iommu_done() attaches the device to group->domain. */ - if (group->resetting_domain) { + if (group->recovery_cnt) { ret = -EBUSY; goto out_unlock; } @@ -3708,10 +3743,10 @@ int iommu_replace_device_pasid(struct iommu_domain *domain, mutex_lock(&group->mutex); /* - * This is a concurrent attach during a device reset. Reject it until + * This is a concurrent attach during device recovery. Reject it until * pci_dev_reset_iommu_done() attaches the device to group->domain. */ - if (group->resetting_domain) { + if (group->recovery_cnt) { ret = -EBUSY; goto out_unlock; } @@ -3982,12 +4017,12 @@ EXPORT_SYMBOL_NS_GPL(iommu_replace_group_handle, "IOMMUFD_INTERNAL"); * routine wants to block any IOMMU activity: translation and ATS invalidation. * * This function attaches the device's RID/PASID(s) the group->blocking_domain, - * setting the group->resetting_domain. This allows the IOMMU driver pausing any + * incrementing the group->recovery_cnt, to allow the IOMMU driver pausing any * IOMMU activity while leaving the group->domain pointer intact. Later when the * reset is finished, pci_dev_reset_iommu_done() can restore everything. * * Caller must use pci_dev_reset_iommu_prepare() with pci_dev_reset_iommu_done() - * before/after the core-level reset routine, to unset the resetting_domain. + * before/after the core-level reset routine, to decrement the recovery_cnt. * * Return: 0 on success or negative error code if the preparation failed. * @@ -4000,6 +4035,7 @@ EXPORT_SYMBOL_NS_GPL(iommu_replace_group_handle, "IOMMUFD_INTERNAL"); int pci_dev_reset_iommu_prepare(struct pci_dev *pdev) { struct iommu_group *group = pdev->dev.iommu_group; + struct group_device *gdev; unsigned long pasid; void *entry; int ret; @@ -4009,45 +4045,99 @@ int pci_dev_reset_iommu_prepare(struct pci_dev *pdev) guard(mutex)(&group->mutex); - /* Re-entry is not allowed */ - if (WARN_ON(group->resetting_domain)) - return -EBUSY; + gdev = __dev_to_gdev(&pdev->dev); + if (WARN_ON(!gdev)) + return -ENODEV; + + if (gdev->reset_depth++) + return 0; ret = __iommu_group_alloc_blocking_domain(group); - if (ret) + if (ret) { + gdev->reset_depth--; return ret; + } /* Stage RID domain at blocking_domain while retaining group->domain */ if (group->domain != group->blocking_domain) { ret = __iommu_attach_device(group->blocking_domain, &pdev->dev, group->domain); - if (ret) + if (ret) { + gdev->reset_depth--; return ret; + } } + /* + * Update gdev->blocked upon the domain change, as it is used to return + * the correct domain in iommu_driver_get_domain_for_dev() that might be + * called in a set_dev_pasid callback function. + */ + gdev->blocked = true; + /* * Stage PASID domains at blocking_domain while retaining pasid_array. * * The pasid_array is mostly fenced by group->mutex, except one reader * in iommu_attach_handle_get(), so it's safe to read without xa_lock. */ - xa_for_each_start(&group->pasid_array, pasid, entry, 1) - iommu_remove_dev_pasid(&pdev->dev, pasid, - pasid_array_entry_to_domain(entry)); + if (pdev->dev.iommu->max_pasids > 0) { + xa_for_each_start(&group->pasid_array, pasid, entry, 1) { + struct iommu_domain *pasid_dom = + pasid_array_entry_to_domain(entry); + + iommu_remove_dev_pasid(&pdev->dev, pasid, pasid_dom); + } + } - group->resetting_domain = group->blocking_domain; + group->recovery_cnt++; return ret; } EXPORT_SYMBOL_GPL(pci_dev_reset_iommu_prepare); +static int __group_device_cmp_dma_alias(struct pci_dev *dev, u16 alias, + void *data) +{ + return alias == *(u16 *)data; +} + +static int group_device_cmp_dma_alias(struct pci_dev *dev, u16 alias, + void *data) +{ + return pci_for_each_dma_alias(data, __group_device_cmp_dma_alias, + &alias); +} + +static bool group_device_dma_alias_is_blocked(struct iommu_group *group, + struct group_device *gdev) +{ + struct group_device *sibling; + + lockdep_assert_held(&group->mutex); + + if (!dev_is_pci(gdev->dev)) + return false; + + for_each_group_device(group, sibling) { + if (sibling == gdev || !sibling->blocked || + !dev_is_pci(sibling->dev)) + continue; + if (pci_for_each_dma_alias(to_pci_dev(gdev->dev), + group_device_cmp_dma_alias, + to_pci_dev(sibling->dev))) + return true; + } + return false; +} + /** * pci_dev_reset_iommu_done() - Restore IOMMU after a PCI device reset is done * @pdev: PCI device that has finished a reset routine * * After a PCIe device finishes a reset routine, it wants to restore its IOMMU - * IOMMU activity, including new translation as well as cache invalidation, by - * re-attaching all RID/PASID of the device's back to the domains retained in - * the core-level structure. + * activity, including new translation and cache invalidation, by re-attaching + * all RID/PASID of the device back to the domains retained in the core-level + * structure. * * Caller must pair it with a successful pci_dev_reset_iommu_prepare(). * @@ -4057,6 +4147,7 @@ EXPORT_SYMBOL_GPL(pci_dev_reset_iommu_prepare); void pci_dev_reset_iommu_done(struct pci_dev *pdev) { struct iommu_group *group = pdev->dev.iommu_group; + struct group_device *gdev; unsigned long pasid; void *entry; @@ -4065,32 +4156,70 @@ void pci_dev_reset_iommu_done(struct pci_dev *pdev) guard(mutex)(&group->mutex); - /* pci_dev_reset_iommu_prepare() was bypassed for the device */ - if (!group->resetting_domain) + gdev = __dev_to_gdev(&pdev->dev); + if (WARN_ON(!gdev)) + return; + + /* Unbalanced done() calls would underflow the counter */ + if (WARN_ON(gdev->reset_depth == 0)) + return; + if (--gdev->reset_depth) return; - /* pci_dev_reset_iommu_prepare() was not successfully called */ if (WARN_ON(!group->blocking_domain)) return; - /* Re-attach RID domain back to group->domain */ - if (group->domain != group->blocking_domain) { + if (group_device_dma_alias_is_blocked(group, gdev)) { + /* + * FIXME: DMA aliased devices share the same RID, which would be + * convoluted to handle, as "gdev->blocked" is not sufficient: + * - "blocked" state is effectively shared across these devices + * - if the core skipped the blocking on the second device, the + * IOMMU driver's attachment state would diverge from the HW + * state + * For now, just warn and see whether real ATS use cases hit it. + */ + pci_warn(pdev, + "DMA-aliased sibling may be prematurely unblocked\n"); + } + + /* + * Re-attach RID domain back to group->domain + * + * Leave the device parked in the blocking_domain if group->domain isn't + * initialized yet + */ + if (group->domain && group->domain != group->blocking_domain) { WARN_ON(__iommu_attach_device(group->domain, &pdev->dev, group->blocking_domain)); } + /* + * Update gdev->blocked upon the domain change, as it is used to return + * the correct domain in iommu_driver_get_domain_for_dev() that might be + * called in a set_dev_pasid callback function. + */ + gdev->blocked = false; + /* * Re-attach PASID domains back to the domains retained in pasid_array. * * The pasid_array is mostly fenced by group->mutex, except one reader * in iommu_attach_handle_get(), so it's safe to read without xa_lock. */ - xa_for_each_start(&group->pasid_array, pasid, entry, 1) - WARN_ON(__iommu_set_group_pasid( - pasid_array_entry_to_domain(entry), group, pasid, - group->blocking_domain)); + if (pdev->dev.iommu->max_pasids > 0) { + xa_for_each_start(&group->pasid_array, pasid, entry, 1) { + struct iommu_domain *pasid_dom = + pasid_array_entry_to_domain(entry); + + WARN_ON(pasid_dom->ops->set_dev_pasid( + pasid_dom, &pdev->dev, pasid, + group->blocking_domain)); + } + } - group->resetting_domain = NULL; + if (!WARN_ON(group->recovery_cnt == 0)) + group->recovery_cnt--; } EXPORT_SYMBOL_GPL(pci_dev_reset_iommu_done); diff --git a/drivers/irqchip/irq-ath79-cpu.c b/drivers/irqchip/irq-ath79-cpu.c index 923e4bba377676..9b7273a7f8ced9 100644 --- a/drivers/irqchip/irq-ath79-cpu.c +++ b/drivers/irqchip/irq-ath79-cpu.c @@ -85,10 +85,3 @@ static int __init ar79_cpu_intc_of_init( } IRQCHIP_DECLARE(ar79_cpu_intc, "qca,ar7100-cpu-intc", ar79_cpu_intc_of_init); - -void __init ath79_cpu_irq_init(unsigned irq_wb_chan2, unsigned irq_wb_chan3) -{ - irq_wb_chan[2] = irq_wb_chan2; - irq_wb_chan[3] = irq_wb_chan3; - mips_cpu_irq_init(); -} diff --git a/drivers/irqchip/irq-gic-v5-its.c b/drivers/irqchip/irq-gic-v5-its.c index 36a8d1368f0e44..28e39b065de0ee 100644 --- a/drivers/irqchip/irq-gic-v5-its.c +++ b/drivers/irqchip/irq-gic-v5-its.c @@ -929,14 +929,15 @@ static void gicv5_its_free_eventid(struct gicv5_its_dev *its_dev, u32 event_id_b static int gicv5_its_irq_domain_alloc(struct irq_domain *domain, unsigned int virq, unsigned int nr_irqs, void *arg) { - u32 device_id, event_id_base, lpi; struct gicv5_its_dev *its_dev; + u32 device_id, event_id_base; msi_alloc_info_t *info = arg; irq_hw_number_t hwirq; struct irq_data *irqd; int ret, i; its_dev = info->scratchpad[0].ptr; + device_id = its_dev->device_id; ret = gicv5_its_alloc_eventid(its_dev, info, nr_irqs, &event_id_base); if (ret) @@ -946,22 +947,11 @@ static int gicv5_its_irq_domain_alloc(struct irq_domain *domain, unsigned int vi if (ret) goto out_eventid; - device_id = its_dev->device_id; + ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, NULL); + if (ret) + goto out_eventid; for (i = 0; i < nr_irqs; i++) { - ret = gicv5_alloc_lpi(); - if (ret < 0) { - pr_debug("Failed to find free LPI!\n"); - goto out_free_irqs; - } - lpi = ret; - - ret = irq_domain_alloc_irqs_parent(domain, virq + i, 1, &lpi); - if (ret) { - gicv5_free_lpi(lpi); - goto out_free_irqs; - } - /* * Store eventid and deviceid into the hwirq for later use. * @@ -980,13 +970,6 @@ static int gicv5_its_irq_domain_alloc(struct irq_domain *domain, unsigned int vi return 0; -out_free_irqs: - while (--i >= 0) { - irqd = irq_domain_get_irq_data(domain, virq + i); - gicv5_free_lpi(irqd->parent_data->hwirq); - irq_domain_reset_irq_data(irqd); - irq_domain_free_irqs_parent(domain, virq + i, 1); - } out_eventid: gicv5_its_free_eventid(its_dev, event_id_base, nr_irqs); return ret; @@ -1009,15 +992,14 @@ static void gicv5_its_irq_domain_free(struct irq_domain *domain, unsigned int vi bitmap_release_region(its_dev->event_map, event_id_base, get_count_order(nr_irqs)); - /* Hierarchically free irq data */ for (i = 0; i < nr_irqs; i++) { d = irq_domain_get_irq_data(domain, virq + i); - - gicv5_free_lpi(d->parent_data->hwirq); irq_domain_reset_irq_data(d); - irq_domain_free_irqs_parent(domain, virq + i, 1); } + /* Hierarchically free irq data */ + irq_domain_free_irqs_parent(domain, virq, nr_irqs); + gicv5_its_syncr(its, its_dev); gicv5_irs_syncr(); } diff --git a/drivers/irqchip/irq-gic-v5.c b/drivers/irqchip/irq-gic-v5.c index 6b0903be8ebfd3..c1af07083ceff3 100644 --- a/drivers/irqchip/irq-gic-v5.c +++ b/drivers/irqchip/irq-gic-v5.c @@ -59,16 +59,6 @@ static void release_lpi(u32 lpi) ida_free(&lpi_ida, lpi); } -int gicv5_alloc_lpi(void) -{ - return alloc_lpi(); -} - -void gicv5_free_lpi(u32 lpi) -{ - release_lpi(lpi); -} - static void gicv5_ppi_priority_init(void) { write_sysreg_s(REPEAT_BYTE(GICV5_IRQ_PRI_MI), SYS_ICC_PPI_PRIORITYR0_EL1); @@ -806,38 +796,64 @@ static void gicv5_lpi_config_reset(struct irq_data *d) gicv5_lpi_irq_write_pending_state(d, false); } +static void gicv5_irq_lpi_domain_free(struct irq_domain *domain, unsigned int virq, + unsigned int nr_irqs) +{ + struct irq_data *d; + + for (unsigned int i = 0; i < nr_irqs; i++, virq++) { + d = irq_domain_get_irq_data(domain, virq); + + release_lpi(d->hwirq); + + irq_set_handler(virq, NULL); + irq_domain_reset_irq_data(d); + } +} + static int gicv5_irq_lpi_domain_alloc(struct irq_domain *domain, unsigned int virq, unsigned int nr_irqs, void *arg) { irq_hw_number_t hwirq; struct irq_data *irqd; - u32 *lpi = arg; + unsigned int i; int ret; - if (WARN_ON_ONCE(nr_irqs != 1)) - return -EINVAL; + for (i = 0; i < nr_irqs; i++) { + ret = alloc_lpi(); + if (ret < 0) + goto out_free_lpis; + hwirq = ret; + + ret = gicv5_irs_iste_alloc(hwirq); + if (ret < 0) { + /* Undo partial state first, then clean up the rest */ + release_lpi(hwirq); + goto out_free_lpis; + } - hwirq = *lpi; + irqd = irq_domain_get_irq_data(domain, virq + i); - irqd = irq_domain_get_irq_data(domain, virq); + irq_domain_set_info(domain, virq + i, hwirq, &gicv5_lpi_irq_chip, + NULL, handle_fasteoi_irq, NULL, NULL); + irqd_set_single_target(irqd); - irq_domain_set_info(domain, virq, hwirq, &gicv5_lpi_irq_chip, NULL, - handle_fasteoi_irq, NULL, NULL); - irqd_set_single_target(irqd); + gicv5_hwirq_init(hwirq, GICV5_IRQ_PRI_MI, GICV5_HWIRQ_TYPE_LPI); + gicv5_lpi_config_reset(irqd); + } - ret = gicv5_irs_iste_alloc(hwirq); - if (ret < 0) - return ret; + return 0; - gicv5_hwirq_init(hwirq, GICV5_IRQ_PRI_MI, GICV5_HWIRQ_TYPE_LPI); - gicv5_lpi_config_reset(irqd); +out_free_lpis: + if (i) + gicv5_irq_lpi_domain_free(domain, virq, i); - return 0; + return ret; } static const struct irq_domain_ops gicv5_irq_lpi_domain_ops = { .alloc = gicv5_irq_lpi_domain_alloc, - .free = gicv5_irq_domain_free, + .free = gicv5_irq_lpi_domain_free, }; void __init gicv5_init_lpi_domain(void) @@ -858,30 +874,21 @@ static int gicv5_irq_ipi_domain_alloc(struct irq_domain *domain, unsigned int vi unsigned int nr_irqs, void *arg) { struct irq_data *irqd; - int ret, i; - u32 lpi; - - for (i = 0; i < nr_irqs; i++) { - ret = gicv5_alloc_lpi(); - if (ret < 0) - return ret; - - lpi = ret; + int ret; - ret = irq_domain_alloc_irqs_parent(domain, virq + i, 1, &lpi); - if (ret) { - gicv5_free_lpi(lpi); - return ret; - } + ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, arg); + if (ret) + return ret; - irqd = irq_domain_get_irq_data(domain, virq + i); + for (unsigned int i = 0; i < nr_irqs; i++, virq++) { + irqd = irq_domain_get_irq_data(domain, virq); - irq_domain_set_hwirq_and_chip(domain, virq + i, i, - &gicv5_ipi_irq_chip, NULL); + irq_domain_set_hwirq_and_chip(domain, virq, i, + &gicv5_ipi_irq_chip, NULL); irqd_set_single_target(irqd); - irq_set_handler(virq + i, handle_percpu_irq); + irq_set_handler(virq, handle_percpu_irq); } return 0; @@ -899,12 +906,11 @@ static void gicv5_irq_ipi_domain_free(struct irq_domain *domain, unsigned int vi if (!d) return; - gicv5_free_lpi(d->parent_data->hwirq); - irq_set_handler(virq + i, NULL); irq_domain_reset_irq_data(d); - irq_domain_free_irqs_parent(domain, virq + i, 1); } + + irq_domain_free_irqs_parent(domain, virq, nr_irqs); } static const struct irq_domain_ops gicv5_irq_ipi_domain_ops = { diff --git a/drivers/irqchip/irq-meson-gpio.c b/drivers/irqchip/irq-meson-gpio.c index f722e9c57e2e40..74a376ef452e21 100644 --- a/drivers/irqchip/irq-meson-gpio.c +++ b/drivers/irqchip/irq-meson-gpio.c @@ -415,8 +415,7 @@ static int meson_s4_gpio_irq_set_type(struct meson_gpio_irq_controller *ctl, if (type & (IRQ_TYPE_EDGE_RISING | IRQ_TYPE_EDGE_FALLING)) val |= BIT(ctl->params->edge_single_offset + idx); - meson_gpio_irq_update_bits(ctl, params->edge_pol_reg, - BIT(idx) | BIT(12 + idx), val); + meson_gpio_irq_update_bits(ctl, REG_EDGE_POL, BIT(idx) | BIT(12 + idx), val); return 0; }; diff --git a/drivers/irqchip/irq-riscv-imsic-early.c b/drivers/irqchip/irq-riscv-imsic-early.c index ba903fa689bd52..a7a1852b548c48 100644 --- a/drivers/irqchip/irq-riscv-imsic-early.c +++ b/drivers/irqchip/irq-riscv-imsic-early.c @@ -158,6 +158,8 @@ static int imsic_dying_cpu(unsigned int cpu) /* Cleanup IPIs */ imsic_ipi_dying_cpu(); + imsic_local_sync_all(false); + /* Mark per-CPU IMSIC state as offline */ imsic_state_offline(); diff --git a/drivers/media/platform/qcom/camss/camss-csid-gen3.c b/drivers/media/platform/qcom/camss/camss-csid-gen3.c index 664245cf6eb0ca..bd059243790ede 100644 --- a/drivers/media/platform/qcom/camss/camss-csid-gen3.c +++ b/drivers/media/platform/qcom/camss/camss-csid-gen3.c @@ -48,9 +48,9 @@ #define IS_CSID_690(csid) ((csid->camss->res->version == CAMSS_8775P) \ || (csid->camss->res->version == CAMSS_8300)) #define CSID_BUF_DONE_IRQ_STATUS 0x8C -#define BUF_DONE_IRQ_STATUS_RDI_OFFSET (csid_is_lite(csid) ?\ - 1 : (IS_CSID_690(csid) ?\ - 13 : 14)) +#define BUF_DONE_IRQ_STATUS_RDI_OFFSET (csid_is_lite(csid) ? \ + ((IS_CSID_690(csid) ? 0 : 1)) : \ + ((IS_CSID_690(csid) ? 13 : 14))) #define CSID_BUF_DONE_IRQ_MASK 0x90 #define CSID_BUF_DONE_IRQ_CLEAR 0x94 #define CSID_BUF_DONE_IRQ_SET 0x98 diff --git a/drivers/media/platform/qcom/camss/camss-csiphy.c b/drivers/media/platform/qcom/camss/camss-csiphy.c index 62623393f41449..78a1b568dbae6a 100644 --- a/drivers/media/platform/qcom/camss/camss-csiphy.c +++ b/drivers/media/platform/qcom/camss/camss-csiphy.c @@ -558,12 +558,16 @@ static int csiphy_init_formats(struct v4l2_subdev *sd, return csiphy_set_format(sd, fh ? fh->state : NULL, &format); } -static bool csiphy_match_clock_name(const char *clock_name, const char *format, - int index) +static bool __printf(2, 3) +csiphy_match_clock_name(const char *clock_name, const char *format, ...) { char name[16]; /* csiphyXXX_timer\0 */ + va_list args; + + va_start(args, format); + vsnprintf(name, sizeof(name), format, args); + va_end(args); - snprintf(name, sizeof(name), format, index); return !strcmp(clock_name, name); } diff --git a/drivers/media/platform/qcom/camss/camss.c b/drivers/media/platform/qcom/camss/camss.c index 00b87fd9afbd89..9335636d7c4dfc 100644 --- a/drivers/media/platform/qcom/camss/camss.c +++ b/drivers/media/platform/qcom/camss/camss.c @@ -3598,12 +3598,10 @@ static const struct camss_subdev_resources csid_res_8775p[] = { /* CSID2 (lite) */ { .regulators = {}, - .clock = { "cpas_vfe_lite", "vfe_lite_ahb", - "vfe_lite_csid", "vfe_lite_cphy_rx", - "vfe_lite"}, + .clock = { "vfe_lite_csid", "vfe_lite_cphy_rx" }, .clock_rate = { - { 0, 0, 400000000, 400000000, 0}, - { 0, 0, 400000000, 480000000, 0} + { 400000000, 480000000 }, + { 400000000, 480000000 } }, .reg = { "csid_lite0" }, .interrupt = { "csid_lite0" }, @@ -3617,12 +3615,10 @@ static const struct camss_subdev_resources csid_res_8775p[] = { /* CSID3 (lite) */ { .regulators = {}, - .clock = { "cpas_vfe_lite", "vfe_lite_ahb", - "vfe_lite_csid", "vfe_lite_cphy_rx", - "vfe_lite"}, + .clock = { "vfe_lite_csid", "vfe_lite_cphy_rx" }, .clock_rate = { - { 0, 0, 400000000, 400000000, 0}, - { 0, 0, 400000000, 480000000, 0} + { 400000000, 480000000 }, + { 400000000, 480000000 } }, .reg = { "csid_lite1" }, .interrupt = { "csid_lite1" }, @@ -3636,12 +3632,10 @@ static const struct camss_subdev_resources csid_res_8775p[] = { /* CSID4 (lite) */ { .regulators = {}, - .clock = { "cpas_vfe_lite", "vfe_lite_ahb", - "vfe_lite_csid", "vfe_lite_cphy_rx", - "vfe_lite"}, + .clock = { "vfe_lite_csid", "vfe_lite_cphy_rx" }, .clock_rate = { - { 0, 0, 400000000, 400000000, 0}, - { 0, 0, 400000000, 480000000, 0} + { 400000000, 480000000 }, + { 400000000, 480000000 } }, .reg = { "csid_lite2" }, .interrupt = { "csid_lite2" }, @@ -3655,12 +3649,10 @@ static const struct camss_subdev_resources csid_res_8775p[] = { /* CSID5 (lite) */ { .regulators = {}, - .clock = { "cpas_vfe_lite", "vfe_lite_ahb", - "vfe_lite_csid", "vfe_lite_cphy_rx", - "vfe_lite"}, + .clock = { "vfe_lite_csid", "vfe_lite_cphy_rx" }, .clock_rate = { - { 0, 0, 400000000, 400000000, 0}, - { 0, 0, 400000000, 480000000, 0} + { 400000000, 480000000 }, + { 400000000, 480000000 } }, .reg = { "csid_lite3" }, .interrupt = { "csid_lite3" }, @@ -3674,12 +3666,10 @@ static const struct camss_subdev_resources csid_res_8775p[] = { /* CSID6 (lite) */ { .regulators = {}, - .clock = { "cpas_vfe_lite", "vfe_lite_ahb", - "vfe_lite_csid", "vfe_lite_cphy_rx", - "vfe_lite"}, + .clock = { "vfe_lite_csid", "vfe_lite_cphy_rx" }, .clock_rate = { - { 0, 0, 400000000, 400000000, 0}, - { 0, 0, 400000000, 480000000, 0} + { 400000000, 480000000 }, + { 400000000, 480000000 } }, .reg = { "csid_lite4" }, .interrupt = { "csid_lite4" }, @@ -3752,15 +3742,17 @@ static const struct camss_subdev_resources vfe_res_8775p[] = { /* VFE2 (lite) */ { .regulators = {}, - .clock = { "cpas_vfe_lite", "vfe_lite_ahb", + .clock = { "cpas_ahb", "cpas_vfe_lite", "vfe_lite_ahb", "vfe_lite_csid", "vfe_lite_cphy_rx", - "vfe_lite"}, + "vfe_lite", "camnoc_axi"}, .clock_rate = { - { 0, 0, 0, 0 }, + { 0 }, + { 0 }, { 300000000, 400000000, 400000000, 400000000 }, { 400000000, 400000000, 400000000, 400000000 }, { 400000000, 400000000, 400000000, 400000000 }, { 480000000, 600000000, 600000000, 600000000 }, + { 400000000 }, }, .reg = { "vfe_lite0" }, .interrupt = { "vfe_lite0" }, @@ -3775,15 +3767,17 @@ static const struct camss_subdev_resources vfe_res_8775p[] = { /* VFE3 (lite) */ { .regulators = {}, - .clock = { "cpas_vfe_lite", "vfe_lite_ahb", + .clock = { "cpas_ahb", "cpas_vfe_lite", "vfe_lite_ahb", "vfe_lite_csid", "vfe_lite_cphy_rx", - "vfe_lite"}, + "vfe_lite", "camnoc_axi"}, .clock_rate = { - { 0, 0, 0, 0 }, + { 0 }, + { 0 }, { 300000000, 400000000, 400000000, 400000000 }, { 400000000, 400000000, 400000000, 400000000 }, { 400000000, 400000000, 400000000, 400000000 }, { 480000000, 600000000, 600000000, 600000000 }, + { 400000000 }, }, .reg = { "vfe_lite1" }, .interrupt = { "vfe_lite1" }, @@ -3798,15 +3792,17 @@ static const struct camss_subdev_resources vfe_res_8775p[] = { /* VFE4 (lite) */ { .regulators = {}, - .clock = { "cpas_vfe_lite", "vfe_lite_ahb", + .clock = { "cpas_ahb", "cpas_vfe_lite", "vfe_lite_ahb", "vfe_lite_csid", "vfe_lite_cphy_rx", - "vfe_lite"}, + "vfe_lite", "camnoc_axi"}, .clock_rate = { - { 0, 0, 0, 0 }, + { 0 }, + { 0 }, { 300000000, 400000000, 400000000, 400000000 }, { 400000000, 400000000, 400000000, 400000000 }, { 400000000, 400000000, 400000000, 400000000 }, { 480000000, 600000000, 600000000, 600000000 }, + { 400000000 }, }, .reg = { "vfe_lite2" }, .interrupt = { "vfe_lite2" }, @@ -3821,15 +3817,17 @@ static const struct camss_subdev_resources vfe_res_8775p[] = { /* VFE5 (lite) */ { .regulators = {}, - .clock = { "cpas_vfe_lite", "vfe_lite_ahb", + .clock = { "cpas_ahb", "cpas_vfe_lite", "vfe_lite_ahb", "vfe_lite_csid", "vfe_lite_cphy_rx", - "vfe_lite"}, + "vfe_lite", "camnoc_axi"}, .clock_rate = { - { 0, 0, 0, 0 }, + { 0 }, + { 0 }, { 300000000, 400000000, 400000000, 400000000 }, { 400000000, 400000000, 400000000, 400000000 }, { 400000000, 400000000, 400000000, 400000000 }, { 480000000, 600000000, 600000000, 600000000 }, + { 400000000 }, }, .reg = { "vfe_lite3" }, .interrupt = { "vfe_lite3" }, @@ -3844,15 +3842,17 @@ static const struct camss_subdev_resources vfe_res_8775p[] = { /* VFE6 (lite) */ { .regulators = {}, - .clock = { "cpas_vfe_lite", "vfe_lite_ahb", + .clock = { "cpas_ahb", "cpas_vfe_lite", "vfe_lite_ahb", "vfe_lite_csid", "vfe_lite_cphy_rx", - "vfe_lite"}, + "vfe_lite", "camnoc_axi"}, .clock_rate = { - { 0, 0, 0, 0 }, + { 0 }, + { 0 }, { 300000000, 400000000, 400000000, 400000000 }, { 400000000, 400000000, 400000000, 400000000 }, { 400000000, 400000000, 400000000, 400000000 }, { 480000000, 600000000, 600000000, 600000000 }, + { 400000000 }, }, .reg = { "vfe_lite4" }, .interrupt = { "vfe_lite4" }, diff --git a/drivers/media/platform/qcom/iris/Kconfig b/drivers/media/platform/qcom/iris/Kconfig index 3c803a05305a80..5498f48362d15c 100644 --- a/drivers/media/platform/qcom/iris/Kconfig +++ b/drivers/media/platform/qcom/iris/Kconfig @@ -3,7 +3,7 @@ config VIDEO_QCOM_IRIS depends on VIDEO_DEV depends on ARCH_QCOM || COMPILE_TEST select V4L2_MEM2MEM_DEV - select QCOM_MDT_LOADER if ARCH_QCOM + select QCOM_MDT_LOADER select QCOM_SCM select VIDEOBUF2_DMA_CONTIG help diff --git a/drivers/media/platform/qcom/iris/iris_buffer.c b/drivers/media/platform/qcom/iris/iris_buffer.c index 9151f43bc6b9c2..1d53c7414b754b 100644 --- a/drivers/media/platform/qcom/iris/iris_buffer.c +++ b/drivers/media/platform/qcom/iris/iris_buffer.c @@ -582,10 +582,12 @@ static int iris_release_internal_buffers(struct iris_inst *inst, continue; if (!(buffer->attr & BUF_ATTR_QUEUED)) continue; + buffer->attr |= BUF_ATTR_PENDING_RELEASE; ret = hfi_ops->session_release_buf(inst, buffer); - if (ret) + if (ret) { + buffer->attr &= ~BUF_ATTR_PENDING_RELEASE; return ret; - buffer->attr |= BUF_ATTR_PENDING_RELEASE; + } } return 0; diff --git a/drivers/media/platform/qcom/iris/iris_core.c b/drivers/media/platform/qcom/iris/iris_core.c index 8406c48d635b6e..dbaac01eb15a0e 100644 --- a/drivers/media/platform/qcom/iris/iris_core.c +++ b/drivers/media/platform/qcom/iris/iris_core.c @@ -75,6 +75,10 @@ int iris_core_init(struct iris_core *core) if (ret) goto error_unload_fw; + ret = iris_vpu_switch_to_hwmode(core); + if (ret) + goto error_unload_fw; + ret = iris_hfi_core_init(core); if (ret) goto error_unload_fw; diff --git a/drivers/media/platform/qcom/iris/iris_hfi_common.c b/drivers/media/platform/qcom/iris/iris_hfi_common.c index 92112eb16c1104..621c66593d88d4 100644 --- a/drivers/media/platform/qcom/iris/iris_hfi_common.c +++ b/drivers/media/platform/qcom/iris/iris_hfi_common.c @@ -159,6 +159,10 @@ int iris_hfi_pm_resume(struct iris_core *core) if (ret) goto err_suspend_hw; + ret = iris_vpu_switch_to_hwmode(core); + if (ret) + goto err_suspend_hw; + ret = ops->sys_interframe_powercollapse(core); if (ret) goto err_suspend_hw; diff --git a/drivers/media/platform/qcom/iris/iris_hfi_queue.c b/drivers/media/platform/qcom/iris/iris_hfi_queue.c index b3ed06297953b9..bf6db23b53e210 100644 --- a/drivers/media/platform/qcom/iris/iris_hfi_queue.c +++ b/drivers/media/platform/qcom/iris/iris_hfi_queue.c @@ -263,7 +263,7 @@ int iris_hfi_queues_init(struct iris_core *core) GFP_KERNEL, DMA_ATTR_WRITE_COMBINE); if (!core->sfr_vaddr) { dev_err(core->dev, "sfr alloc and map failed\n"); - dma_free_attrs(core->dev, sizeof(*q_tbl_hdr), core->iface_q_table_vaddr, + dma_free_attrs(core->dev, queue_size, core->iface_q_table_vaddr, core->iface_q_table_daddr, DMA_ATTR_WRITE_COMBINE); return -ENOMEM; } diff --git a/drivers/media/platform/qcom/iris/iris_vdec.c b/drivers/media/platform/qcom/iris/iris_vdec.c index 719217399a304e..99d544e2af4f98 100644 --- a/drivers/media/platform/qcom/iris/iris_vdec.c +++ b/drivers/media/platform/qcom/iris/iris_vdec.c @@ -61,12 +61,6 @@ int iris_vdec_inst_init(struct iris_inst *inst) return iris_ctrls_init(inst); } -void iris_vdec_inst_deinit(struct iris_inst *inst) -{ - kfree(inst->fmt_dst); - kfree(inst->fmt_src); -} - static const struct iris_fmt iris_vdec_formats_cap[] = { [IRIS_FMT_NV12] = { .pixfmt = V4L2_PIX_FMT_NV12, diff --git a/drivers/media/platform/qcom/iris/iris_vdec.h b/drivers/media/platform/qcom/iris/iris_vdec.h index ec1ce55d1375fd..5123d2a340e15f 100644 --- a/drivers/media/platform/qcom/iris/iris_vdec.h +++ b/drivers/media/platform/qcom/iris/iris_vdec.h @@ -9,7 +9,6 @@ struct iris_inst; int iris_vdec_inst_init(struct iris_inst *inst); -void iris_vdec_inst_deinit(struct iris_inst *inst); int iris_vdec_enum_fmt(struct iris_inst *inst, struct v4l2_fmtdesc *f); int iris_vdec_try_fmt(struct iris_inst *inst, struct v4l2_format *f); int iris_vdec_s_fmt(struct iris_inst *inst, struct v4l2_format *f); diff --git a/drivers/media/platform/qcom/iris/iris_venc.c b/drivers/media/platform/qcom/iris/iris_venc.c index aa27b22704eb9e..4d886769d958b9 100644 --- a/drivers/media/platform/qcom/iris/iris_venc.c +++ b/drivers/media/platform/qcom/iris/iris_venc.c @@ -79,12 +79,6 @@ int iris_venc_inst_init(struct iris_inst *inst) return iris_ctrls_init(inst); } -void iris_venc_inst_deinit(struct iris_inst *inst) -{ - kfree(inst->fmt_dst); - kfree(inst->fmt_src); -} - static const struct iris_fmt iris_venc_formats_cap[] = { [IRIS_FMT_H264] = { .pixfmt = V4L2_PIX_FMT_H264, diff --git a/drivers/media/platform/qcom/iris/iris_venc.h b/drivers/media/platform/qcom/iris/iris_venc.h index c4db7433da5375..00c1716b2747c7 100644 --- a/drivers/media/platform/qcom/iris/iris_venc.h +++ b/drivers/media/platform/qcom/iris/iris_venc.h @@ -9,7 +9,6 @@ struct iris_inst; int iris_venc_inst_init(struct iris_inst *inst); -void iris_venc_inst_deinit(struct iris_inst *inst); int iris_venc_enum_fmt(struct iris_inst *inst, struct v4l2_fmtdesc *f); int iris_venc_try_fmt(struct iris_inst *inst, struct v4l2_format *f); int iris_venc_s_fmt(struct iris_inst *inst, struct v4l2_format *f); diff --git a/drivers/media/platform/qcom/iris/iris_vidc.c b/drivers/media/platform/qcom/iris/iris_vidc.c index bd38d84c9cc79d..5eb1786b07371d 100644 --- a/drivers/media/platform/qcom/iris/iris_vidc.c +++ b/drivers/media/platform/qcom/iris/iris_vidc.c @@ -289,10 +289,6 @@ int iris_close(struct file *filp) v4l2_m2m_ctx_release(inst->m2m_ctx); v4l2_m2m_release(inst->m2m_dev); mutex_lock(&inst->lock); - if (inst->domain == DECODER) - iris_vdec_inst_deinit(inst); - else if (inst->domain == ENCODER) - iris_venc_inst_deinit(inst); iris_session_close(inst); iris_inst_change_state(inst, IRIS_INST_DEINIT); iris_v4l2_fh_deinit(inst, filp); @@ -304,6 +300,8 @@ int iris_close(struct file *filp) mutex_unlock(&inst->lock); mutex_destroy(&inst->ctx_q_lock); mutex_destroy(&inst->lock); + kfree(inst->fmt_src); + kfree(inst->fmt_dst); kfree(inst); return 0; diff --git a/drivers/media/platform/qcom/iris/iris_vpu2.c b/drivers/media/platform/qcom/iris/iris_vpu2.c index 9c103a2e4e4eaf..01ef40f3895743 100644 --- a/drivers/media/platform/qcom/iris/iris_vpu2.c +++ b/drivers/media/platform/qcom/iris/iris_vpu2.c @@ -44,4 +44,5 @@ const struct vpu_ops iris_vpu2_ops = { .power_off_controller = iris_vpu_power_off_controller, .power_on_controller = iris_vpu_power_on_controller, .calc_freq = iris_vpu2_calc_freq, + .set_hwmode = iris_vpu_set_hwmode, }; diff --git a/drivers/media/platform/qcom/iris/iris_vpu3x.c b/drivers/media/platform/qcom/iris/iris_vpu3x.c index fe4423b951b1e9..3dad47be78b58f 100644 --- a/drivers/media/platform/qcom/iris/iris_vpu3x.c +++ b/drivers/media/platform/qcom/iris/iris_vpu3x.c @@ -234,14 +234,8 @@ static int iris_vpu35_power_on_hw(struct iris_core *core) if (ret) goto err_disable_hw_free_clk; - ret = dev_pm_genpd_set_hwmode(core->pmdomain_tbl->pd_devs[IRIS_HW_POWER_DOMAIN], true); - if (ret) - goto err_disable_hw_clk; - return 0; -err_disable_hw_clk: - iris_disable_unprepare_clock(core, IRIS_HW_CLK); err_disable_hw_free_clk: iris_disable_unprepare_clock(core, IRIS_HW_FREERUN_CLK); err_disable_axi_clk: @@ -266,6 +260,7 @@ const struct vpu_ops iris_vpu3_ops = { .power_off_controller = iris_vpu_power_off_controller, .power_on_controller = iris_vpu_power_on_controller, .calc_freq = iris_vpu3x_vpu4x_calculate_frequency, + .set_hwmode = iris_vpu_set_hwmode, }; const struct vpu_ops iris_vpu33_ops = { @@ -274,6 +269,7 @@ const struct vpu_ops iris_vpu33_ops = { .power_off_controller = iris_vpu33_power_off_controller, .power_on_controller = iris_vpu_power_on_controller, .calc_freq = iris_vpu3x_vpu4x_calculate_frequency, + .set_hwmode = iris_vpu_set_hwmode, }; const struct vpu_ops iris_vpu35_ops = { @@ -283,4 +279,5 @@ const struct vpu_ops iris_vpu35_ops = { .power_on_controller = iris_vpu35_vpu4x_power_on_controller, .program_bootup_registers = iris_vpu35_vpu4x_program_bootup_registers, .calc_freq = iris_vpu3x_vpu4x_calculate_frequency, + .set_hwmode = iris_vpu_set_hwmode, }; diff --git a/drivers/media/platform/qcom/iris/iris_vpu4x.c b/drivers/media/platform/qcom/iris/iris_vpu4x.c index a8db02ce5c5ec5..02e100a4045fce 100644 --- a/drivers/media/platform/qcom/iris/iris_vpu4x.c +++ b/drivers/media/platform/qcom/iris/iris_vpu4x.c @@ -252,21 +252,10 @@ static int iris_vpu4x_power_on_hardware(struct iris_core *core) ret = iris_vpu4x_power_on_apv(core); if (ret) goto disable_hw_clocks; - - iris_vpu4x_ahb_sync_reset_apv(core); } - iris_vpu4x_ahb_sync_reset_hardware(core); - - ret = iris_vpu4x_genpd_set_hwmode(core, true, efuse_value); - if (ret) - goto disable_apv_power_domain; - return 0; -disable_apv_power_domain: - if (!(efuse_value & DISABLE_VIDEO_APV_BIT)) - iris_vpu4x_power_off_apv(core); disable_hw_clocks: iris_vpu4x_disable_hardware_clocks(core, efuse_value); disable_vpp1_power_domain: @@ -359,6 +348,18 @@ static void iris_vpu4x_power_off_hardware(struct iris_core *core) iris_disable_power_domains(core, core->pmdomain_tbl->pd_devs[IRIS_HW_POWER_DOMAIN]); } +static int iris_vpu4x_set_hwmode(struct iris_core *core) +{ + u32 efuse_value = readl(core->reg_base + WRAPPER_EFUSE_MONITOR); + + if (!(efuse_value & DISABLE_VIDEO_APV_BIT)) + iris_vpu4x_ahb_sync_reset_apv(core); + + iris_vpu4x_ahb_sync_reset_hardware(core); + + return iris_vpu4x_genpd_set_hwmode(core, true, efuse_value); +} + const struct vpu_ops iris_vpu4x_ops = { .power_off_hw = iris_vpu4x_power_off_hardware, .power_on_hw = iris_vpu4x_power_on_hardware, @@ -366,4 +367,5 @@ const struct vpu_ops iris_vpu4x_ops = { .power_on_controller = iris_vpu35_vpu4x_power_on_controller, .program_bootup_registers = iris_vpu35_vpu4x_program_bootup_registers, .calc_freq = iris_vpu3x_vpu4x_calculate_frequency, + .set_hwmode = iris_vpu4x_set_hwmode, }; diff --git a/drivers/media/platform/qcom/iris/iris_vpu_buffer.h b/drivers/media/platform/qcom/iris/iris_vpu_buffer.h index 12640eb5ed8c45..8c0d6b7b5de85f 100644 --- a/drivers/media/platform/qcom/iris/iris_vpu_buffer.h +++ b/drivers/media/platform/qcom/iris/iris_vpu_buffer.h @@ -67,7 +67,7 @@ struct iris_inst; #define SIZE_DOLBY_RPU_METADATA (41 * 1024) #define H264_CABAC_HDR_RATIO_HD_TOT 1 #define H264_CABAC_RES_RATIO_HD_TOT 3 -#define H265D_MAX_SLICE 1200 +#define H265D_MAX_SLICE 3600 #define SIZE_H265D_HW_PIC_T SIZE_H264D_HW_PIC_T #define H265_CABAC_HDR_RATIO_HD_TOT 2 #define H265_CABAC_RES_RATIO_HD_TOT 2 diff --git a/drivers/media/platform/qcom/iris/iris_vpu_common.c b/drivers/media/platform/qcom/iris/iris_vpu_common.c index 548e5f1727fdb7..69e6126dc4d95e 100644 --- a/drivers/media/platform/qcom/iris/iris_vpu_common.c +++ b/drivers/media/platform/qcom/iris/iris_vpu_common.c @@ -292,14 +292,8 @@ int iris_vpu_power_on_hw(struct iris_core *core) if (ret && ret != -ENOENT) goto err_disable_hw_clock; - ret = dev_pm_genpd_set_hwmode(core->pmdomain_tbl->pd_devs[IRIS_HW_POWER_DOMAIN], true); - if (ret) - goto err_disable_hw_ahb_clock; - return 0; -err_disable_hw_ahb_clock: - iris_disable_unprepare_clock(core, IRIS_HW_AHB_CLK); err_disable_hw_clock: iris_disable_unprepare_clock(core, IRIS_HW_CLK); err_disable_power: @@ -308,6 +302,16 @@ int iris_vpu_power_on_hw(struct iris_core *core) return ret; } +int iris_vpu_set_hwmode(struct iris_core *core) +{ + return dev_pm_genpd_set_hwmode(core->pmdomain_tbl->pd_devs[IRIS_HW_POWER_DOMAIN], true); +} + +int iris_vpu_switch_to_hwmode(struct iris_core *core) +{ + return core->iris_platform_data->vpu_ops->set_hwmode(core); +} + int iris_vpu35_vpu4x_power_off_controller(struct iris_core *core) { u32 clk_rst_tbl_size = core->iris_platform_data->clk_rst_tbl_size; diff --git a/drivers/media/platform/qcom/iris/iris_vpu_common.h b/drivers/media/platform/qcom/iris/iris_vpu_common.h index f6dffc613b8223..dee3b1349c5e86 100644 --- a/drivers/media/platform/qcom/iris/iris_vpu_common.h +++ b/drivers/media/platform/qcom/iris/iris_vpu_common.h @@ -21,6 +21,7 @@ struct vpu_ops { int (*power_on_controller)(struct iris_core *core); void (*program_bootup_registers)(struct iris_core *core); u64 (*calc_freq)(struct iris_inst *inst, size_t data_size); + int (*set_hwmode)(struct iris_core *core); }; int iris_vpu_boot_firmware(struct iris_core *core); @@ -30,6 +31,8 @@ int iris_vpu_watchdog(struct iris_core *core, u32 intr_status); int iris_vpu_prepare_pc(struct iris_core *core); int iris_vpu_power_on_controller(struct iris_core *core); int iris_vpu_power_on_hw(struct iris_core *core); +int iris_vpu_set_hwmode(struct iris_core *core); +int iris_vpu_switch_to_hwmode(struct iris_core *core); int iris_vpu_power_on(struct iris_core *core); int iris_vpu_power_off_controller(struct iris_core *core); void iris_vpu_power_off_hw(struct iris_core *core); diff --git a/drivers/media/platform/qcom/venus/Kconfig b/drivers/media/platform/qcom/venus/Kconfig index ffb731ecd48c90..63ee8c78dc6d75 100644 --- a/drivers/media/platform/qcom/venus/Kconfig +++ b/drivers/media/platform/qcom/venus/Kconfig @@ -4,7 +4,7 @@ config VIDEO_QCOM_VENUS depends on VIDEO_DEV && QCOM_SMEM depends on (ARCH_QCOM && ARM64 && IOMMU_API) || COMPILE_TEST select OF_DYNAMIC if ARCH_QCOM - select QCOM_MDT_LOADER if ARCH_QCOM + select QCOM_MDT_LOADER select QCOM_SCM select VIDEOBUF2_DMA_CONTIG select V4L2_MEM2MEM_DEV diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c index b9423389c2ef0b..44d670904ad893 100644 --- a/drivers/net/dsa/mt7530.c +++ b/drivers/net/dsa/mt7530.c @@ -25,6 +25,9 @@ #include "mt7530.h" +#define MT7530_STATS_POLL_INTERVAL (1 * HZ) +#define MT7530_STATS_RATE_LIMIT (HZ / 10) + static struct mt753x_pcs *pcs_to_mt753x_pcs(struct phylink_pcs *pcs) { return container_of(pcs, struct mt753x_pcs, pcs); @@ -906,10 +909,9 @@ static void mt7530_get_rmon_stats(struct dsa_switch *ds, int port, *ranges = mt7530_rmon_ranges; } -static void mt7530_get_stats64(struct dsa_switch *ds, int port, - struct rtnl_link_stats64 *storage) +static void mt7530_read_port_stats64(struct mt7530_priv *priv, int port, + struct rtnl_link_stats64 *storage) { - struct mt7530_priv *priv = ds->priv; uint64_t data; /* MIB counter doesn't provide a FramesTransmittedOK but instead @@ -951,6 +953,54 @@ static void mt7530_get_stats64(struct dsa_switch *ds, int port, &storage->rx_crc_errors); } +static void mt7530_stats_refresh(struct mt7530_priv *priv) +{ + struct rtnl_link_stats64 stats = {}; + struct dsa_port *dp; + int port; + + dsa_switch_for_each_user_port(dp, priv->ds) { + port = dp->index; + + mt7530_read_port_stats64(priv, port, &stats); + + spin_lock_bh(&priv->stats_lock); + priv->ports[port].stats = stats; + priv->stats_last = jiffies; + spin_unlock_bh(&priv->stats_lock); + } +} + +static void mt7530_stats_poll(struct work_struct *work) +{ + struct mt7530_priv *priv = container_of(work, struct mt7530_priv, + stats_work.work); + + mt7530_stats_refresh(priv); + schedule_delayed_work(&priv->stats_work, + MT7530_STATS_POLL_INTERVAL); +} + +static void mt7530_get_stats64(struct dsa_switch *ds, int port, + struct rtnl_link_stats64 *storage) +{ + struct mt7530_priv *priv = ds->priv; + bool refresh; + + if (priv->bus) { + spin_lock_bh(&priv->stats_lock); + *storage = priv->ports[port].stats; + refresh = time_after(jiffies, priv->stats_last + + MT7530_STATS_RATE_LIMIT); + spin_unlock_bh(&priv->stats_lock); + if (refresh) + mod_delayed_work(system_percpu_wq, + &priv->stats_work, 0); + } else { + mt7530_read_port_stats64(priv, port, storage); + } +} + static void mt7530_get_eth_ctrl_stats(struct dsa_switch *ds, int port, struct ethtool_eth_ctrl_stats *ctrl_stats) { @@ -3137,9 +3187,24 @@ mt753x_setup(struct dsa_switch *ds) if (ret && priv->irq_domain) mt7530_free_mdio_irq(priv); + if (!ret && priv->bus) { + mt7530_stats_refresh(priv); + schedule_delayed_work(&priv->stats_work, + MT7530_STATS_POLL_INTERVAL); + } + return ret; } +static void +mt753x_teardown(struct dsa_switch *ds) +{ + struct mt7530_priv *priv = ds->priv; + + if (priv->bus) + cancel_delayed_work_sync(&priv->stats_work); +} + static int mt753x_set_mac_eee(struct dsa_switch *ds, int port, struct ethtool_keee *e) { @@ -3257,6 +3322,7 @@ static int mt7988_setup(struct dsa_switch *ds) static const struct dsa_switch_ops mt7530_switch_ops = { .get_tag_protocol = mtk_get_tag_protocol, .setup = mt753x_setup, + .teardown = mt753x_teardown, .preferred_default_local_cpu_port = mt753x_preferred_default_local_cpu_port, .get_strings = mt7530_get_strings, .get_ethtool_stats = mt7530_get_ethtool_stats, @@ -3395,6 +3461,9 @@ mt7530_probe_common(struct mt7530_priv *priv) priv->ds->ops = &mt7530_switch_ops; priv->ds->phylink_mac_ops = &mt753x_phylink_mac_ops; mutex_init(&priv->reg_mutex); + spin_lock_init(&priv->stats_lock); + INIT_DELAYED_WORK(&priv->stats_work, mt7530_stats_poll); + dev_set_drvdata(dev, priv); return 0; diff --git a/drivers/net/dsa/mt7530.h b/drivers/net/dsa/mt7530.h index 3e0090bed298de..dd33b0df3419e3 100644 --- a/drivers/net/dsa/mt7530.h +++ b/drivers/net/dsa/mt7530.h @@ -796,6 +796,7 @@ struct mt7530_fdb { * @pvid: The VLAN specified is to be considered a PVID at ingress. Any * untagged frames will be assigned to the related VLAN. * @sgmii_pcs: Pointer to PCS instance for SerDes ports + * @stats: Cached port statistics for MDIO-connected switches */ struct mt7530_port { bool enable; @@ -803,6 +804,7 @@ struct mt7530_port { u32 pm; u16 pvid; struct phylink_pcs *sgmii_pcs; + struct rtnl_link_stats64 stats; }; /* Port 5 mode definitions of the MT7530 switch */ @@ -875,6 +877,9 @@ struct mt753x_info { * @create_sgmii: Pointer to function creating SGMII PCS instance(s) * @active_cpu_ports: Holding the active CPU ports * @mdiodev: The pointer to the MDIO device structure + * @stats_lock: Protects cached per-port stats from concurrent access + * @stats_work: Delayed work for polling MIB counters on MDIO switches + * @stats_last: Jiffies timestamp of last MIB counter poll */ struct mt7530_priv { struct device *dev; @@ -900,6 +905,9 @@ struct mt7530_priv { int (*create_sgmii)(struct mt7530_priv *priv); u8 active_cpu_ports; struct mdio_device *mdiodev; + spinlock_t stats_lock; /* protects cached stats counters */ + struct delayed_work stats_work; + unsigned long stats_last; }; struct mt7530_hw_vlan_entry { diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c index f8b3d53bccadbd..d0c0c0ec8a80af 100644 --- a/drivers/net/ethernet/airoha/airoha_eth.c +++ b/drivers/net/ethernet/airoha/airoha_eth.c @@ -2120,14 +2120,12 @@ static netdev_tx_t airoha_dev_xmit(struct sk_buff *skb, return NETDEV_TX_OK; error_unmap: - while (!list_empty(&tx_list)) { - e = list_first_entry(&tx_list, struct airoha_queue_entry, - list); + list_for_each_entry(e, &tx_list, list) { dma_unmap_single(dev->dev.parent, e->dma_addr, e->dma_len, DMA_TO_DEVICE); e->dma_addr = 0; - list_move_tail(&e->list, &q->tx_list); } + list_splice(&tx_list, &q->tx_list); spin_unlock_bh(&q->lock); error: diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c index e67b592e569763..8c86789d867a5f 100644 --- a/drivers/net/ethernet/amazon/ena/ena_com.c +++ b/drivers/net/ethernet/amazon/ena/ena_com.c @@ -1782,20 +1782,23 @@ void ena_com_phc_destroy(struct ena_com_dev *ena_dev) int ena_com_phc_get_timestamp(struct ena_com_dev *ena_dev, u64 *timestamp) { - volatile struct ena_admin_phc_resp *resp = ena_dev->phc.virt_addr; const ktime_t zero_system_time = ktime_set(0, 0); struct ena_com_phc_info *phc = &ena_dev->phc; + volatile struct ena_admin_phc_resp *resp; ktime_t expire_time; ktime_t block_time; unsigned long flags = 0; int ret = 0; + spin_lock_irqsave(&phc->lock, flags); + if (!phc->active) { + spin_unlock_irqrestore(&phc->lock, flags); netdev_err(ena_dev->net_device, "PHC feature is not active in the device\n"); return -EOPNOTSUPP; } - spin_lock_irqsave(&phc->lock, flags); + resp = ena_dev->phc.virt_addr; /* Check if PHC is in blocked state */ if (unlikely(ktime_compare(phc->system_time, zero_system_time))) { diff --git a/drivers/net/ethernet/amazon/ena/ena_phc.c b/drivers/net/ethernet/amazon/ena/ena_phc.c index 7867e893fd15f9..c2a3ff1ef645c5 100644 --- a/drivers/net/ethernet/amazon/ena/ena_phc.c +++ b/drivers/net/ethernet/amazon/ena/ena_phc.c @@ -46,9 +46,12 @@ static int ena_phc_gettimex64(struct ptp_clock_info *clock_info, spin_unlock_irqrestore(&phc_info->lock, flags); + if (rc) + return rc; + *ts = ns_to_timespec64(timestamp_nsec); - return rc; + return 0; } static int ena_phc_settime64(struct ptp_clock_info *clock_info, diff --git a/drivers/net/ethernet/amd/xgbe/xgbe.h b/drivers/net/ethernet/amd/xgbe/xgbe.h index 60b7e53206d1e1..3d3b09010d48fb 100644 --- a/drivers/net/ethernet/amd/xgbe/xgbe.h +++ b/drivers/net/ethernet/amd/xgbe/xgbe.h @@ -135,11 +135,11 @@ */ #define XGBE_TSTAMP_SSINC 20 #define XGBE_TSTAMP_SNSINC 0 -#define XGBE_PTP_ACT_CLK_FREQ 500000000 +#define XGBE_PTP_ACT_CLK_FREQ (NSEC_PER_SEC / XGBE_TSTAMP_SSINC) #define XGBE_V2_TSTAMP_SSINC 0xA #define XGBE_V2_TSTAMP_SNSINC 0 -#define XGBE_V2_PTP_ACT_CLK_FREQ 1000000000 +#define XGBE_V2_PTP_ACT_CLK_FREQ (NSEC_PER_SEC / XGBE_V2_TSTAMP_SSINC) /* Define maximum supported values */ #define XGBE_MAX_PPS_OUT 4 diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c index b854b6b42d77b9..2926e1e599419d 100644 --- a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c +++ b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c @@ -910,7 +910,9 @@ static int xgene_mdiobus_register(struct xgene_enet_pdata *pdata, return -ENXIO; } - return of_mdiobus_register(mdio, mdio_np); + ret = of_mdiobus_register(mdio, mdio_np); + of_node_put(mdio_np); + return ret; } /* Mask out all PHYs from auto probing. */ diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c b/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c index e9e38af680c34a..39e1b606a75a9d 100644 --- a/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c +++ b/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c @@ -371,7 +371,7 @@ static void aq_pci_shutdown(struct pci_dev *pdev) pci_disable_device(pdev); if (system_state == SYSTEM_POWER_OFF) { - pci_wake_from_d3(pdev, false); + pci_wake_from_d3(pdev, self->aq_hw->aq_nic_cfg->wol); pci_set_power_state(pdev, PCI_D3hot); } } diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 8c55874f44ca6d..008c34cff7b46c 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -3825,7 +3825,10 @@ static int bnxt_alloc_tpa_info(struct bnxt *bp) if (bp->flags & BNXT_FLAG_CHIP_P5_PLUS) { if (!bp->max_tpa_v2) return 0; - bp->max_tpa = max_t(u16, bp->max_tpa_v2, MAX_TPA_P5); + bp->max_tpa = min_t(u16, bp->max_tpa_v2, MAX_TPA_P5); + /* Older P5 FW sets max_tpa_v2 low by mistake except NPAR */ + if (bp->max_tpa <= 32 && BNXT_CHIP_P5(bp) && !BNXT_NPAR(bp)) + bp->max_tpa = MAX_TPA_P5; } for (i = 0; i < bp->rx_nr_rings; i++) { @@ -17360,9 +17363,14 @@ static pci_ers_result_t bnxt_io_slot_reset(struct pci_dev *pdev) netdev_info(bp->dev, "PCI Slot Reset\n"); - if (!(bp->flags & BNXT_FLAG_CHIP_P5_PLUS) && - test_bit(BNXT_STATE_PCI_CHANNEL_IO_FROZEN, &bp->state)) - msleep(900); + if (test_bit(BNXT_STATE_PCI_CHANNEL_IO_FROZEN, &bp->state)) { + /* After DPC, the chip should return CRS when the vendor ID + * config register is read until it is ready. On all chips, + * this is not happening reliably so add a 5-second delay as a + * workaround. + */ + msleep(5000); + } netdev_lock(netdev); diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c index 53f336db4fcc06..5d41dc1bc78209 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c @@ -419,31 +419,13 @@ void bnxt_ptp_reapply_pps(struct bnxt *bp) } } -static int bnxt_get_target_cycles(struct bnxt_ptp_cfg *ptp, u64 target_ns, - u64 *cycles_delta) -{ - u64 cycles_now; - u64 nsec_now, nsec_delta; - int rc; - - rc = bnxt_refclk_read(ptp->bp, NULL, &cycles_now); - if (rc) - return rc; - - nsec_now = bnxt_timecounter_cyc2time(ptp, cycles_now); - - nsec_delta = target_ns - nsec_now; - *cycles_delta = div64_u64(nsec_delta << ptp->cc.shift, ptp->cc.mult); - return 0; -} - static int bnxt_ptp_perout_cfg(struct bnxt_ptp_cfg *ptp, struct ptp_clock_request *rq) { struct hwrm_func_ptp_cfg_input *req; struct bnxt *bp = ptp->bp; struct timespec64 ts; - u64 target_ns, delta; + u64 target_ns; u16 enables; int rc; @@ -451,10 +433,6 @@ static int bnxt_ptp_perout_cfg(struct bnxt_ptp_cfg *ptp, ts.tv_nsec = rq->perout.start.nsec; target_ns = timespec64_to_ns(&ts); - rc = bnxt_get_target_cycles(ptp, target_ns, &delta); - if (rc) - return rc; - rc = hwrm_req_init(bp, req, HWRM_FUNC_PTP_CFG); if (rc) return rc; @@ -468,7 +446,10 @@ static int bnxt_ptp_perout_cfg(struct bnxt_ptp_cfg *ptp, req->ptp_freq_adj_dll_phase = 0; req->ptp_freq_adj_ext_period = cpu_to_le32(NSEC_PER_SEC); req->ptp_freq_adj_ext_up = 0; - req->ptp_freq_adj_ext_phase_lower = cpu_to_le32(delta); + req->ptp_freq_adj_ext_phase_lower = + cpu_to_le32(lower_32_bits(target_ns)); + req->ptp_freq_adj_ext_phase_upper = + cpu_to_le32(upper_32_bits(target_ns)); return hwrm_req_send(bp, req); } diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c index 052bf69cfa4cdd..5c751933da6a9d 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c @@ -175,8 +175,14 @@ int bnxt_register_dev(struct bnxt_en_dev *edev, ulp->handle = handle; rcu_assign_pointer(ulp->ulp_ops, ulp_ops); - if (test_bit(BNXT_STATE_OPEN, &bp->state)) - bnxt_hwrm_vnic_cfg(bp, &bp->vnic_info[BNXT_VNIC_DEFAULT]); + if (test_bit(BNXT_STATE_OPEN, &bp->state)) { + rc = bnxt_hwrm_vnic_cfg(bp, &bp->vnic_info[BNXT_VNIC_DEFAULT]); + if (rc) { + netdev_err(dev, "Failed to configure dual VNIC mode\n"); + RCU_INIT_POINTER(ulp->ulp_ops, NULL); + goto exit; + } + } edev->ulp_tbl->msix_requested = bnxt_get_ulp_msix_num(bp); diff --git a/drivers/net/ethernet/cirrus/cs89x0.c b/drivers/net/ethernet/cirrus/cs89x0.c index fa5857923db4c2..b4bfd6c174e78e 100644 --- a/drivers/net/ethernet/cirrus/cs89x0.c +++ b/drivers/net/ethernet/cirrus/cs89x0.c @@ -1271,7 +1271,6 @@ static const struct net_device_ops net_ops = { static void __init reset_chip(struct net_device *dev) { -#if !defined(CONFIG_MACH_MX31ADS) struct net_local *lp = netdev_priv(dev); unsigned long reset_start_time; @@ -1298,7 +1297,6 @@ static void __init reset_chip(struct net_device *dev) while ((readreg(dev, PP_SelfST) & INIT_DONE) == 0 && time_before(jiffies, reset_start_time + 2)) ; -#endif /* !CONFIG_MACH_MX31ADS */ } /* This is the real probe routine. diff --git a/drivers/net/ethernet/cortina/gemini.c b/drivers/net/ethernet/cortina/gemini.c index 4824232f489072..4c762229ce420f 100644 --- a/drivers/net/ethernet/cortina/gemini.c +++ b/drivers/net/ethernet/cortina/gemini.c @@ -122,6 +122,9 @@ struct gemini_ethernet_port { struct napi_struct napi; struct hrtimer rx_coalesce_timer; unsigned int rx_coalesce_nsecs; + struct sk_buff *rx_skb; + unsigned int rx_frag_nr; + unsigned int freeq_refill; struct gmac_txq txq[TX_QUEUE_NUM]; unsigned int txq_order; @@ -1442,10 +1445,11 @@ static unsigned int gmac_rx(struct net_device *netdev, unsigned int budget) unsigned short m = (1 << port->rxq_order) - 1; struct gemini_ethernet *geth = port->geth; void __iomem *ptr_reg = port->rxq_rwptr; + unsigned int frag_nr = port->rx_frag_nr; + struct sk_buff *skb = port->rx_skb; unsigned int frame_len, frag_len; struct gmac_rxdesc *rx = NULL; struct gmac_queue_page *gpage; - static struct sk_buff *skb; union gmac_rxdesc_0 word0; union gmac_rxdesc_1 word1; union gmac_rxdesc_3 word3; @@ -1455,7 +1459,6 @@ static unsigned int gmac_rx(struct net_device *netdev, unsigned int budget) unsigned short r, w; union dma_rwptr rw; dma_addr_t mapping; - int frag_nr = 0; spin_lock_irqsave(&geth->irq_lock, flags); rw.bits32 = readl(ptr_reg); @@ -1491,6 +1494,12 @@ static unsigned int gmac_rx(struct net_device *netdev, unsigned int budget) gpage = gmac_get_queue_page(geth, port, mapping + PAGE_SIZE); if (!gpage) { dev_err(geth->dev, "could not find mapping\n"); + port->stats.rx_dropped++; + if (skb) { + napi_free_frags(&port->napi); + skb = NULL; + frag_nr = 0; + } continue; } page = gpage->page; @@ -1499,6 +1508,8 @@ static unsigned int gmac_rx(struct net_device *netdev, unsigned int budget) if (skb) { napi_free_frags(&port->napi); port->stats.rx_dropped++; + skb = NULL; + frag_nr = 0; } skb = gmac_skb_if_good_frame(port, word0, frame_len); @@ -1533,6 +1544,7 @@ static unsigned int gmac_rx(struct net_device *netdev, unsigned int budget) if (word3.bits32 & EOF_BIT) { napi_gro_frags(&port->napi); skb = NULL; + frag_nr = 0; --budget; } continue; @@ -1541,6 +1553,7 @@ static unsigned int gmac_rx(struct net_device *netdev, unsigned int budget) if (skb) { napi_free_frags(&port->napi); skb = NULL; + frag_nr = 0; } if (mapping) @@ -1549,6 +1562,8 @@ static unsigned int gmac_rx(struct net_device *netdev, unsigned int budget) port->stats.rx_dropped++; } + port->rx_skb = skb; + port->rx_frag_nr = frag_nr; writew(r, ptr_reg); return budget; } @@ -1876,6 +1891,8 @@ static int gmac_stop(struct net_device *netdev) gmac_disable_tx_rx(netdev); gmac_stop_dma(port); napi_disable(&port->napi); + port->rx_skb = NULL; + port->rx_frag_nr = 0; gmac_enable_irq(netdev, 0); gmac_cleanup_rxq(netdev); diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h index e663bb5e614eff..e691144e875671 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.h +++ b/drivers/net/ethernet/freescale/enetc/enetc.h @@ -330,6 +330,7 @@ struct enetc_si { struct workqueue_struct *workqueue; struct work_struct rx_mode_task; struct dentry *debugfs_root; + struct enetc_msg_swbd msg; /* Only valid for VSI */ }; #define ENETC_SI_ALIGN 32 diff --git a/drivers/net/ethernet/freescale/enetc/enetc_vf.c b/drivers/net/ethernet/freescale/enetc/enetc_vf.c index 6c4b374bcb0ef1..df8e95cc47d0fe 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc_vf.c +++ b/drivers/net/ethernet/freescale/enetc/enetc_vf.c @@ -17,11 +17,36 @@ static void enetc_msg_vsi_write_msg(struct enetc_hw *hw, enetc_wr(hw, ENETC_VSIMSGSNDAR0, val); } +static void enetc_msg_dma_free(struct device *dev, struct enetc_msg_swbd *msg) +{ + if (msg->vaddr) { + dma_free_coherent(dev, msg->size, msg->vaddr, msg->dma); + msg->vaddr = NULL; + } +} + static int enetc_msg_vsi_send(struct enetc_si *si, struct enetc_msg_swbd *msg) { + struct device *dev = &si->pdev->dev; int timeout = 100; u32 vsimsgsr; + /* The VSI mailbox may be busy if last message was not yet processed + * by PSI. So need to check the mailbox status before sending. + */ + vsimsgsr = enetc_rd(&si->hw, ENETC_VSIMSGSR); + if (vsimsgsr & ENETC_VSIMSGSR_MB) { + /* It is safe to free the DMA buffer here, the caller does + * not access the DMA buffer if enetc_msg_vsi_send() fails. + */ + enetc_msg_dma_free(dev, msg); + dev_err(dev, "VSI mailbox is busy\n"); + return -EIO; + } + + /* Free the DMA buffer of the last message */ + enetc_msg_dma_free(dev, &si->msg); + si->msg = *msg; enetc_msg_vsi_write_msg(&si->hw, msg); do { @@ -32,12 +57,15 @@ static int enetc_msg_vsi_send(struct enetc_si *si, struct enetc_msg_swbd *msg) usleep_range(1000, 2000); } while (--timeout); - if (!timeout) + if (!timeout) { + dev_err(dev, "VSI mailbox timeout\n"); + return -ETIMEDOUT; + } /* check for message delivery error */ if (vsimsgsr & ENETC_VSIMSGSR_MS) { - dev_err(&si->pdev->dev, "VSI command execute error: %d\n", + dev_err(dev, "VSI command execute error: %d\n", ENETC_SIMSGSR_GET_MC(vsimsgsr)); return -EIO; } @@ -50,7 +78,6 @@ static int enetc_msg_vsi_set_primary_mac_addr(struct enetc_ndev_priv *priv, { struct enetc_msg_cmd_set_primary_mac *cmd; struct enetc_msg_swbd msg; - int err; msg.size = ALIGN(sizeof(struct enetc_msg_cmd_set_primary_mac), 64); msg.vaddr = dma_alloc_coherent(priv->dev, msg.size, &msg.dma, @@ -67,11 +94,7 @@ static int enetc_msg_vsi_set_primary_mac_addr(struct enetc_ndev_priv *priv, memcpy(&cmd->mac, saddr, sizeof(struct sockaddr)); /* send the command and wait */ - err = enetc_msg_vsi_send(priv->si, &msg); - - dma_free_coherent(priv->dev, msg.size, msg.vaddr, msg.dma); - - return err; + return enetc_msg_vsi_send(priv->si, &msg); } static int enetc_vf_set_mac_addr(struct net_device *ndev, void *addr) @@ -259,6 +282,7 @@ static void enetc_vf_remove(struct pci_dev *pdev) { struct enetc_si *si = pci_get_drvdata(pdev); struct enetc_ndev_priv *priv; + struct enetc_msg_swbd msg; priv = netdev_priv(si->ndev); unregister_netdev(si->ndev); @@ -270,7 +294,9 @@ static void enetc_vf_remove(struct pci_dev *pdev) free_netdev(si->ndev); + msg = si->msg; enetc_pci_remove(pdev); + enetc_msg_dma_free(&pdev->dev, &msg); } static const struct pci_device_id enetc_vf_id_table[] = { diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h index dcb50c2e1aa277..83e780919ac97f 100644 --- a/drivers/net/ethernet/intel/i40e/i40e.h +++ b/drivers/net/ethernet/intel/i40e/i40e.h @@ -1318,6 +1318,7 @@ void i40e_ptp_restore_hw_time(struct i40e_pf *pf); void i40e_ptp_init(struct i40e_pf *pf); void i40e_ptp_stop(struct i40e_pf *pf); int i40e_ptp_alloc_pins(struct i40e_pf *pf); +void i40e_ptp_free_pins(struct i40e_pf *pf); int i40e_update_adq_vsi_queues(struct i40e_vsi *vsi, int vsi_offset); int i40e_is_vsi_uplink_mode_veb(struct i40e_vsi *vsi); int i40e_get_partition_bw_setting(struct i40e_pf *pf); diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 028bd500603a54..6d4f9218dc6847 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -16108,9 +16108,11 @@ static int i40e_probe(struct pci_dev *pdev, const struct pci_device_id *ent) /* Unwind what we've done if something failed in the setup */ err_vsis: set_bit(__I40E_DOWN, pf->state); + i40e_ptp_stop(pf); i40e_clear_interrupt_scheme(pf); kfree(pf->vsi); err_switch_setup: + i40e_ptp_free_pins(pf); i40e_reset_interrupt_capability(pf); timer_shutdown_sync(&pf->service_timer); err_mac_addr: diff --git a/drivers/net/ethernet/intel/i40e/i40e_ptp.c b/drivers/net/ethernet/intel/i40e/i40e_ptp.c index 404a716db8da71..7d07c389bb2312 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_ptp.c +++ b/drivers/net/ethernet/intel/i40e/i40e_ptp.c @@ -940,12 +940,13 @@ int i40e_ptp_hwtstamp_get(struct net_device *netdev, * * Release memory allocated for PTP pins. **/ -static void i40e_ptp_free_pins(struct i40e_pf *pf) +void i40e_ptp_free_pins(struct i40e_pf *pf) { if (i40e_is_ptp_pin_dev(&pf->hw)) { kfree(pf->ptp_pins); kfree(pf->ptp_caps.pin_config); pf->ptp_pins = NULL; + pf->ptp_caps.pin_config = NULL; } } diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c index 16aa255351523d..0bc6dd37568792 100644 --- a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c @@ -537,14 +537,14 @@ void ice_dcb_rebuild(struct ice_pf *pf) struct ice_dcbx_cfg *err_cfg; int ret; + mutex_lock(&pf->tc_mutex); + ret = ice_query_port_ets(pf->hw.port_info, &buf, sizeof(buf), NULL); if (ret) { dev_err(dev, "Query Port ETS failed\n"); goto dcb_error; } - mutex_lock(&pf->tc_mutex); - if (!pf->hw.port_info->qos_cfg.is_sw_lldp) ice_cfg_etsrec_defaults(pf->hw.port_info); diff --git a/drivers/net/ethernet/intel/ice/ice_dpll.c b/drivers/net/ethernet/intel/ice/ice_dpll.c index 27b460926baced..892bc7c2e28b46 100644 --- a/drivers/net/ethernet/intel/ice/ice_dpll.c +++ b/drivers/net/ethernet/intel/ice/ice_dpll.c @@ -2523,6 +2523,8 @@ ice_dpll_rclk_state_on_pin_set(const struct dpll_pin *pin, void *pin_priv, if (hw_idx < 0) goto unlock; hw_idx -= pf->dplls.base_rclk_idx; + if (hw_idx >= ICE_DPLL_RCLK_NUM_MAX) + goto unlock; if ((enable && p->state[hw_idx] == DPLL_PIN_STATE_CONNECTED) || (!enable && p->state[hw_idx] == DPLL_PIN_STATE_DISCONNECTED)) { @@ -2586,6 +2588,9 @@ ice_dpll_rclk_state_on_pin_get(const struct dpll_pin *pin, void *pin_priv, hw_idx = ice_dpll_pin_get_parent_idx(p, parent_pin); if (hw_idx < 0) goto unlock; + hw_idx -= pf->dplls.base_rclk_idx; + if (hw_idx >= ICE_DPLL_RCLK_NUM_MAX) + goto unlock; ret = ice_dpll_pin_state_update(pf, p, ICE_DPLL_PIN_TYPE_RCLK_INPUT, extack); diff --git a/drivers/net/ethernet/intel/ice/ice_dpll.h b/drivers/net/ethernet/intel/ice/ice_dpll.h index ae42cdea0ee145..8678575359b929 100644 --- a/drivers/net/ethernet/intel/ice/ice_dpll.h +++ b/drivers/net/ethernet/intel/ice/ice_dpll.h @@ -8,6 +8,22 @@ #define ICE_DPLL_RCLK_NUM_MAX 4 +#define ICE_CGU_R10 0x28 +#define ICE_CGU_R10_SYNCE_CLKO_SEL GENMASK(8, 5) +#define ICE_CGU_R10_SYNCE_CLKODIV_M1 GENMASK(13, 9) +#define ICE_CGU_R10_SYNCE_CLKODIV_LOAD BIT(14) +#define ICE_CGU_R10_SYNCE_DCK_RST BIT(15) +#define ICE_CGU_R10_SYNCE_ETHCLKO_SEL GENMASK(18, 16) +#define ICE_CGU_R10_SYNCE_ETHDIV_M1 GENMASK(23, 19) +#define ICE_CGU_R10_SYNCE_ETHDIV_LOAD BIT(24) +#define ICE_CGU_R10_SYNCE_DCK2_RST BIT(25) +#define ICE_CGU_R10_SYNCE_S_REF_CLK GENMASK(31, 27) + +#define ICE_CGU_R11 0x2C +#define ICE_CGU_R11_SYNCE_S_BYP_CLK GENMASK(6, 1) + +#define ICE_CGU_BYPASS_MUX_OFFSET_E825C 3 + /** * enum ice_dpll_pin_sw - enumerate ice software pin indices: * @ICE_DPLL_PIN_SW_1_IDX: index of first SW pin @@ -157,19 +173,3 @@ static inline void ice_dpll_deinit(struct ice_pf *pf) { } #endif #endif - -#define ICE_CGU_R10 0x28 -#define ICE_CGU_R10_SYNCE_CLKO_SEL GENMASK(8, 5) -#define ICE_CGU_R10_SYNCE_CLKODIV_M1 GENMASK(13, 9) -#define ICE_CGU_R10_SYNCE_CLKODIV_LOAD BIT(14) -#define ICE_CGU_R10_SYNCE_DCK_RST BIT(15) -#define ICE_CGU_R10_SYNCE_ETHCLKO_SEL GENMASK(18, 16) -#define ICE_CGU_R10_SYNCE_ETHDIV_M1 GENMASK(23, 19) -#define ICE_CGU_R10_SYNCE_ETHDIV_LOAD BIT(24) -#define ICE_CGU_R10_SYNCE_DCK2_RST BIT(25) -#define ICE_CGU_R10_SYNCE_S_REF_CLK GENMASK(31, 27) - -#define ICE_CGU_R11 0x2C -#define ICE_CGU_R11_SYNCE_S_BYP_CLK GENMASK(6, 1) - -#define ICE_CGU_BYPASS_MUX_OFFSET_E825C 3 diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index 1d1947a7fe1191..c52c465280f742 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -8046,7 +8046,7 @@ int ice_set_rss_hfunc(struct ice_vsi *vsi, u8 hfunc) ctx->info.q_opt_rss |= FIELD_PREP(ICE_AQ_VSI_Q_OPT_RSS_HASH_M, hfunc); ctx->info.q_opt_tc = vsi->info.q_opt_tc; - ctx->info.q_opt_flags = vsi->info.q_opt_rss; + ctx->info.q_opt_flags = vsi->info.q_opt_flags; err = ice_update_vsi(hw, vsi->idx, ctx, NULL); if (err) { diff --git a/drivers/net/ethernet/intel/idpf/idpf_idc.c b/drivers/net/ethernet/intel/idpf/idpf_idc.c index 7e4f4ac9265377..b7d6b08fc89e89 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_idc.c +++ b/drivers/net/ethernet/intel/idpf/idpf_idc.c @@ -90,7 +90,10 @@ static int idpf_plug_vport_aux_dev(struct iidc_rdma_core_dev_info *cdev_info, return 0; err_aux_dev_add: + ida_free(&idpf_idc_ida, adev->id); + vdev_info->adev = NULL; auxiliary_device_uninit(adev); + return ret; err_aux_dev_init: ida_free(&idpf_idc_ida, adev->id); err_ida_alloc: @@ -228,7 +231,10 @@ static int idpf_plug_core_aux_dev(struct iidc_rdma_core_dev_info *cdev_info) return 0; err_aux_dev_add: + ida_free(&idpf_idc_ida, adev->id); + cdev_info->adev = NULL; auxiliary_device_uninit(adev); + return ret; err_aux_dev_init: ida_free(&idpf_idc_ida, adev->id); err_ida_alloc: diff --git a/drivers/net/ethernet/intel/idpf/idpf_ptp.c b/drivers/net/ethernet/intel/idpf/idpf_ptp.c index eec91c4f0a75a0..4a51d2727547d9 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_ptp.c +++ b/drivers/net/ethernet/intel/idpf/idpf_ptp.c @@ -952,6 +952,8 @@ int idpf_ptp_init(struct idpf_adapter *adapter) goto free_ptp; } + spin_lock_init(&adapter->ptp->read_dev_clk_lock); + err = idpf_ptp_create_clock(adapter); if (err) goto free_ptp; @@ -977,8 +979,6 @@ int idpf_ptp_init(struct idpf_adapter *adapter) goto remove_clock; } - spin_lock_init(&adapter->ptp->read_dev_clk_lock); - pci_dbg(adapter->pdev, "PTP init successful\n"); return 0; diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c index 3debf2fae1a48e..6f13296303cb54 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c @@ -249,34 +249,21 @@ DEFINE_SHOW_ATTRIBUTE(npc_defrag); int npc_cn20k_debugfs_init(struct rvu *rvu) { struct npc_priv_t *npc_priv = npc_priv_get(); - struct dentry *npc_dentry; - npc_dentry = debugfs_create_file("mcam_layout", 0444, rvu->rvu_dbg.npc, - npc_priv, &npc_mcam_layout_fops); + debugfs_create_file("mcam_layout", 0444, rvu->rvu_dbg.npc, + npc_priv, &npc_mcam_layout_fops); - if (!npc_dentry) - return -EFAULT; + debugfs_create_file("mcam_default", 0444, rvu->rvu_dbg.npc, + rvu, &npc_mcam_default_fops); - npc_dentry = debugfs_create_file("mcam_default", 0444, rvu->rvu_dbg.npc, - rvu, &npc_mcam_default_fops); + debugfs_create_file("vidx2idx", 0444, rvu->rvu_dbg.npc, + npc_priv, &npc_vidx2idx_map_fops); - if (!npc_dentry) - return -EFAULT; + debugfs_create_file("idx2vidx", 0444, rvu->rvu_dbg.npc, + npc_priv, &npc_idx2vidx_map_fops); - npc_dentry = debugfs_create_file("vidx2idx", 0444, rvu->rvu_dbg.npc, - npc_priv, &npc_vidx2idx_map_fops); - if (!npc_dentry) - return -EFAULT; - - npc_dentry = debugfs_create_file("idx2vidx", 0444, rvu->rvu_dbg.npc, - npc_priv, &npc_idx2vidx_map_fops); - if (!npc_dentry) - return -EFAULT; - - npc_dentry = debugfs_create_file("defrag", 0444, rvu->rvu_dbg.npc, - npc_priv, &npc_defrag_fops); - if (!npc_dentry) - return -EFAULT; + debugfs_create_file("defrag", 0444, rvu->rvu_dbg.npc, + npc_priv, &npc_defrag_fops); return 0; } diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c index 7291fdb89b03f6..6b3f453fd5004e 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c @@ -798,7 +798,7 @@ void npc_cn20k_load_mkex_profile(struct rvu *rvu, int blkaddr, iounmap(mkex_prfl_addr); } -void +int npc_cn20k_enable_mcam_entry(struct rvu *rvu, int blkaddr, int index, bool enable) { @@ -808,7 +808,12 @@ npc_cn20k_enable_mcam_entry(struct rvu *rvu, int blkaddr, u64 cfg, hw_prio; u8 kw_type; - npc_mcam_idx_2_key_type(rvu, index, &kw_type); + if (index < 0 || index >= mcam->total_entries) + return -EINVAL; + + if (npc_mcam_idx_2_key_type(rvu, index, &kw_type)) + return -EINVAL; + if (kw_type == NPC_MCAM_KEY_X2) { cfg = rvu_read64(rvu, blkaddr, NPC_AF_CN20K_MCAMEX_BANKX_CFG_EXT(mcam_idx, @@ -819,7 +824,7 @@ npc_cn20k_enable_mcam_entry(struct rvu *rvu, int blkaddr, rvu_write64(rvu, blkaddr, NPC_AF_CN20K_MCAMEX_BANKX_CFG_EXT(mcam_idx, bank), cfg); - return; + return 0; } /* For NPC_CN20K_MCAM_KEY_X4 keys, both the banks @@ -836,10 +841,12 @@ npc_cn20k_enable_mcam_entry(struct rvu *rvu, int blkaddr, NPC_AF_CN20K_MCAMEX_BANKX_CFG_EXT(mcam_idx, bank), cfg); } + + return 0; } -void -npc_cn20k_clear_mcam_entry(struct rvu *rvu, int blkaddr, int bank, int index) +static void +npc_clear_x2_entry(struct rvu *rvu, int blkaddr, int bank, int index) { rvu_write64(rvu, blkaddr, NPC_AF_CN20K_MCAMEX_BANKX_CAMX_INTF_EXT(index, bank, 1), @@ -873,6 +880,33 @@ npc_cn20k_clear_mcam_entry(struct rvu *rvu, int blkaddr, int bank, int index) NPC_AF_CN20K_MCAMEX_BANKX_STAT_EXT(index, bank), 0); } +int +npc_cn20k_clear_mcam_entry(struct rvu *rvu, int blkaddr, int mcam_idx) +{ + struct npc_mcam *mcam = &rvu->hw->mcam; + int bank = npc_get_bank(mcam, mcam_idx); + u8 kw_type; + int index; + + if (npc_mcam_idx_2_key_type(rvu, mcam_idx, &kw_type)) + return -EINVAL; + + index = mcam_idx & (mcam->banksize - 1); + + if (kw_type == NPC_MCAM_KEY_X2) { + npc_clear_x2_entry(rvu, blkaddr, bank, index); + return 0; + } + + /* For NPC_MCAM_KEY_X4 keys, both the banks + * need to be programmed with the same value. + */ + for (bank = 0; bank < mcam->banks_per_entry; bank++) + npc_clear_x2_entry(rvu, blkaddr, bank, index); + + return 0; +} + static void npc_cn20k_get_keyword(struct cn20k_mcam_entry *entry, int idx, u64 *cam0, u64 *cam1) { @@ -1014,48 +1048,27 @@ static void npc_cn20k_config_kw_x4(struct rvu *rvu, struct npc_mcam *mcam, kw, req_kw_type); } -static void -npc_cn20k_set_mcam_bank_cfg(struct rvu *rvu, int blkaddr, int mcam_idx, - int bank, u8 kw_type, bool enable, u8 hw_prio) -{ - struct npc_mcam *mcam = &rvu->hw->mcam; - u64 bank_cfg; - - bank_cfg = (u64)hw_prio << 24; - if (enable) - bank_cfg |= 0x1; - - if (kw_type == NPC_MCAM_KEY_X2) { - rvu_write64(rvu, blkaddr, - NPC_AF_CN20K_MCAMEX_BANKX_CFG_EXT(mcam_idx, bank), - bank_cfg); - return; - } - - /* For NPC_MCAM_KEY_X4 keys, both the banks - * need to be programmed with the same value. - */ - for (bank = 0; bank < mcam->banks_per_entry; bank++) { - rvu_write64(rvu, blkaddr, - NPC_AF_CN20K_MCAMEX_BANKX_CFG_EXT(mcam_idx, bank), - bank_cfg); - } -} - -void npc_cn20k_config_mcam_entry(struct rvu *rvu, int blkaddr, int index, - u8 intf, struct cn20k_mcam_entry *entry, - bool enable, u8 hw_prio, u8 req_kw_type) +int npc_cn20k_config_mcam_entry(struct rvu *rvu, int blkaddr, int index, + u8 intf, struct cn20k_mcam_entry *entry, + bool enable, u8 hw_prio, u8 req_kw_type) { struct npc_mcam *mcam = &rvu->hw->mcam; int mcam_idx = index % mcam->banksize; int bank = index / mcam->banksize; + u64 bank_cfg = (u64)hw_prio << 24; int kw = 0; u8 kw_type; + if (index < 0 || index >= mcam->total_entries) + return -EINVAL; + + if (npc_mcam_idx_2_key_type(rvu, index, &kw_type)) + return -EINVAL; + /* Disable before mcam entry update */ - npc_cn20k_enable_mcam_entry(rvu, blkaddr, index, false); + if (npc_cn20k_enable_mcam_entry(rvu, blkaddr, index, false)) + return -EINVAL; - npc_mcam_idx_2_key_type(rvu, index, &kw_type); /* CAM1 takes the comparison value and * CAM0 specifies match for a bit in key being '0' or '1' or 'dontcare'. * CAM1 = 0 & CAM0 = 1 => match if key = 0 @@ -1064,7 +1077,7 @@ void npc_cn20k_config_mcam_entry(struct rvu *rvu, int blkaddr, int index, */ if (kw_type == NPC_MCAM_KEY_X2) { /* Clear mcam entry to avoid writes being suppressed by NPC */ - npc_cn20k_clear_mcam_entry(rvu, blkaddr, bank, mcam_idx); + npc_clear_x2_entry(rvu, blkaddr, bank, mcam_idx); npc_cn20k_config_kw_x2(rvu, mcam, blkaddr, mcam_idx, intf, entry, bank, kw_type, kw, req_kw_type); @@ -1085,44 +1098,55 @@ void npc_cn20k_config_mcam_entry(struct rvu *rvu, int blkaddr, int index, NPC_AF_CN20K_MCAMEX_BANKX_ACTIONX_EXT(mcam_idx, bank, 1), entry->vtag_action); - goto set_cfg; - } - - /* Clear mcam entry to avoid writes being suppressed by NPC */ - npc_cn20k_clear_mcam_entry(rvu, blkaddr, 0, mcam_idx); - npc_cn20k_clear_mcam_entry(rvu, blkaddr, 1, mcam_idx); - npc_cn20k_config_kw_x4(rvu, mcam, blkaddr, - mcam_idx, intf, entry, - kw_type, req_kw_type); - for (bank = 0; bank < mcam->banks_per_entry; bank++) { - /* Set 'action' */ + /* Set HW priority */ rvu_write64(rvu, blkaddr, - NPC_AF_CN20K_MCAMEX_BANKX_ACTIONX_EXT(mcam_idx, - bank, 0), - entry->action); + NPC_AF_CN20K_MCAMEX_BANKX_CFG_EXT(mcam_idx, bank), + bank_cfg); - /* Set TAG 'action' */ - rvu_write64(rvu, blkaddr, - NPC_AF_CN20K_MCAMEX_BANKX_ACTIONX_EXT(mcam_idx, - bank, 1), - entry->vtag_action); + } else { + /* Clear mcam entry to avoid writes being suppressed by NPC */ + npc_clear_x2_entry(rvu, blkaddr, 0, mcam_idx); + npc_clear_x2_entry(rvu, blkaddr, 1, mcam_idx); - /* Set 'action2' for inline receive */ - rvu_write64(rvu, blkaddr, - NPC_AF_CN20K_MCAMEX_BANKX_ACTIONX_EXT(mcam_idx, - bank, 2), - entry->action2); + npc_cn20k_config_kw_x4(rvu, mcam, blkaddr, + mcam_idx, intf, entry, + kw_type, req_kw_type); + for (bank = 0; bank < mcam->banks_per_entry; bank++) { + /* Set 'action' */ + rvu_write64(rvu, blkaddr, + NPC_AF_CN20K_MCAMEX_BANKX_ACTIONX_EXT(mcam_idx, + bank, 0), + entry->action); + + /* Set TAG 'action' */ + rvu_write64(rvu, blkaddr, + NPC_AF_CN20K_MCAMEX_BANKX_ACTIONX_EXT(mcam_idx, + bank, 1), + entry->vtag_action); + + /* Set 'action2' for inline receive */ + rvu_write64(rvu, blkaddr, + NPC_AF_CN20K_MCAMEX_BANKX_ACTIONX_EXT(mcam_idx, + bank, 2), + entry->action2); + + /* Set HW priority */ + rvu_write64(rvu, blkaddr, + NPC_AF_CN20K_MCAMEX_BANKX_CFG_EXT(mcam_idx, bank), + bank_cfg); + } } -set_cfg: /* TODO: */ /* PF installing VF rule */ - npc_cn20k_set_mcam_bank_cfg(rvu, blkaddr, mcam_idx, bank, - kw_type, enable, hw_prio); + if (npc_cn20k_enable_mcam_entry(rvu, blkaddr, index, enable)) + return -EINVAL; + + return 0; } -void npc_cn20k_copy_mcam_entry(struct rvu *rvu, int blkaddr, u16 src, u16 dest) +int npc_cn20k_copy_mcam_entry(struct rvu *rvu, int blkaddr, u16 src, u16 dest) { struct npc_mcam *mcam = &rvu->hw->mcam; u64 cfg, sreg, dreg, soff, doff; @@ -1130,12 +1154,20 @@ void npc_cn20k_copy_mcam_entry(struct rvu *rvu, int blkaddr, u16 src, u16 dest) int bank, i, sb, db; int dbank, sbank; + if (src >= mcam->total_entries || dest >= mcam->total_entries) + return -EINVAL; + dbank = npc_get_bank(mcam, dest); sbank = npc_get_bank(mcam, src); - npc_mcam_idx_2_key_type(rvu, src, &src_kwtype); - npc_mcam_idx_2_key_type(rvu, dest, &dest_kwtype); + + if (npc_mcam_idx_2_key_type(rvu, src, &src_kwtype)) + return -EINVAL; + + if (npc_mcam_idx_2_key_type(rvu, dest, &dest_kwtype)) + return -EINVAL; + if (src_kwtype != dest_kwtype) - return; + return -EINVAL; src &= (mcam->banksize - 1); dest &= (mcam->banksize - 1); @@ -1170,6 +1202,8 @@ void npc_cn20k_copy_mcam_entry(struct rvu *rvu, int blkaddr, u16 src, u16 dest) if (src_kwtype == NPC_MCAM_KEY_X2) break; } + + return 0; } static void npc_cn20k_fill_entryword(struct cn20k_mcam_entry *entry, int idx, @@ -1179,20 +1213,36 @@ static void npc_cn20k_fill_entryword(struct cn20k_mcam_entry *entry, int idx, entry->kw_mask[idx] = cam1 ^ cam0; } -void npc_cn20k_read_mcam_entry(struct rvu *rvu, int blkaddr, u16 index, - struct cn20k_mcam_entry *entry, - u8 *intf, u8 *ena, u8 *hw_prio) +int npc_cn20k_read_mcam_entry(struct rvu *rvu, int blkaddr, u16 index, + struct cn20k_mcam_entry *entry, + u8 *intf, u8 *ena, u8 *hw_prio) { struct npc_mcam *mcam = &rvu->hw->mcam; u64 cam0, cam1, bank_cfg, cfg; int kw = 0, bank; u8 kw_type; - npc_mcam_idx_2_key_type(rvu, index, &kw_type); + if (index >= mcam->total_entries) + return -EINVAL; + + if (npc_mcam_idx_2_key_type(rvu, index, &kw_type)) + return -EINVAL; bank = npc_get_bank(mcam, index); index &= (mcam->banksize - 1); + cfg = rvu_read64(rvu, blkaddr, + NPC_AF_CN20K_MCAMEX_BANKX_ACTIONX_EXT(index, bank, 0)); + entry->action = cfg; + + cfg = rvu_read64(rvu, blkaddr, + NPC_AF_CN20K_MCAMEX_BANKX_ACTIONX_EXT(index, bank, 1)); + entry->vtag_action = cfg; + + cfg = rvu_read64(rvu, blkaddr, + NPC_AF_CN20K_MCAMEX_BANKX_ACTIONX_EXT(index, bank, 2)); + entry->action2 = cfg; + cfg = rvu_read64(rvu, blkaddr, NPC_AF_CN20K_MCAMEX_BANKX_CAMX_INTF_EXT(index, bank, 1)) & 3; @@ -1242,7 +1292,7 @@ void npc_cn20k_read_mcam_entry(struct rvu *rvu, int blkaddr, u16 index, bank, 0)); npc_cn20k_fill_entryword(entry, kw + 3, cam0, cam1); - goto read_action; + return 0; } for (bank = 0; bank < mcam->banks_per_entry; bank++, kw = kw + 4) { @@ -1287,17 +1337,7 @@ void npc_cn20k_read_mcam_entry(struct rvu *rvu, int blkaddr, u16 index, npc_cn20k_fill_entryword(entry, kw + 3, cam0, cam1); } -read_action: - /* 'action' is set to same value for both bank '0' and '1'. - * Hence, reading bank '0' should be enough. - */ - cfg = rvu_read64(rvu, blkaddr, - NPC_AF_CN20K_MCAMEX_BANKX_ACTIONX_EXT(index, 0, 0)); - entry->action = cfg; - - cfg = rvu_read64(rvu, blkaddr, - NPC_AF_CN20K_MCAMEX_BANKX_ACTIONX_EXT(index, 0, 1)); - entry->vtag_action = cfg; + return 0; } int rvu_mbox_handler_npc_cn20k_mcam_write_entry(struct rvu *rvu, @@ -1335,11 +1375,10 @@ int rvu_mbox_handler_npc_cn20k_mcam_write_entry(struct rvu *rvu, if (is_pffunc_af(req->hdr.pcifunc)) nix_intf = req->intf; - npc_cn20k_config_mcam_entry(rvu, blkaddr, req->entry, nix_intf, - &req->entry_data, req->enable_entry, - req->hw_prio, req->req_kw_type); + rc = npc_cn20k_config_mcam_entry(rvu, blkaddr, req->entry, nix_intf, + &req->entry_data, req->enable_entry, + req->hw_prio, req->req_kw_type); - rc = 0; exit: mutex_unlock(&mcam->lock); return rc; @@ -1361,11 +1400,13 @@ int rvu_mbox_handler_npc_cn20k_mcam_read_entry(struct rvu *rvu, mutex_lock(&mcam->lock); rc = npc_mcam_verify_entry(mcam, pcifunc, req->entry); - if (!rc) - npc_cn20k_read_mcam_entry(rvu, blkaddr, req->entry, - &rsp->entry_data, &rsp->intf, - &rsp->enable, &rsp->hw_prio); + if (rc) + goto fail; + rc = npc_cn20k_read_mcam_entry(rvu, blkaddr, req->entry, + &rsp->entry_data, &rsp->intf, + &rsp->enable, &rsp->hw_prio); +fail: mutex_unlock(&mcam->lock); return rc; } @@ -1375,11 +1416,13 @@ int rvu_mbox_handler_npc_cn20k_mcam_alloc_and_write_entry(struct rvu *rvu, struct npc_mcam_alloc_and_write_entry_rsp *rsp) { struct rvu_pfvf *pfvf = rvu_get_pfvf(rvu, req->hdr.pcifunc); + struct npc_mcam_free_entry_req free_req = { 0 }; struct npc_mcam_alloc_entry_req entry_req; struct npc_mcam_alloc_entry_rsp entry_rsp; struct npc_mcam *mcam = &rvu->hw->mcam; u16 entry = NPC_MCAM_ENTRY_INVALID; - int blkaddr, rc; + struct msg_rsp free_rsp; + int blkaddr, rc, err; u8 nix_intf; blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPC, 0); @@ -1415,12 +1458,23 @@ int rvu_mbox_handler_npc_cn20k_mcam_alloc_and_write_entry(struct rvu *rvu, else nix_intf = pfvf->nix_rx_intf; - npc_cn20k_config_mcam_entry(rvu, blkaddr, entry, nix_intf, - &req->entry_data, req->enable_entry, - req->hw_prio, req->req_kw_type); + rc = npc_cn20k_config_mcam_entry(rvu, blkaddr, entry, nix_intf, + &req->entry_data, req->enable_entry, + req->hw_prio, req->req_kw_type); mutex_unlock(&mcam->lock); + if (rc) { + free_req.hdr.pcifunc = req->hdr.pcifunc; + free_req.entry = entry_rsp.entry; + err = rvu_mbox_handler_npc_mcam_free_entry(rvu, &free_req, &free_rsp); + if (err) + dev_err(rvu->dev, + "%s: Error to free mcam idx %u\n", + __func__, entry_rsp.entry); + return rc; + } + rsp->entry = entry_rsp.entry; return 0; } @@ -1480,9 +1534,9 @@ int rvu_mbox_handler_npc_cn20k_read_base_steer_rule(struct rvu *rvu, read_entry: /* Read the mcam entry */ - npc_cn20k_read_mcam_entry(rvu, blkaddr, index, - &rsp->entry, &intf, - &enable, &hw_prio); + rc = npc_cn20k_read_mcam_entry(rvu, blkaddr, index, + &rsp->entry, &intf, + &enable, &hw_prio); mutex_unlock(&mcam->lock); out: return rc; @@ -2305,6 +2359,7 @@ static int __npc_subbank_alloc(struct rvu *rvu, struct npc_subbank *sb, __npc_subbank_mark_free(rvu, sb); err1: kfree(save); + *alloc_cnt = 0; return rc; } @@ -3482,7 +3537,7 @@ static int npc_defrag_alloc_free_slots(struct rvu *rvu, { int alloc_cnt1, alloc_cnt2; struct npc_subbank *sb; - int rc, sb_off, i; + int rc, sb_off, i, err; bool deleted; sb = &npc_priv.sb[f->idx]; @@ -3496,6 +3551,7 @@ static int npc_defrag_alloc_free_slots(struct rvu *rvu, NPC_MCAM_LOWER_PRIO, false, cnt, save, cnt, true, &alloc_cnt1); + if (alloc_cnt1 < cnt) { rc = __npc_subbank_alloc(rvu, sb, NPC_MCAM_KEY_X2, sb->b1b, @@ -3511,15 +3567,17 @@ static int npc_defrag_alloc_free_slots(struct rvu *rvu, dev_err(rvu->dev, "%s: Failed to alloc cnt=%u alloc_cnt1=%u alloc_cnt2=%u\n", __func__, cnt, alloc_cnt1, alloc_cnt2); + rc = -ENOSPC; goto fail_free_alloc; } + return 0; fail_free_alloc: for (i = 0; i < alloc_cnt1 + alloc_cnt2; i++) { - rc = npc_mcam_idx_2_subbank_idx(rvu, save[i], - &sb, &sb_off); - if (rc) { + err = npc_mcam_idx_2_subbank_idx(rvu, save[i], + &sb, &sb_off); + if (err) { dev_err(rvu->dev, "%s: Error to find subbank for mcam idx=%u\n", __func__, save[i]); @@ -3565,9 +3623,10 @@ int npc_defrag_move_vdx_to_free(struct rvu *rvu, struct npc_defrag_node *v, int cnt, u16 *save) { + u16 new_midx, old_midx, vidx, target_pf; struct npc_mcam *mcam = &rvu->hw->mcam; + struct rvu_npc_mcam_rule *rule, *tmp; int i, vidx_cnt, rc, sb_off; - u16 new_midx, old_midx, vidx; struct npc_subbank *sb; bool deleted; u16 pcifunc; @@ -3607,9 +3666,30 @@ int npc_defrag_move_vdx_to_free(struct rvu *rvu, NPC_AF_CN20K_MCAMEX_BANKX_STAT_EXT(midx, bank)); - npc_cn20k_enable_mcam_entry(rvu, blkaddr, old_midx, false); - npc_cn20k_copy_mcam_entry(rvu, blkaddr, old_midx, new_midx); - npc_cn20k_enable_mcam_entry(rvu, blkaddr, new_midx, true); + /* If bug happened during copy/enable mcam, then there is a bug in allocation + * algorithm itself. There is no point in rewinding and returning, as it + * will face further issue. Return error after printing error + */ + if (npc_cn20k_enable_mcam_entry(rvu, blkaddr, old_midx, false)) { + dev_err(rvu->dev, + "%s: Error happened while disabling old_mid=%u\n", + __func__, old_midx); + return -EFAULT; + } + + if (npc_cn20k_copy_mcam_entry(rvu, blkaddr, old_midx, new_midx)) { + dev_err(rvu->dev, + "%s: Error happened while copying old_midx=%u new_midx=%u\n", + __func__, old_midx, new_midx); + return -EFAULT; + } + + if (npc_cn20k_enable_mcam_entry(rvu, blkaddr, new_midx, true)) { + dev_err(rvu->dev, + "%s: Error happened while enabling new_mid=%u\n", + __func__, new_midx); + return -EFAULT; + } midx = new_midx % mcam->banksize; bank = new_midx / mcam->banksize; @@ -3665,8 +3745,21 @@ int npc_defrag_move_vdx_to_free(struct rvu *rvu, mcam->entry2pfvf_map[new_midx] = pcifunc; /* Counter is not preserved */ mcam->entry2cntr_map[new_midx] = new_midx; + target_pf = mcam->entry2target_pffunc[old_midx]; + mcam->entry2target_pffunc[new_midx] = target_pf; + mcam->entry2target_pffunc[old_midx] = NPC_MCAM_INVALID_MAP; + npc_mcam_set_bit(mcam, new_midx); + /* Note: list order is not functionally required for mcam_rules */ + list_for_each_entry_safe(rule, tmp, &mcam->mcam_rules, list) { + if (rule->entry != old_midx) + continue; + + rule->entry = new_midx; + break; + } + /* Mark as invalid */ v->vidx[vidx_cnt - i - 1] = -1; save[cnt - i - 1] = -1; @@ -3935,6 +4028,13 @@ int npc_cn20k_dft_rules_idx_get(struct rvu *rvu, u16 pcifunc, u16 *bcast, void *val; int i, j; + for (i = 0; i < ARRAY_SIZE(ptr); i++) { + if (!ptr[i]) + continue; + + *ptr[i] = USHRT_MAX; + } + if (!npc_priv.init_done) return 0; @@ -3950,7 +4050,6 @@ int npc_cn20k_dft_rules_idx_get(struct rvu *rvu, u16 pcifunc, u16 *bcast, npc_dft_rule_name[NPC_DFT_RULE_PROMISC_ID], pcifunc); - *ptr[0] = USHRT_MAX; return -ESRCH; } @@ -3970,7 +4069,6 @@ int npc_cn20k_dft_rules_idx_get(struct rvu *rvu, u16 pcifunc, u16 *bcast, npc_dft_rule_name[NPC_DFT_RULE_UCAST_ID], pcifunc); - *ptr[3] = USHRT_MAX; return -ESRCH; } @@ -3990,7 +4088,6 @@ int npc_cn20k_dft_rules_idx_get(struct rvu *rvu, u16 pcifunc, u16 *bcast, __func__, npc_dft_rule_name[i], pcifunc); - *ptr[j] = USHRT_MAX; continue; } @@ -4085,7 +4182,7 @@ int rvu_mbox_handler_npc_get_dft_rl_idxs(struct rvu *rvu, struct msg_req *req, return 0; } -static bool npc_is_cgx_or_lbk(struct rvu *rvu, u16 pcifunc) +bool npc_is_cgx_or_lbk(struct rvu *rvu, u16 pcifunc) { return is_pf_cgxmapped(rvu, rvu_get_pf(rvu->pdev, pcifunc)) || is_lbk_vf(rvu, pcifunc); @@ -4093,11 +4190,11 @@ static bool npc_is_cgx_or_lbk(struct rvu *rvu, u16 pcifunc) void npc_cn20k_dft_rules_free(struct rvu *rvu, u16 pcifunc) { - struct npc_mcam_free_entry_req free_req = { 0 }; + struct npc_mcam *mcam = &rvu->hw->mcam; + u16 ptr[4] = {[0 ... 3] = USHRT_MAX}; + struct rvu_npc_mcam_rule *rule, *tmp; unsigned long index; - struct msg_rsp rsp; - u16 ptr[4]; - int rc, i; + int blkaddr, rc, i; void *map; if (!npc_priv.init_done) @@ -4155,14 +4252,43 @@ void npc_cn20k_dft_rules_free(struct rvu *rvu, u16 pcifunc) } free_rules: + blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPC, 0); + if (blkaddr < 0) + return; + for (int i = 0; i < 4; i++) { + if (ptr[i] == USHRT_MAX) + continue; - free_req.hdr.pcifunc = pcifunc; - free_req.all = 1; - rc = rvu_mbox_handler_npc_mcam_free_entry(rvu, &free_req, &rsp); - if (rc) - dev_err(rvu->dev, - "%s: Error deleting default entries (pcifunc=%#x\n", - __func__, pcifunc); + mutex_lock(&mcam->lock); + npc_mcam_clear_bit(mcam, ptr[i]); + mcam->entry2pfvf_map[ptr[i]] = NPC_MCAM_INVALID_MAP; + npc_cn20k_enable_mcam_entry(rvu, blkaddr, ptr[i], false); + mcam->entry2target_pffunc[ptr[i]] = 0x0; + mutex_unlock(&mcam->lock); + + rc = npc_cn20k_idx_free(rvu, &ptr[i], 1); + if (rc) { + /* Non recoverable error. Let us WARN and return. Keep system alive to + * enable debugging + */ + WARN(1, "%s Error deleting default entries (pcifunc=%#x) mcam_idx=%u\n", + __func__, pcifunc, ptr[i]); + return; + } + } + + mutex_lock(&mcam->lock); + list_for_each_entry_safe(rule, tmp, &mcam->mcam_rules, list) { + for (int i = 0; i < 4; i++) { + if (ptr[i] != rule->entry) + continue; + + list_del(&rule->list); + kfree(rule); + break; + } + } + mutex_unlock(&mcam->lock); } int npc_cn20k_dft_rules_alloc(struct rvu *rvu, u16 pcifunc) diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h index 815d0b257a7e11..3d5eb952cc0736 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h +++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h @@ -320,21 +320,21 @@ void npc_cn20k_dft_rules_free(struct rvu *rvu, u16 pcifunc); int npc_cn20k_dft_rules_idx_get(struct rvu *rvu, u16 pcifunc, u16 *bcast, u16 *mcast, u16 *promisc, u16 *ucast); -void npc_cn20k_config_mcam_entry(struct rvu *rvu, int blkaddr, int index, - u8 intf, struct cn20k_mcam_entry *entry, - bool enable, u8 hw_prio, u8 req_kw_type); -void npc_cn20k_enable_mcam_entry(struct rvu *rvu, int blkaddr, - int index, bool enable); -void npc_cn20k_copy_mcam_entry(struct rvu *rvu, int blkaddr, - u16 src, u16 dest); -void npc_cn20k_read_mcam_entry(struct rvu *rvu, int blkaddr, u16 index, - struct cn20k_mcam_entry *entry, u8 *intf, - u8 *ena, u8 *hw_prio); -void npc_cn20k_clear_mcam_entry(struct rvu *rvu, int blkaddr, - int bank, int index); +int npc_cn20k_config_mcam_entry(struct rvu *rvu, int blkaddr, int index, + u8 intf, struct cn20k_mcam_entry *entry, + bool enable, u8 hw_prio, u8 req_kw_type); +int npc_cn20k_enable_mcam_entry(struct rvu *rvu, int blkaddr, + int index, bool enable); +int npc_cn20k_copy_mcam_entry(struct rvu *rvu, int blkaddr, + u16 src, u16 dest); +int npc_cn20k_read_mcam_entry(struct rvu *rvu, int blkaddr, u16 index, + struct cn20k_mcam_entry *entry, u8 *intf, + u8 *ena, u8 *hw_prio); +int npc_cn20k_clear_mcam_entry(struct rvu *rvu, int blkaddr, int index); int npc_mcam_idx_2_key_type(struct rvu *rvu, u16 mcam_idx, u8 *key_type); u16 npc_cn20k_vidx2idx(u16 index); u16 npc_cn20k_idx2vidx(u16 idx); int npc_cn20k_defrag(struct rvu *rvu); +bool npc_is_cgx_or_lbk(struct rvu *rvu, u16 pcifunc); #endif /* NPC_CN20K_H */ diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c index ef5b081162ebfb..f977734ae712c6 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c @@ -3577,6 +3577,9 @@ static int nix_update_mce_rule(struct rvu *rvu, u16 pcifunc, mcam_index = npc_get_nixlf_mcam_index(mcam, pcifunc & ~RVU_PFVF_FUNC_MASK, nixlf, type); + if (mcam_index < 0) + return -EINVAL; + err = nix_update_mce_list(rvu, pcifunc, mce_list, mce_idx, mcam_index, add); return err; diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c index c2ca5ed1d028a4..3c814d157ab988 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c @@ -163,14 +163,35 @@ int npc_get_nixlf_mcam_index(struct npc_mcam *mcam, if (rc) return -EFAULT; + if (is_lbk_vf(rvu, pcifunc)) { + if (promisc == USHRT_MAX) + return -EINVAL; + return promisc; + } + + if (is_cgx_vf(rvu, pcifunc)) { + if (ucast == USHRT_MAX) + return -EINVAL; + + return ucast; + } + switch (type) { case NIXLF_BCAST_ENTRY: + if (bcast == USHRT_MAX) + return -EINVAL; return bcast; case NIXLF_ALLMULTI_ENTRY: + if (mcast == USHRT_MAX) + return -EINVAL; return mcast; case NIXLF_PROMISC_ENTRY: + if (promisc == USHRT_MAX) + return -EINVAL; return promisc; case NIXLF_UCAST_ENTRY: + if (ucast == USHRT_MAX) + return -EINVAL; return ucast; default: return -EINVAL; @@ -238,10 +259,10 @@ void npc_enable_mcam_entry(struct rvu *rvu, struct npc_mcam *mcam, int actbank = bank; if (is_cn20k(rvu->pdev)) { - if (index < 0 || index >= mcam->banksize * mcam->banks) - return; - - return npc_cn20k_enable_mcam_entry(rvu, blkaddr, index, enable); + if (npc_cn20k_enable_mcam_entry(rvu, blkaddr, index, enable)) + dev_err(rvu->dev, "Error to %s mcam %u entry\n", + enable ? "enable" : "disable", index); + return; } index &= (mcam->banksize - 1); @@ -258,6 +279,13 @@ static void npc_clear_mcam_entry(struct rvu *rvu, struct npc_mcam *mcam, int bank = npc_get_bank(mcam, index); int actbank = bank; + if (is_cn20k(rvu->pdev)) { + if (npc_cn20k_clear_mcam_entry(rvu, blkaddr, index)) + dev_err(rvu->dev, "%s Failed to clear mcam %u\n", + __func__, index); + return; + } + index &= (mcam->banksize - 1); for (; bank < (actbank + mcam->banks_per_entry); bank++) { rvu_write64(rvu, blkaddr, @@ -424,6 +452,15 @@ static u64 npc_get_default_entry_action(struct rvu *rvu, struct npc_mcam *mcam, index = npc_get_nixlf_mcam_index(mcam, pf_func, nixlf, NIXLF_UCAST_ENTRY); + + if (index < 0) { + dev_err(rvu->dev, + "%s: failed to get ucast entry pcifunc:0x%x\n", + __func__, pf_func); + /* Action 0 is drop */ + return 0; + } + bank = npc_get_bank(mcam, index); index &= (mcam->banksize - 1); @@ -589,8 +626,8 @@ void npc_read_mcam_entry(struct rvu *rvu, struct npc_mcam *mcam, NPC_AF_MCAMEX_BANKX_CFG(src, sbank)) & 1; } -static void npc_copy_mcam_entry(struct rvu *rvu, struct npc_mcam *mcam, - int blkaddr, u16 src, u16 dest) +static int npc_copy_mcam_entry(struct rvu *rvu, struct npc_mcam *mcam, + int blkaddr, u16 src, u16 dest) { int dbank = npc_get_bank(mcam, dest); int sbank = npc_get_bank(mcam, src); @@ -630,6 +667,7 @@ static void npc_copy_mcam_entry(struct rvu *rvu, struct npc_mcam *mcam, NPC_AF_MCAMEX_BANKX_CFG(src, sbank)); rvu_write64(rvu, blkaddr, NPC_AF_MCAMEX_BANKX_CFG(dest, dbank), cfg); + return 0; } u64 npc_get_mcam_action(struct rvu *rvu, struct npc_mcam *mcam, @@ -689,6 +727,12 @@ void rvu_npc_install_ucast_entry(struct rvu *rvu, u16 pcifunc, index = npc_get_nixlf_mcam_index(mcam, pcifunc, nixlf, NIXLF_UCAST_ENTRY); + if (index < 0) { + dev_err(rvu->dev, + "%s: Error to get ucast entry for pcifunc=%#x\n", + __func__, pcifunc); + return; + } /* Don't change the action if entry is already enabled * Otherwise RSS action may get overwritten. @@ -744,16 +788,38 @@ void rvu_npc_install_promisc_entry(struct rvu *rvu, u16 pcifunc, index = npc_get_nixlf_mcam_index(mcam, pcifunc, nixlf, NIXLF_PROMISC_ENTRY); + /* In cn20k, default indexes are installed only for CGX mapped + * and lbk interfaces + */ if (is_cgx_vf(rvu, pcifunc)) index = npc_get_nixlf_mcam_index(mcam, pcifunc & ~RVU_PFVF_FUNC_MASK, nixlf, NIXLF_PROMISC_ENTRY); + if (index < 0) { + dev_err(rvu->dev, + "%s: Error to get promisc entry for pcifunc=%#x\n", + __func__, pcifunc); + return; + } + /* If the corresponding PF's ucast action is RSS, * use the same action for promisc also + * Please note that for lbk(s) "index" and "ucast_idx" + * will be same. */ - ucast_idx = npc_get_nixlf_mcam_index(mcam, pcifunc, - nixlf, NIXLF_UCAST_ENTRY); + if (is_lbk_vf(rvu, pcifunc)) + ucast_idx = index; + else + ucast_idx = npc_get_nixlf_mcam_index(mcam, pcifunc, + nixlf, NIXLF_UCAST_ENTRY); + if (ucast_idx < 0) { + dev_err(rvu->dev, + "%s: Error to get ucast/promisc entry for pcifunc=%#x\n", + __func__, pcifunc); + return; + } + if (is_mcam_entry_enabled(rvu, mcam, blkaddr, ucast_idx)) *(u64 *)&action = npc_get_mcam_action(rvu, mcam, blkaddr, ucast_idx); @@ -827,6 +893,14 @@ void rvu_npc_enable_promisc_entry(struct rvu *rvu, u16 pcifunc, index = npc_get_nixlf_mcam_index(mcam, pcifunc, nixlf, NIXLF_PROMISC_ENTRY); + + if (index < 0) { + dev_err(rvu->dev, + "%s: Error to get promisc entry for pcifunc=%#x\n", + __func__, pcifunc); + return; + } + npc_enable_mcam_entry(rvu, mcam, blkaddr, index, enable); } @@ -867,6 +941,12 @@ void rvu_npc_install_bcast_match_entry(struct rvu *rvu, u16 pcifunc, index = npc_get_nixlf_mcam_index(mcam, pcifunc, nixlf, NIXLF_BCAST_ENTRY); + if (index < 0) { + dev_err(rvu->dev, + "%s: Error to get bcast entry for pcifunc=%#x\n", + __func__, pcifunc); + return; + } if (!hw->cap.nix_rx_multicast) { /* Early silicon doesn't support pkt replication, @@ -931,12 +1011,25 @@ void rvu_npc_install_allmulti_entry(struct rvu *rvu, u16 pcifunc, int nixlf, index = npc_get_nixlf_mcam_index(mcam, pcifunc, nixlf, NIXLF_ALLMULTI_ENTRY); + if (index < 0) { + dev_err(rvu->dev, + "%s: Error to get mcast entry for pcifunc=%#x\n", + __func__, pcifunc); + return; + } /* If the corresponding PF's ucast action is RSS, * use the same action for multicast entry also */ ucast_idx = npc_get_nixlf_mcam_index(mcam, pcifunc, nixlf, NIXLF_UCAST_ENTRY); + if (ucast_idx < 0) { + dev_err(rvu->dev, + "%s: Error to get ucast entry for pcifunc=%#x\n", + __func__, pcifunc); + return; + } + if (is_mcam_entry_enabled(rvu, mcam, blkaddr, ucast_idx)) *(u64 *)&action = npc_get_mcam_action(rvu, mcam, blkaddr, ucast_idx); @@ -1001,6 +1094,13 @@ void rvu_npc_enable_allmulti_entry(struct rvu *rvu, u16 pcifunc, int nixlf, index = npc_get_nixlf_mcam_index(mcam, pcifunc, nixlf, NIXLF_ALLMULTI_ENTRY); + if (index < 0) { + dev_err(rvu->dev, + "%s: Error to get mcast entry for pcifunc=%#x\n", + __func__, pcifunc); + return; + } + npc_enable_mcam_entry(rvu, mcam, blkaddr, index, enable); } @@ -1113,8 +1213,12 @@ void rvu_npc_update_flowkey_alg_idx(struct rvu *rvu, u16 pcifunc, int nixlf, index = mcam_index; } - if (index >= mcam->total_entries) + if (index < 0 || index >= mcam->total_entries) { + dev_err(rvu->dev, + "%s: Invalid mcam index, pcifunc=%#x\n", + __func__, pcifunc); return; + } bank = npc_get_bank(mcam, index); index &= (mcam->banksize - 1); @@ -1158,16 +1262,18 @@ void rvu_npc_update_flowkey_alg_idx(struct rvu *rvu, u16 pcifunc, int nixlf, /* If PF's promiscuous entry is enabled, * Set RSS action for that entry as well */ - npc_update_rx_action_with_alg_idx(rvu, action, pfvf, index, - blkaddr, alg_idx); + if (index >= 0) + npc_update_rx_action_with_alg_idx(rvu, action, pfvf, index, + blkaddr, alg_idx); index = npc_get_nixlf_mcam_index(mcam, pcifunc, nixlf, NIXLF_ALLMULTI_ENTRY); /* If PF's allmulti entry is enabled, * Set RSS action for that entry as well */ - npc_update_rx_action_with_alg_idx(rvu, action, pfvf, index, - blkaddr, alg_idx); + if (index >= 0) + npc_update_rx_action_with_alg_idx(rvu, action, pfvf, index, + blkaddr, alg_idx); } } @@ -1180,12 +1286,22 @@ void npc_enadis_default_mce_entry(struct rvu *rvu, u16 pcifunc, int index, blkaddr, mce_idx; struct rvu_pfvf *pfvf; + /* multicast pkt replication is not enabled for AF's VFs & SDP links */ + if (is_lbk_vf(rvu, pcifunc) || is_sdp_pfvf(rvu, pcifunc)) + return; + blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPC, 0); if (blkaddr < 0) return; index = npc_get_nixlf_mcam_index(mcam, pcifunc & ~RVU_PFVF_FUNC_MASK, nixlf, type); + if (index < 0) { + dev_err(rvu->dev, + "%s: Error to get entry for pcifunc=%#x, type=%u\n", + __func__, pcifunc, type); + return; + } /* disable MCAM entry when packet replication is not supported by hw */ if (!hw->cap.nix_rx_multicast && !is_vf(pcifunc)) { @@ -1214,6 +1330,10 @@ static void npc_enadis_default_entries(struct rvu *rvu, u16 pcifunc, struct npc_mcam *mcam = &rvu->hw->mcam; int index, blkaddr; + /* only CGX or LBK interfaces have default entries */ + if (is_cn20k(rvu->pdev) && !npc_is_cgx_or_lbk(rvu, pcifunc)) + return; + blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPC, 0); if (blkaddr < 0) return; @@ -1223,6 +1343,12 @@ static void npc_enadis_default_entries(struct rvu *rvu, u16 pcifunc, pfvf->nix_rx_intf)) { index = npc_get_nixlf_mcam_index(mcam, pcifunc, nixlf, NIXLF_UCAST_ENTRY); + if (index < 0) { + dev_err(rvu->dev, + "%s: Error to get ucast entry for pcifunc=%#x\n", + __func__, pcifunc); + return; + } npc_enable_mcam_entry(rvu, mcam, blkaddr, index, enable); } @@ -2504,33 +2630,58 @@ void npc_mcam_clear_bit(struct npc_mcam *mcam, u16 index) static void npc_mcam_free_all_entries(struct rvu *rvu, struct npc_mcam *mcam, int blkaddr, u16 pcifunc) { + u16 dft_idxs[NPC_DFT_RULE_MAX_ID] = {[0 ... NPC_DFT_RULE_MAX_ID - 1] = USHRT_MAX}; + bool cn20k_dft_rl; u16 index, cntr; int rc; + npc_cn20k_dft_rules_idx_get(rvu, pcifunc, + &dft_idxs[NPC_DFT_RULE_BCAST_ID], + &dft_idxs[NPC_DFT_RULE_MCAST_ID], + &dft_idxs[NPC_DFT_RULE_PROMISC_ID], + &dft_idxs[NPC_DFT_RULE_UCAST_ID]); + /* Scan all MCAM entries and free the ones mapped to 'pcifunc' */ for (index = 0; index < mcam->bmap_entries; index++) { - if (mcam->entry2pfvf_map[index] == pcifunc) { + if (mcam->entry2pfvf_map[index] != pcifunc) + continue; + + cn20k_dft_rl = false; + + if (is_cn20k(rvu->pdev)) { + if (dft_idxs[NPC_DFT_RULE_BCAST_ID] == index || + dft_idxs[NPC_DFT_RULE_MCAST_ID] == index || + dft_idxs[NPC_DFT_RULE_PROMISC_ID] == index || + dft_idxs[NPC_DFT_RULE_UCAST_ID] == index) { + cn20k_dft_rl = true; + } + } + + /* Disable the entry */ + npc_enable_mcam_entry(rvu, mcam, blkaddr, index, false); + + if (!cn20k_dft_rl) { mcam->entry2pfvf_map[index] = NPC_MCAM_INVALID_MAP; /* Free the entry in bitmap */ npc_mcam_clear_bit(mcam, index); - /* Disable the entry */ - npc_enable_mcam_entry(rvu, mcam, blkaddr, index, false); - - /* Update entry2counter mapping */ - cntr = mcam->entry2cntr_map[index]; - if (cntr != NPC_MCAM_INVALID_MAP) - npc_unmap_mcam_entry_and_cntr(rvu, mcam, - blkaddr, index, - cntr); mcam->entry2target_pffunc[index] = 0x0; - if (is_cn20k(rvu->pdev)) { - rc = npc_cn20k_idx_free(rvu, &index, 1); - if (rc) - dev_err(rvu->dev, - "Failed to free mcam idx=%u pcifunc=%#x\n", - index, pcifunc); - } } + + /* Update entry2counter mapping */ + cntr = mcam->entry2cntr_map[index]; + if (cntr != NPC_MCAM_INVALID_MAP) + npc_unmap_mcam_entry_and_cntr(rvu, mcam, + blkaddr, index, + cntr); + + if (!is_cn20k(rvu->pdev) || cn20k_dft_rl) + continue; + + rc = npc_cn20k_idx_free(rvu, &index, 1); + if (rc) + dev_err(rvu->dev, + "Failed to free mcam idx=%u pcifunc=%#x\n", + index, pcifunc); } } @@ -3266,7 +3417,10 @@ int rvu_mbox_handler_npc_mcam_shift_entry(struct rvu *rvu, npc_enable_mcam_entry(rvu, mcam, blkaddr, new_entry, false); /* Copy rule from old entry to new entry */ - npc_copy_mcam_entry(rvu, mcam, blkaddr, old_entry, new_entry); + if (npc_copy_mcam_entry(rvu, mcam, blkaddr, old_entry, new_entry)) { + rc = NPC_MCAM_INVALID_REQ; + break; + } /* Copy counter mapping, if any */ cntr = mcam->entry2cntr_map[old_entry]; @@ -3284,7 +3438,8 @@ int rvu_mbox_handler_npc_mcam_shift_entry(struct rvu *rvu, /* If shift has failed then report the failed index */ if (index != req->shift_count) { - rc = NPC_MCAM_PERM_DENIED; + if (!rc) + rc = NPC_MCAM_PERM_DENIED; rsp->failed_entry_idx = index; } @@ -3851,6 +4006,12 @@ int rvu_mbox_handler_npc_read_base_steer_rule(struct rvu *rvu, /* Read the default ucast entry if there is no pkt steering rule */ index = npc_get_nixlf_mcam_index(mcam, pcifunc, nixlf, NIXLF_UCAST_ENTRY); + if (index < 0) { + mutex_unlock(&mcam->lock); + rc = NIX_AF_ERR_AF_LF_INVALID; + goto out; + } + read_entry: /* Read the mcam entry */ npc_read_mcam_entry(rvu, mcam, blkaddr, index, &rsp->entry, &intf, @@ -3924,6 +4085,12 @@ void rvu_npc_clear_ucast_entry(struct rvu *rvu, int pcifunc, int nixlf) ucast_idx = npc_get_nixlf_mcam_index(mcam, pcifunc, nixlf, NIXLF_UCAST_ENTRY); + if (ucast_idx < 0) { + dev_err(rvu->dev, + "%s: Error to get ucast entry for pcifunc=%#x\n", + __func__, pcifunc); + return; + } npc_enable_mcam_entry(rvu, mcam, blkaddr, ucast_idx, false); diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_fs.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_fs.c index b45798d9fdabd3..6ae9cdcb608b0c 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_fs.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_fs.c @@ -1444,7 +1444,7 @@ static int npc_install_flow(struct rvu *rvu, int blkaddr, u16 target, struct msg_rsp write_rsp; struct mcam_entry *entry; bool new = false; - u16 entry_index; + int entry_index; int err; installed_features = req->features; @@ -1477,6 +1477,14 @@ static int npc_install_flow(struct rvu *rvu, int blkaddr, u16 target, if (req->default_rule) { entry_index = npc_get_nixlf_mcam_index(mcam, target, nixlf, NIXLF_UCAST_ENTRY); + + if (entry_index < 0) { + dev_err(rvu->dev, + "%s: Error to get ucast entry for target=%#x\n", + __func__, target); + return -EINVAL; + } + enable = is_mcam_entry_enabled(rvu, mcam, blkaddr, entry_index); } @@ -1980,13 +1988,15 @@ static int npc_update_dmac_value(struct rvu *rvu, int npcblkaddr, ether_addr_copy(rule->packet.dmac, pfvf->mac_addr); - if (is_cn20k(rvu->pdev)) - npc_cn20k_read_mcam_entry(rvu, npcblkaddr, rule->entry, - cn20k_entry, &intf, - &enable, &hw_prio); - else + if (is_cn20k(rvu->pdev)) { + if (npc_cn20k_read_mcam_entry(rvu, npcblkaddr, rule->entry, + cn20k_entry, &intf, + &enable, &hw_prio)) + return -EINVAL; + } else { npc_read_mcam_entry(rvu, mcam, npcblkaddr, rule->entry, entry, &intf, &enable); + } npc_update_entry(rvu, NPC_DMAC, &mdata, ether_addr_to_u64(pfvf->mac_addr), 0, @@ -2038,8 +2048,12 @@ void npc_mcam_enable_flows(struct rvu *rvu, u16 target) continue; } - if (rule->vfvlan_cfg) - npc_update_dmac_value(rvu, blkaddr, rule, pfvf); + if (rule->vfvlan_cfg) { + if (npc_update_dmac_value(rvu, blkaddr, rule, pfvf)) + dev_err(rvu->dev, + "Update dmac failed for %u, target=%#x\n", + rule->entry, target); + } if (rule->rx_action.op == NIX_RX_ACTION_DEFAULT) { if (!def_ucast_rule) diff --git a/drivers/net/ethernet/mellanox/mlx4/srq.c b/drivers/net/ethernet/mellanox/mlx4/srq.c index dd890f5d7b725c..8711689120f302 100644 --- a/drivers/net/ethernet/mellanox/mlx4/srq.c +++ b/drivers/net/ethernet/mellanox/mlx4/srq.c @@ -44,13 +44,14 @@ void mlx4_srq_event(struct mlx4_dev *dev, u32 srqn, int event_type) { struct mlx4_srq_table *srq_table = &mlx4_priv(dev)->srq_table; struct mlx4_srq *srq; + unsigned long flags; - rcu_read_lock(); + spin_lock_irqsave(&srq_table->lock, flags); srq = radix_tree_lookup(&srq_table->tree, srqn & (dev->caps.num_srqs - 1)); - rcu_read_unlock(); - if (srq) - refcount_inc(&srq->refcount); - else { + if (!srq || !refcount_inc_not_zero(&srq->refcount)) + srq = NULL; + spin_unlock_irqrestore(&srq_table->lock, flags); + if (!srq) { mlx4_warn(dev, "Async event for bogus SRQ %08x\n", srqn); return; } @@ -203,8 +204,8 @@ int mlx4_srq_alloc(struct mlx4_dev *dev, u32 pdn, u32 cqn, u16 xrcd, if (err) goto err_radix; - refcount_set(&srq->refcount, 1); init_completion(&srq->free); + refcount_set_release(&srq->refcount, 1); return 0; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/psp.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/psp.c index 6a50b6dec0fa82..d9adb993e64d95 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/psp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/psp.c @@ -1070,29 +1070,37 @@ static struct psp_dev_ops mlx5_psp_ops = { void mlx5e_psp_unregister(struct mlx5e_priv *priv) { - if (!priv->psp || !priv->psp->psp) + struct mlx5e_psp *psp = priv->psp; + + if (!psp || !psp->psp) return; - psp_dev_unregister(priv->psp->psp); + psp_dev_unregister(psp->psp); + psp->psp = NULL; } void mlx5e_psp_register(struct mlx5e_priv *priv) { + struct mlx5e_psp *psp = priv->psp; + struct psp_dev *psd; + /* FW Caps missing */ if (!priv->psp) return; - priv->psp->caps.assoc_drv_spc = sizeof(u32); - priv->psp->caps.versions = 1 << PSP_VERSION_HDR0_AES_GCM_128; + psp->caps.assoc_drv_spc = sizeof(u32); + psp->caps.versions = 1 << PSP_VERSION_HDR0_AES_GCM_128; if (MLX5_CAP_PSP(priv->mdev, psp_crypto_esp_aes_gcm_256_encrypt) && MLX5_CAP_PSP(priv->mdev, psp_crypto_esp_aes_gcm_256_decrypt)) - priv->psp->caps.versions |= 1 << PSP_VERSION_HDR0_AES_GCM_256; + psp->caps.versions |= 1 << PSP_VERSION_HDR0_AES_GCM_256; - priv->psp->psp = psp_dev_create(priv->netdev, &mlx5_psp_ops, - &priv->psp->caps, NULL); - if (IS_ERR(priv->psp->psp)) + psd = psp_dev_create(priv->netdev, &mlx5_psp_ops, &psp->caps, NULL); + if (IS_ERR(psd)) { mlx5_core_err(priv->mdev, "PSP failed to register due to %pe\n", - priv->psp->psp); + psd); + return; + } + psp->psp = psd; } int mlx5e_psp_init(struct mlx5e_priv *priv) @@ -1131,22 +1139,18 @@ int mlx5e_psp_init(struct mlx5e_priv *priv) if (!psp) return -ENOMEM; - priv->psp = psp; fs = mlx5e_accel_psp_fs_init(priv); if (IS_ERR(fs)) { err = PTR_ERR(fs); - goto out_err; + kfree(psp); + return err; } psp->fs = fs; + priv->psp = psp; mlx5_core_dbg(priv->mdev, "PSP attached to netdevice\n"); return 0; - -out_err: - priv->psp = NULL; - kfree(psp); - return err; } void mlx5e_psp_cleanup(struct mlx5e_priv *priv) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 5a46870c4b749e..8f2b3abe009214 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -6023,7 +6023,6 @@ static int mlx5e_nic_init(struct mlx5_core_dev *mdev, if (take_rtnl) rtnl_lock(); - mlx5e_psp_register(priv); /* update XDP supported features */ mlx5e_set_xdp_feature(priv); @@ -6036,7 +6035,6 @@ static int mlx5e_nic_init(struct mlx5_core_dev *mdev, static void mlx5e_nic_cleanup(struct mlx5e_priv *priv) { mlx5e_health_destroy_reporters(priv); - mlx5e_psp_unregister(priv); mlx5e_ktls_cleanup(priv); mlx5e_psp_cleanup(priv); mlx5e_fs_cleanup(priv->fs); @@ -6160,6 +6158,7 @@ static void mlx5e_nic_enable(struct mlx5e_priv *priv) mlx5e_fs_init_l2_addr(priv->fs, netdev); mlx5e_ipsec_init(priv); + mlx5e_psp_register(priv); err = mlx5e_macsec_init(priv); if (err) @@ -6230,6 +6229,7 @@ static void mlx5e_nic_disable(struct mlx5e_priv *priv) mlx5_lag_remove_netdev(mdev, priv->netdev); mlx5_vxlan_reset_to_default(mdev->vxlan); mlx5e_macsec_cleanup(priv); + mlx5e_psp_unregister(priv); mlx5e_ipsec_cleanup(priv); } @@ -6774,9 +6774,11 @@ static int mlx5e_resume(struct auxiliary_device *adev) return err; actual_adev = mlx5_sd_get_adev(mdev, adev, edev->idx); - if (actual_adev) - return _mlx5e_resume(actual_adev); - return 0; + if (actual_adev) { + err = _mlx5e_resume(actual_adev); + mlx5_sd_put_adev(actual_adev, adev); + } + return err; } static int _mlx5e_suspend(struct auxiliary_device *adev, bool pre_netdev_reg) @@ -6815,6 +6817,8 @@ static int mlx5e_suspend(struct auxiliary_device *adev, pm_message_t state) err = _mlx5e_suspend(actual_adev, false); mlx5_sd_cleanup(mdev); + if (actual_adev) + mlx5_sd_put_adev(actual_adev, adev); return err; } @@ -6912,9 +6916,19 @@ static int mlx5e_probe(struct auxiliary_device *adev, return err; actual_adev = mlx5_sd_get_adev(mdev, adev, edev->idx); - if (actual_adev) - return _mlx5e_probe(actual_adev); + if (actual_adev) { + err = _mlx5e_probe(actual_adev); + if (err) + goto sd_cleanup; + mlx5_sd_put_adev(actual_adev, adev); + } return 0; + +sd_cleanup: + mlx5_sd_cleanup(mdev); + if (actual_adev) + mlx5_sd_put_adev(actual_adev, adev); + return err; } static void _mlx5e_remove(struct auxiliary_device *adev) @@ -6966,6 +6980,8 @@ static void mlx5e_remove(struct auxiliary_device *adev) _mlx5e_remove(actual_adev); mlx5_sd_cleanup(mdev); + if (actual_adev) + mlx5_sd_put_adev(actual_adev, adev); } static const struct auxiliary_device_id mlx5e_id_table[] = { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/sd.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/sd.c index 762c783156b411..6e199161b0083a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/sd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/sd.c @@ -18,6 +18,7 @@ struct mlx5_sd { u8 host_buses; struct mlx5_devcom_comp_dev *devcom; struct dentry *dfs; + u8 state; bool primary; union { struct { /* primary */ @@ -31,6 +32,11 @@ struct mlx5_sd { }; }; +enum mlx5_sd_state { + MLX5_SD_STATE_DOWN = 0, + MLX5_SD_STATE_UP, +}; + static int mlx5_sd_get_host_buses(struct mlx5_core_dev *dev) { struct mlx5_sd *sd = mlx5_get_sd(dev); @@ -270,9 +276,6 @@ static void sd_unregister(struct mlx5_core_dev *dev) { struct mlx5_sd *sd = mlx5_get_sd(dev); - mlx5_devcom_comp_lock(sd->devcom); - mlx5_devcom_comp_set_ready(sd->devcom, false); - mlx5_devcom_comp_unlock(sd->devcom); mlx5_devcom_unregister_component(sd->devcom); } @@ -426,6 +429,7 @@ int mlx5_sd_init(struct mlx5_core_dev *dev) struct mlx5_core_dev *primary, *pos, *to; struct mlx5_sd *sd = mlx5_get_sd(dev); u8 alias_key[ACCESS_KEY_LEN]; + struct mlx5_sd *primary_sd; int err, i; err = sd_init(dev); @@ -440,10 +444,17 @@ int mlx5_sd_init(struct mlx5_core_dev *dev) if (err) goto err_sd_cleanup; + mlx5_devcom_comp_lock(sd->devcom); if (!mlx5_devcom_comp_is_ready(sd->devcom)) - return 0; + goto out; primary = mlx5_sd_get_primary(dev); + if (!primary) + goto out; + + primary_sd = mlx5_get_sd(primary); + if (primary_sd->state != MLX5_SD_STATE_DOWN) + goto out; for (i = 0; i < ACCESS_KEY_LEN; i++) alias_key[i] = get_random_u8(); @@ -452,9 +463,13 @@ int mlx5_sd_init(struct mlx5_core_dev *dev) if (err) goto err_sd_unregister; - sd->dfs = debugfs_create_dir("multi-pf", mlx5_debugfs_get_dev_root(primary)); - debugfs_create_x32("group_id", 0400, sd->dfs, &sd->group_id); - debugfs_create_file("primary", 0400, sd->dfs, primary, &dev_fops); + primary_sd->dfs = + debugfs_create_dir("multi-pf", + mlx5_debugfs_get_dev_root(primary)); + debugfs_create_x32("group_id", 0400, primary_sd->dfs, + &primary_sd->group_id); + debugfs_create_file("primary", 0400, primary_sd->dfs, primary, + &dev_fops); mlx5_sd_for_each_secondary(i, primary, pos) { char name[32]; @@ -464,7 +479,8 @@ int mlx5_sd_init(struct mlx5_core_dev *dev) goto err_unset_secondaries; snprintf(name, sizeof(name), "secondary_%d", i - 1); - debugfs_create_file(name, 0400, sd->dfs, pos, &dev_fops); + debugfs_create_file(name, 0400, primary_sd->dfs, pos, + &dev_fops); } @@ -472,6 +488,9 @@ int mlx5_sd_init(struct mlx5_core_dev *dev) sd->group_id, mlx5_devcom_comp_get_size(sd->devcom)); sd_print_group(primary); + primary_sd->state = MLX5_SD_STATE_UP; +out: + mlx5_devcom_comp_unlock(sd->devcom); return 0; err_unset_secondaries: @@ -479,8 +498,18 @@ int mlx5_sd_init(struct mlx5_core_dev *dev) mlx5_sd_for_each_secondary_to(i, primary, to, pos) sd_cmd_unset_secondary(pos); sd_cmd_unset_primary(primary); - debugfs_remove_recursive(sd->dfs); + debugfs_remove_recursive(primary_sd->dfs); + primary_sd->dfs = NULL; err_sd_unregister: + mlx5_sd_for_each_secondary(i, primary, pos) { + struct mlx5_sd *peer_sd = mlx5_get_sd(pos); + + primary_sd->secondaries[i - 1] = NULL; + peer_sd->primary_dev = NULL; + } + primary_sd->primary = false; + mlx5_devcom_comp_set_ready(sd->devcom, false); + mlx5_devcom_comp_unlock(sd->devcom); sd_unregister(dev); err_sd_cleanup: sd_cleanup(dev); @@ -491,42 +520,97 @@ void mlx5_sd_cleanup(struct mlx5_core_dev *dev) { struct mlx5_sd *sd = mlx5_get_sd(dev); struct mlx5_core_dev *primary, *pos; + struct mlx5_sd *primary_sd; int i; if (!sd) return; + mlx5_devcom_comp_lock(sd->devcom); if (!mlx5_devcom_comp_is_ready(sd->devcom)) - goto out; + goto out_unlock; primary = mlx5_sd_get_primary(dev); + if (!primary) + goto out_ready_false; + + primary_sd = mlx5_get_sd(primary); + if (primary_sd->state != MLX5_SD_STATE_UP) + goto out_clear_peers; + mlx5_sd_for_each_secondary(i, primary, pos) sd_cmd_unset_secondary(pos); sd_cmd_unset_primary(primary); - debugfs_remove_recursive(sd->dfs); + debugfs_remove_recursive(primary_sd->dfs); + primary_sd->dfs = NULL; sd_info(primary, "group id %#x, uncombined\n", sd->group_id); -out: + primary_sd->state = MLX5_SD_STATE_DOWN; +out_clear_peers: + mlx5_sd_for_each_secondary(i, primary, pos) { + struct mlx5_sd *peer_sd = mlx5_get_sd(pos); + + primary_sd->secondaries[i - 1] = NULL; + peer_sd->primary_dev = NULL; + } + primary_sd->primary = false; +out_ready_false: + mlx5_devcom_comp_set_ready(sd->devcom, false); +out_unlock: + mlx5_devcom_comp_unlock(sd->devcom); sd_unregister(dev); sd_cleanup(dev); } +/* Lock order: + * primary: actual_adev_lock -> SD devcom comp lock + * secondary: SD devcom comp lock -> (drop) -> actual_adev_lock + * The two locks are never held together, so no ABBA. + */ struct auxiliary_device *mlx5_sd_get_adev(struct mlx5_core_dev *dev, struct auxiliary_device *adev, int idx) { struct mlx5_sd *sd = mlx5_get_sd(dev); struct mlx5_core_dev *primary; + struct mlx5_adev *primary_adev; if (!sd) return adev; - if (!mlx5_devcom_comp_is_ready(sd->devcom)) + mlx5_devcom_comp_lock(sd->devcom); + if (!mlx5_devcom_comp_is_ready(sd->devcom)) { + mlx5_devcom_comp_unlock(sd->devcom); return NULL; + } primary = mlx5_sd_get_primary(dev); - if (dev == primary) + if (!primary || dev == primary) { + mlx5_devcom_comp_unlock(sd->devcom); return adev; + } + + primary_adev = primary->priv.adev[idx]; + get_device(&primary_adev->adev.dev); + mlx5_devcom_comp_unlock(sd->devcom); + + device_lock(&primary_adev->adev.dev); + /* Primary may have completed remove between dropping devcom and + * acquiring device_lock; recheck. + */ + if (!mlx5_devcom_comp_is_ready(sd->devcom)) { + device_unlock(&primary_adev->adev.dev); + put_device(&primary_adev->adev.dev); + return NULL; + } + return &primary_adev->adev; +} - return &primary->priv.adev[idx]->adev; +void mlx5_sd_put_adev(struct auxiliary_device *actual_adev, + struct auxiliary_device *adev) +{ + if (actual_adev != adev) { + device_unlock(&actual_adev->dev); + put_device(&actual_adev->dev); + } } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/sd.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/sd.h index 137efaf9aabcc0..9bfd5b9756b507 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/sd.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/sd.h @@ -15,6 +15,8 @@ struct mlx5_core_dev *mlx5_sd_ch_ix_get_dev(struct mlx5_core_dev *primary, int c struct auxiliary_device *mlx5_sd_get_adev(struct mlx5_core_dev *dev, struct auxiliary_device *adev, int idx); +void mlx5_sd_put_adev(struct auxiliary_device *actual_adev, + struct auxiliary_device *adev); int mlx5_sd_init(struct mlx5_core_dev *dev); void mlx5_sd_cleanup(struct mlx5_core_dev *dev); diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c index c406a3b56b37fd..4dea2bb58d2fca 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c +++ b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c @@ -826,7 +826,8 @@ struct net_device *fbnic_netdev_alloc(struct fbnic_dev *fbd) netif_tx_stop_all_queues(netdev); if (fbnic_phylink_create(netdev)) { - fbnic_netdev_free(fbd); + free_netdev(netdev); + fbd->netdev = NULL; return NULL; } diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c index 47752d3fde0b10..1179a6e127c527 100644 --- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c @@ -749,11 +749,10 @@ static void lan966x_cleanup_ports(struct lan966x *lan966x) for (p = 0; p < lan966x->num_phys_ports; p++) { port = lan966x->ports[p]; - if (!port) + if (!port || !port->dev) continue; - if (port->dev) - unregister_netdev(port->dev); + unregister_netdev(port->dev); lan966x_xdp_port_deinit(port); if (lan966x->fdma && lan966x->fdma_ndev == port->dev) @@ -873,6 +872,9 @@ static int lan966x_probe_port(struct lan966x *lan966x, u32 p, err = register_netdev(dev); if (err) { dev_err(lan966x->dev, "register_netdev failed\n"); + phylink_destroy(phylink); + port->phylink = NULL; + port->dev = NULL; return err; } diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_main.h b/drivers/net/ethernet/microchip/sparx5/sparx5_main.h index 6a745bb71b5c53..eb57b86fbe22fa 100644 --- a/drivers/net/ethernet/microchip/sparx5/sparx5_main.h +++ b/drivers/net/ethernet/microchip/sparx5/sparx5_main.h @@ -31,11 +31,11 @@ enum spx5_target_chiptype { SPX5_TARGET_CT_7552 = 0x7552, /* SparX-5-128 Enterprise */ SPX5_TARGET_CT_7556 = 0x7556, /* SparX-5-160 Enterprise */ SPX5_TARGET_CT_7558 = 0x7558, /* SparX-5-200 Enterprise */ - SPX5_TARGET_CT_7546TSN = 0x47546, /* SparX-5-64i Industrial */ - SPX5_TARGET_CT_7549TSN = 0x47549, /* SparX-5-90i Industrial */ - SPX5_TARGET_CT_7552TSN = 0x47552, /* SparX-5-128i Industrial */ - SPX5_TARGET_CT_7556TSN = 0x47556, /* SparX-5-160i Industrial */ - SPX5_TARGET_CT_7558TSN = 0x47558, /* SparX-5-200i Industrial */ + SPX5_TARGET_CT_7546TSN = 0x0546, /* SparX-5-64i Industrial */ + SPX5_TARGET_CT_7549TSN = 0x0549, /* SparX-5-90i Industrial */ + SPX5_TARGET_CT_7552TSN = 0x0552, /* SparX-5-128i Industrial */ + SPX5_TARGET_CT_7556TSN = 0x0556, /* SparX-5-160i Industrial */ + SPX5_TARGET_CT_7558TSN = 0x0558, /* SparX-5-200i Industrial */ SPX5_TARGET_CT_LAN9694 = 0x9694, /* lan969x-40 */ SPX5_TARGET_CT_LAN9691VAO = 0x9691, /* lan969x-40-VAO */ SPX5_TARGET_CT_LAN9694TSN = 0x9695, /* lan969x-40-TSN */ diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_port.c b/drivers/net/ethernet/microchip/sparx5/sparx5_port.c index 04bc8fffaf961e..62c49893de3c37 100644 --- a/drivers/net/ethernet/microchip/sparx5/sparx5_port.c +++ b/drivers/net/ethernet/microchip/sparx5/sparx5_port.c @@ -1128,7 +1128,8 @@ int sparx5_port_init(struct sparx5 *sparx5, DEV2G5_PCS1G_SD_CFG(port->portno)); if (conf->portmode == PHY_INTERFACE_MODE_QSGMII || - conf->portmode == PHY_INTERFACE_MODE_SGMII) { + conf->portmode == PHY_INTERFACE_MODE_SGMII || + conf->portmode == PHY_INTERFACE_MODE_1000BASEX) { err = sparx5_serdes_set(sparx5, port, conf); if (err) return err; diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c index 098fbda0d128a5..d8e816882f02c2 100644 --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c @@ -43,8 +43,9 @@ static u64 mana_gd_r64(struct gdma_context *g, u64 offset) static int mana_gd_init_pf_regs(struct pci_dev *pdev) { struct gdma_context *gc = pci_get_drvdata(pdev); - void __iomem *sriov_base_va; + u64 remaining_barsize; u64 sriov_base_off; + u64 sriov_shm_off; gc->db_page_size = mana_gd_r32(gc, GDMA_PF_REG_DB_PAGE_SIZE) & 0xFFFF; @@ -73,10 +74,28 @@ static int mana_gd_init_pf_regs(struct pci_dev *pdev) gc->phys_db_page_base = gc->bar0_pa + gc->db_page_off; sriov_base_off = mana_gd_r64(gc, GDMA_SRIOV_REG_CFG_BASE_OFF); + if (sriov_base_off >= gc->bar0_size || + gc->bar0_size - sriov_base_off < + GDMA_PF_REG_SHM_OFF + sizeof(u64) || + !IS_ALIGNED(sriov_base_off, sizeof(u64))) { + dev_err(gc->dev, + "SRIOV base offset 0x%llx out of range or unaligned (BAR0 size 0x%llx)\n", + sriov_base_off, (u64)gc->bar0_size); + return -EPROTO; + } - sriov_base_va = gc->bar0_va + sriov_base_off; - gc->shm_base = sriov_base_va + - mana_gd_r64(gc, sriov_base_off + GDMA_PF_REG_SHM_OFF); + remaining_barsize = gc->bar0_size - sriov_base_off; + sriov_shm_off = mana_gd_r64(gc, sriov_base_off + GDMA_PF_REG_SHM_OFF); + if (sriov_shm_off >= remaining_barsize || + remaining_barsize - sriov_shm_off < SMC_APERTURE_SIZE || + !IS_ALIGNED(sriov_shm_off, sizeof(u32))) { + dev_err(gc->dev, + "SRIOV SHM offset 0x%llx out of range or unaligned (BAR0 size 0x%llx)\n", + sriov_shm_off, (u64)gc->bar0_size); + return -EPROTO; + } + + gc->shm_base = gc->bar0_va + sriov_base_off + sriov_shm_off; return 0; } @@ -84,6 +103,7 @@ static int mana_gd_init_pf_regs(struct pci_dev *pdev) static int mana_gd_init_vf_regs(struct pci_dev *pdev) { struct gdma_context *gc = pci_get_drvdata(pdev); + u64 shm_off; gc->db_page_size = mana_gd_r32(gc, GDMA_REG_DB_PAGE_SIZE) & 0xFFFF; @@ -111,7 +131,17 @@ static int mana_gd_init_vf_regs(struct pci_dev *pdev) gc->db_page_base = gc->bar0_va + gc->db_page_off; gc->phys_db_page_base = gc->bar0_pa + gc->db_page_off; - gc->shm_base = gc->bar0_va + mana_gd_r64(gc, GDMA_REG_SHM_OFFSET); + shm_off = mana_gd_r64(gc, GDMA_REG_SHM_OFFSET); + if (shm_off >= gc->bar0_size || + gc->bar0_size - shm_off < SMC_APERTURE_SIZE || + !IS_ALIGNED(shm_off, sizeof(u32))) { + dev_err(gc->dev, + "SHM offset 0x%llx out of range or unaligned (BAR0 size 0x%llx)\n", + shm_off, (u64)gc->bar0_size); + return -EPROTO; + } + + gc->shm_base = gc->bar0_va + shm_off; return 0; } diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c index a654b3699c4c51..9afc786b297a8d 100644 --- a/drivers/net/ethernet/microsoft/mana/mana_en.c +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c @@ -2520,9 +2520,12 @@ static void mana_destroy_rxq(struct mana_port_context *apc, napi_disable_locked(napi); netif_napi_del_locked(napi); } - xdp_rxq_info_unreg(&rxq->xdp_rxq); - mana_destroy_wq_obj(apc, GDMA_RQ, rxq->rxobj); + if (xdp_rxq_info_is_reg(&rxq->xdp_rxq)) + xdp_rxq_info_unreg(&rxq->xdp_rxq); + + if (rxq->rxobj != INVALID_MANA_HANDLE) + mana_destroy_wq_obj(apc, GDMA_RQ, rxq->rxobj); mana_deinit_cq(apc, &rxq->rx_cq); @@ -2796,9 +2799,6 @@ static struct mana_rxq *mana_create_rxq(struct mana_port_context *apc, mana_destroy_rxq(apc, rxq, false); - if (cq) - mana_deinit_cq(apc, cq); - return NULL; } diff --git a/drivers/net/ethernet/microsoft/mana/shm_channel.c b/drivers/net/ethernet/microsoft/mana/shm_channel.c index 0f1679ebad96ba..d21b5db06e5092 100644 --- a/drivers/net/ethernet/microsoft/mana/shm_channel.c +++ b/drivers/net/ethernet/microsoft/mana/shm_channel.c @@ -61,11 +61,6 @@ union smc_proto_hdr { }; }; /* HW DATA */ -#define SMC_APERTURE_BITS 256 -#define SMC_BASIC_UNIT (sizeof(u32)) -#define SMC_APERTURE_DWORDS (SMC_APERTURE_BITS / (SMC_BASIC_UNIT * 8)) -#define SMC_LAST_DWORD (SMC_APERTURE_DWORDS - 1) - static int mana_smc_poll_register(void __iomem *base, bool reset) { void __iomem *ptr = base + SMC_LAST_DWORD * SMC_BASIC_UNIT; diff --git a/drivers/net/ethernet/qlogic/qed/qed_dev.c b/drivers/net/ethernet/qlogic/qed/qed_dev.c index 42c6dcfb1f0fa5..dd75c47758e1a3 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_dev.c +++ b/drivers/net/ethernet/qlogic/qed/qed_dev.c @@ -5103,6 +5103,13 @@ static int qed_init_wfq_param(struct qed_hwfn *p_hwfn, return -EINVAL; } + /* All vports are already or become configured, nothing to distribute */ + if (non_requested_count == 0) { + p_hwfn->qm_info.wfq_data[vport_id].min_speed = req_rate; + p_hwfn->qm_info.wfq_data[vport_id].configured = true; + return 0; + } + total_left_rate = min_pf_rate - total_req_min_rate; left_rate_per_vp = total_left_rate / non_requested_count; diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c index 1dbfadb2a88145..5f88733094d0fc 100644 --- a/drivers/net/ethernet/renesas/ravb_main.c +++ b/drivers/net/ethernet/renesas/ravb_main.c @@ -1108,9 +1108,12 @@ static int ravb_stop_dma(struct net_device *ndev) /* Request for transmission suspension */ ravb_modify(ndev, CCC, CCC_DTSR, CCC_DTSR); - error = ravb_wait(ndev, CSR, CSR_DTS, CSR_DTS); - if (error) - netdev_err(ndev, "failed to stop AXI BUS\n"); + /* Access to URAM will not be suspended if WoL is enabled. */ + if (!priv->wol_enabled) { + error = ravb_wait(ndev, CSR, CSR_DTS, CSR_DTS); + if (error) + netdev_err(ndev, "failed to stop AXI BUS\n"); + } /* Stop AVB-DMAC process */ return ravb_set_opmode(ndev, CCC_OPC_CONFIG); diff --git a/drivers/net/ethernet/renesas/rtsn.c b/drivers/net/ethernet/renesas/rtsn.c index 03a2669f051879..ee8381b60b8d49 100644 --- a/drivers/net/ethernet/renesas/rtsn.c +++ b/drivers/net/ethernet/renesas/rtsn.c @@ -797,11 +797,11 @@ static int rtsn_mdio_alloc(struct rtsn_private *priv) /* Enter config mode before registering the MDIO bus */ ret = rtsn_reset(priv); if (ret) - goto out_free_bus; + goto out_put_node; ret = rtsn_change_mode(priv, OCR_OPC_CONFIG); if (ret) - goto out_free_bus; + goto out_put_node; rtsn_modify(priv, MPIC, MPIC_PSMCS_MASK | MPIC_PSMHT_MASK, MPIC_PSMCS_DEFAULT | MPIC_PSMHT_DEFAULT); @@ -824,6 +824,8 @@ static int rtsn_mdio_alloc(struct rtsn_private *priv) return 0; +out_put_node: + of_node_put(mdio_node); out_free_bus: mdiobus_free(mii); return ret; diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-nuvoton.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-nuvoton.c index e2240b68ad98d6..2ab6ecac64224b 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-nuvoton.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-nuvoton.c @@ -100,6 +100,8 @@ static int nvt_gmac_probe(struct platform_device *pdev) if (!priv) return dev_err_probe(dev, -ENOMEM, "Failed to allocate private data\n"); + priv->dev = dev; + priv->regmap = syscon_regmap_lookup_by_phandle_args(dev->of_node, "nuvoton,sys", 1, &priv->macid); if (IS_ERR(priv->regmap)) diff --git a/drivers/net/ethernet/ti/icssm/icssm_prueth.c b/drivers/net/ethernet/ti/icssm/icssm_prueth.c index 53bbd929090423..b7e94244355a3b 100644 --- a/drivers/net/ethernet/ti/icssm/icssm_prueth.c +++ b/drivers/net/ethernet/ti/icssm/icssm_prueth.c @@ -1825,6 +1825,7 @@ static int icssm_prueth_probe(struct platform_device *pdev) dev_err(dev, "%pOF error reading port_id %d\n", eth_node, ret); of_node_put(eth_node); + of_node_put(eth_ports_node); return ret; } diff --git a/drivers/net/ethernet/wangxun/libwx/wx_hw.c b/drivers/net/ethernet/wangxun/libwx/wx_hw.c index d3772d01e00bc8..2451f6b20b1115 100644 --- a/drivers/net/ethernet/wangxun/libwx/wx_hw.c +++ b/drivers/net/ethernet/wangxun/libwx/wx_hw.c @@ -2480,8 +2480,11 @@ int wx_sw_init(struct wx *wx) wx->oem_svid = pdev->subsystem_vendor; wx->oem_ssid = pdev->subsystem_device; wx->bus.device = PCI_SLOT(pdev->devfn); - wx->bus.func = FIELD_GET(WX_CFG_PORT_ST_LANID, - rd32(wx, WX_CFG_PORT_ST)); + if (pdev->is_virtfn) + wx->bus.func = PCI_FUNC(pdev->devfn); + else + wx->bus.func = FIELD_GET(WX_CFG_PORT_ST_LANID, + rd32(wx, WX_CFG_PORT_ST)); if (wx->oem_svid == PCI_VENDOR_ID_WANGXUN || pdev->is_virtfn) { diff --git a/drivers/net/ethernet/wangxun/libwx/wx_vf_common.c b/drivers/net/ethernet/wangxun/libwx/wx_vf_common.c index 29cdbed2e5ecd3..94ff8f5f0b4c8f 100644 --- a/drivers/net/ethernet/wangxun/libwx/wx_vf_common.c +++ b/drivers/net/ethernet/wangxun/libwx/wx_vf_common.c @@ -99,8 +99,8 @@ int wx_request_msix_irqs_vf(struct wx *wx) } } - err = request_threaded_irq(wx->msix_entry->vector, wx_msix_misc_vf, - NULL, IRQF_ONESHOT, netdev->name, wx); + err = request_irq(wx->msix_entry->vector, wx_msix_misc_vf, + 0, netdev->name, wx); if (err) { wx_err(wx, "request_irq for msix_other failed: %d\n", err); goto free_queue_irqs; diff --git a/drivers/net/fddi/defza.c b/drivers/net/fddi/defza.c index 064fa484f797af..9bfecc87d6b252 100644 --- a/drivers/net/fddi/defza.c +++ b/drivers/net/fddi/defza.c @@ -984,7 +984,7 @@ static irqreturn_t fza_interrupt(int irq, void *dev_id) case FZA_STATE_UNINITIALIZED: netif_carrier_off(dev); - timer_delete_sync(&fp->reset_timer); + timer_delete_sync_try(&fp->reset_timer); fp->ring_cmd_index = 0; fp->ring_uns_index = 0; fp->ring_rmc_tx_index = 0; @@ -1018,7 +1018,9 @@ static irqreturn_t fza_interrupt(int irq, void *dev_id) fp->queue_active = 0; netif_stop_queue(dev); pr_debug("%s: queue stopped\n", fp->name); - timer_delete_sync(&fp->reset_timer); + + spin_lock(&fp->lock); + timer_delete(&fp->reset_timer); pr_warn("%s: halted, reason: %x\n", fp->name, FZA_STATUS_GET_HALT(status)); fza_regs_dump(fp); @@ -1027,6 +1029,8 @@ static irqreturn_t fza_interrupt(int irq, void *dev_id) fp->timer_state = 0; fp->reset_timer.expires = jiffies + 45 * HZ; add_timer(&fp->reset_timer); + spin_unlock(&fp->lock); + break; default: @@ -1046,7 +1050,9 @@ static irqreturn_t fza_interrupt(int irq, void *dev_id) static void fza_reset_timer(struct timer_list *t) { struct fza_private *fp = timer_container_of(fp, t, reset_timer); + unsigned long flags; + spin_lock_irqsave(&fp->lock, flags); if (!fp->timer_state) { pr_err("%s: RESET timed out!\n", fp->name); pr_info("%s: trying harder...\n", fp->name); @@ -1069,6 +1075,7 @@ static void fza_reset_timer(struct timer_list *t) fp->reset_timer.expires = jiffies + 45 * HZ; } add_timer(&fp->reset_timer); + spin_unlock_irqrestore(&fp->lock, flags); } static int fza_set_mac_address(struct net_device *dev, void *addr) diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c index 6147ee8b1d78b1..f904f4d16b45fd 100644 --- a/drivers/net/macsec.c +++ b/drivers/net/macsec.c @@ -26,6 +26,8 @@ #include +static struct workqueue_struct *macsec_wq; + /* SecTAG length = macsec_eth_header without the optional SCI */ #define MACSEC_TAG_LEN 6 @@ -174,9 +176,10 @@ static void macsec_rxsc_put(struct macsec_rx_sc *sc) call_rcu(&sc->rcu_head, free_rx_sc_rcu); } -static void free_rxsa(struct rcu_head *head) +static void free_rxsa_work(struct work_struct *work) { - struct macsec_rx_sa *sa = container_of(head, struct macsec_rx_sa, rcu); + struct macsec_rx_sa *sa = + container_of(to_rcu_work(work), struct macsec_rx_sa, destroy_work); crypto_free_aead(sa->key.tfm); free_percpu(sa->stats); @@ -186,7 +189,7 @@ static void free_rxsa(struct rcu_head *head) static void macsec_rxsa_put(struct macsec_rx_sa *sa) { if (refcount_dec_and_test(&sa->refcnt)) - call_rcu(&sa->rcu, free_rxsa); + queue_rcu_work(macsec_wq, &sa->destroy_work); } static struct macsec_tx_sa *macsec_txsa_get(struct macsec_tx_sa __rcu *ptr) @@ -202,9 +205,10 @@ static struct macsec_tx_sa *macsec_txsa_get(struct macsec_tx_sa __rcu *ptr) return sa; } -static void free_txsa(struct rcu_head *head) +static void free_txsa_work(struct work_struct *work) { - struct macsec_tx_sa *sa = container_of(head, struct macsec_tx_sa, rcu); + struct macsec_tx_sa *sa = + container_of(to_rcu_work(work), struct macsec_tx_sa, destroy_work); crypto_free_aead(sa->key.tfm); free_percpu(sa->stats); @@ -214,7 +218,7 @@ static void free_txsa(struct rcu_head *head) static void macsec_txsa_put(struct macsec_tx_sa *sa) { if (refcount_dec_and_test(&sa->refcnt)) - call_rcu(&sa->rcu, free_txsa); + queue_rcu_work(macsec_wq, &sa->destroy_work); } static struct macsec_cb *macsec_skb_cb(struct sk_buff *skb) @@ -1407,6 +1411,7 @@ static int init_rx_sa(struct macsec_rx_sa *rx_sa, char *sak, int key_len, rx_sa->next_pn = 1; refcount_set(&rx_sa->refcnt, 1); spin_lock_init(&rx_sa->lock); + INIT_RCU_WORK(&rx_sa->destroy_work, free_rxsa_work); return 0; } @@ -1506,6 +1511,7 @@ static int init_tx_sa(struct macsec_tx_sa *tx_sa, char *sak, int key_len, tx_sa->active = false; refcount_set(&tx_sa->refcnt, 1); spin_lock_init(&tx_sa->lock); + INIT_RCU_WORK(&tx_sa->destroy_work, free_txsa_work); return 0; } @@ -4505,25 +4511,35 @@ static int __init macsec_init(void) { int err; + macsec_wq = alloc_workqueue("macsec", WQ_UNBOUND, 0); + if (!macsec_wq) + return -ENOMEM; + pr_info("MACsec IEEE 802.1AE\n"); err = register_netdevice_notifier(&macsec_notifier); if (err) - return err; + goto err_destroy_wq; err = rtnl_link_register(&macsec_link_ops); if (err) - goto notifier; + goto err_notifier; err = genl_register_family(&macsec_fam); if (err) - goto rtnl; + goto err_rtnl; return 0; -rtnl: +err_rtnl: rtnl_link_unregister(&macsec_link_ops); -notifier: +err_notifier: unregister_netdevice_notifier(&macsec_notifier); +err_destroy_wq: + /* Precautionary, mirrors macsec_exit() to stay safe if work + * ever becomes queueable before this point in the future. + */ + rcu_barrier(); + destroy_workqueue(macsec_wq); return err; } @@ -4533,6 +4549,7 @@ static void __exit macsec_exit(void) rtnl_link_unregister(&macsec_link_ops); unregister_netdevice_notifier(&macsec_notifier); rcu_barrier(); + destroy_workqueue(macsec_wq); } module_init(macsec_init); diff --git a/drivers/net/net_failover.c b/drivers/net/net_failover.c index d0361aaf25efb5..3f7d31033bae8b 100644 --- a/drivers/net/net_failover.c +++ b/drivers/net/net_failover.c @@ -502,7 +502,7 @@ static int net_failover_slave_register(struct net_device *slave_dev, /* Align MTU of slave with failover dev */ orig_mtu = slave_dev->mtu; - err = dev_set_mtu(slave_dev, failover_dev->mtu); + err = netif_set_mtu(slave_dev, failover_dev->mtu); if (err) { netdev_err(failover_dev, "unable to change mtu of %s to %u register failed\n", slave_dev->name, failover_dev->mtu); @@ -512,11 +512,11 @@ static int net_failover_slave_register(struct net_device *slave_dev, dev_hold(slave_dev); if (netif_running(failover_dev)) { - err = dev_open(slave_dev, NULL); + err = netif_open(slave_dev, NULL); if (err && (err != -EBUSY)) { netdev_err(failover_dev, "Opening slave %s failed err:%d\n", slave_dev->name, err); - goto err_dev_open; + goto err_netif_open; } } @@ -562,10 +562,10 @@ static int net_failover_slave_register(struct net_device *slave_dev, err_vlan_add: dev_uc_unsync(slave_dev, failover_dev); dev_mc_unsync(slave_dev, failover_dev); - dev_close(slave_dev); -err_dev_open: + netif_close(slave_dev); +err_netif_open: dev_put(slave_dev); - dev_set_mtu(slave_dev, orig_mtu); + netif_set_mtu(slave_dev, orig_mtu); done: return err; } diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c index a05af192caf325..a750768912b5a1 100644 --- a/drivers/net/netdevsim/netdev.c +++ b/drivers/net/netdevsim/netdev.c @@ -1182,7 +1182,8 @@ void nsim_destroy(struct netdevsim *ns) unregister_netdevice_notifier_dev_net(ns->netdev, &ns->nb, &ns->nn); - nsim_psp_uninit(ns); + if (nsim_dev_port_is_pf(ns->nsim_dev_port)) + nsim_psp_uninit(ns); rtnl_lock(); peer = rtnl_dereference(ns->peer); diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h index 7e129dddbbe7f9..d909c4160ea1fe 100644 --- a/drivers/net/netdevsim/netdevsim.h +++ b/drivers/net/netdevsim/netdevsim.h @@ -120,7 +120,9 @@ struct netdevsim { u64_stats_t tx_packets; u64_stats_t tx_bytes; struct u64_stats_sync syncp; - struct psp_dev *dev; + struct psp_dev __rcu *dev; + struct dentry *rereg; + struct mutex rereg_lock; u32 spi; u32 assoc_cnt; } psp; diff --git a/drivers/net/netdevsim/psp.c b/drivers/net/netdevsim/psp.c index 0b4d717253b080..6936ecb8173e20 100644 --- a/drivers/net/netdevsim/psp.c +++ b/drivers/net/netdevsim/psp.c @@ -19,6 +19,7 @@ nsim_do_psp(struct sk_buff *skb, struct netdevsim *ns, struct netdevsim *peer_ns, struct skb_ext **psp_ext) { enum skb_drop_reason rc = 0; + struct psp_dev *peer_psd; struct psp_assoc *pas; struct net *net; void **ptr; @@ -48,7 +49,8 @@ nsim_do_psp(struct sk_buff *skb, struct netdevsim *ns, } /* Now pretend we just received this frame */ - if (peer_ns->psp.dev->config.versions & (1 << pas->version)) { + peer_psd = rcu_dereference(peer_ns->psp.dev); + if (peer_psd && peer_psd->config.versions & (1 << pas->version)) { bool strip_icv = false; u8 generation; @@ -61,8 +63,7 @@ nsim_do_psp(struct sk_buff *skb, struct netdevsim *ns, skb_ext_reset(skb); skb->mac_len = ETH_HLEN; - if (psp_dev_rcv(skb, peer_ns->psp.dev->id, generation, - strip_icv)) { + if (psp_dev_rcv(skb, peer_psd->id, generation, strip_icv)) { rc = SKB_DROP_REASON_PSP_OUTPUT; goto out_unlock; } @@ -209,26 +210,50 @@ static struct psp_dev_caps nsim_psp_caps = { .assoc_drv_spc = sizeof(void *), }; -void nsim_psp_uninit(struct netdevsim *ns) +static void __nsim_psp_uninit(struct netdevsim *ns, bool teardown) { - if (!IS_ERR(ns->psp.dev)) - psp_dev_unregister(ns->psp.dev); + struct psp_dev *psd; + + psd = rcu_dereference_protected(ns->psp.dev, + teardown || + lockdep_is_held(&ns->psp.rereg_lock)); + if (psd) { + rcu_assign_pointer(ns->psp.dev, NULL); + synchronize_rcu(); + psp_dev_unregister(psd); + } WARN_ON(ns->psp.assoc_cnt); } +void nsim_psp_uninit(struct netdevsim *ns) +{ + debugfs_remove(ns->psp.rereg); + mutex_destroy(&ns->psp.rereg_lock); + __nsim_psp_uninit(ns, true); +} + static ssize_t nsim_psp_rereg_write(struct file *file, const char __user *data, size_t count, loff_t *ppos) { struct netdevsim *ns = file->private_data; - int err; + struct psp_dev *psd; + ssize_t ret; + + mutex_lock(&ns->psp.rereg_lock); + __nsim_psp_uninit(ns, false); - nsim_psp_uninit(ns); + psd = psp_dev_create(ns->netdev, &nsim_psp_ops, &nsim_psp_caps, ns); + if (IS_ERR(psd)) { + ret = PTR_ERR(psd); + goto out; + } - ns->psp.dev = psp_dev_create(ns->netdev, &nsim_psp_ops, - &nsim_psp_caps, ns); - err = PTR_ERR_OR_ZERO(ns->psp.dev); - return err ?: count; + rcu_assign_pointer(ns->psp.dev, psd); + ret = count; +out: + mutex_unlock(&ns->psp.rereg_lock); + return ret; } static const struct file_operations nsim_psp_rereg_fops = { @@ -241,14 +266,16 @@ static const struct file_operations nsim_psp_rereg_fops = { int nsim_psp_init(struct netdevsim *ns) { struct dentry *ddir = ns->nsim_dev_port->ddir; - int err; + struct psp_dev *psd; + + psd = psp_dev_create(ns->netdev, &nsim_psp_ops, &nsim_psp_caps, ns); + if (IS_ERR(psd)) + return PTR_ERR(psd); - ns->psp.dev = psp_dev_create(ns->netdev, &nsim_psp_ops, - &nsim_psp_caps, ns); - err = PTR_ERR_OR_ZERO(ns->psp.dev); - if (err) - return err; + rcu_assign_pointer(ns->psp.dev, psd); - debugfs_create_file("psp_rereg", 0200, ddir, ns, &nsim_psp_rereg_fops); + mutex_init(&ns->psp.rereg_lock); + ns->psp.rereg = debugfs_create_file("psp_rereg", 0200, ddir, ns, + &nsim_psp_rereg_fops); return 0; } diff --git a/drivers/net/ovpn/io.c b/drivers/net/ovpn/io.c index db43a1f8a07a27..22c555dd962ec7 100644 --- a/drivers/net/ovpn/io.c +++ b/drivers/net/ovpn/io.c @@ -85,17 +85,24 @@ static void ovpn_netdev_write(struct ovpn_peer *peer, struct sk_buff *skb) skb_scrub_packet(skb, true); /* network header reset in ovpn_decrypt_post() */ + skb_reset_mac_header(skb); skb_reset_transport_header(skb); skb_reset_inner_headers(skb); /* cause packet to be "received" by the interface */ pkt_len = skb->len; + /* we may get here in process context in case of TCP connections, + * therefore we have to disable BHs to ensure gro_cells_receive() + * and dev_dstats_rx_add() do not get corrupted or enter deadlock + */ + local_bh_disable(); ret = gro_cells_receive(&peer->ovpn->gro_cells, skb); if (likely(ret == NET_RX_SUCCESS)) { /* update RX stats with the size of decrypted packet */ ovpn_peer_stats_increment_rx(&peer->vpn_stats, pkt_len); dev_dstats_rx_add(peer->ovpn->dev, pkt_len); } + local_bh_enable(); } void ovpn_decrypt_post(void *data, int ret) diff --git a/drivers/net/phy/bcm-phy-lib.c b/drivers/net/phy/bcm-phy-lib.c index 5198d66dbbc048..b64beade8dd933 100644 --- a/drivers/net/phy/bcm-phy-lib.c +++ b/drivers/net/phy/bcm-phy-lib.c @@ -563,6 +563,15 @@ void bcm_phy_get_stats(struct phy_device *phydev, u64 *shadow, } EXPORT_SYMBOL_GPL(bcm_phy_get_stats); +void bcm_phy_update_stats_shadow(struct phy_device *phydev, u64 *shadow) +{ + unsigned int i; + + for (i = 0; i < ARRAY_SIZE(bcm_phy_hw_stats); i++) + bcm_phy_get_stat(phydev, shadow, i); +} +EXPORT_SYMBOL_GPL(bcm_phy_update_stats_shadow); + void bcm_phy_r_rc_cal_reset(struct phy_device *phydev) { /* Reset R_CAL/RC_CAL Engine */ diff --git a/drivers/net/phy/bcm-phy-lib.h b/drivers/net/phy/bcm-phy-lib.h index bceddbc860eb28..bba94ce9619543 100644 --- a/drivers/net/phy/bcm-phy-lib.h +++ b/drivers/net/phy/bcm-phy-lib.h @@ -85,6 +85,7 @@ int bcm_phy_get_sset_count(struct phy_device *phydev); void bcm_phy_get_strings(struct phy_device *phydev, u8 *data); void bcm_phy_get_stats(struct phy_device *phydev, u64 *shadow, struct ethtool_stats *stats, u64 *data); +void bcm_phy_update_stats_shadow(struct phy_device *phydev, u64 *shadow); void bcm_phy_r_rc_cal_reset(struct phy_device *phydev); int bcm_phy_28nm_a0b0_afe_config_init(struct phy_device *phydev); int bcm_phy_enable_jumbo(struct phy_device *phydev); diff --git a/drivers/net/phy/bcm7xxx.c b/drivers/net/phy/bcm7xxx.c index 00e8fa14aa773a..71a163f62c0eb6 100644 --- a/drivers/net/phy/bcm7xxx.c +++ b/drivers/net/phy/bcm7xxx.c @@ -807,6 +807,17 @@ static void bcm7xxx_28nm_get_phy_stats(struct phy_device *phydev, bcm_phy_get_stats(phydev, priv->stats, stats, data); } +static int bcm7xxx_28nm_suspend(struct phy_device *phydev) +{ + struct bcm7xxx_phy_priv *priv = phydev->priv; + + mutex_lock(&phydev->lock); + bcm_phy_update_stats_shadow(phydev, priv->stats); + mutex_unlock(&phydev->lock); + + return genphy_suspend(phydev); +} + static int bcm7xxx_28nm_probe(struct phy_device *phydev) { struct bcm7xxx_phy_priv *priv; @@ -849,6 +860,7 @@ static int bcm7xxx_28nm_probe(struct phy_device *phydev) .flags = PHY_IS_INTERNAL, \ .config_init = bcm7xxx_28nm_config_init, \ .resume = bcm7xxx_28nm_resume, \ + .suspend = bcm7xxx_28nm_suspend, \ .get_tunable = bcm7xxx_28nm_get_tunable, \ .set_tunable = bcm7xxx_28nm_set_tunable, \ .get_sset_count = bcm_phy_get_sset_count, \ @@ -866,6 +878,7 @@ static int bcm7xxx_28nm_probe(struct phy_device *phydev) .flags = PHY_IS_INTERNAL, \ .config_init = bcm7xxx_28nm_ephy_config_init, \ .resume = bcm7xxx_28nm_ephy_resume, \ + .suspend = bcm7xxx_28nm_suspend, \ .get_sset_count = bcm_phy_get_sset_count, \ .get_strings = bcm_phy_get_strings, \ .get_stats = bcm7xxx_28nm_get_phy_stats, \ @@ -902,6 +915,7 @@ static int bcm7xxx_28nm_probe(struct phy_device *phydev) .config_aneg = genphy_config_aneg, \ .read_status = genphy_read_status, \ .resume = bcm7xxx_16nm_ephy_resume, \ + .suspend = bcm7xxx_28nm_suspend, \ } static struct phy_driver bcm7xxx_driver[] = { diff --git a/drivers/net/phy/broadcom.c b/drivers/net/phy/broadcom.c index bf0c6a04481ee2..d1a4edb34ad2ef 100644 --- a/drivers/net/phy/broadcom.c +++ b/drivers/net/phy/broadcom.c @@ -592,8 +592,13 @@ static int bcm54xx_set_wakeup_irq(struct phy_device *phydev, bool state) static int bcm54xx_suspend(struct phy_device *phydev) { + struct bcm54xx_phy_priv *priv = phydev->priv; int ret = 0; + mutex_lock(&phydev->lock); + bcm_phy_update_stats_shadow(phydev, priv->stats); + mutex_unlock(&phydev->lock); + bcm54xx_ptp_stop(phydev); /* Acknowledge any Wake-on-LAN interrupt prior to suspend */ diff --git a/drivers/net/phy/dp83tc811.c b/drivers/net/phy/dp83tc811.c index e480c2a0745057..252fb12b3e68eb 100644 --- a/drivers/net/phy/dp83tc811.c +++ b/drivers/net/phy/dp83tc811.c @@ -393,6 +393,7 @@ static struct phy_driver dp83811_driver[] = { .config_init = dp83811_config_init, .config_aneg = dp83811_config_aneg, .soft_reset = dp83811_phy_reset, + .get_features = genphy_c45_pma_read_ext_abilities, .get_wol = dp83811_get_wol, .set_wol = dp83811_set_wol, .config_intr = dp83811_config_intr, diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c index 2aa1dedd21b8eb..e211a523c2584e 100644 --- a/drivers/net/phy/micrel.c +++ b/drivers/net/phy/micrel.c @@ -4548,6 +4548,13 @@ static int lan8814_config_init(struct phy_device *phydev) struct kszphy_priv *lan8814 = phydev->priv; int ret; + if (phy_package_init_once(phydev)) + /* Reset the PHY */ + lanphy_modify_page_reg(phydev, LAN8814_PAGE_COMMON_REGS, + LAN8814_QSGMII_SOFT_RESET, + LAN8814_QSGMII_SOFT_RESET_BIT, + LAN8814_QSGMII_SOFT_RESET_BIT); + /* Based on the interface type select how the advertise ability is * encoded, to set as SGMII or as USGMII. */ @@ -4655,13 +4662,7 @@ static int lan8814_probe(struct phy_device *phydev) priv->is_ptp_available = err == LAN8814_REV_LAN8814 || err == LAN8814_REV_LAN8818; - if (phy_package_init_once(phydev)) { - /* Reset the PHY */ - lanphy_modify_page_reg(phydev, LAN8814_PAGE_COMMON_REGS, - LAN8814_QSGMII_SOFT_RESET, - LAN8814_QSGMII_SOFT_RESET_BIT, - LAN8814_QSGMII_SOFT_RESET_BIT); - + if (phy_package_probe_once(phydev)) { err = lan8814_release_coma_mode(phydev); if (err) return err; diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c index df0bcfedddbc36..293ef80c4e302e 100644 --- a/drivers/net/usb/asix_devices.c +++ b/drivers/net/usb/asix_devices.c @@ -756,6 +756,7 @@ static void ax88772_mac_link_down(struct phylink_config *config, struct usbnet *dev = netdev_priv(to_net_dev(config->dev)); asix_write_medium_mode(dev, 0, 0); + usbnet_link_change(dev, false, false); } static void ax88772_mac_link_up(struct phylink_config *config, @@ -786,6 +787,7 @@ static void ax88772_mac_link_up(struct phylink_config *config, m |= AX_MEDIUM_RFC; asix_write_medium_mode(dev, m, 0); + usbnet_link_change(dev, true, false); } static const struct phylink_mac_ops ax88772_phylink_mac_ops = { diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c index bb9929727eb932..0223a172851ecf 100644 --- a/drivers/net/usb/cdc_ncm.c +++ b/drivers/net/usb/cdc_ncm.c @@ -2012,6 +2012,14 @@ static const struct usb_device_id cdc_devs[] = { .driver_info = (unsigned long)&apple_private_interface_info, }, + /* Mac */ + { USB_DEVICE_INTERFACE_NUMBER(0x05ac, 0x1905, 0), + .driver_info = (unsigned long)&apple_private_interface_info, + }, + { USB_DEVICE_INTERFACE_NUMBER(0x05ac, 0x1905, 2), + .driver_info = (unsigned long)&apple_private_interface_info, + }, + /* Ericsson MBM devices like F5521gw */ { .match_flags = USB_DEVICE_ID_MATCH_INT_INFO | USB_DEVICE_ID_MATCH_VENDOR, diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c index 7337bf1b7d6ad0..1ace1d2398c9c1 100644 --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -10138,6 +10138,7 @@ static const struct usb_device_id rtl8152_table[] = { { USB_DEVICE(VENDOR_ID_DELL, 0xb097) }, { USB_DEVICE(VENDOR_ID_ASUS, 0x1976) }, { USB_DEVICE(VENDOR_ID_TRENDNET, 0xe02b) }, + { USB_DEVICE(VENDOR_ID_TRENDNET, 0xe02c) }, {} }; diff --git a/drivers/net/veth.c b/drivers/net/veth.c index e35df717e65e28..0cfb19b760dd54 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -972,7 +972,8 @@ static int veth_poll(struct napi_struct *napi, int budget) /* NAPI functions as RCU section */ peer_dev = rcu_dereference_check(priv->peer, rcu_read_lock_bh_held()); - peer_txq = peer_dev ? netdev_get_tx_queue(peer_dev, queue_idx) : NULL; + peer_txq = (peer_dev && queue_idx < peer_dev->real_num_tx_queues) ? + netdev_get_tx_queue(peer_dev, queue_idx) : NULL; xdp_set_return_frame_no_direct(); done = veth_xdp_rcv(rq, budget, &bq, &stats); diff --git a/drivers/net/wan/fsl_ucc_hdlc.c b/drivers/net/wan/fsl_ucc_hdlc.c index 3bd57527b1be63..809f21fb93f56e 100644 --- a/drivers/net/wan/fsl_ucc_hdlc.c +++ b/drivers/net/wan/fsl_ucc_hdlc.c @@ -740,6 +740,8 @@ static int uhdlc_open(struct net_device *dev) static void uhdlc_memclean(struct ucc_hdlc_private *priv) { + int i; + qe_muram_free(ioread16be(&priv->ucc_pram->riptr)); qe_muram_free(ioread16be(&priv->ucc_pram->tiptr)); @@ -770,14 +772,14 @@ static void uhdlc_memclean(struct ucc_hdlc_private *priv) kfree(priv->rx_skbuff); priv->rx_skbuff = NULL; + for (i = 0; i < TX_BD_RING_LEN; i++) { + dev_kfree_skb(priv->tx_skbuff[i]); + priv->tx_skbuff[i] = NULL; + } + kfree(priv->tx_skbuff); priv->tx_skbuff = NULL; - if (priv->uf_regs) { - iounmap(priv->uf_regs); - priv->uf_regs = NULL; - } - if (priv->uccf) { ucc_fast_free(priv->uccf); priv->uccf = NULL; @@ -1255,12 +1257,12 @@ static void ucc_hdlc_remove(struct platform_device *pdev) uhdlc_memclean(priv); - if (priv->utdm->si_regs) { + if (priv->utdm && priv->utdm->si_regs) { iounmap(priv->utdm->si_regs); priv->utdm->si_regs = NULL; } - if (priv->utdm->siram) { + if (priv->utdm && priv->utdm->siram) { iounmap(priv->utdm->siram); priv->utdm->siram = NULL; } diff --git a/drivers/net/wireless/ath/ath10k/Kconfig b/drivers/net/wireless/ath/ath10k/Kconfig index 876aed76583318..efb9f022d8c662 100644 --- a/drivers/net/wireless/ath/ath10k/Kconfig +++ b/drivers/net/wireless/ath/ath10k/Kconfig @@ -46,6 +46,7 @@ config ATH10K_SNOC depends on ARCH_QCOM || COMPILE_TEST depends on QCOM_SMEM depends on QCOM_RPROC_COMMON || QCOM_RPROC_COMMON=n + select POWER_SEQUENCING select QCOM_SCM select QCOM_QMI_HELPERS help diff --git a/drivers/net/wireless/ath/ath12k/core.c b/drivers/net/wireless/ath/ath12k/core.c index 2519e2400d5891..980a12fb2c6e7e 100644 --- a/drivers/net/wireless/ath/ath12k/core.c +++ b/drivers/net/wireless/ath/ath12k/core.c @@ -1838,10 +1838,22 @@ static struct ath12k_hw_group *ath12k_core_hw_group_alloc(struct ath12k_base *ab return ag; } +static void ath12k_core_free_wsi_info(struct ath12k_hw_group *ag) +{ + int i; + + for (i = 0; i < ag->num_devices; i++) { + of_node_put(ag->wsi_node[i]); + ag->wsi_node[i] = NULL; + } + ag->num_devices = 0; +} + static void ath12k_core_hw_group_free(struct ath12k_hw_group *ag) { mutex_lock(&ath12k_hw_group_mutex); + ath12k_core_free_wsi_info(ag); list_del(&ag->list); kfree(ag); @@ -1867,52 +1879,59 @@ static struct ath12k_hw_group *ath12k_core_hw_group_find_by_dt(struct ath12k_bas static int ath12k_core_get_wsi_info(struct ath12k_hw_group *ag, struct ath12k_base *ab) { - struct device_node *wsi_dev = ab->dev->of_node, *next_wsi_dev; - struct device_node *tx_endpoint, *next_rx_endpoint; - int device_count = 0; - - next_wsi_dev = wsi_dev; + struct device_node *next_wsi_dev; + int device_count = 0, ret = 0; + struct device_node *wsi_dev; - if (!next_wsi_dev) + wsi_dev = of_node_get(ab->dev->of_node); + if (!wsi_dev) return -ENODEV; do { - ag->wsi_node[device_count] = next_wsi_dev; + if (device_count >= ATH12K_MAX_DEVICES) { + ath12k_warn(ab, "device count in DT %d is more than limit %d\n", + device_count, ATH12K_MAX_DEVICES); + ret = -EINVAL; + break; + } + + ag->wsi_node[device_count++] = of_node_get(wsi_dev); - tx_endpoint = of_graph_get_endpoint_by_regs(next_wsi_dev, 0, -1); + struct device_node *tx_endpoint __free(device_node) = + of_graph_get_endpoint_by_regs(wsi_dev, 0, -1); if (!tx_endpoint) { - of_node_put(next_wsi_dev); - return -ENODEV; + ret = -ENODEV; + break; } - next_rx_endpoint = of_graph_get_remote_endpoint(tx_endpoint); + struct device_node *next_rx_endpoint __free(device_node) = + of_graph_get_remote_endpoint(tx_endpoint); if (!next_rx_endpoint) { - of_node_put(next_wsi_dev); - of_node_put(tx_endpoint); - return -ENODEV; + ret = -ENODEV; + break; } - of_node_put(tx_endpoint); - of_node_put(next_wsi_dev); - next_wsi_dev = of_graph_get_port_parent(next_rx_endpoint); if (!next_wsi_dev) { - of_node_put(next_rx_endpoint); - return -ENODEV; + ret = -ENODEV; + break; } - of_node_put(next_rx_endpoint); + of_node_put(wsi_dev); + wsi_dev = next_wsi_dev; + } while (ab->dev->of_node != wsi_dev); - device_count++; - if (device_count > ATH12K_MAX_DEVICES) { - ath12k_warn(ab, "device count in DT %d is more than limit %d\n", - device_count, ATH12K_MAX_DEVICES); - of_node_put(next_wsi_dev); - return -EINVAL; + if (ret) { + while (--device_count >= 0) { + of_node_put(ag->wsi_node[device_count]); + ag->wsi_node[device_count] = NULL; } - } while (wsi_dev != next_wsi_dev); - of_node_put(next_wsi_dev); + of_node_put(wsi_dev); + return ret; + } + + of_node_put(wsi_dev); ag->num_devices = device_count; return 0; @@ -1983,9 +2002,9 @@ static struct ath12k_hw_group *ath12k_core_hw_group_assign(struct ath12k_base *a ath12k_core_get_wsi_index(ag, ab)) { ath12k_dbg(ab, ATH12K_DBG_BOOT, "unable to get wsi info from dt, grouping single device"); + ath12k_core_free_wsi_info(ag); ag->id = ATH12K_INVALID_GROUP_ID; ag->num_devices = 1; - memset(ag->wsi_node, 0, sizeof(ag->wsi_node)); wsi->index = 0; } diff --git a/drivers/net/wireless/ath/ath12k/dp_rx.c b/drivers/net/wireless/ath/ath12k/dp_rx.c index 250459facff367..b108ccd0f63703 100644 --- a/drivers/net/wireless/ath/ath12k/dp_rx.c +++ b/drivers/net/wireless/ath/ath12k/dp_rx.c @@ -565,6 +565,9 @@ static int ath12k_dp_prepare_reo_update_elem(struct ath12k_dp *dp, lockdep_assert_held(&dp->dp_lock); + if (!peer->primary_link) + return 0; + elem = kzalloc_obj(*elem, GFP_ATOMIC); if (!elem) return -ENOMEM; @@ -1337,7 +1340,7 @@ void ath12k_dp_rx_deliver_msdu(struct ath12k_pdev_dp *dp_pdev, struct napi_struc bool is_mcbc = rxcb->is_mcbc; bool is_eapol = rxcb->is_eapol; - peer = ath12k_dp_peer_find_by_peerid(dp_pdev, rx_info->peer_id); + peer = ath12k_dp_peer_find_by_peerid(dp_pdev, rxcb->peer_id); pubsta = peer ? peer->sta : NULL; diff --git a/drivers/net/wireless/ath/ath12k/mac.c b/drivers/net/wireless/ath/ath12k/mac.c index fbdfe6424fd7c2..df2334f3bad645 100644 --- a/drivers/net/wireless/ath/ath12k/mac.c +++ b/drivers/net/wireless/ath/ath12k/mac.c @@ -788,7 +788,7 @@ struct ath12k_link_vif *ath12k_mac_get_arvif(struct ath12k *ar, u32 vdev_id) /* To use the arvif returned, caller must have held rcu read lock. */ - WARN_ON(!rcu_read_lock_any_held()); + lockdep_assert_in_rcu_read_lock(); arvif_iter.vdev_id = vdev_id; arvif_iter.ar = ar; diff --git a/drivers/net/wireless/ath/ath12k/p2p.c b/drivers/net/wireless/ath/ath12k/p2p.c index 59589748f1a8c2..19ebcd1d8eb256 100644 --- a/drivers/net/wireless/ath/ath12k/p2p.c +++ b/drivers/net/wireless/ath/ath12k/p2p.c @@ -123,7 +123,7 @@ static void ath12k_p2p_noa_update_vdev_iter(void *data, u8 *mac, struct ath12k_p2p_noa_arg *arg = data; struct ath12k_link_vif *arvif; - WARN_ON(!rcu_read_lock_any_held()); + lockdep_assert_in_rcu_read_lock(); arvif = &ahvif->deflink; if (!arvif->is_created || arvif->ar != arg->ar || arvif->vdev_id != arg->vdev_id) return; diff --git a/drivers/net/wireless/ath/ath12k/wmi.c b/drivers/net/wireless/ath/ath12k/wmi.c index 65a05a9520ffcb..b5e904a55aeabb 100644 --- a/drivers/net/wireless/ath/ath12k/wmi.c +++ b/drivers/net/wireless/ath/ath12k/wmi.c @@ -9778,7 +9778,7 @@ static void ath12k_wmi_rssi_dbm_conversion_params_info_event(struct ath12k_base *ab, struct sk_buff *skb) { - struct ath12k_wmi_rssi_dbm_conv_info_arg rssi_info; + struct ath12k_wmi_rssi_dbm_conv_info_arg rssi_info = {}; struct ath12k *ar; s32 noise_floor; u32 pdev_id; @@ -10251,7 +10251,7 @@ int ath12k_wmi_hw_data_filter_cmd(struct ath12k *ar, struct wmi_hw_data_filter_a { struct wmi_hw_data_filter_cmd *cmd; struct sk_buff *skb; - int len; + int ret, len; len = sizeof(*cmd); skb = ath12k_wmi_alloc_skb(ar->wmi->wmi_ab, len); @@ -10275,7 +10275,13 @@ int ath12k_wmi_hw_data_filter_cmd(struct ath12k *ar, struct wmi_hw_data_filter_a "wmi hw data filter enable %d filter_bitmap 0x%x\n", arg->enable, arg->hw_filter_bitmap); - return ath12k_wmi_cmd_send(ar->wmi, skb, WMI_HW_DATA_FILTER_CMDID); + ret = ath12k_wmi_cmd_send(ar->wmi, skb, WMI_HW_DATA_FILTER_CMDID); + if (ret) { + ath12k_warn(ar->ab, "failed to send WMI_HW_DATA_FILTER_CMDID\n"); + dev_kfree_skb(skb); + } + + return ret; } int ath12k_wmi_wow_host_wakeup_ind(struct ath12k *ar) @@ -10283,6 +10289,7 @@ int ath12k_wmi_wow_host_wakeup_ind(struct ath12k *ar) struct wmi_wow_host_wakeup_cmd *cmd; struct sk_buff *skb; size_t len; + int ret; len = sizeof(*cmd); skb = ath12k_wmi_alloc_skb(ar->wmi->wmi_ab, len); @@ -10295,14 +10302,20 @@ int ath12k_wmi_wow_host_wakeup_ind(struct ath12k *ar) ath12k_dbg(ar->ab, ATH12K_DBG_WMI, "wmi tlv wow host wakeup ind\n"); - return ath12k_wmi_cmd_send(ar->wmi, skb, WMI_WOW_HOSTWAKEUP_FROM_SLEEP_CMDID); + ret = ath12k_wmi_cmd_send(ar->wmi, skb, WMI_WOW_HOSTWAKEUP_FROM_SLEEP_CMDID); + if (ret) { + ath12k_warn(ar->ab, "failed to send WMI_WOW_HOSTWAKEUP_FROM_SLEEP_CMDID\n"); + dev_kfree_skb(skb); + } + + return ret; } int ath12k_wmi_wow_enable(struct ath12k *ar) { struct wmi_wow_enable_cmd *cmd; struct sk_buff *skb; - int len; + int ret, len; len = sizeof(*cmd); skb = ath12k_wmi_alloc_skb(ar->wmi->wmi_ab, len); @@ -10317,7 +10330,13 @@ int ath12k_wmi_wow_enable(struct ath12k *ar) cmd->pause_iface_config = cpu_to_le32(WOW_IFACE_PAUSE_ENABLED); ath12k_dbg(ar->ab, ATH12K_DBG_WMI, "wmi tlv wow enable\n"); - return ath12k_wmi_cmd_send(ar->wmi, skb, WMI_WOW_ENABLE_CMDID); + ret = ath12k_wmi_cmd_send(ar->wmi, skb, WMI_WOW_ENABLE_CMDID); + if (ret) { + ath12k_warn(ar->ab, "failed to send WMI_WOW_ENABLE_CMDID\n"); + dev_kfree_skb(skb); + } + + return ret; } int ath12k_wmi_wow_add_wakeup_event(struct ath12k *ar, u32 vdev_id, @@ -10327,6 +10346,7 @@ int ath12k_wmi_wow_add_wakeup_event(struct ath12k *ar, u32 vdev_id, struct wmi_wow_add_del_event_cmd *cmd; struct sk_buff *skb; size_t len; + int ret; len = sizeof(*cmd); skb = ath12k_wmi_alloc_skb(ar->wmi->wmi_ab, len); @@ -10343,7 +10363,13 @@ int ath12k_wmi_wow_add_wakeup_event(struct ath12k *ar, u32 vdev_id, ath12k_dbg(ar->ab, ATH12K_DBG_WMI, "wmi tlv wow add wakeup event %s enable %d vdev_id %d\n", wow_wakeup_event(event), enable, vdev_id); - return ath12k_wmi_cmd_send(ar->wmi, skb, WMI_WOW_ENABLE_DISABLE_WAKE_EVENT_CMDID); + ret = ath12k_wmi_cmd_send(ar->wmi, skb, WMI_WOW_ENABLE_DISABLE_WAKE_EVENT_CMDID); + if (ret) { + ath12k_warn(ar->ab, "failed to send WMI_WOW_ENABLE_DISABLE_WAKE_EVENT_CMDID\n"); + dev_kfree_skb(skb); + } + + return ret; } int ath12k_wmi_wow_add_pattern(struct ath12k *ar, u32 vdev_id, u32 pattern_id, @@ -10356,6 +10382,7 @@ int ath12k_wmi_wow_add_pattern(struct ath12k *ar, u32 vdev_id, u32 pattern_id, struct sk_buff *skb; void *ptr; size_t len; + int ret; len = sizeof(*cmd) + sizeof(*tlv) + /* array struct */ @@ -10435,7 +10462,13 @@ int ath12k_wmi_wow_add_pattern(struct ath12k *ar, u32 vdev_id, u32 pattern_id, ath12k_dbg_dump(ar->ab, ATH12K_DBG_WMI, NULL, "wow bitmask: ", bitmap->bitmaskbuf, pattern_len); - return ath12k_wmi_cmd_send(ar->wmi, skb, WMI_WOW_ADD_WAKE_PATTERN_CMDID); + ret = ath12k_wmi_cmd_send(ar->wmi, skb, WMI_WOW_ADD_WAKE_PATTERN_CMDID); + if (ret) { + ath12k_warn(ar->ab, "failed to send WMI_WOW_ADD_WAKE_PATTERN_CMDID\n"); + dev_kfree_skb(skb); + } + + return ret; } int ath12k_wmi_wow_del_pattern(struct ath12k *ar, u32 vdev_id, u32 pattern_id) @@ -10443,6 +10476,7 @@ int ath12k_wmi_wow_del_pattern(struct ath12k *ar, u32 vdev_id, u32 pattern_id) struct wmi_wow_del_pattern_cmd *cmd; struct sk_buff *skb; size_t len; + int ret; len = sizeof(*cmd); skb = ath12k_wmi_alloc_skb(ar->wmi->wmi_ab, len); @@ -10459,7 +10493,13 @@ int ath12k_wmi_wow_del_pattern(struct ath12k *ar, u32 vdev_id, u32 pattern_id) ath12k_dbg(ar->ab, ATH12K_DBG_WMI, "wmi tlv wow del pattern vdev_id %d pattern_id %d\n", vdev_id, pattern_id); - return ath12k_wmi_cmd_send(ar->wmi, skb, WMI_WOW_DEL_WAKE_PATTERN_CMDID); + ret = ath12k_wmi_cmd_send(ar->wmi, skb, WMI_WOW_DEL_WAKE_PATTERN_CMDID); + if (ret) { + ath12k_warn(ar->ab, "failed to send WMI_WOW_DEL_WAKE_PATTERN_CMDID\n"); + dev_kfree_skb(skb); + } + + return ret; } static struct sk_buff * @@ -10595,6 +10635,7 @@ int ath12k_wmi_wow_config_pno(struct ath12k *ar, u32 vdev_id, struct wmi_pno_scan_req_arg *pno_scan) { struct sk_buff *skb; + int ret; if (pno_scan->enable) skb = ath12k_wmi_op_gen_config_pno_start(ar, vdev_id, pno_scan); @@ -10604,7 +10645,13 @@ int ath12k_wmi_wow_config_pno(struct ath12k *ar, u32 vdev_id, if (IS_ERR_OR_NULL(skb)) return -ENOMEM; - return ath12k_wmi_cmd_send(ar->wmi, skb, WMI_NETWORK_LIST_OFFLOAD_CONFIG_CMDID); + ret = ath12k_wmi_cmd_send(ar->wmi, skb, WMI_NETWORK_LIST_OFFLOAD_CONFIG_CMDID); + if (ret) { + ath12k_warn(ar->ab, "failed to send WMI_NETWORK_LIST_OFFLOAD_CONFIG_CMDID\n"); + dev_kfree_skb(skb); + } + + return ret; } static void ath12k_wmi_fill_ns_offload(struct ath12k *ar, @@ -10717,6 +10764,7 @@ int ath12k_wmi_arp_ns_offload(struct ath12k *ar, void *buf_ptr; size_t len; u8 ns_cnt, ns_ext_tuples = 0; + int ret; ns_cnt = offload->ipv6_count; @@ -10752,7 +10800,13 @@ int ath12k_wmi_arp_ns_offload(struct ath12k *ar, if (ns_ext_tuples) ath12k_wmi_fill_ns_offload(ar, offload, &buf_ptr, enable, 1); - return ath12k_wmi_cmd_send(ar->wmi, skb, WMI_SET_ARP_NS_OFFLOAD_CMDID); + ret = ath12k_wmi_cmd_send(ar->wmi, skb, WMI_SET_ARP_NS_OFFLOAD_CMDID); + if (ret) { + ath12k_warn(ar->ab, "failed to send WMI_SET_ARP_NS_OFFLOAD_CMDID\n"); + dev_kfree_skb(skb); + } + + return ret; } int ath12k_wmi_gtk_rekey_offload(struct ath12k *ar, @@ -10762,7 +10816,7 @@ int ath12k_wmi_gtk_rekey_offload(struct ath12k *ar, struct wmi_gtk_rekey_offload_cmd *cmd; struct sk_buff *skb; __le64 replay_ctr; - int len; + int ret, len; len = sizeof(*cmd); skb = ath12k_wmi_alloc_skb(ar->wmi->wmi_ab, len); @@ -10789,7 +10843,13 @@ int ath12k_wmi_gtk_rekey_offload(struct ath12k *ar, ath12k_dbg(ar->ab, ATH12K_DBG_WMI, "offload gtk rekey vdev: %d %d\n", arvif->vdev_id, enable); - return ath12k_wmi_cmd_send(ar->wmi, skb, WMI_GTK_OFFLOAD_CMDID); + ret = ath12k_wmi_cmd_send(ar->wmi, skb, WMI_GTK_OFFLOAD_CMDID); + if (ret) { + ath12k_warn(ar->ab, "failed to send WMI_GTK_OFFLOAD_CMDID offload\n"); + dev_kfree_skb(skb); + } + + return ret; } int ath12k_wmi_gtk_rekey_getinfo(struct ath12k *ar, @@ -10797,7 +10857,7 @@ int ath12k_wmi_gtk_rekey_getinfo(struct ath12k *ar, { struct wmi_gtk_rekey_offload_cmd *cmd; struct sk_buff *skb; - int len; + int ret, len; len = sizeof(*cmd); skb = ath12k_wmi_alloc_skb(ar->wmi->wmi_ab, len); @@ -10811,7 +10871,13 @@ int ath12k_wmi_gtk_rekey_getinfo(struct ath12k *ar, ath12k_dbg(ar->ab, ATH12K_DBG_WMI, "get gtk rekey vdev_id: %d\n", arvif->vdev_id); - return ath12k_wmi_cmd_send(ar->wmi, skb, WMI_GTK_OFFLOAD_CMDID); + ret = ath12k_wmi_cmd_send(ar->wmi, skb, WMI_GTK_OFFLOAD_CMDID); + if (ret) { + ath12k_warn(ar->ab, "failed to send WMI_GTK_OFFLOAD_CMDID getinfo\n"); + dev_kfree_skb(skb); + } + + return ret; } int ath12k_wmi_sta_keepalive(struct ath12k *ar, @@ -10822,6 +10888,7 @@ int ath12k_wmi_sta_keepalive(struct ath12k *ar, struct wmi_sta_keepalive_cmd *cmd; struct sk_buff *skb; size_t len; + int ret; len = sizeof(*cmd) + sizeof(*arp); skb = ath12k_wmi_alloc_skb(wmi->wmi_ab, len); @@ -10849,7 +10916,13 @@ int ath12k_wmi_sta_keepalive(struct ath12k *ar, "wmi sta keepalive vdev %d enabled %d method %d interval %d\n", arg->vdev_id, arg->enabled, arg->method, arg->interval); - return ath12k_wmi_cmd_send(wmi, skb, WMI_STA_KEEPALIVE_CMDID); + ret = ath12k_wmi_cmd_send(wmi, skb, WMI_STA_KEEPALIVE_CMDID); + if (ret) { + ath12k_warn(ar->ab, "failed to send WMI_STA_KEEPALIVE_CMDID\n"); + dev_kfree_skb(skb); + } + + return ret; } int ath12k_wmi_mlo_setup(struct ath12k *ar, struct wmi_mlo_setup_arg *mlo_params) diff --git a/drivers/net/wireless/ath/ath5k/base.c b/drivers/net/wireless/ath/ath5k/base.c index 05c9c07591fcb1..6ca31d4ea437bd 100644 --- a/drivers/net/wireless/ath/ath5k/base.c +++ b/drivers/net/wireless/ath/ath5k/base.c @@ -1738,7 +1738,8 @@ ath5k_tx_frame_completed(struct ath5k_hw *ah, struct sk_buff *skb, } info->status.rates[ts->ts_final_idx].count = ts->ts_final_retry; - info->status.rates[ts->ts_final_idx + 1].idx = -1; + if (ts->ts_final_idx + 1 < IEEE80211_TX_MAX_RATES) + info->status.rates[ts->ts_final_idx + 1].idx = -1; if (unlikely(ts->ts_status)) { ah->stats.ack_fail++; diff --git a/drivers/net/wireless/broadcom/b43/xmit.c b/drivers/net/wireless/broadcom/b43/xmit.c index 7651b1bdb59266..f0b082596637ff 100644 --- a/drivers/net/wireless/broadcom/b43/xmit.c +++ b/drivers/net/wireless/broadcom/b43/xmit.c @@ -702,7 +702,8 @@ void b43_rx(struct b43_wldev *dev, struct sk_buff *skb, const void *_rxhdr) * key index, but the ucode passed it slightly different. */ keyidx = b43_kidx_to_raw(dev, keyidx); - B43_WARN_ON(keyidx >= ARRAY_SIZE(dev->key)); + if (B43_WARN_ON(keyidx >= ARRAY_SIZE(dev->key))) + goto drop; if (dev->key[keyidx].algorithm != B43_SEC_ALGO_NONE) { wlhdr_len = ieee80211_hdrlen(fctl); diff --git a/drivers/net/wireless/broadcom/b43legacy/xmit.c b/drivers/net/wireless/broadcom/b43legacy/xmit.c index efd63f4ce74f2b..ee199d4eaf039a 100644 --- a/drivers/net/wireless/broadcom/b43legacy/xmit.c +++ b/drivers/net/wireless/broadcom/b43legacy/xmit.c @@ -476,7 +476,8 @@ void b43legacy_rx(struct b43legacy_wldev *dev, * key index, but the ucode passed it slightly different. */ keyidx = b43legacy_kidx_to_raw(dev, keyidx); - B43legacy_WARN_ON(keyidx >= dev->max_nr_keys); + if (B43legacy_WARN_ON(keyidx >= dev->max_nr_keys)) + goto drop; if (dev->key[keyidx].algorithm != B43legacy_SEC_ALGO_NONE) { /* Remove PROTECTED flag to mark it as decrypted. */ diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c index 30f6fcb6863279..8fb595733b9c36 100644 --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c @@ -2476,8 +2476,9 @@ static void brcmf_sdio_bus_stop(struct device *dev) brcmf_dbg(TRACE, "Enter\n"); if (bus->watchdog_tsk) { + get_task_struct(bus->watchdog_tsk); send_sig(SIGTERM, bus->watchdog_tsk, 1); - kthread_stop(bus->watchdog_tsk); + kthread_stop_put(bus->watchdog_tsk); bus->watchdog_tsk = NULL; } @@ -4567,8 +4568,9 @@ void brcmf_sdio_remove(struct brcmf_sdio *bus) if (bus) { /* Stop watchdog task */ if (bus->watchdog_tsk) { + get_task_struct(bus->watchdog_tsk); send_sig(SIGTERM, bus->watchdog_tsk, 1); - kthread_stop(bus->watchdog_tsk); + kthread_stop_put(bus->watchdog_tsk); bus->watchdog_tsk = NULL; } diff --git a/drivers/net/wireless/marvell/libertas/if_usb.c b/drivers/net/wireless/marvell/libertas/if_usb.c index 4fae0e33513661..5cc0c5cac25745 100644 --- a/drivers/net/wireless/marvell/libertas/if_usb.c +++ b/drivers/net/wireless/marvell/libertas/if_usb.c @@ -310,6 +310,7 @@ static void if_usb_disconnect(struct usb_interface *intf) struct lbs_private *priv = cardp->priv; cardp->surprise_removed = 1; + wake_up(&cardp->fw_wq); if (priv) { lbs_stop_card(priv); @@ -633,9 +634,10 @@ static inline void process_cmdrequest(int recvlength, uint8_t *recvbuff, unsigned long flags; u8 i; - if (recvlength > LBS_CMD_BUFFER_SIZE) { + if (recvlength < MESSAGE_HEADER_LEN || + recvlength > LBS_CMD_BUFFER_SIZE) { lbs_deb_usbd(&cardp->udev->dev, - "The receive buffer is too large\n"); + "The receive buffer is invalid: %d\n", recvlength); kfree_skb(skb); return; } diff --git a/drivers/net/wireless/rsi/rsi_common.h b/drivers/net/wireless/rsi/rsi_common.h index 591602beeec689..3cdf9ded876d9c 100644 --- a/drivers/net/wireless/rsi/rsi_common.h +++ b/drivers/net/wireless/rsi/rsi_common.h @@ -70,12 +70,11 @@ static inline int rsi_create_kthread(struct rsi_common *common, return 0; } -static inline int rsi_kill_thread(struct rsi_thread *handle) +static inline void rsi_kill_thread(struct rsi_thread *handle) { atomic_inc(&handle->thread_done); rsi_set_event(&handle->event); - - return kthread_stop(handle->task); + wait_for_completion(&handle->completion); } void rsi_mac80211_detach(struct rsi_hw *hw); diff --git a/drivers/net/wireless/st/cw1200/pm.c b/drivers/net/wireless/st/cw1200/pm.c index 84eb15d729c70d..120f0379f81dd5 100644 --- a/drivers/net/wireless/st/cw1200/pm.c +++ b/drivers/net/wireless/st/cw1200/pm.c @@ -264,14 +264,12 @@ int cw1200_wow_suspend(struct ieee80211_hw *hw, struct cfg80211_wowlan *wowlan) wiphy_err(priv->hw->wiphy, "PM request failed: %d. WoW is disabled.\n", ret); cw1200_wow_resume(hw); - mutex_unlock(&priv->conf_mutex); return -EBUSY; } /* Force resume if event is coming from the device. */ if (atomic_read(&priv->bh_rx)) { cw1200_wow_resume(hw); - mutex_unlock(&priv->conf_mutex); return -EAGAIN; } diff --git a/drivers/net/wwan/t7xx/t7xx_modem_ops.c b/drivers/net/wwan/t7xx/t7xx_modem_ops.c index 7968e208dd37c1..adb29d30c63fe7 100644 --- a/drivers/net/wwan/t7xx/t7xx_modem_ops.c +++ b/drivers/net/wwan/t7xx/t7xx_modem_ops.c @@ -457,8 +457,20 @@ static int t7xx_parse_host_rt_data(struct t7xx_fsm_ctl *ctl, struct t7xx_sys_inf offset = sizeof(struct feature_query); for (i = 0; i < FEATURE_COUNT && offset < data_length; i++) { + size_t remaining = data_length - offset; + size_t feat_data_len, feat_total; + + if (remaining < sizeof(*rt_feature)) + break; + rt_feature = data + offset; - offset += sizeof(*rt_feature) + le32_to_cpu(rt_feature->data_len); + feat_data_len = le32_to_cpu(rt_feature->data_len); + + if (feat_data_len > remaining - sizeof(*rt_feature)) + break; + + feat_total = sizeof(*rt_feature) + feat_data_len; + offset += feat_total; ft_spt_cfg = FIELD_GET(FEATURE_MSK, core->feature_set[i]); if (ft_spt_cfg != MTK_FEATURE_MUST_BE_SUPPORTED) @@ -468,8 +480,10 @@ static int t7xx_parse_host_rt_data(struct t7xx_fsm_ctl *ctl, struct t7xx_sys_inf if (ft_spt_st != MTK_FEATURE_MUST_BE_SUPPORTED) return -EINVAL; - if (i == RT_ID_MD_PORT_ENUM || i == RT_ID_AP_PORT_ENUM) - t7xx_port_enum_msg_handler(ctl->md, rt_feature->data); + if (i == RT_ID_MD_PORT_ENUM || i == RT_ID_AP_PORT_ENUM) { + t7xx_port_enum_msg_handler(ctl->md, rt_feature->data, + feat_data_len); + } } return 0; diff --git a/drivers/net/wwan/t7xx/t7xx_port_ctrl_msg.c b/drivers/net/wwan/t7xx/t7xx_port_ctrl_msg.c index ae632ef966983e..f869e4ed9ee9a9 100644 --- a/drivers/net/wwan/t7xx/t7xx_port_ctrl_msg.c +++ b/drivers/net/wwan/t7xx/t7xx_port_ctrl_msg.c @@ -117,6 +117,7 @@ static int fsm_ee_message_handler(struct t7xx_port *port, struct t7xx_fsm_ctl *c * t7xx_port_enum_msg_handler() - Parse the port enumeration message to create/remove nodes. * @md: Modem context. * @msg: Message. + * @msg_len: Length of @msg in bytes. * * Used to control create/remove device node. * @@ -124,12 +125,18 @@ static int fsm_ee_message_handler(struct t7xx_port *port, struct t7xx_fsm_ctl *c * * 0 - Success. * * -EFAULT - Message check failure. */ -int t7xx_port_enum_msg_handler(struct t7xx_modem *md, void *msg) +int t7xx_port_enum_msg_handler(struct t7xx_modem *md, void *msg, size_t msg_len) { struct device *dev = &md->t7xx_dev->pdev->dev; unsigned int version, port_count, i; struct port_msg *port_msg = msg; + if (msg_len < sizeof(*port_msg)) { + dev_err(dev, "Port enum msg too short for header: need %zu, have %zu\n", + sizeof(*port_msg), msg_len); + return -EINVAL; + } + version = FIELD_GET(PORT_MSG_VERSION, le32_to_cpu(port_msg->info)); if (version != PORT_ENUM_VER || le32_to_cpu(port_msg->head_pattern) != PORT_ENUM_HEAD_PATTERN || @@ -141,6 +148,13 @@ int t7xx_port_enum_msg_handler(struct t7xx_modem *md, void *msg) } port_count = FIELD_GET(PORT_MSG_PRT_CNT, le32_to_cpu(port_msg->info)); + + if (msg_len < struct_size(port_msg, data, port_count)) { + dev_err(dev, "Port enum msg too short: need %zu, have %zu\n", + struct_size(port_msg, data, port_count), msg_len); + return -EINVAL; + } + for (i = 0; i < port_count; i++) { u32 port_info = le32_to_cpu(port_msg->data[i]); unsigned int ch_id; @@ -191,7 +205,7 @@ static int control_msg_handler(struct t7xx_port *port, struct sk_buff *skb) case CTL_ID_PORT_ENUM: skb_pull(skb, sizeof(*ctrl_msg_h)); - ret = t7xx_port_enum_msg_handler(ctl->md, (struct port_msg *)skb->data); + ret = t7xx_port_enum_msg_handler(ctl->md, (struct port_msg *)skb->data, skb->len); if (!ret) ret = port_ctl_send_msg_to_md(port, CTL_ID_PORT_ENUM, 0); else diff --git a/drivers/net/wwan/t7xx/t7xx_port_proxy.h b/drivers/net/wwan/t7xx/t7xx_port_proxy.h index f0918b36e899bd..7c3190bf0fcf39 100644 --- a/drivers/net/wwan/t7xx/t7xx_port_proxy.h +++ b/drivers/net/wwan/t7xx/t7xx_port_proxy.h @@ -103,7 +103,7 @@ void t7xx_port_proxy_reset(struct port_proxy *port_prox); void t7xx_port_proxy_uninit(struct port_proxy *port_prox); int t7xx_port_proxy_init(struct t7xx_modem *md); void t7xx_port_proxy_md_status_notify(struct port_proxy *port_prox, unsigned int state); -int t7xx_port_enum_msg_handler(struct t7xx_modem *md, void *msg); +int t7xx_port_enum_msg_handler(struct t7xx_modem *md, void *msg, size_t msg_len); int t7xx_port_proxy_chl_enable_disable(struct port_proxy *port_prox, unsigned int ch_id, bool en_flag); void t7xx_port_proxy_set_cfg(struct t7xx_modem *md, enum port_cfg_id cfg_id); diff --git a/drivers/nvme/host/apple.c b/drivers/nvme/host/apple.c index 423c9c628e7bfa..c692fc73babfe5 100644 --- a/drivers/nvme/host/apple.c +++ b/drivers/nvme/host/apple.c @@ -1009,6 +1009,7 @@ static void apple_nvme_init_queue(struct apple_nvme_queue *q) unsigned int depth = apple_nvme_queue_depth(q); struct apple_nvme *anv = queue_to_apple_nvme(q); + q->sq_tail = 0; q->cq_head = 0; q->cq_phase = 1; if (anv->hw->has_lsq_nvmmu) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index dc388e24caadeb..c3032d6ad6b1e2 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -3749,6 +3749,10 @@ int nvme_init_ctrl_finish(struct nvme_ctrl *ctrl, bool was_suspended) ret = nvme_hwmon_init(ctrl); if (ret == -EINTR) return ret; + + if (!nvme_ctrl_sgl_supported(ctrl)) + dev_info(ctrl->device, + "passthrough uses implicit buffer lengths\n"); } clear_bit(NVME_CTRL_DIRTY_CAPABILITY, &ctrl->flags); @@ -5041,8 +5045,8 @@ void nvme_start_ctrl(struct nvme_ctrl *ctrl) nvme_mpath_update(ctrl); } - nvme_change_uevent(ctrl, "NVME_EVENT=connected"); set_bit(NVME_CTRL_STARTED_ONCE, &ctrl->flags); + nvme_change_uevent(ctrl, "NVME_EVENT=connected"); } EXPORT_SYMBOL_GPL(nvme_start_ctrl); diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c index 9597a87cf05dc3..08889b20e5d8c4 100644 --- a/drivers/nvme/host/ioctl.c +++ b/drivers/nvme/host/ioctl.c @@ -120,21 +120,11 @@ static int nvme_map_user_request(struct request *req, u64 ubuffer, struct nvme_ns *ns = q->queuedata; struct block_device *bdev = ns ? ns->disk->part0 : NULL; bool supports_metadata = bdev && blk_get_integrity(bdev->bd_disk); - struct nvme_ctrl *ctrl = nvme_req(req)->ctrl; bool has_metadata = meta_buffer && meta_len; - struct bio *bio = NULL; int ret; - if (!nvme_ctrl_sgl_supported(ctrl)) - dev_warn_once(ctrl->device, "using unchecked data buffer\n"); - if (has_metadata) { - if (!supports_metadata) - return -EINVAL; - - if (!nvme_ctrl_meta_sgl_supported(ctrl)) - dev_warn_once(ctrl->device, - "using unchecked metadata buffer\n"); - } + if (has_metadata && !supports_metadata) + return -EINVAL; if (iter) ret = blk_rq_map_user_iov(q, req, NULL, iter, GFP_KERNEL); @@ -154,8 +144,8 @@ static int nvme_map_user_request(struct request *req, u64 ubuffer, return ret; out_unmap: - if (bio) - blk_rq_unmap_user(bio); + if (req->bio) + blk_rq_unmap_user(req->bio); return ret; } diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 9fd04cd7c5cb13..139a10cd687f9a 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2533,11 +2533,13 @@ static void nvme_free_host_mem_multi(struct nvme_dev *dev) static void nvme_free_host_mem(struct nvme_dev *dev) { - if (dev->hmb_sgt) + if (dev->hmb_sgt) { dma_free_noncontiguous(dev->dev, dev->host_mem_size, dev->hmb_sgt, DMA_BIDIRECTIONAL); - else + dev->hmb_sgt = NULL; + } else { nvme_free_host_mem_multi(dev); + } dma_free_coherent(dev->dev, dev->host_mem_descs_size, dev->host_mem_descs, dev->host_mem_descs_dma); @@ -4107,8 +4109,6 @@ static const struct pci_device_id nvme_id_table[] = { .driver_data = NVME_QUIRK_DELAY_BEFORE_CHK_RDY, }, { PCI_DEVICE(0x1c5f, 0x0555), /* Memblaze Pblaze5 adapter */ .driver_data = NVME_QUIRK_NO_NS_DESC_LIST, }, - { PCI_DEVICE(0x144d, 0xa808), /* Samsung PM981/983 */ - .driver_data = NVME_QUIRK_IGNORE_DEV_SUBNQN, }, { PCI_DEVICE(0x144d, 0xa821), /* Samsung PM1725 */ .driver_data = NVME_QUIRK_DELAY_BEFORE_CHK_RDY, }, { PCI_DEVICE(0x144d, 0xa822), /* Samsung PM1725a */ diff --git a/drivers/nvme/target/Kconfig b/drivers/nvme/target/Kconfig index 4904097dfd490f..69bde270115e29 100644 --- a/drivers/nvme/target/Kconfig +++ b/drivers/nvme/target/Kconfig @@ -117,6 +117,15 @@ config NVME_TARGET_AUTH If unsure, say N. +config NVME_TARGET_AUTH_DEBUG + bool "NVMe over Fabrics In-band Authentication debug messages" + depends on NVME_TARGET_AUTH + help + This enables additional debug messages including the generated + DH-HMAC-CHAP secrets to help debugging authentication failures. + + If unsure, say N. + config NVME_TARGET_PCI_EPF tristate "NVMe PCI Endpoint Function target support" depends on NVME_TARGET && PCI_ENDPOINT diff --git a/drivers/nvme/target/auth.c b/drivers/nvme/target/auth.c index 9a2eccdc8b1332..edb9627d97b096 100644 --- a/drivers/nvme/target/auth.c +++ b/drivers/nvme/target/auth.c @@ -144,7 +144,6 @@ u8 nvmet_setup_auth(struct nvmet_ctrl *ctrl, struct nvmet_sq *sq, bool reset) goto out_unlock; list_for_each_entry(p, &ctrl->subsys->hosts, entry) { - pr_debug("check %s\n", nvmet_host_name(p->host)); if (strcmp(nvmet_host_name(p->host), ctrl->hostnqn)) continue; host = p->host; @@ -189,11 +188,12 @@ u8 nvmet_setup_auth(struct nvmet_ctrl *ctrl, struct nvmet_sq *sq, bool reset) ctrl->host_key = NULL; goto out_free_hash; } +#ifdef CONFIG_NVME_TARGET_AUTH_DEBUG pr_debug("%s: using hash %s key %*ph\n", __func__, ctrl->host_key->hash > 0 ? nvme_auth_hmac_name(ctrl->host_key->hash) : "none", (int)ctrl->host_key->len, ctrl->host_key->key); - +#endif nvme_auth_free_key(ctrl->ctrl_key); if (!host->dhchap_ctrl_secret) { ctrl->ctrl_key = NULL; @@ -207,11 +207,12 @@ u8 nvmet_setup_auth(struct nvmet_ctrl *ctrl, struct nvmet_sq *sq, bool reset) ctrl->ctrl_key = NULL; goto out_free_hash; } +#ifdef CONFIG_NVME_TARGET_AUTH_DEBUG pr_debug("%s: using ctrl hash %s key %*ph\n", __func__, ctrl->ctrl_key->hash > 0 ? nvme_auth_hmac_name(ctrl->ctrl_key->hash) : "none", (int)ctrl->ctrl_key->len, ctrl->ctrl_key->key); - +#endif out_free_hash: if (ret) { if (ctrl->host_key) { @@ -317,7 +318,6 @@ int nvmet_auth_host_hash(struct nvmet_req *req, u8 *response, if (ret) goto out_free_challenge; } - pr_debug("ctrl %d qid %d host response seq %u transaction %d\n", ctrl->cntlid, req->sq->qid, req->sq->dhchap_s1, req->sq->dhchap_tid); @@ -434,8 +434,10 @@ int nvmet_auth_ctrl_exponential(struct nvmet_req *req, ret = -EINVAL; } else { memcpy(buf, ctrl->dh_key, buf_size); +#ifdef CONFIG_NVME_TARGET_AUTH_DEBUG pr_debug("%s: ctrl %d public key %*ph\n", __func__, ctrl->cntlid, (int)buf_size, buf); +#endif } return ret; @@ -458,11 +460,12 @@ int nvmet_auth_ctrl_sesskey(struct nvmet_req *req, ctrl->shash_id); if (ret) pr_debug("failed to compute session key, err %d\n", ret); +#ifdef CONFIG_NVME_TARGET_AUTH_DEBUG else pr_debug("%s: session key %*ph\n", __func__, (int)req->sq->dhchap_skey_len, req->sq->dhchap_skey); - +#endif return ret; } diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index 164a564ba3b4e9..20f150d17a9625 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -1321,8 +1321,10 @@ static int nvmet_tcp_try_recv_ddgst(struct nvmet_tcp_queue *queue) queue->idx, cmd->req.cmd->common.command_id, queue->pdu.cmd.hdr.type, le32_to_cpu(cmd->recv_ddgst), le32_to_cpu(cmd->exp_ddgst)); - if (!(cmd->flags & NVMET_TCP_F_INIT_FAILED)) + if (!(cmd->flags & NVMET_TCP_F_INIT_FAILED)) { + cmd->req.cqe->status = NVME_SC_CMD_SEQ_ERROR; nvmet_req_uninit(&cmd->req); + } nvmet_tcp_free_cmd_buffers(cmd); ret = -EPROTO; goto out; diff --git a/drivers/parisc/lasi.c b/drivers/parisc/lasi.c index ef6125d838788b..a5b80cd5cc37d7 100644 --- a/drivers/parisc/lasi.c +++ b/drivers/parisc/lasi.c @@ -193,8 +193,7 @@ static int __init lasi_init_chip(struct parisc_device *dev) ret = request_irq(lasi->gsc_irq.irq, gsc_asic_intr, 0, "lasi", lasi); if (ret < 0) { - kfree(lasi); - return ret; + goto err_free; } /* enable IRQ's for devices below LASI */ @@ -203,8 +202,7 @@ static int __init lasi_init_chip(struct parisc_device *dev) /* Done init'ing, register this driver */ ret = gsc_common_setup(dev, lasi); if (ret) { - kfree(lasi); - return ret; + goto err_irq; } gsc_fixup_irqs(dev, lasi, lasi_choose_irq); @@ -214,6 +212,12 @@ static int __init lasi_init_chip(struct parisc_device *dev) SYS_OFF_PRIO_DEFAULT, lasi_power_off, lasi); return ret; + +err_irq: + free_irq(lasi->gsc_irq.irq, lasi); +err_free: + kfree(lasi); + return ret; } static struct parisc_device_id lasi_tbl[] __initdata = { diff --git a/drivers/parisc/led.c b/drivers/parisc/led.c index b299fcc48b0874..016c9d5a60a8a8 100644 --- a/drivers/parisc/led.c +++ b/drivers/parisc/led.c @@ -543,10 +543,8 @@ static void __init register_led_regions(void) static int __init startup_leds(void) { - if (platform_device_register(&platform_leds)) { - pr_info("LED: failed to register LEDs\n"); - platform_device_put(&platform_leds); - } + if (platform_device_register(&platform_leds)) + printk(KERN_INFO "LED: failed to register LEDs\n"); register_led_regions(); return 0; } diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index d10ece0889f0f4..e3f59001785a17 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -179,6 +179,11 @@ static const struct pci_device_id *pci_match_device(struct pci_driver *drv, return NULL; } +static void _pci_free_device(struct device *dev) +{ + kfree(to_pci_dev(dev)); +} + /** * new_id_store - sysfs frontend to pci_add_dynid() * @driver: target device driver @@ -214,11 +219,13 @@ static ssize_t new_id_store(struct device_driver *driver, const char *buf, pdev->subsystem_vendor = subvendor; pdev->subsystem_device = subdevice; pdev->class = class; + pdev->dev.release = _pci_free_device; + device_initialize(&pdev->dev); if (pci_match_device(pdrv, pdev)) retval = -EEXIST; - kfree(pdev); + put_device(&pdev->dev); if (retval) return retval; diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 8f7cfcc000901d..d34266651ad09f 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -5607,13 +5607,14 @@ static int pci_try_reset_bus(struct pci_bus *bus) * reset for affected devices * * This function will first try to reset the slots on this bus if the method is - * available. If slot reset fails or is not available, this will fall back to a + * available. If slot reset is not available, this will fall back to a * secondary bus reset. */ static int pci_reset_bridge(struct pci_dev *bridge, bool restore) { struct pci_bus *bus = bridge->subordinate; struct pci_slot *slot; + int ret = 0; if (!bus) return -ENOTTY; @@ -5627,19 +5628,17 @@ static int pci_reset_bridge(struct pci_dev *bridge, bool restore) goto bus_reset; list_for_each_entry(slot, &bus->slots, list) { - int ret; - if (restore) ret = pci_try_reset_slot(slot); else ret = pci_slot_reset(slot, PCI_RESET_DO_RESET); if (ret) - goto bus_reset; + break; } mutex_unlock(&pci_slot_mutex); - return 0; + return ret; bus_reset: mutex_unlock(&pci_slot_mutex); diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c index fbc05cda96ee01..991d3ed543f58c 100644 --- a/drivers/pci/setup-res.c +++ b/drivers/pci/setup-res.c @@ -102,6 +102,7 @@ static void pci_std_update_resource(struct pci_dev *dev, int resno) } pci_write_config_dword(dev, reg, new); + dev->saved_config_space[reg / 4] = new; pci_read_config_dword(dev, reg, &check); if ((new ^ check) & mask) { @@ -112,6 +113,7 @@ static void pci_std_update_resource(struct pci_dev *dev, int resno) if (res->flags & IORESOURCE_MEM_64) { new = region.start >> 16 >> 16; pci_write_config_dword(dev, reg + 4, new); + dev->saved_config_space[(reg + 4) / 4] = new; pci_read_config_dword(dev, reg + 4, &check); if (check != new) { pci_err(dev, "%s: error updating (high %#010x != %#010x)\n", diff --git a/drivers/platform/chrome/cros_typec_altmode.c b/drivers/platform/chrome/cros_typec_altmode.c index 557340b53af03b..66c546bf89b532 100644 --- a/drivers/platform/chrome/cros_typec_altmode.c +++ b/drivers/platform/chrome/cros_typec_altmode.c @@ -359,6 +359,7 @@ cros_typec_register_thunderbolt(struct cros_typec_port *port, } INIT_WORK(&adata->work, cros_typec_altmode_work); + mutex_init(&adata->lock); adata->alt = alt; adata->port = port; adata->ap_mode_entry = true; diff --git a/drivers/platform/wmi/core.c b/drivers/platform/wmi/core.c index 7aa40dab6145ab..5a2ffcbab6af25 100644 --- a/drivers/platform/wmi/core.c +++ b/drivers/platform/wmi/core.c @@ -411,6 +411,9 @@ int wmidev_invoke_method(struct wmi_device *wdev, u8 instance, u32 method_id, obj = aout.pointer; if (!obj) { + if (min_size != 0) + return -ENOMSG; + out->length = 0; out->data = ZERO_SIZE_PTR; diff --git a/drivers/platform/x86/asus-nb-wmi.c b/drivers/platform/x86/asus-nb-wmi.c index b4677c5bba5b44..8005c088e9eeed 100644 --- a/drivers/platform/x86/asus-nb-wmi.c +++ b/drivers/platform/x86/asus-nb-wmi.c @@ -544,6 +544,15 @@ static const struct dmi_system_id asus_quirks[] = { }, .driver_data = &quirk_asus_zenbook_duo_kbd, }, + { + .callback = dmi_matched, + .ident = "ASUS Zenbook Duo UX8407AA", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "ASUS"), + DMI_MATCH(DMI_PRODUCT_NAME, "Zenbook Duo UX8407AA"), + }, + .driver_data = &quirk_asus_zenbook_duo_kbd, + }, { .callback = dmi_matched, .ident = "ASUS ROG Z13", diff --git a/drivers/platform/x86/hp/hp-wmi.c b/drivers/platform/x86/hp/hp-wmi.c index d1cc6e7d176ca9..6950bec2a9d80d 100644 --- a/drivers/platform/x86/hp/hp-wmi.c +++ b/drivers/platform/x86/hp/hp-wmi.c @@ -205,6 +205,10 @@ static const struct dmi_system_id victus_s_thermal_profile_boards[] __initconst .matches = { DMI_MATCH(DMI_BOARD_NAME, "8BBE") }, .driver_data = (void *)&victus_s_thermal_params, }, + { + .matches = { DMI_MATCH(DMI_BOARD_NAME, "8BC2") }, + .driver_data = (void *)&omen_v1_thermal_params, + }, { .matches = { DMI_MATCH(DMI_BOARD_NAME, "8BCA") }, .driver_data = (void *)&omen_v1_thermal_params, @@ -243,7 +247,7 @@ static const struct dmi_system_id victus_s_thermal_profile_boards[] __initconst }, { .matches = { DMI_MATCH(DMI_BOARD_NAME, "8D41") }, - .driver_data = (void *)&victus_s_thermal_params, + .driver_data = (void *)&omen_v1_no_ec_thermal_params, }, { .matches = { DMI_MATCH(DMI_BOARD_NAME, "8D87") }, diff --git a/drivers/platform/x86/intel/plr_tpmi.c b/drivers/platform/x86/intel/plr_tpmi.c index 05727169f49c14..8faecc311038f9 100644 --- a/drivers/platform/x86/intel/plr_tpmi.c +++ b/drivers/platform/x86/intel/plr_tpmi.c @@ -22,6 +22,7 @@ #include #include #include +#include #include #include #include @@ -60,6 +61,8 @@ struct tpmi_plr { struct tpmi_plr_die *die_info; int num_dies; struct auxiliary_device *auxdev; + struct notifier_block nb; + struct mutex lock; /* Protect access to dbgfs_dir */ }; static const char * const plr_coarse_reasons[] = { @@ -255,6 +258,30 @@ static ssize_t plr_status_write(struct file *filp, const char __user *ubuf, } DEFINE_SHOW_STORE_ATTRIBUTE(plr_status); +static int intel_plr_notify(struct notifier_block *self, unsigned long action, void *data) +{ + struct tpmi_plr *plr = container_of(self, struct tpmi_plr, nb); + + if (action == TPMI_CORE_EXIT) { + guard(mutex)(&plr->lock); + plr->dbgfs_dir = NULL; + } + + return NOTIFY_DONE; +} + +static int intel_plr_register_notifier(struct notifier_block *nb) +{ + nb->notifier_call = intel_plr_notify; + nb->priority = 0; + return tpmi_register_notifier(nb); +} + +static void intel_plr_unregister_notifier(struct notifier_block *nb) +{ + tpmi_unregister_notifier(nb); +} + static int intel_plr_probe(struct auxiliary_device *auxdev, const struct auxiliary_device_id *id) { struct oobmsm_plat_info *plat_info; @@ -282,10 +309,18 @@ static int intel_plr_probe(struct auxiliary_device *auxdev, const struct auxilia if (!plr) return -ENOMEM; + err = devm_mutex_init(&auxdev->dev, &plr->lock); + if (err) + return err; + + intel_plr_register_notifier(&plr->nb); + plr->die_info = devm_kcalloc(&auxdev->dev, num_resources, sizeof(*plr->die_info), GFP_KERNEL); - if (!plr->die_info) - return -ENOMEM; + if (!plr->die_info) { + err = -ENOMEM; + goto err_notify; + } plr->num_dies = num_resources; plr->dbgfs_dir = debugfs_create_dir("plr", dentry); @@ -326,6 +361,9 @@ static int intel_plr_probe(struct auxiliary_device *auxdev, const struct auxilia err: debugfs_remove_recursive(plr->dbgfs_dir); +err_notify: + intel_plr_unregister_notifier(&plr->nb); + return err; } @@ -333,6 +371,9 @@ static void intel_plr_remove(struct auxiliary_device *auxdev) { struct tpmi_plr *plr = auxiliary_get_drvdata(auxdev); + intel_plr_unregister_notifier(&plr->nb); + + guard(mutex)(&plr->lock); debugfs_remove_recursive(plr->dbgfs_dir); } diff --git a/drivers/platform/x86/intel/vsec_tpmi.c b/drivers/platform/x86/intel/vsec_tpmi.c index 7fc6ff8d104063..16fd7aa41f20e2 100644 --- a/drivers/platform/x86/intel/vsec_tpmi.c +++ b/drivers/platform/x86/intel/vsec_tpmi.c @@ -56,6 +56,7 @@ #include #include #include +#include #include #include #include @@ -188,6 +189,20 @@ struct tpmi_feature_state { /* Used during auxbus device creation */ static DEFINE_IDA(intel_vsec_tpmi_ida); +static BLOCKING_NOTIFIER_HEAD(tpmi_notify_list); + +int tpmi_register_notifier(struct notifier_block *nb) +{ + return blocking_notifier_chain_register(&tpmi_notify_list, nb); +} +EXPORT_SYMBOL_NS_GPL(tpmi_register_notifier, "INTEL_TPMI"); + +int tpmi_unregister_notifier(struct notifier_block *nb) +{ + return blocking_notifier_chain_unregister(&tpmi_notify_list, nb); +} +EXPORT_SYMBOL_NS_GPL(tpmi_unregister_notifier, "INTEL_TPMI"); + struct oobmsm_plat_info *tpmi_get_platform_data(struct auxiliary_device *auxdev) { struct intel_vsec_device *vsec_dev = auxdev_to_ivdev(auxdev); @@ -817,10 +832,6 @@ static int intel_vsec_tpmi_init(struct auxiliary_device *auxdev) auxiliary_set_drvdata(auxdev, tpmi_info); - ret = tpmi_create_devices(tpmi_info); - if (ret) - return ret; - /* * Allow debugfs when security policy allows. Everything this debugfs * interface provides, can also be done via /dev/mem access. If @@ -830,6 +841,14 @@ static int intel_vsec_tpmi_init(struct auxiliary_device *auxdev) if (!security_locked_down(LOCKDOWN_DEV_MEM) && capable(CAP_SYS_RAWIO)) tpmi_dbgfs_register(tpmi_info); + ret = tpmi_create_devices(tpmi_info); + if (ret) { + debugfs_remove_recursive(tpmi_info->dbgfs_dir); + return ret; + } + + blocking_notifier_call_chain(&tpmi_notify_list, TPMI_CORE_INIT, auxdev); + return 0; } @@ -843,6 +862,8 @@ static void tpmi_remove(struct auxiliary_device *auxdev) { struct intel_tpmi_info *tpmi_info = auxiliary_get_drvdata(auxdev); + blocking_notifier_call_chain(&tpmi_notify_list, TPMI_CORE_EXIT, auxdev); + debugfs_remove_recursive(tpmi_info->dbgfs_dir); } diff --git a/drivers/platform/x86/lenovo/Kconfig b/drivers/platform/x86/lenovo/Kconfig index f885127b007f19..09b1b055d2e014 100644 --- a/drivers/platform/x86/lenovo/Kconfig +++ b/drivers/platform/x86/lenovo/Kconfig @@ -252,7 +252,6 @@ config LENOVO_WMI_GAMEZONE select ACPI_PLATFORM_PROFILE select LENOVO_WMI_EVENTS select LENOVO_WMI_HELPERS - select LENOVO_WMI_TUNING help Say Y here if you have a WMI aware Lenovo Legion device and would like to use the platform-profile firmware interface to manage power usage. diff --git a/drivers/platform/x86/lenovo/wmi-capdata.c b/drivers/platform/x86/lenovo/wmi-capdata.c index b73d378f0e8b3e..714aa6fd6f1fcb 100644 --- a/drivers/platform/x86/lenovo/wmi-capdata.c +++ b/drivers/platform/x86/lenovo/wmi-capdata.c @@ -27,7 +27,6 @@ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include -#include #include #include #include @@ -48,6 +47,7 @@ #include #include "wmi-capdata.h" +#include "wmi-helpers.h" #define LENOVO_CAPABILITY_DATA_00_GUID "362A3AFE-3D96-4665-8530-96DAD5BB300E" #define LENOVO_CAPABILITY_DATA_01_GUID "7A8F5407-CB67-4D6E-B547-39B3BE018154" @@ -57,9 +57,9 @@ #define LWMI_FEATURE_ID_FAN_TEST 0x05 -#define LWMI_ATTR_ID_FAN_TEST \ - (FIELD_PREP(LWMI_ATTR_DEV_ID_MASK, LWMI_DEVICE_ID_FAN) | \ - FIELD_PREP(LWMI_ATTR_FEAT_ID_MASK, LWMI_FEATURE_ID_FAN_TEST)) +#define LWMI_ATTR_ID_FAN_TEST \ + lwmi_attr_id(LWMI_DEVICE_ID_FAN, LWMI_FEATURE_ID_FAN_TEST, \ + LWMI_GZ_THERMAL_MODE_NONE, LWMI_TYPE_ID_NONE) enum lwmi_cd_type { LENOVO_CAPABILITY_DATA_00, diff --git a/drivers/platform/x86/lenovo/wmi-capdata.h b/drivers/platform/x86/lenovo/wmi-capdata.h index 8c1df3efcc5533..c3e760b8c3c3df 100644 --- a/drivers/platform/x86/lenovo/wmi-capdata.h +++ b/drivers/platform/x86/lenovo/wmi-capdata.h @@ -6,6 +6,7 @@ #define _LENOVO_WMI_CAPDATA_H_ #include +#include #include #define LWMI_SUPP_VALID BIT(0) @@ -19,6 +20,8 @@ #define LWMI_DEVICE_ID_FAN 0x04 +#define LWMI_TYPE_ID_NONE 0x00 + struct component_match; struct device; struct cd_list; @@ -57,6 +60,23 @@ struct lwmi_cd_binder { cd_list_cb_t cd_fan_list_cb; }; +/** + * lwmi_attr_id() - Formats a capability data attribute ID + * @dev_id: The u8 corresponding to the device ID. + * @feat_id: The u8 corresponding to the feature ID on the device. + * @mode_id: The u8 corresponding to the wmi-gamezone mode for set/get. + * @type_id: The u8 corresponding to the sub-device. + * + * Return: encoded capability data attribute ID. + */ +static inline u32 lwmi_attr_id(u8 dev_id, u8 feat_id, u8 mode_id, u8 type_id) +{ + return (FIELD_PREP(LWMI_ATTR_DEV_ID_MASK, dev_id) | + FIELD_PREP(LWMI_ATTR_FEAT_ID_MASK, feat_id) | + FIELD_PREP(LWMI_ATTR_MODE_ID_MASK, mode_id) | + FIELD_PREP(LWMI_ATTR_TYPE_ID_MASK, type_id)); +} + void lwmi_cd_match_add_all(struct device *master, struct component_match **matchptr); int lwmi_cd00_get_data(struct cd_list *list, u32 attribute_id, struct capdata00 *output); int lwmi_cd01_get_data(struct cd_list *list, u32 attribute_id, struct capdata01 *output); diff --git a/drivers/platform/x86/lenovo/wmi-events.c b/drivers/platform/x86/lenovo/wmi-events.c index 4a6a2c82413acf..fc25bba68a7c67 100644 --- a/drivers/platform/x86/lenovo/wmi-events.c +++ b/drivers/platform/x86/lenovo/wmi-events.c @@ -17,7 +17,7 @@ #include #include "wmi-events.h" -#include "wmi-gamezone.h" +#include "wmi-helpers.h" #define THERMAL_MODE_EVENT_GUID "D320289E-8FEA-41E0-86F9-911D83151B5F" diff --git a/drivers/platform/x86/lenovo/wmi-gamezone.c b/drivers/platform/x86/lenovo/wmi-gamezone.c index c7fe7e3c9f1791..109c0b564a9f6b 100644 --- a/drivers/platform/x86/lenovo/wmi-gamezone.c +++ b/drivers/platform/x86/lenovo/wmi-gamezone.c @@ -21,9 +21,7 @@ #include #include "wmi-events.h" -#include "wmi-gamezone.h" #include "wmi-helpers.h" -#include "wmi-other.h" #define LENOVO_GAMEZONE_GUID "887B54E3-DDDC-4B2C-8B88-68A26A8835D0" @@ -201,7 +199,7 @@ static int lwmi_gz_profile_set(struct device *dev, enum platform_profile_option profile) { struct lwmi_gz_priv *priv = dev_get_drvdata(dev); - struct wmi_method_args_32 args; + struct wmi_method_args_32 args = {}; enum thermal_mode mode; int ret; @@ -383,7 +381,7 @@ static int lwmi_gz_probe(struct wmi_device *wdev, const void *context) return ret; priv->mode_nb.notifier_call = lwmi_gz_mode_call; - return devm_lwmi_om_register_notifier(&wdev->dev, &priv->mode_nb); + return devm_lwmi_tm_register_notifier(&wdev->dev, &priv->mode_nb); } static const struct wmi_device_id lwmi_gz_id_table[] = { @@ -405,7 +403,6 @@ module_wmi_driver(lwmi_gz_driver); MODULE_IMPORT_NS("LENOVO_WMI_EVENTS"); MODULE_IMPORT_NS("LENOVO_WMI_HELPERS"); -MODULE_IMPORT_NS("LENOVO_WMI_OTHER"); MODULE_DEVICE_TABLE(wmi, lwmi_gz_id_table); MODULE_AUTHOR("Derek J. Clark "); MODULE_DESCRIPTION("Lenovo GameZone WMI Driver"); diff --git a/drivers/platform/x86/lenovo/wmi-gamezone.h b/drivers/platform/x86/lenovo/wmi-gamezone.h deleted file mode 100644 index 6b163a5eeb959d..00000000000000 --- a/drivers/platform/x86/lenovo/wmi-gamezone.h +++ /dev/null @@ -1,20 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0-or-later */ - -/* Copyright (C) 2025 Derek J. Clark */ - -#ifndef _LENOVO_WMI_GAMEZONE_H_ -#define _LENOVO_WMI_GAMEZONE_H_ - -enum gamezone_events_type { - LWMI_GZ_GET_THERMAL_MODE = 1, -}; - -enum thermal_mode { - LWMI_GZ_THERMAL_MODE_QUIET = 0x01, - LWMI_GZ_THERMAL_MODE_BALANCED = 0x02, - LWMI_GZ_THERMAL_MODE_PERFORMANCE = 0x03, - LWMI_GZ_THERMAL_MODE_EXTREME = 0xE0, /* Ver 6+ */ - LWMI_GZ_THERMAL_MODE_CUSTOM = 0xFF, -}; - -#endif /* !_LENOVO_WMI_GAMEZONE_H_ */ diff --git a/drivers/platform/x86/lenovo/wmi-helpers.c b/drivers/platform/x86/lenovo/wmi-helpers.c index 7379defac50028..7a198259e3933c 100644 --- a/drivers/platform/x86/lenovo/wmi-helpers.c +++ b/drivers/platform/x86/lenovo/wmi-helpers.c @@ -21,11 +21,15 @@ #include #include #include +#include #include #include #include "wmi-helpers.h" +/* Thermal mode notifier chain. */ +static BLOCKING_NOTIFIER_HEAD(tm_chain_head); + /** * lwmi_dev_evaluate_int() - Helper function for calling WMI methods that * return an integer. @@ -46,7 +50,6 @@ int lwmi_dev_evaluate_int(struct wmi_device *wdev, u8 instance, u32 method_id, unsigned char *buf, size_t size, u32 *retval) { struct acpi_buffer output = { ACPI_ALLOCATE_BUFFER, NULL }; - union acpi_object *ret_obj __free(kfree) = NULL; struct acpi_buffer input = { size, buf }; acpi_status status; @@ -55,8 +58,9 @@ int lwmi_dev_evaluate_int(struct wmi_device *wdev, u8 instance, u32 method_id, if (ACPI_FAILURE(status)) return -EIO; + union acpi_object *ret_obj __free(kfree) = output.pointer; + if (retval) { - ret_obj = output.pointer; if (!ret_obj) return -ENODATA; @@ -84,6 +88,103 @@ int lwmi_dev_evaluate_int(struct wmi_device *wdev, u8 instance, u32 method_id, }; EXPORT_SYMBOL_NS_GPL(lwmi_dev_evaluate_int, "LENOVO_WMI_HELPERS"); +/** + * lwmi_tm_register_notifier() - Add a notifier to the blocking notifier chain + * @nb: The notifier_block struct to register + * + * Call blocking_notifier_chain_register to register the notifier block to the + * thermal mode notifier chain. + * + * Return: 0 on success, %-EEXIST on error. + */ +int lwmi_tm_register_notifier(struct notifier_block *nb) +{ + return blocking_notifier_chain_register(&tm_chain_head, nb); +} +EXPORT_SYMBOL_NS_GPL(lwmi_tm_register_notifier, "LENOVO_WMI_HELPERS"); + +/** + * lwmi_tm_unregister_notifier() - Remove a notifier from the blocking notifier + * chain. + * @nb: The notifier_block struct to register + * + * Call blocking_notifier_chain_unregister to unregister the notifier block from the + * thermal mode notifier chain. + * + * Return: 0 on success, %-ENOENT on error. + */ +int lwmi_tm_unregister_notifier(struct notifier_block *nb) +{ + return blocking_notifier_chain_unregister(&tm_chain_head, nb); +} +EXPORT_SYMBOL_NS_GPL(lwmi_tm_unregister_notifier, "LENOVO_WMI_HELPERS"); + +/** + * devm_lwmi_tm_unregister_notifier() - Remove a notifier from the blocking + * notifier chain. + * @data: Void pointer to the notifier_block struct to register. + * + * Call lwmi_tm_unregister_notifier to unregister the notifier block from the + * thermal mode notifier chain. + * + * Return: 0 on success, %-ENOENT on error. + */ +static void devm_lwmi_tm_unregister_notifier(void *data) +{ + struct notifier_block *nb = data; + + lwmi_tm_unregister_notifier(nb); +} + +/** + * devm_lwmi_tm_register_notifier() - Add a notifier to the blocking notifier + * chain. + * @dev: The parent device of the notifier_block struct. + * @nb: The notifier_block struct to register + * + * Call lwmi_tm_register_notifier to register the notifier block to the + * thermal mode notifier chain. Then add devm_lwmi_tm_unregister_notifier + * as a device managed action to automatically unregister the notifier block + * upon parent device removal. + * + * Return: 0 on success, or an error code. + */ +int devm_lwmi_tm_register_notifier(struct device *dev, + struct notifier_block *nb) +{ + int ret; + + ret = lwmi_tm_register_notifier(nb); + if (ret < 0) + return ret; + + return devm_add_action_or_reset(dev, devm_lwmi_tm_unregister_notifier, + nb); +} +EXPORT_SYMBOL_NS_GPL(devm_lwmi_tm_register_notifier, "LENOVO_WMI_HELPERS"); + +/** + * lwmi_tm_notifier_call() - Call functions for the notifier call chain. + * @mode: Pointer to a thermal mode enum to retrieve the data from. + * + * Call blocking_notifier_call_chain to retrieve the thermal mode from the + * lenovo-wmi-gamezone driver. + * + * Return: 0 on success, or an error code. + */ +int lwmi_tm_notifier_call(enum thermal_mode *mode) +{ + int ret; + + ret = blocking_notifier_call_chain(&tm_chain_head, + LWMI_GZ_GET_THERMAL_MODE, &mode); + if ((ret & ~NOTIFY_STOP_MASK) != NOTIFY_OK) + return -EINVAL; + + return 0; +} +EXPORT_SYMBOL_NS_GPL(lwmi_tm_notifier_call, "LENOVO_WMI_HELPERS"); + MODULE_AUTHOR("Derek J. Clark "); MODULE_DESCRIPTION("Lenovo WMI Helpers Driver"); MODULE_LICENSE("GPL"); diff --git a/drivers/platform/x86/lenovo/wmi-helpers.h b/drivers/platform/x86/lenovo/wmi-helpers.h index 20fd217498035d..ed7db3ebba6cc1 100644 --- a/drivers/platform/x86/lenovo/wmi-helpers.h +++ b/drivers/platform/x86/lenovo/wmi-helpers.h @@ -7,6 +7,8 @@ #include +struct device; +struct notifier_block; struct wmi_device; struct wmi_method_args_32 { @@ -14,7 +16,26 @@ struct wmi_method_args_32 { u32 arg1; }; +enum lwmi_event_type { + LWMI_GZ_GET_THERMAL_MODE = 0x01, +}; + +enum thermal_mode { + LWMI_GZ_THERMAL_MODE_NONE = 0x00, + LWMI_GZ_THERMAL_MODE_QUIET = 0x01, + LWMI_GZ_THERMAL_MODE_BALANCED = 0x02, + LWMI_GZ_THERMAL_MODE_PERFORMANCE = 0x03, + LWMI_GZ_THERMAL_MODE_EXTREME = 0xE0, /* Ver 6+ */ + LWMI_GZ_THERMAL_MODE_CUSTOM = 0xFF, +}; + int lwmi_dev_evaluate_int(struct wmi_device *wdev, u8 instance, u32 method_id, unsigned char *buf, size_t size, u32 *retval); +int lwmi_tm_register_notifier(struct notifier_block *nb); +int lwmi_tm_unregister_notifier(struct notifier_block *nb); +int devm_lwmi_tm_register_notifier(struct device *dev, + struct notifier_block *nb); +int lwmi_tm_notifier_call(enum thermal_mode *mode); + #endif /* !_LENOVO_WMI_HELPERS_H_ */ diff --git a/drivers/platform/x86/lenovo/wmi-other.c b/drivers/platform/x86/lenovo/wmi-other.c index 6040f45aa2b0d6..d318ba432fdcc1 100644 --- a/drivers/platform/x86/lenovo/wmi-other.c +++ b/drivers/platform/x86/lenovo/wmi-other.c @@ -40,16 +40,13 @@ #include #include #include -#include #include #include #include #include "wmi-capdata.h" #include "wmi-events.h" -#include "wmi-gamezone.h" #include "wmi-helpers.h" -#include "wmi-other.h" #include "../firmware_attributes_class.h" #define LENOVO_OTHER_MODE_GUID "DC2A8805-3A8C-41BA-A6F7-092E0089CD3B" @@ -62,8 +59,6 @@ #define LWMI_FEATURE_ID_FAN_RPM 0x03 -#define LWMI_TYPE_ID_NONE 0x00 - #define LWMI_FEATURE_VALUE_GET 17 #define LWMI_FEATURE_VALUE_SET 18 @@ -71,17 +66,15 @@ #define LWMI_FAN_NR 4 #define LWMI_FAN_ID(x) ((x) + LWMI_FAN_ID_BASE) -#define LWMI_ATTR_ID_FAN_RPM(x) \ - (FIELD_PREP(LWMI_ATTR_DEV_ID_MASK, LWMI_DEVICE_ID_FAN) | \ - FIELD_PREP(LWMI_ATTR_FEAT_ID_MASK, LWMI_FEATURE_ID_FAN_RPM) | \ - FIELD_PREP(LWMI_ATTR_TYPE_ID_MASK, LWMI_FAN_ID(x))) - #define LWMI_FAN_DIV 100 +#define LWMI_ATTR_ID_FAN_RPM(x) \ + lwmi_attr_id(LWMI_DEVICE_ID_FAN, LWMI_FEATURE_ID_FAN_RPM, \ + LWMI_GZ_THERMAL_MODE_NONE, LWMI_FAN_ID(x)) + #define LWMI_OM_FW_ATTR_BASE_PATH "lenovo-wmi-other" #define LWMI_OM_HWMON_NAME "lenovo_wmi_other" -static BLOCKING_NOTIFIER_HEAD(om_chain_head); static DEFINE_IDA(lwmi_om_ida); enum attribute_property { @@ -109,7 +102,6 @@ struct lwmi_om_priv { struct device *hwmon_dev; struct device *fw_attr_dev; struct kset *fw_attr_kset; - struct notifier_block nb; struct wmi_device *wdev; int ida_id; @@ -166,7 +158,7 @@ MODULE_PARM_DESC(relax_fan_constraint, */ static int lwmi_om_fan_get_set(struct lwmi_om_priv *priv, int channel, u32 *val, bool set) { - struct wmi_method_args_32 args; + struct wmi_method_args_32 args = {}; u32 method_id, retval; int err; @@ -349,6 +341,8 @@ static int lwmi_om_hwmon_write(struct device *dev, enum hwmon_sensor_types type, */ if (!relax_fan_constraint) raw = val / LWMI_FAN_DIV * LWMI_FAN_DIV; + else + raw = val; err = lwmi_om_fan_get_set(priv, channel, &raw, true); if (err) @@ -546,13 +540,26 @@ static void lwmi_om_fan_info_collect_cd_fan(struct device *dev, struct cd_list * /* ======== fw_attributes (component: lenovo-wmi-capdata 01) ======== */ struct tunable_attr_01 { - struct capdata01 *capdata; struct device *dev; - u32 feature_id; - u32 device_id; - u32 type_id; + u8 feature_id; + u8 device_id; + u8 type_id; + u8 cd_mode_id; /* mode arg for searching capdata */ + u8 cv_mode_id; /* mode arg for set/get current_value */ }; +/** + * tunable_attr_01_id() - Formats a tunable_attr_01 to a capdata attribute ID + * @attr: The tunable_attr_01 to format. + * @mode: The u8 corresponding to the wmi-gamezone mode for set/get. + * + * Return: encoded capability data attribute ID. + */ +static u32 tunable_attr_01_id(struct tunable_attr_01 *attr, u8 mode) +{ + return lwmi_attr_id(attr->device_id, attr->feature_id, mode, attr->type_id); +} + static struct tunable_attr_01 ppt_pl1_spl = { .device_id = LWMI_DEVICE_ID_CPU, .feature_id = LWMI_FEATURE_ID_CPU_SPL, @@ -576,102 +583,6 @@ struct capdata01_attr_group { struct tunable_attr_01 *tunable_attr; }; -/** - * lwmi_om_register_notifier() - Add a notifier to the blocking notifier chain - * @nb: The notifier_block struct to register - * - * Call blocking_notifier_chain_register to register the notifier block to the - * lenovo-wmi-other driver notifier chain. - * - * Return: 0 on success, %-EEXIST on error. - */ -int lwmi_om_register_notifier(struct notifier_block *nb) -{ - return blocking_notifier_chain_register(&om_chain_head, nb); -} -EXPORT_SYMBOL_NS_GPL(lwmi_om_register_notifier, "LENOVO_WMI_OTHER"); - -/** - * lwmi_om_unregister_notifier() - Remove a notifier from the blocking notifier - * chain. - * @nb: The notifier_block struct to register - * - * Call blocking_notifier_chain_unregister to unregister the notifier block from the - * lenovo-wmi-other driver notifier chain. - * - * Return: 0 on success, %-ENOENT on error. - */ -int lwmi_om_unregister_notifier(struct notifier_block *nb) -{ - return blocking_notifier_chain_unregister(&om_chain_head, nb); -} -EXPORT_SYMBOL_NS_GPL(lwmi_om_unregister_notifier, "LENOVO_WMI_OTHER"); - -/** - * devm_lwmi_om_unregister_notifier() - Remove a notifier from the blocking - * notifier chain. - * @data: Void pointer to the notifier_block struct to register. - * - * Call lwmi_om_unregister_notifier to unregister the notifier block from the - * lenovo-wmi-other driver notifier chain. - * - * Return: 0 on success, %-ENOENT on error. - */ -static void devm_lwmi_om_unregister_notifier(void *data) -{ - struct notifier_block *nb = data; - - lwmi_om_unregister_notifier(nb); -} - -/** - * devm_lwmi_om_register_notifier() - Add a notifier to the blocking notifier - * chain. - * @dev: The parent device of the notifier_block struct. - * @nb: The notifier_block struct to register - * - * Call lwmi_om_register_notifier to register the notifier block to the - * lenovo-wmi-other driver notifier chain. Then add devm_lwmi_om_unregister_notifier - * as a device managed action to automatically unregister the notifier block - * upon parent device removal. - * - * Return: 0 on success, or an error code. - */ -int devm_lwmi_om_register_notifier(struct device *dev, - struct notifier_block *nb) -{ - int ret; - - ret = lwmi_om_register_notifier(nb); - if (ret < 0) - return ret; - - return devm_add_action_or_reset(dev, devm_lwmi_om_unregister_notifier, - nb); -} -EXPORT_SYMBOL_NS_GPL(devm_lwmi_om_register_notifier, "LENOVO_WMI_OTHER"); - -/** - * lwmi_om_notifier_call() - Call functions for the notifier call chain. - * @mode: Pointer to a thermal mode enum to retrieve the data from. - * - * Call blocking_notifier_call_chain to retrieve the thermal mode from the - * lenovo-wmi-gamezone driver. - * - * Return: 0 on success, or an error code. - */ -static int lwmi_om_notifier_call(enum thermal_mode *mode) -{ - int ret; - - ret = blocking_notifier_call_chain(&om_chain_head, - LWMI_GZ_GET_THERMAL_MODE, &mode); - if ((ret & ~NOTIFY_STOP_MASK) != NOTIFY_OK) - return -EINVAL; - - return 0; -} - /* Attribute Methods */ /** @@ -716,12 +627,7 @@ static ssize_t attr_capdata01_show(struct kobject *kobj, u32 attribute_id; int value, ret; - attribute_id = - FIELD_PREP(LWMI_ATTR_DEV_ID_MASK, tunable_attr->device_id) | - FIELD_PREP(LWMI_ATTR_FEAT_ID_MASK, tunable_attr->feature_id) | - FIELD_PREP(LWMI_ATTR_MODE_ID_MASK, - LWMI_GZ_THERMAL_MODE_CUSTOM) | - FIELD_PREP(LWMI_ATTR_TYPE_ID_MASK, tunable_attr->type_id); + attribute_id = tunable_attr_01_id(tunable_attr, tunable_attr->cd_mode_id); ret = lwmi_cd01_get_data(priv->cd01_list, attribute_id, &capdata); if (ret) @@ -773,27 +679,22 @@ static ssize_t attr_current_value_store(struct kobject *kobj, struct tunable_attr_01 *tunable_attr) { struct lwmi_om_priv *priv = dev_get_drvdata(tunable_attr->dev); - struct wmi_method_args_32 args; + struct wmi_method_args_32 args = {}; struct capdata01 capdata; enum thermal_mode mode; - u32 attribute_id; u32 value; int ret; - ret = lwmi_om_notifier_call(&mode); + ret = lwmi_tm_notifier_call(&mode); if (ret) return ret; if (mode != LWMI_GZ_THERMAL_MODE_CUSTOM) return -EBUSY; - attribute_id = - FIELD_PREP(LWMI_ATTR_DEV_ID_MASK, tunable_attr->device_id) | - FIELD_PREP(LWMI_ATTR_FEAT_ID_MASK, tunable_attr->feature_id) | - FIELD_PREP(LWMI_ATTR_MODE_ID_MASK, mode) | - FIELD_PREP(LWMI_ATTR_TYPE_ID_MASK, tunable_attr->type_id); + args.arg0 = tunable_attr_01_id(tunable_attr, tunable_attr->cd_mode_id); - ret = lwmi_cd01_get_data(priv->cd01_list, attribute_id, &capdata); + ret = lwmi_cd01_get_data(priv->cd01_list, args.arg0, &capdata); if (ret) return ret; @@ -804,7 +705,7 @@ static ssize_t attr_current_value_store(struct kobject *kobj, if (value < capdata.min_value || value > capdata.max_value) return -EINVAL; - args.arg0 = attribute_id; + args.arg0 = tunable_attr_01_id(tunable_attr, tunable_attr->cv_mode_id); args.arg1 = value; ret = lwmi_dev_evaluate_int(priv->wdev, 0x0, LWMI_FEATURE_VALUE_SET, @@ -836,23 +737,20 @@ static ssize_t attr_current_value_show(struct kobject *kobj, struct tunable_attr_01 *tunable_attr) { struct lwmi_om_priv *priv = dev_get_drvdata(tunable_attr->dev); - struct wmi_method_args_32 args; + struct wmi_method_args_32 args = {}; enum thermal_mode mode; - u32 attribute_id; int retval; int ret; - ret = lwmi_om_notifier_call(&mode); + ret = lwmi_tm_notifier_call(&mode); if (ret) return ret; - attribute_id = - FIELD_PREP(LWMI_ATTR_DEV_ID_MASK, tunable_attr->device_id) | - FIELD_PREP(LWMI_ATTR_FEAT_ID_MASK, tunable_attr->feature_id) | - FIELD_PREP(LWMI_ATTR_MODE_ID_MASK, mode) | - FIELD_PREP(LWMI_ATTR_TYPE_ID_MASK, tunable_attr->type_id); + /* If "no-mode" is the supported mode, ensure we never send current mode */ + if (tunable_attr->cv_mode_id == LWMI_GZ_THERMAL_MODE_NONE) + mode = tunable_attr->cv_mode_id; - args.arg0 = attribute_id; + args.arg0 = tunable_attr_01_id(tunable_attr, mode); ret = lwmi_dev_evaluate_int(priv->wdev, 0x0, LWMI_FEATURE_VALUE_GET, (unsigned char *)&args, sizeof(args), @@ -863,6 +761,81 @@ static ssize_t attr_current_value_show(struct kobject *kobj, return sysfs_emit(buf, "%d\n", retval); } +/** + * lwmi_attr_01_is_supported() - Determine if the given attribute is supported. + * @tunable_attr: The attribute to verify. + * + * For an attribute to be supported it must have a functional get/set method, + * as well as associated capability data stored in the capdata01 table. + * + * First check if the attribute has a corresponding data table under custom mode + * (0xff), then under no mode (0x00). If either of those passes, check if the + * supported field of the capdata struct is > 0. If it is supported, store the + * successful mode in the cd_mode_id field of tunable_attr. + * + * If the attribute capdata shows it is supported, attempt to determine the mode + * for the current value property get/set methods using a similar pattern to the + * capdata table check. If the value returned by either mode is 0 or an error, + * assume that mode is not supported. Otherwise, store the successful mode in the + * cv_mode_id field of tunable_attr. + * + * If any of the above checks fail then the attribute is not fully supported. + * + * Return: true if capdata and set/get modes are found, otherwise false. + */ +static bool lwmi_attr_01_is_supported(struct tunable_attr_01 *tunable_attr) +{ + u8 modes[2] = { LWMI_GZ_THERMAL_MODE_CUSTOM, LWMI_GZ_THERMAL_MODE_NONE }; + struct lwmi_om_priv *priv = dev_get_drvdata(tunable_attr->dev); + struct wmi_method_args_32 args = {}; + bool cd_mode_found = false; + bool cv_mode_found = false; + struct capdata01 capdata; + int retval, ret, i; + + /* Determine tunable_attr->cd_mode_id */ + for (i = 0; i < ARRAY_SIZE(modes); i++) { + args.arg0 = tunable_attr_01_id(tunable_attr, modes[i]); + + ret = lwmi_cd01_get_data(priv->cd01_list, args.arg0, &capdata); + if (ret || !capdata.supported) + continue; + + tunable_attr->cd_mode_id = modes[i]; + cd_mode_found = true; + break; + } + + if (!cd_mode_found) + return cd_mode_found; + + dev_dbg(tunable_attr->dev, + "cd_mode_id: %#010x\n", args.arg0); + + /* Determine tunable_attr->cv_mode_id, returns 1 if supported */ + for (i = 0; i < ARRAY_SIZE(modes); i++) { + args.arg0 = tunable_attr_01_id(tunable_attr, modes[i]); + + ret = lwmi_dev_evaluate_int(priv->wdev, 0x0, LWMI_FEATURE_VALUE_GET, + (u8 *)&args, sizeof(args), + &retval); + if (ret || !retval) + continue; + + tunable_attr->cv_mode_id = modes[i]; + cv_mode_found = true; + break; + } + + if (!cv_mode_found) + return cv_mode_found; + + dev_dbg(tunable_attr->dev, "cv_mode_id: %#010x, attribute support level: %#010x\n", + args.arg0, capdata.supported); + + return capdata.supported > 0; +} + /* Lenovo WMI Other Mode Attribute macros */ #define __LWMI_ATTR_RO(_func, _name) \ { \ @@ -957,17 +930,17 @@ static struct capdata01_attr_group cd01_attr_groups[] = { /** * lwmi_om_fw_attr_add() - Register all firmware_attributes_class members * @priv: The Other Mode driver data. - * - * Return: Either 0, or an error code. */ -static int lwmi_om_fw_attr_add(struct lwmi_om_priv *priv) +static void lwmi_om_fw_attr_add(struct lwmi_om_priv *priv) { unsigned int i; int err; - priv->ida_id = ida_alloc(&lwmi_om_ida, GFP_KERNEL); - if (priv->ida_id < 0) - return priv->ida_id; + err = ida_alloc(&lwmi_om_ida, GFP_KERNEL); + if (err < 0) + goto err_no_ida; + + priv->ida_id = err; priv->fw_attr_dev = device_create(&firmware_attributes_class, NULL, MKDEV(0, 0), NULL, "%s-%u", @@ -986,14 +959,16 @@ static int lwmi_om_fw_attr_add(struct lwmi_om_priv *priv) } for (i = 0; i < ARRAY_SIZE(cd01_attr_groups) - 1; i++) { + cd01_attr_groups[i].tunable_attr->dev = &priv->wdev->dev; + if (!lwmi_attr_01_is_supported(cd01_attr_groups[i].tunable_attr)) + continue; + err = sysfs_create_group(&priv->fw_attr_kset->kobj, cd01_attr_groups[i].attr_group); if (err) goto err_remove_groups; - - cd01_attr_groups[i].tunable_attr->dev = &priv->wdev->dev; } - return 0; + return; err_remove_groups: while (i--) @@ -1007,7 +982,12 @@ static int lwmi_om_fw_attr_add(struct lwmi_om_priv *priv) err_free_ida: ida_free(&lwmi_om_ida, priv->ida_id); - return err; + +err_no_ida: + priv->ida_id = -EIDRM; + + dev_warn(&priv->wdev->dev, + "failed to register firmware-attributes device: %d\n", err); } /** @@ -1016,12 +996,17 @@ static int lwmi_om_fw_attr_add(struct lwmi_om_priv *priv) */ static void lwmi_om_fw_attr_remove(struct lwmi_om_priv *priv) { + if (priv->ida_id < 0) + return; + for (unsigned int i = 0; i < ARRAY_SIZE(cd01_attr_groups) - 1; i++) sysfs_remove_group(&priv->fw_attr_kset->kobj, cd01_attr_groups[i].attr_group); kset_unregister(priv->fw_attr_kset); device_unregister(priv->fw_attr_dev); + ida_free(&lwmi_om_ida, priv->ida_id); + priv->ida_id = -EIDRM; } /* ======== Self (master: lenovo-wmi-other) ======== */ @@ -1058,12 +1043,17 @@ static int lwmi_om_master_bind(struct device *dev) priv->cd00_list = binder.cd00_list; priv->cd01_list = binder.cd01_list; - if (!priv->cd00_list || !priv->cd01_list) + if (!priv->cd00_list || !priv->cd01_list) { + component_unbind_all(dev, NULL); + return -ENODEV; + } lwmi_om_fan_info_collect_cd00(priv); - return lwmi_om_fw_attr_add(priv); + lwmi_om_fw_attr_add(priv); + + return 0; } /** @@ -1115,13 +1105,7 @@ static int lwmi_other_probe(struct wmi_device *wdev, const void *context) static void lwmi_other_remove(struct wmi_device *wdev) { - struct lwmi_om_priv *priv = dev_get_drvdata(&wdev->dev); - component_master_del(&wdev->dev, &lwmi_om_master_ops); - - /* No IDA to free if the driver is never bound to its components. */ - if (priv->ida_id >= 0) - ida_free(&lwmi_om_ida, priv->ida_id); } static const struct wmi_device_id lwmi_other_id_table[] = { diff --git a/drivers/platform/x86/lenovo/wmi-other.h b/drivers/platform/x86/lenovo/wmi-other.h deleted file mode 100644 index 8ebf5602bb9970..00000000000000 --- a/drivers/platform/x86/lenovo/wmi-other.h +++ /dev/null @@ -1,16 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0-or-later */ - -/* Copyright (C) 2025 Derek J. Clark */ - -#ifndef _LENOVO_WMI_OTHER_H_ -#define _LENOVO_WMI_OTHER_H_ - -struct device; -struct notifier_block; - -int lwmi_om_register_notifier(struct notifier_block *nb); -int lwmi_om_unregister_notifier(struct notifier_block *nb); -int devm_lwmi_om_register_notifier(struct device *dev, - struct notifier_block *nb); - -#endif /* !_LENOVO_WMI_OTHER_H_ */ diff --git a/drivers/platform/x86/samsung-galaxybook.c b/drivers/platform/x86/samsung-galaxybook.c index 755cb82bdb606b..6382af0b106c06 100644 --- a/drivers/platform/x86/samsung-galaxybook.c +++ b/drivers/platform/x86/samsung-galaxybook.c @@ -53,7 +53,7 @@ struct samsung_galaxybook { void *i8042_filter_ptr; struct work_struct block_recording_hotkey_work; - struct input_dev *camera_lens_cover_switch; + struct input_dev *input; struct acpi_battery_hook battery_hook; @@ -197,6 +197,9 @@ static const guid_t performance_mode_guid = #define GB_ACPI_NOTIFY_DEVICE_ON_TABLE 0x6c #define GB_ACPI_NOTIFY_DEVICE_OFF_TABLE 0x6d #define GB_ACPI_NOTIFY_HOTKEY_PERFORMANCE_MODE 0x70 +#define GB_ACPI_NOTIFY_HOTKEY_KBD_BACKLIGHT 0x7d +#define GB_ACPI_NOTIFY_HOTKEY_MICMUTE 0x6e +#define GB_ACPI_NOTIFY_HOTKEY_CAMERA 0x6f #define GB_KEY_KBD_BACKLIGHT_KEYDOWN 0x2c #define GB_KEY_KBD_BACKLIGHT_KEYUP 0xac @@ -859,13 +862,29 @@ static int block_recording_acpi_set(struct samsung_galaxybook *galaxybook, const if (err) return err; - input_report_switch(galaxybook->camera_lens_cover_switch, + input_report_switch(galaxybook->input, SW_CAMERA_LENS_COVER, value ? 1 : 0); - input_sync(galaxybook->camera_lens_cover_switch); + input_sync(galaxybook->input); return 0; } +static int galaxybook_input_init(struct samsung_galaxybook *galaxybook) +{ + galaxybook->input = devm_input_allocate_device(&galaxybook->platform->dev); + if (!galaxybook->input) + return -ENOMEM; + + galaxybook->input->name = "Samsung Galaxy Book Camera Lens Cover"; + galaxybook->input->phys = DRIVER_NAME "/input0"; + galaxybook->input->id.bustype = BUS_HOST; + + input_set_capability(galaxybook->input, EV_KEY, KEY_MICMUTE); + input_set_capability(galaxybook->input, EV_SW, SW_CAMERA_LENS_COVER); + + return input_register_device(galaxybook->input); +} + static int galaxybook_block_recording_init(struct samsung_galaxybook *galaxybook) { bool value; @@ -887,24 +906,8 @@ static int galaxybook_block_recording_init(struct samsung_galaxybook *galaxybook return GB_NOT_SUPPORTED; } - galaxybook->camera_lens_cover_switch = - devm_input_allocate_device(&galaxybook->platform->dev); - if (!galaxybook->camera_lens_cover_switch) - return -ENOMEM; - - galaxybook->camera_lens_cover_switch->name = "Samsung Galaxy Book Camera Lens Cover"; - galaxybook->camera_lens_cover_switch->phys = DRIVER_NAME "/input0"; - galaxybook->camera_lens_cover_switch->id.bustype = BUS_HOST; - - input_set_capability(galaxybook->camera_lens_cover_switch, EV_SW, SW_CAMERA_LENS_COVER); - - err = input_register_device(galaxybook->camera_lens_cover_switch); - if (err) - return err; - - input_report_switch(galaxybook->camera_lens_cover_switch, - SW_CAMERA_LENS_COVER, value ? 1 : 0); - input_sync(galaxybook->camera_lens_cover_switch); + input_report_switch(galaxybook->input, SW_CAMERA_LENS_COVER, value ? 1 : 0); + input_sync(galaxybook->input); return 0; } @@ -1260,6 +1263,25 @@ static void galaxybook_acpi_notify(acpi_handle handle, u32 event, void *data) if (galaxybook->has_performance_mode) platform_profile_cycle(); break; + case GB_ACPI_NOTIFY_HOTKEY_KBD_BACKLIGHT: + if (galaxybook->has_kbd_backlight) + schedule_work(&galaxybook->kbd_backlight_hotkey_work); + break; + case GB_ACPI_NOTIFY_HOTKEY_MICMUTE: + input_report_key(galaxybook->input, KEY_MICMUTE, 1); + input_sync(galaxybook->input); + input_report_key(galaxybook->input, KEY_MICMUTE, 0); + input_sync(galaxybook->input); + break; + case GB_ACPI_NOTIFY_HOTKEY_CAMERA: + if (galaxybook->has_block_recording) { + schedule_work(&galaxybook->block_recording_hotkey_work); + } else { + input_report_switch(galaxybook->input, SW_CAMERA_LENS_COVER, + !test_bit(SW_CAMERA_LENS_COVER, galaxybook->input->sw)); + input_sync(galaxybook->input); + } + break; default: dev_warn(&galaxybook->platform->dev, "unknown ACPI notification event: 0x%x\n", event); @@ -1392,6 +1414,11 @@ static int galaxybook_probe(struct platform_device *pdev) return dev_err_probe(&galaxybook->platform->dev, err, "failed to initialize kbd_backlight\n"); + err = galaxybook_input_init(galaxybook); + if (err) + return dev_err_probe(&galaxybook->platform->dev, err, + "failed to initialize input device\n"); + err = galaxybook_fw_attrs_init(galaxybook); if (err) return dev_err_probe(&galaxybook->platform->dev, err, diff --git a/drivers/pmdomain/core.c b/drivers/pmdomain/core.c index 4d32fc676aaf53..71e930e80178e9 100644 --- a/drivers/pmdomain/core.c +++ b/drivers/pmdomain/core.c @@ -3089,6 +3089,7 @@ static const struct bus_type genpd_bus_type = { static void genpd_dev_pm_detach(struct device *dev, bool power_off) { struct generic_pm_domain *pd; + bool is_virt_dev; unsigned int i; int ret = 0; @@ -3098,6 +3099,13 @@ static void genpd_dev_pm_detach(struct device *dev, bool power_off) dev_dbg(dev, "removing from PM domain %s\n", pd->name); + /* Check if the device was created by genpd at attach. */ + is_virt_dev = dev->bus == &genpd_bus_type; + + /* Disable runtime PM if we enabled it at attach. */ + if (is_virt_dev) + pm_runtime_disable(dev); + /* Drop the default performance state */ if (dev_gpd_data(dev)->default_pstate) { dev_pm_genpd_set_performance_state(dev, 0); @@ -3123,7 +3131,7 @@ static void genpd_dev_pm_detach(struct device *dev, bool power_off) genpd_queue_power_off_work(pd); /* Unregister the device if it was created by genpd. */ - if (dev->bus == &genpd_bus_type) + if (is_virt_dev) device_unregister(dev); } diff --git a/drivers/pmdomain/mediatek/mtk-pm-domains.c b/drivers/pmdomain/mediatek/mtk-pm-domains.c index d2b8d033295153..e1cfd42234734f 100644 --- a/drivers/pmdomain/mediatek/mtk-pm-domains.c +++ b/drivers/pmdomain/mediatek/mtk-pm-domains.c @@ -1015,6 +1015,7 @@ static int scpsys_get_bus_protection_legacy(struct device *dev, struct scpsys *s struct device_node *node, *smi_np; int num_regmaps = 0, i, j; struct regmap *regmap[3]; + int ret = 0; /* * Legacy code retrieves a maximum of three bus protection handles: @@ -1065,11 +1066,14 @@ static int scpsys_get_bus_protection_legacy(struct device *dev, struct scpsys *s if (node) { regmap[2] = syscon_regmap_lookup_by_phandle(node, "mediatek,infracfg-nao"); num_regmaps++; - of_node_put(node); - if (IS_ERR(regmap[2])) - return dev_err_probe(dev, PTR_ERR(regmap[2]), + if (IS_ERR(regmap[2])) { + ret = dev_err_probe(dev, PTR_ERR(regmap[2]), "%pOF: failed to get infracfg regmap\n", node); + of_node_put(node); + return ret; + } + of_node_put(node); } else { regmap[2] = NULL; } diff --git a/drivers/regulator/qcom-rpmh-regulator.c b/drivers/regulator/qcom-rpmh-regulator.c index 6e4cb2871fca8d..0dcb50bf5c35b1 100644 --- a/drivers/regulator/qcom-rpmh-regulator.c +++ b/drivers/regulator/qcom-rpmh-regulator.c @@ -1512,7 +1512,7 @@ static const struct rpmh_vreg_init_data pmh0101_vreg_data[] = { RPMH_VREG("ldo13", LDO, 13, &pmic5_pldo530_mvp150, "vdd-l2-l13-l14"), RPMH_VREG("ldo14", LDO, 14, &pmic5_pldo530_mvp150, "vdd-l2-l13-l14"), RPMH_VREG("ldo15", LDO, 15, &pmic5_nldo530, "vdd-l15"), - RPMH_VREG("ldo16", LDO, 15, &pmic5_pldo530_mvp600, "vdd-l5-l16"), + RPMH_VREG("ldo16", LDO, 16, &pmic5_pldo530_mvp600, "vdd-l5-l16"), RPMH_VREG("ldo17", LDO, 17, &pmic5_pldo515_mv, "vdd-l17"), RPMH_VREG("ldo18", LDO, 18, &pmic5_nldo530, "vdd-l18"), RPMH_VREG("bob1", BOB, 1, &pmic5_bob, "vdd-bob1"), diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c index 41b14344b16f2d..988fc291241d23 100644 --- a/drivers/resctrl/mpam_devices.c +++ b/drivers/resctrl/mpam_devices.c @@ -164,11 +164,17 @@ static void mpam_free_garbage(void) /* * Once mpam is enabled, new requestors cannot further reduce the available * partid. Assert that the size is fixed, and new requestors will be turned - * away. + * away. This is needed when walking over structures sized by PARTID. + * + * During mpam_disable() these structures are not fixed, but the MSC state + * is still reset using whatever sizes have been discovered so far. As only + * PARTID 0 will be used after mpam_disable(), any race would be benign. + * Skip the check if a mpam_disable_reason has been set. */ static void mpam_assert_partid_sizes_fixed(void) { - WARN_ON_ONCE(!partid_max_published); + if (!mpam_disable_reason) + WARN_ON_ONCE(!partid_max_published); } static u32 __mpam_read_reg(struct mpam_msc *msc, u16 reg) @@ -728,10 +734,9 @@ static void mpam_enable_quirks(struct mpam_msc *msc) * Try and see what values stick in this bit. If we can write either value, * its probably not implemented by hardware. */ -static bool _mpam_ris_hw_probe_hw_nrdy(struct mpam_msc_ris *ris, u32 mon_reg) +static bool mpam_ris_hw_probe_csu_nrdy(struct mpam_msc_ris *ris) { - u32 now; - u64 mon_sel; + u32 now, mon_sel, ctl_val; bool can_set, can_clear; struct mpam_msc *msc = ris->vmsc->msc; @@ -740,23 +745,30 @@ static bool _mpam_ris_hw_probe_hw_nrdy(struct mpam_msc_ris *ris, u32 mon_reg) mon_sel = FIELD_PREP(MSMON_CFG_MON_SEL_MON_SEL, 0) | FIELD_PREP(MSMON_CFG_MON_SEL_RIS, ris->ris_idx); - _mpam_write_monsel_reg(msc, mon_reg, mon_sel); + mpam_write_monsel_reg(msc, CFG_MON_SEL, mon_sel); + + /* Hardware might ignore nrdy if it's not enabled */ + ctl_val = MSMON_CFG_CSU_CTL_TYPE_CSU; + ctl_val |= MSMON_CFG_x_CTL_MATCH_PARTID; + ctl_val |= MSMON_CFG_x_CTL_MATCH_PMG; + ctl_val |= MSMON_CFG_x_CTL_EN; + mpam_write_monsel_reg(msc, CFG_CSU_FLT, 0); + mpam_write_monsel_reg(msc, CFG_CSU_CTL, ctl_val); - _mpam_write_monsel_reg(msc, mon_reg, MSMON___NRDY); - now = _mpam_read_monsel_reg(msc, mon_reg); + _mpam_write_monsel_reg(msc, MSMON_CSU, MSMON___NRDY); + now = _mpam_read_monsel_reg(msc, MSMON_CSU); can_set = now & MSMON___NRDY; - _mpam_write_monsel_reg(msc, mon_reg, 0); - now = _mpam_read_monsel_reg(msc, mon_reg); + _mpam_write_monsel_reg(msc, MSMON_CSU, 0); + /* Configuration change to try and coax hardware into setting nrdy */ + mpam_write_monsel_reg(msc, CFG_CSU_FLT, 0x1); + now = _mpam_read_monsel_reg(msc, MSMON_CSU); can_clear = !(now & MSMON___NRDY); mpam_mon_sel_unlock(msc); return (!can_set || !can_clear); } -#define mpam_ris_hw_probe_hw_nrdy(_ris, _mon_reg) \ - _mpam_ris_hw_probe_hw_nrdy(_ris, MSMON_##_mon_reg) - static void mpam_ris_hw_probe(struct mpam_msc_ris *ris) { int err; @@ -873,20 +885,18 @@ static void mpam_ris_hw_probe(struct mpam_msc_ris *ris) mpam_set_feature(mpam_feat_msmon_csu_xcl, props); /* Is NRDY hardware managed? */ - hw_managed = mpam_ris_hw_probe_hw_nrdy(ris, CSU); - if (hw_managed) - mpam_set_feature(mpam_feat_msmon_csu_hw_nrdy, props); - } + hw_managed = mpam_ris_hw_probe_csu_nrdy(ris); - /* - * Accept the missing firmware property if NRDY appears - * un-implemented. - */ - if (err && mpam_has_feature(mpam_feat_msmon_csu_hw_nrdy, props)) - dev_err_once(dev, "Counters are not usable because not-ready timeout was not provided by firmware."); + /* + * Accept the missing firmware property if NRDY appears + * un-implemented. + */ + if (err && hw_managed) + dev_err_once(dev, "Counters are not usable because not-ready timeout was not provided by firmware."); + } } if (FIELD_GET(MPAMF_MSMON_IDR_MSMON_MBWU, msmon_features)) { - bool has_long, hw_managed; + bool has_long; u32 mbwumon_idr = mpam_read_partsel_reg(msc, MBWUMON_IDR); props->num_mbwu_mon = FIELD_GET(MPAMF_MBWUMON_IDR_NUM_MON, mbwumon_idr); @@ -905,16 +915,6 @@ static void mpam_ris_hw_probe(struct mpam_msc_ris *ris) } else { mpam_set_feature(mpam_feat_msmon_mbwu_31counter, props); } - - /* Is NRDY hardware managed? */ - hw_managed = mpam_ris_hw_probe_hw_nrdy(ris, MBWU); - if (hw_managed) - mpam_set_feature(mpam_feat_msmon_mbwu_hw_nrdy, props); - - /* - * Don't warn about any missing firmware property for - * MBWU NRDY - it doesn't make any sense! - */ } } } @@ -1197,7 +1197,6 @@ static void __ris_msmon_read(void *arg) bool reset_on_next_read = false; struct mpam_msc_ris *ris = m->ris; struct msmon_mbwu_state *mbwu_state; - struct mpam_props *rprops = &ris->props; struct mpam_msc *msc = m->ris->vmsc->msc; u32 mon_sel, ctl_val, flt_val, cur_ctl, cur_flt; @@ -1253,8 +1252,7 @@ static void __ris_msmon_read(void *arg) switch (m->type) { case mpam_feat_msmon_csu: now = mpam_read_monsel_reg(msc, CSU); - if (mpam_has_feature(mpam_feat_msmon_csu_hw_nrdy, rprops)) - nrdy = now & MSMON___NRDY; + nrdy = now & MSMON___NRDY; now = FIELD_GET(MSMON___VALUE, now); if (mpam_has_quirk(IGNORE_CSU_NRDY, msc) && m->waited_timeout) @@ -1266,8 +1264,7 @@ static void __ris_msmon_read(void *arg) case mpam_feat_msmon_mbwu_63counter: if (m->type != mpam_feat_msmon_mbwu_31counter) { now = mpam_msc_read_mbwu_l(msc); - if (mpam_has_feature(mpam_feat_msmon_mbwu_hw_nrdy, rprops)) - nrdy = now & MSMON___L_NRDY; + nrdy = now & MSMON___L_NRDY; if (m->type == mpam_feat_msmon_mbwu_63counter) now = FIELD_GET(MSMON___LWD_VALUE, now); @@ -1275,8 +1272,7 @@ static void __ris_msmon_read(void *arg) now = FIELD_GET(MSMON___L_VALUE, now); } else { now = mpam_read_monsel_reg(msc, MBWU); - if (mpam_has_feature(mpam_feat_msmon_mbwu_hw_nrdy, rprops)) - nrdy = now & MSMON___NRDY; + nrdy = now & MSMON___NRDY; now = FIELD_GET(MSMON___VALUE, now); } @@ -2585,6 +2581,9 @@ static void __destroy_component_cfg(struct mpam_component *comp) lockdep_assert_held(&mpam_list_lock); + if (!comp->cfg) + return; + add_to_garbage(comp->cfg); list_for_each_entry(vmsc, &comp->vmsc, comp_list) { msc = vmsc->msc; diff --git a/drivers/resctrl/mpam_internal.h b/drivers/resctrl/mpam_internal.h index 1914aefdcba9ef..04d1a59f02afb4 100644 --- a/drivers/resctrl/mpam_internal.h +++ b/drivers/resctrl/mpam_internal.h @@ -181,14 +181,12 @@ enum mpam_device_features { mpam_feat_msmon_csu, mpam_feat_msmon_csu_capture, mpam_feat_msmon_csu_xcl, - mpam_feat_msmon_csu_hw_nrdy, mpam_feat_msmon_mbwu, mpam_feat_msmon_mbwu_31counter, mpam_feat_msmon_mbwu_44counter, mpam_feat_msmon_mbwu_63counter, mpam_feat_msmon_mbwu_capture, mpam_feat_msmon_mbwu_rwbw, - mpam_feat_msmon_mbwu_hw_nrdy, mpam_feat_partid_nrw, MPAM_FEATURE_LAST }; diff --git a/drivers/reset/reset-eyeq.c b/drivers/reset/reset-eyeq.c index 791b7283111e2d..1a385798389709 100644 --- a/drivers/reset/reset-eyeq.c +++ b/drivers/reset/reset-eyeq.c @@ -422,13 +422,6 @@ static int eqr_of_xlate_twocells(struct reset_controller_dev *rcdev, return eqr_of_xlate_internal(rcdev, reset_spec->args[0], reset_spec->args[1]); } -static void eqr_of_node_put(void *_dev) -{ - struct device *dev = _dev; - - of_node_put(dev->of_node); -} - static int eqr_probe(struct auxiliary_device *adev, const struct auxiliary_device_id *id) { @@ -439,21 +432,8 @@ static int eqr_probe(struct auxiliary_device *adev, int ret; /* - * We are an auxiliary device of clk-eyeq. We do not have an OF node by - * default; let's reuse our parent's OF node. - */ - WARN_ON(dev->of_node); - device_set_of_node_from_dev(dev, dev->parent); - if (!dev->of_node) - return -ENODEV; - - ret = devm_add_action_or_reset(dev, eqr_of_node_put, dev); - if (ret) - return ret; - - /* - * Using our newfound OF node, we can get match data. We cannot use - * device_get_match_data() because it does not match reused OF nodes. + * Get match data. We cannot use device_get_match_data() because it does + * not accept reused OF nodes; see device_set_of_node_from_dev(). */ match = of_match_node(dev->driver->of_match_table, dev->of_node); if (!match || !match->data) diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c index efb08b9b145a14..80ab0ff921d430 100644 --- a/drivers/scsi/device_handler/scsi_dh_alua.c +++ b/drivers/scsi/device_handler/scsi_dh_alua.c @@ -37,7 +37,7 @@ #define TPGS_MODE_EXPLICIT 0x2 #define ALUA_RTPG_SIZE 128 -#define ALUA_FAILOVER_TIMEOUT 60 +#define ALUA_FAILOVER_TIMEOUT 255 /* max 255 (8-bit value) */ #define ALUA_FAILOVER_RETRIES 5 #define ALUA_RTPG_DELAY_MSECS 5 #define ALUA_RTPG_RETRY_DELAY 2 diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c index fda07b193137af..14d563e82d2085 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c @@ -1491,7 +1491,7 @@ static void prep_ata_v3_hw(struct hisi_hba *hisi_hba, phy_id = device->phy->identify.phy_identifier; hdr->dw0 |= cpu_to_le32((1U << phy_id) << CMD_HDR_PHY_ID_OFF); - hdr->dw0 |= CMD_HDR_FORCE_PHY_MSK; + hdr->dw0 |= cpu_to_le32(CMD_HDR_FORCE_PHY_MSK); hdr->dw0 |= cpu_to_le32(4U << CMD_HDR_CMD_OFF); } diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index 6ff78855729424..12caffeed3a0d2 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -2738,8 +2738,20 @@ scsih_sdev_configure(struct scsi_device *sdev, struct queue_limits *lim) pcie_device->enclosure_level, pcie_device->connector_name); + /* + * The HBA firmware passes the NVMe drive's MDTS + * (Maximum Data Transfer Size) up to the driver. However, + * the driver hardcodes a 4K buffer size for the PRP list, + * accommodating at most 512 entries. This strictly limits + * the maximum supported NVMe I/O transfer to 2 MiB. + * + * Cap max_hw_sectors to the smaller of the drive's reported + * MDTS or the 2 MiB driver limit to prevent kernel oopses. + */ + lim->max_hw_sectors = SZ_2M >> SECTOR_SHIFT; if (pcie_device->nvme_mdts) - lim->max_hw_sectors = pcie_device->nvme_mdts / 512; + lim->max_hw_sectors = min(lim->max_hw_sectors, + pcie_device->nvme_mdts >> SECTOR_SHIFT); pcie_device_put(pcie_device); spin_unlock_irqrestore(&ioc->pcie_device_lock, flags); diff --git a/drivers/scsi/pmcraid.h b/drivers/scsi/pmcraid.h index 9f59930e8b4fdd..cd059b7599b4ce 100644 --- a/drivers/scsi/pmcraid.h +++ b/drivers/scsi/pmcraid.h @@ -657,7 +657,7 @@ struct pmcraid_hostrcb { */ struct pmcraid_instance { /* Array of allowed-to-be-exposed resources, initialized from - * Configutation Table, later updated with CCNs + * Configuration Table, later updated with CCNs */ struct pmcraid_resource_entry *res_entries; diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 2b4b2a1a8e442c..74cd4e8a61c2a0 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -1801,7 +1801,7 @@ sg_start_req(Sg_request *srp, unsigned char *cmd) } res = blk_rq_map_user_io(rq, md, hp->dxferp, hp->dxfer_len, - GFP_ATOMIC, iov_count, iov_count, 1, rw); + GFP_KERNEL, iov_count, iov_count, 1, rw); if (!res) { srp->bio = rq->bio; diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c index b4ed991976d065..2026ac645d6ab4 100644 --- a/drivers/scsi/smartpqi/smartpqi_init.c +++ b/drivers/scsi/smartpqi/smartpqi_init.c @@ -9427,6 +9427,7 @@ static void pqi_shutdown(struct pci_dev *pci_dev) pqi_crash_if_pending_command(ctrl_info); pqi_reset(ctrl_info); + pqi_ctrl_unblock_device_reset(ctrl_info); } static void pqi_process_lockup_action_param(void) diff --git a/drivers/soundwire/bus.c b/drivers/soundwire/bus.c index 0490777fa40671..1ce708657d8c60 100644 --- a/drivers/soundwire/bus.c +++ b/drivers/soundwire/bus.c @@ -1372,6 +1372,37 @@ int sdw_slave_get_current_bank(struct sdw_slave *slave) } EXPORT_SYMBOL_GPL(sdw_slave_get_current_bank); +/** + * sdw_slave_wait_for_init - Wait for device initialisation + * @slave: Pointer to the SoundWire peripheral. + * @timeout_ms: Timeout in milliseconds. + * + * Wait for a peripheral device to enumerate and be initialised by the + * SoundWire core. + * + * Return: Zero on success, and a negative error code on failure. + */ +int sdw_slave_wait_for_init(struct sdw_slave *slave, int timeout_ms) +{ + unsigned long time; + + if (!slave->unattach_request) + return 0; + + time = wait_for_completion_timeout(&slave->initialization_complete, + msecs_to_jiffies(timeout_ms)); + if (!time) { + dev_err(&slave->dev, "Initialization not complete\n"); + sdw_show_ping_status(slave->bus, true); + return -ETIMEDOUT; + } + + slave->unattach_request = 0; + + return 0; +} +EXPORT_SYMBOL_GPL(sdw_slave_wait_for_init); + static int sdw_slave_set_frequency(struct sdw_slave *slave) { int scale_index; diff --git a/drivers/soundwire/intel_auxdevice.c b/drivers/soundwire/intel_auxdevice.c index ee951be1546577..0b8107bec9abe1 100644 --- a/drivers/soundwire/intel_auxdevice.c +++ b/drivers/soundwire/intel_auxdevice.c @@ -72,6 +72,7 @@ static struct wake_capable_part wake_capable_list[] = { {0x025d, 0x717}, {0x025d, 0x721}, {0x025d, 0x722}, + {0x04b3, 0x9356}, }; static bool is_wake_capable(struct sdw_slave *slave) diff --git a/drivers/spi/spi-amd.c b/drivers/spi/spi-amd.c index 4d1dce4f497406..71a6e5c475b037 100644 --- a/drivers/spi/spi-amd.c +++ b/drivers/spi/spi-amd.c @@ -868,7 +868,7 @@ static int amd_spi_probe(struct platform_device *pdev) dev_dbg(dev, "io_remap_address: %p\n", amd_spi->io_remap_addr); amd_spi->version = (uintptr_t)device_get_match_data(dev); - host->bus_num = 0; + host->bus_num = (amd_spi->version == AMD_HID2_SPI) ? 2 : 0; return amd_spi_probe_common(dev, host); } diff --git a/drivers/spi/spi-ch341.c b/drivers/spi/spi-ch341.c index 3eaa8f176f63aa..6448a44a8b67e1 100644 --- a/drivers/spi/spi-ch341.c +++ b/drivers/spi/spi-ch341.c @@ -250,5 +250,5 @@ static struct usb_driver ch341a_usb_driver = { module_usb_driver(ch341a_usb_driver); MODULE_AUTHOR("Johannes Thumshirn "); -MODULE_DESCRIPTION("QiHeng Electronics ch341 USB2SPI"); +MODULE_DESCRIPTION("Nanjing Qinheng Microelectronics CH341 USB2SPI driver"); MODULE_LICENSE("GPL v2"); diff --git a/drivers/spi/spi-imx.c b/drivers/spi/spi-imx.c index e5c907c45b8786..480d1e8b281f19 100644 --- a/drivers/spi/spi-imx.c +++ b/drivers/spi/spi-imx.c @@ -1382,9 +1382,7 @@ static int spi_imx_setupxfer(struct spi_device *spi, spi_imx->target_burst = t->len; } - spi_imx->devtype_data->prepare_transfer(spi_imx, spi, t); - - return 0; + return spi_imx->devtype_data->prepare_transfer(spi_imx, spi, t); } static void spi_imx_sdma_exit(struct spi_imx_data *spi_imx) @@ -1709,6 +1707,7 @@ static int spi_imx_dma_data_prepare(struct spi_imx_data *spi_imx, kfree(spi_imx->dma_data[0].dma_tx_buf); kfree(spi_imx->dma_data[0].dma_rx_buf); kfree(spi_imx->dma_data); + return ret; } } @@ -1836,7 +1835,7 @@ static void spi_imx_dma_max_wml_find(struct spi_imx_data *spi_imx, unsigned int i; for (i = spi_imx->devtype_data->fifo_size / 2; i > 0; i--) { - if (!dma_data->dma_len % (i * bytes_per_word)) + if (!(dma_data->dma_len % (i * bytes_per_word))) break; } /* Use 1 as wml in case no available burst length got */ diff --git a/drivers/spi/spi-microchip-core-qspi.c b/drivers/spi/spi-microchip-core-qspi.c index eab059fb0bc2ce..4dee0fea1df8dd 100644 --- a/drivers/spi/spi-microchip-core-qspi.c +++ b/drivers/spi/spi-microchip-core-qspi.c @@ -74,6 +74,13 @@ #define STATUS_FLAGSX4 BIT(8) #define STATUS_MASK GENMASK(8, 0) +/* + * QSPI Direct Access register defines + */ +#define DIRECT_ACCESS_EN_SSEL BIT(0) +#define DIRECT_ACCESS_OP_SSEL BIT(1) +#define DIRECT_ACCESS_OP_SSEL_SHIFT 1 + #define BYTESUPPER_MASK GENMASK(31, 16) #define BYTESLOWER_MASK GENMASK(15, 0) @@ -158,7 +165,39 @@ static int mchp_coreqspi_set_mode(struct mchp_coreqspi *qspi, const struct spi_m return 0; } -static inline void mchp_coreqspi_read_op(struct mchp_coreqspi *qspi) +static void mchp_coreqspi_set_cs(struct spi_device *spi, bool enable) +{ + struct mchp_coreqspi *qspi = spi_controller_get_devdata(spi->controller); + u32 val; + + val = readl(qspi->regs + REG_DIRECT_ACCESS); + + val &= ~DIRECT_ACCESS_OP_SSEL; + val |= !enable << DIRECT_ACCESS_OP_SSEL_SHIFT; + + writel(val, qspi->regs + REG_DIRECT_ACCESS); +} + +static int mchp_coreqspi_setup(struct spi_device *spi) +{ + struct mchp_coreqspi *qspi = spi_controller_get_devdata(spi->controller); + u32 val; + + /* + * Active low devices need to be specifically set to their inactive + * states during probe. + */ + if (spi->mode & SPI_CS_HIGH) + return 0; + + val = readl(qspi->regs + REG_DIRECT_ACCESS); + val |= DIRECT_ACCESS_OP_SSEL; + writel(val, qspi->regs + REG_DIRECT_ACCESS); + + return 0; +} + +static void mchp_coreqspi_read_op(struct mchp_coreqspi *qspi) { u32 control, data; @@ -194,7 +233,7 @@ static inline void mchp_coreqspi_read_op(struct mchp_coreqspi *qspi) } } -static inline void mchp_coreqspi_write_op(struct mchp_coreqspi *qspi) +static void mchp_coreqspi_write_op(struct mchp_coreqspi *qspi) { u32 control, data; @@ -222,7 +261,7 @@ static inline void mchp_coreqspi_write_op(struct mchp_coreqspi *qspi) } } -static inline void mchp_coreqspi_write_read_op(struct mchp_coreqspi *qspi) +static void mchp_coreqspi_write_read_op(struct mchp_coreqspi *qspi) { u32 control, data; @@ -380,20 +419,7 @@ static int mchp_coreqspi_setup_clock(struct mchp_coreqspi *qspi, struct spi_devi return 0; } -static int mchp_coreqspi_setup_op(struct spi_device *spi_dev) -{ - struct spi_controller *ctlr = spi_dev->controller; - struct mchp_coreqspi *qspi = spi_controller_get_devdata(ctlr); - u32 control = readl_relaxed(qspi->regs + REG_CONTROL); - - control |= (CONTROL_MASTER | CONTROL_ENABLE); - control &= ~CONTROL_CLKIDLE; - writel_relaxed(control, qspi->regs + REG_CONTROL); - - return 0; -} - -static inline void mchp_coreqspi_config_op(struct mchp_coreqspi *qspi, const struct spi_mem_op *op) +static void mchp_coreqspi_config_op(struct mchp_coreqspi *qspi, const struct spi_mem_op *op) { u32 idle_cycles = 0; int total_bytes, cmd_bytes, frames, ctrl; @@ -483,6 +509,7 @@ static int mchp_coreqspi_exec_op(struct spi_mem *mem, const struct spi_mem_op *o reinit_completion(&qspi->data_completion); mchp_coreqspi_config_op(qspi, op); + mchp_coreqspi_set_cs(mem->spi, true); if (op->cmd.opcode) { qspi->txbuf = &opcode; qspi->rxbuf = NULL; @@ -523,6 +550,7 @@ static int mchp_coreqspi_exec_op(struct spi_mem *mem, const struct spi_mem_op *o err = -ETIMEDOUT; error: + mchp_coreqspi_set_cs(mem->spi, false); mutex_unlock(&qspi->op_lock); mchp_coreqspi_disable_ints(qspi); @@ -662,18 +690,28 @@ static int mchp_coreqspi_transfer_one(struct spi_controller *ctlr, struct spi_de struct spi_transfer *t) { struct mchp_coreqspi *qspi = spi_controller_get_devdata(ctlr); + bool dual_quad = false; qspi->tx_len = t->len; + if (t->tx_nbits == SPI_NBITS_QUAD || t->rx_nbits == SPI_NBITS_QUAD || + t->tx_nbits == SPI_NBITS_DUAL || + t->rx_nbits == SPI_NBITS_DUAL) + dual_quad = true; + if (t->tx_buf) qspi->txbuf = (u8 *)t->tx_buf; if (!t->rx_buf) { mchp_coreqspi_write_op(qspi); - } else { + } else if (!dual_quad) { qspi->rxbuf = (u8 *)t->rx_buf; qspi->rx_len = t->len; mchp_coreqspi_write_read_op(qspi); + } else { + qspi->rxbuf = (u8 *)t->rx_buf; + qspi->rx_len = t->len; + mchp_coreqspi_read_op(qspi); } return 0; @@ -686,6 +724,7 @@ static int mchp_coreqspi_probe(struct platform_device *pdev) struct device *dev = &pdev->dev; struct device_node *np = dev->of_node; int ret; + u32 num_cs, val; ctlr = devm_spi_alloc_host(&pdev->dev, sizeof(*qspi)); if (!ctlr) @@ -718,10 +757,18 @@ static int mchp_coreqspi_probe(struct platform_device *pdev) return ret; } + /* + * The IP core only has a single CS, any more have to be provided via + * gpios + */ + if (of_property_read_u32(pdev->dev.of_node, "num-cs", &num_cs)) + num_cs = 1; + + ctlr->num_chipselect = num_cs; + ctlr->bits_per_word_mask = SPI_BPW_MASK(8); ctlr->mem_ops = &mchp_coreqspi_mem_ops; ctlr->mem_caps = &mchp_coreqspi_mem_caps; - ctlr->setup = mchp_coreqspi_setup_op; ctlr->mode_bits = SPI_CPOL | SPI_CPHA | SPI_RX_DUAL | SPI_RX_QUAD | SPI_TX_DUAL | SPI_TX_QUAD; ctlr->dev.of_node = np; @@ -729,9 +776,21 @@ static int mchp_coreqspi_probe(struct platform_device *pdev) ctlr->prepare_message = mchp_coreqspi_prepare_message; ctlr->unprepare_message = mchp_coreqspi_unprepare_message; ctlr->transfer_one = mchp_coreqspi_transfer_one; - ctlr->num_chipselect = 2; + ctlr->setup = mchp_coreqspi_setup; + ctlr->set_cs = mchp_coreqspi_set_cs; ctlr->use_gpio_descriptors = true; + val = readl_relaxed(qspi->regs + REG_CONTROL); + val |= (CONTROL_MASTER | CONTROL_ENABLE); + writel_relaxed(val, qspi->regs + REG_CONTROL); + + /* + * Put cs into software controlled mode + */ + val = readl_relaxed(qspi->regs + REG_DIRECT_ACCESS); + val |= DIRECT_ACCESS_EN_SSEL; + writel(val, qspi->regs + REG_DIRECT_ACCESS); + ret = spi_register_controller(ctlr); if (ret) return dev_err_probe(&pdev->dev, ret, diff --git a/drivers/staging/greybus/hid.c b/drivers/staging/greybus/hid.c index 1f58c907c03683..f1f9f6fbc00e52 100644 --- a/drivers/staging/greybus/hid.c +++ b/drivers/staging/greybus/hid.c @@ -201,7 +201,7 @@ static void gb_hid_init_report(struct gb_hid *ghid, struct hid_report *report) * we just need to setup the input fields, so using * hid_report_raw_event is safe. */ - hid_report_raw_event(ghid->hid, report->type, ghid->inbuf, size, 1); + hid_report_raw_event(ghid->hid, report->type, ghid->inbuf, ghid->bufsize, size, 1); } static void gb_hid_init_reports(struct gb_hid *ghid) diff --git a/drivers/staging/rtl8723bs/os_dep/osdep_service.c b/drivers/staging/rtl8723bs/os_dep/osdep_service.c index 7959daeabc6ff0..4cfdf7c623440a 100644 --- a/drivers/staging/rtl8723bs/os_dep/osdep_service.c +++ b/drivers/staging/rtl8723bs/os_dep/osdep_service.c @@ -194,7 +194,8 @@ struct rtw_cbuf *rtw_cbuf_alloc(u32 size) struct rtw_cbuf *cbuf; cbuf = kzalloc_flex(*cbuf, bufs, size); - cbuf->size = size; + if (cbuf) + cbuf->size = size; return cbuf; } diff --git a/drivers/staging/vme_user/vme_fake.c b/drivers/staging/vme_user/vme_fake.c index be4ad47ed526a9..8abaa3165fbb7b 100644 --- a/drivers/staging/vme_user/vme_fake.c +++ b/drivers/staging/vme_user/vme_fake.c @@ -1230,6 +1230,8 @@ static int __init fake_init(void) err_driver: kfree(fake_bridge); err_struct: + root_device_unregister(vme_root); + return retval; } diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index e80449f6ce159c..cb832fd523af18 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -995,6 +995,7 @@ int iscsit_setup_scsi_cmd(struct iscsit_conn *conn, struct iscsit_cmd *cmd, int data_direction, payload_length; struct iscsi_ecdb_ahdr *ecdb_ahdr; struct iscsi_scsi_req *hdr; + u16 ahslength, cdb_length; int iscsi_task_attr; unsigned char *cdb; int sam_task_attr; @@ -1108,14 +1109,27 @@ int iscsit_setup_scsi_cmd(struct iscsit_conn *conn, struct iscsit_cmd *cmd, ISCSI_REASON_CMD_NOT_SUPPORTED, buf); } - cdb = kmalloc(be16_to_cpu(ecdb_ahdr->ahslength) + 15, - GFP_KERNEL); + ahslength = be16_to_cpu(ecdb_ahdr->ahslength); + if (!ahslength) { + pr_err("Extended CDB AHS with zero length, protocol error.\n"); + return iscsit_add_reject_cmd(cmd, + ISCSI_REASON_PROTOCOL_ERROR, buf); + } + if (ahslength > (hdr->hlength * 4) - 3) { + pr_err("Extended CDB AHS length %u exceeds available PDU buffer.\n", + ahslength); + return iscsit_add_reject_cmd(cmd, + ISCSI_REASON_PROTOCOL_ERROR, buf); + } + + cdb_length = ahslength - 1 + ISCSI_CDB_SIZE; + + cdb = kmalloc(cdb_length, GFP_KERNEL); if (cdb == NULL) return iscsit_add_reject_cmd(cmd, ISCSI_REASON_BOOKMARK_NO_RESOURCES, buf); memcpy(cdb, hdr->cdb, ISCSI_CDB_SIZE); - memcpy(cdb + ISCSI_CDB_SIZE, ecdb_ahdr->ecdb, - be16_to_cpu(ecdb_ahdr->ahslength) - 1); + memcpy(cdb + ISCSI_CDB_SIZE, ecdb_ahdr->ecdb, cdb_length - ISCSI_CDB_SIZE); } data_direction = (hdr->flags & ISCSI_FLAG_CMD_WRITE) ? DMA_TO_DEVICE : diff --git a/drivers/target/target_core_configfs.c b/drivers/target/target_core_configfs.c index d93773b3227c3e..2b19a956007b73 100644 --- a/drivers/target/target_core_configfs.c +++ b/drivers/target/target_core_configfs.c @@ -3249,7 +3249,7 @@ static ssize_t target_tg_pt_gp_members_show(struct config_item *item, config_item_name(&lun->lun_group.cg_item)); cur_len++; /* Extra byte for NULL terminator */ - if ((cur_len + len) > PAGE_SIZE) { + if (cur_len > TG_PT_GROUP_NAME_BUF || (cur_len + len) > PAGE_SIZE) { pr_warn("Ran out of lu_gp_show_attr" "_members buffer\n"); break; diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c index 4805e40ed4d78c..c3f08957d179ac 100644 --- a/drivers/ufs/core/ufshcd.c +++ b/drivers/ufs/core/ufshcd.c @@ -9259,6 +9259,30 @@ static void ufshcd_config_mcq(struct ufs_hba *hba) hba->nutrs); } +/** + * ufshcd_get_op_mode - get UFS operating mode. + * @hba: per-adapter instance + * + * Use the PA_PWRMODE value to represent the operating mode of UFS. + * + */ +static enum ufs_op_mode ufshcd_get_op_mode(struct ufs_hba *hba) +{ + u32 mode; + u8 rx_mode; + u8 tx_mode; + + ufshcd_dme_get(hba, UIC_ARG_MIB(PA_PWRMODE), &mode); + rx_mode = (mode >> PWRMODE_RX_OFFSET) & PWRMODE_MASK; + tx_mode = mode & PWRMODE_MASK; + + if ((rx_mode == SLOW_MODE || rx_mode == SLOWAUTO_MODE) && + (tx_mode == SLOW_MODE || tx_mode == SLOWAUTO_MODE)) + return LS_MODE; + + return HS_MODE; +} + static int ufshcd_post_device_init(struct ufs_hba *hba) { int ret; @@ -9281,11 +9305,13 @@ static int ufshcd_post_device_init(struct ufs_hba *hba) return 0; /* - * Set the right value to bRefClkFreq before attempting to + * Set the right value to bRefClkFreq in LS_MODE before attempting to * switch to HS gears. */ - if (hba->dev_ref_clk_freq != REF_CLK_FREQ_INVAL) + if (ufshcd_get_op_mode(hba) == LS_MODE && + hba->dev_ref_clk_freq != REF_CLK_FREQ_INVAL) ufshcd_set_dev_ref_clk(hba); + /* Gear up to HS gear. */ ret = ufshcd_config_pwr_mode(hba, &hba->max_pwr_info.info, UFSHCD_PMC_POLICY_DONT_FORCE); diff --git a/drivers/usb/class/usblp.c b/drivers/usb/class/usblp.c index 669b9e6879bfa5..746414763da5d6 100644 --- a/drivers/usb/class/usblp.c +++ b/drivers/usb/class/usblp.c @@ -1178,7 +1178,7 @@ static int usblp_probe(struct usb_interface *intf, } /* Allocate buffer for printer status */ - usblp->statusbuf = kmalloc(STATUS_BUF_SIZE, GFP_KERNEL); + usblp->statusbuf = kzalloc(STATUS_BUF_SIZE, GFP_KERNEL); if (!usblp->statusbuf) { retval = -ENOMEM; goto abort; @@ -1377,6 +1377,7 @@ static int usblp_cache_device_id_string(struct usblp *usblp) { int err, length; + memset(usblp->device_id_string, 0, USBLP_DEVICE_ID_SIZE); err = usblp_get_id(usblp, 0, usblp->device_id_string, USBLP_DEVICE_ID_SIZE - 1); if (err < 0) { dev_dbg(&usblp->intf->dev, diff --git a/drivers/usb/common/ulpi.c b/drivers/usb/common/ulpi.c index b34fb65813c45e..9b69148128e5bf 100644 --- a/drivers/usb/common/ulpi.c +++ b/drivers/usb/common/ulpi.c @@ -286,12 +286,15 @@ static int ulpi_register(struct device *dev, struct ulpi *ulpi) ACPI_COMPANION_SET(&ulpi->dev, ACPI_COMPANION(dev)); ret = ulpi_of_register(ulpi); - if (ret) + if (ret) { + kfree(ulpi); return ret; + } ret = ulpi_read_id(ulpi); if (ret) { of_node_put(ulpi->dev.of_node); + kfree(ulpi); return ret; } diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c index 58899b1fa96d29..65213896de998a 100644 --- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -1359,12 +1359,6 @@ int dwc3_core_init(struct dwc3 *dwc) hw_mode = DWC3_GHWPARAMS0_MODE(dwc->hwparams.hwparams0); - /* - * Write Linux Version Code to our GUID register so it's easy to figure - * out which kernel version a bug was found. - */ - dwc3_writel(dwc, DWC3_GUID, LINUX_VERSION_CODE); - ret = dwc3_phy_setup(dwc); if (ret) return ret; @@ -1398,6 +1392,12 @@ int dwc3_core_init(struct dwc3 *dwc) if (ret) goto err_exit_phy; + /* + * Write Linux Version Code to our GUID register so it's easy to figure + * out which kernel version a bug was found. + */ + dwc3_writel(dwc, DWC3_GUID, LINUX_VERSION_CODE); + dwc3_core_setup_global_control(dwc); dwc3_core_num_eps(dwc); diff --git a/drivers/usb/gadget/udc/omap_udc.c b/drivers/usb/gadget/udc/omap_udc.c index 91139ae668f480..f3ca79cece1bea 100644 --- a/drivers/usb/gadget/udc/omap_udc.c +++ b/drivers/usb/gadget/udc/omap_udc.c @@ -733,8 +733,6 @@ static void dma_channel_claim(struct omap_ep *ep, unsigned channel) if (status == 0) { omap_writew(reg, UDC_TXDMA_CFG); /* EMIFF or SDRC */ - omap_set_dma_src_burst_mode(ep->lch, - OMAP_DMA_DATA_BURST_4); omap_set_dma_src_data_pack(ep->lch, 1); /* TIPB */ omap_set_dma_dest_params(ep->lch, @@ -756,8 +754,6 @@ static void dma_channel_claim(struct omap_ep *ep, unsigned channel) UDC_DATA_DMA, 0, 0); /* EMIFF or SDRC */ - omap_set_dma_dest_burst_mode(ep->lch, - OMAP_DMA_DATA_BURST_4); omap_set_dma_dest_data_pack(ep->lch, 1); } } diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c index c71461893d20c2..42e4cecd28aca6 100644 --- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -1513,7 +1513,11 @@ static const struct usb_device_id option_ids[] = { { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1231, 0xff), /* Telit LE910Cx (RNDIS) */ .driver_info = NCTRL(2) | RSVD(3) }, { USB_DEVICE_AND_INTERFACE_INFO(TELIT_VENDOR_ID, 0x1250, 0xff, 0x00, 0x00) }, /* Telit LE910Cx (rmnet) */ + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1251, 0xff) }, /* Telit LE910Cx (RNDIS) */ { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1252, 0xff) }, /* Telit LE910Cx (MBIM) */ + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1253, 0xff) }, /* Telit LE910Cx (ECM) */ + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1254, 0xff) }, /* Telit LE910Cx */ + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1255, 0xff) }, /* Telit LE910Cx */ { USB_DEVICE(TELIT_VENDOR_ID, 0x1260), .driver_info = NCTRL(0) | RSVD(1) | RSVD(2) }, { USB_DEVICE(TELIT_VENDOR_ID, 0x1261), diff --git a/drivers/usb/typec/tcpm/tcpm.c b/drivers/usb/typec/tcpm/tcpm.c index dfbb94ddc98a61..55fee96d3342ad 100644 --- a/drivers/usb/typec/tcpm/tcpm.c +++ b/drivers/usb/typec/tcpm/tcpm.c @@ -732,9 +732,14 @@ static const char * const pd_rev[] = { (tcpm_cc_is_source((port)->cc2) && \ !tcpm_cc_is_source((port)->cc1))) +#define tcpm_port_is_debug_source(port) \ + (tcpm_cc_is_source((port)->cc1) && tcpm_cc_is_source((port)->cc2)) + +#define tcpm_port_is_debug_sink(port) \ + (tcpm_cc_is_sink((port)->cc1) && tcpm_cc_is_sink((port)->cc2)) + #define tcpm_port_is_debug(port) \ - ((tcpm_cc_is_source((port)->cc1) && tcpm_cc_is_source((port)->cc2)) || \ - (tcpm_cc_is_sink((port)->cc1) && tcpm_cc_is_sink((port)->cc2))) + (tcpm_port_is_debug_source(port) || tcpm_port_is_debug_sink(port)) #define tcpm_port_is_audio(port) \ (tcpm_cc_is_audio((port)->cc1) && tcpm_cc_is_audio((port)->cc2)) @@ -5176,7 +5181,7 @@ static void run_state_machine(struct tcpm_port *port) tcpm_set_state(port, SNK_UNATTACHED, PD_T_DRP_SNK); break; case SRC_ATTACH_WAIT: - if (tcpm_port_is_debug(port)) + if (tcpm_port_is_debug_source(port)) tcpm_set_state(port, DEBUG_ACC_ATTACHED, port->timings.cc_debounce_time); else if (tcpm_port_is_audio(port)) @@ -5434,7 +5439,7 @@ static void run_state_machine(struct tcpm_port *port) tcpm_set_state(port, SRC_UNATTACHED, PD_T_DRP_SRC); break; case SNK_ATTACH_WAIT: - if (tcpm_port_is_debug(port)) + if (tcpm_port_is_debug_sink(port)) tcpm_set_state(port, DEBUG_ACC_ATTACHED, PD_T_CC_DEBOUNCE); else if (tcpm_port_is_audio(port)) @@ -5454,7 +5459,7 @@ static void run_state_machine(struct tcpm_port *port) if (tcpm_port_is_disconnected(port)) tcpm_set_state(port, SNK_UNATTACHED, PD_T_PD_DEBOUNCE); - else if (tcpm_port_is_debug(port)) + else if (tcpm_port_is_debug_sink(port)) tcpm_set_state(port, DEBUG_ACC_ATTACHED, PD_T_CC_DEBOUNCE); else if (tcpm_port_is_audio(port)) @@ -5935,6 +5940,8 @@ static void run_state_machine(struct tcpm_port *port) /* remove existing capabilities */ tcpm_partner_source_caps_reset(port); tcpm_pd_send_control(port, PD_CTRL_ACCEPT, TCPC_TX_SOP); + port->vdm_sm_running = false; + port->explicit_contract = false; tcpm_ams_finish(port); if (port->pwr_role == TYPEC_SOURCE) { port->upcoming_state = SRC_SEND_CAPABILITIES; @@ -6360,10 +6367,10 @@ static void _tcpm_cc_change(struct tcpm_port *port, enum typec_cc_status cc1, switch (port->state) { case TOGGLING: - if (tcpm_port_is_debug(port) || tcpm_port_is_audio(port) || + if (tcpm_port_is_debug_source(port) || tcpm_port_is_audio(port) || tcpm_port_is_source(port)) tcpm_set_state(port, SRC_ATTACH_WAIT, 0); - else if (tcpm_port_is_sink(port)) + else if (tcpm_port_is_debug_sink(port) || tcpm_port_is_sink(port)) tcpm_set_state(port, SNK_ATTACH_WAIT, 0); break; case CHECK_CONTAMINANT: @@ -6371,9 +6378,11 @@ static void _tcpm_cc_change(struct tcpm_port *port, enum typec_cc_status cc1, break; case SRC_UNATTACHED: case ACC_UNATTACHED: - if (tcpm_port_is_debug(port) || tcpm_port_is_audio(port) || + if (tcpm_port_is_debug_source(port) || tcpm_port_is_audio(port) || tcpm_port_is_source(port)) tcpm_set_state(port, SRC_ATTACH_WAIT, 0); + else if (tcpm_port_is_debug_sink(port)) + tcpm_set_state(port, SNK_ATTACH_WAIT, 0); break; case SRC_ATTACH_WAIT: if (tcpm_port_is_disconnected(port) || @@ -6395,7 +6404,7 @@ static void _tcpm_cc_change(struct tcpm_port *port, enum typec_cc_status cc1, } break; case SNK_UNATTACHED: - if (tcpm_port_is_debug(port) || tcpm_port_is_audio(port) || + if (tcpm_port_is_debug_sink(port) || tcpm_port_is_audio(port) || tcpm_port_is_sink(port)) tcpm_set_state(port, SNK_ATTACH_WAIT, 0); break; diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index 3f8d093aacf8a0..050e7542952ed1 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -482,6 +482,40 @@ static int vfio_pci_core_runtime_resume(struct device *dev) } #endif /* CONFIG_PM */ +/* + * Eager-request BAR resources, and iomap them. Soft failures are + * allowed, and consumers must check the barmap before use in order to + * give compatible user-visible behaviour with the previous on-demand + * allocation method. + */ +static void vfio_pci_core_map_bars(struct vfio_pci_core_device *vdev) +{ + struct pci_dev *pdev = vdev->pdev; + int i; + + for (i = 0; i < PCI_STD_NUM_BARS; i++) { + int bar = i + PCI_STD_RESOURCES; + + vdev->barmap[bar] = IOMEM_ERR_PTR(-ENODEV); + + if (!pci_resource_len(pdev, i)) + continue; + + if (pci_request_selected_regions(pdev, 1 << bar, "vfio")) { + pci_dbg(pdev, "Failed to reserve region %d\n", bar); + vdev->barmap[bar] = IOMEM_ERR_PTR(-EBUSY); + continue; + } + + vdev->barmap[bar] = pci_iomap(pdev, bar, 0); + if (!vdev->barmap[bar]) { + pci_dbg(pdev, "Failed to iomap region %d\n", bar); + pci_release_selected_regions(pdev, 1 << bar); + vdev->barmap[bar] = IOMEM_ERR_PTR(-ENOMEM); + } + } +} + /* * The pci-driver core runtime PM routines always save the device state * before going into suspended state. If the device is going into low power @@ -568,6 +602,7 @@ int vfio_pci_core_enable(struct vfio_pci_core_device *vdev) if (!vfio_vga_disabled() && vfio_pci_is_vga(pdev)) vdev->has_vga = true; + vfio_pci_core_map_bars(vdev); return 0; @@ -648,7 +683,7 @@ void vfio_pci_core_disable(struct vfio_pci_core_device *vdev) for (i = 0; i < PCI_STD_NUM_BARS; i++) { bar = i + PCI_STD_RESOURCES; - if (!vdev->barmap[bar]) + if (IS_ERR_OR_NULL(vdev->barmap[bar])) continue; pci_iounmap(pdev, vdev->barmap[bar]); pci_release_selected_regions(pdev, 1 << bar); diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c index f87fd32e4a0176..1a177ce7de546a 100644 --- a/drivers/vfio/pci/vfio_pci_dmabuf.c +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c @@ -244,9 +244,11 @@ int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags, return -EINVAL; /* - * For PCI the region_index is the BAR number like everything else. + * For PCI the region_index is the BAR number like everything + * else. Check that PCI resources have been claimed for it. */ - if (get_dma_buf.region_index >= VFIO_PCI_ROM_REGION_INDEX) + if (get_dma_buf.region_index >= VFIO_PCI_ROM_REGION_INDEX || + vfio_pci_core_setup_barmap(vdev, get_dma_buf.region_index)) return -ENODEV; dma_ranges = memdup_array_user(&arg->dma_ranges, get_dma_buf.nr_ranges, @@ -354,19 +356,18 @@ void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, bool revoked) if (revoked) { kref_put(&priv->kref, vfio_pci_dma_buf_done); wait_for_completion(&priv->comp); - } else { /* - * Kref is initialize again, because when revoke - * was performed the reference counter was decreased - * to zero to trigger completion. + * Re-arm the registered kref reference and the + * completion so the post-revoke state matches the + * post-creation state. An un-revoke followed by a + * new mapping needs the kref to be non-zero before + * kref_get(), and vfio_pci_dma_buf_cleanup() + * delegates its drain back through this revoke + * path on a possibly-already-revoked dma-buf. */ kref_init(&priv->kref); - /* - * There is no need to wait as no mapping was - * performed when the previous status was - * priv->revoked == true. - */ reinit_completion(&priv->comp); + } else { dma_resv_lock(priv->dmabuf->resv, NULL); priv->revoked = false; dma_resv_unlock(priv->dmabuf->resv); @@ -382,21 +383,22 @@ void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev) struct vfio_pci_dma_buf *tmp; down_write(&vdev->memory_lock); + + /* + * Drain any active mappings via the revoke path. The move is + * idempotent for dma-bufs already in the revoked state and + * leaves every priv with the kref re-armed and the completion + * ready, so cleanup itself does not need to participate in kref + * bookkeeping. + */ + vfio_pci_dma_buf_move(vdev, true); + list_for_each_entry_safe(priv, tmp, &vdev->dmabufs, dmabufs_elm) { if (!get_file_active(&priv->dmabuf->file)) continue; - dma_resv_lock(priv->dmabuf->resv, NULL); list_del_init(&priv->dmabufs_elm); priv->vdev = NULL; - priv->revoked = true; - dma_buf_invalidate_mappings(priv->dmabuf); - dma_resv_wait_timeout(priv->dmabuf->resv, - DMA_RESV_USAGE_BOOKKEEP, false, - MAX_SCHEDULE_TIMEOUT); - dma_resv_unlock(priv->dmabuf->resv); - kref_put(&priv->kref, vfio_pci_dma_buf_done); - wait_for_completion(&priv->comp); vfio_device_put_registration(&vdev->vdev); fput(priv->dmabuf->file); } diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c index 4251ee03e1463b..3bfbb879a00513 100644 --- a/drivers/vfio/pci/vfio_pci_rdwr.c +++ b/drivers/vfio/pci/vfio_pci_rdwr.c @@ -198,27 +198,15 @@ ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device *vdev, bool test_mem, } EXPORT_SYMBOL_GPL(vfio_pci_core_do_io_rw); +/* + * The barmap is set up in vfio_pci_core_enable(). Callers use this + * function to check that the BAR resources are requested or that the + * pci_iomap() was done. + */ int vfio_pci_core_setup_barmap(struct vfio_pci_core_device *vdev, int bar) { - struct pci_dev *pdev = vdev->pdev; - int ret; - void __iomem *io; - - if (vdev->barmap[bar]) - return 0; - - ret = pci_request_selected_regions(pdev, 1 << bar, "vfio"); - if (ret) - return ret; - - io = pci_iomap(pdev, bar, 0); - if (!io) { - pci_release_selected_regions(pdev, 1 << bar); - return -ENOMEM; - } - - vdev->barmap[bar] = io; - + if (IS_ERR(vdev->barmap[bar])) + return PTR_ERR(vdev->barmap[bar]); return 0; } EXPORT_SYMBOL_GPL(vfio_pci_core_setup_barmap); diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c index a12dd25ab6979f..fd00b86e1ae602 100644 --- a/drivers/video/fbdev/core/fb_defio.c +++ b/drivers/video/fbdev/core/fb_defio.c @@ -14,7 +14,6 @@ #include #include #include -#include #include #include #include diff --git a/drivers/video/fbdev/udlfb.c b/drivers/video/fbdev/udlfb.c index c341d76bc5646b..fdbb8671a810c7 100644 --- a/drivers/video/fbdev/udlfb.c +++ b/drivers/video/fbdev/udlfb.c @@ -321,12 +321,32 @@ static int dlfb_set_video_mode(struct dlfb_data *dlfb, return retval; } +static void dlfb_vm_open(struct vm_area_struct *vma) +{ + struct dlfb_data *dlfb = vma->vm_private_data; + + atomic_inc(&dlfb->mmap_count); +} + +static void dlfb_vm_close(struct vm_area_struct *vma) +{ + struct dlfb_data *dlfb = vma->vm_private_data; + + atomic_dec(&dlfb->mmap_count); +} + +static const struct vm_operations_struct dlfb_vm_ops = { + .open = dlfb_vm_open, + .close = dlfb_vm_close, +}; + static int dlfb_ops_mmap(struct fb_info *info, struct vm_area_struct *vma) { unsigned long start = vma->vm_start; unsigned long size = vma->vm_end - vma->vm_start; unsigned long offset = vma->vm_pgoff << PAGE_SHIFT; unsigned long page, pos; + struct dlfb_data *dlfb = info->par; if (info->fbdefio) return fb_deferred_io_mmap(info, vma); @@ -358,6 +378,9 @@ static int dlfb_ops_mmap(struct fb_info *info, struct vm_area_struct *vma) size = 0; } + vma->vm_ops = &dlfb_vm_ops; + vma->vm_private_data = dlfb; + atomic_inc(&dlfb->mmap_count); return 0; } @@ -1176,7 +1199,6 @@ static void dlfb_deferred_vfree(struct dlfb_data *dlfb, void *mem) /* * Assumes &info->lock held by caller - * Assumes no active clients have framebuffer open */ static int dlfb_realloc_framebuffer(struct dlfb_data *dlfb, struct fb_info *info, u32 new_len) { @@ -1188,6 +1210,13 @@ static int dlfb_realloc_framebuffer(struct dlfb_data *dlfb, struct fb_info *info new_len = PAGE_ALIGN(new_len); if (new_len > old_len) { + if (atomic_read(&dlfb->mmap_count) > 0) { + dev_warn(info->dev, + "refusing realloc: %d active mmaps\n", + atomic_read(&dlfb->mmap_count)); + return -EBUSY; + } + /* * Alloc system memory for virtual framebuffer */ diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c index e001e6769a43fc..910a1de0d5a72f 100644 --- a/drivers/virt/coco/sev-guest/sev-guest.c +++ b/drivers/virt/coco/sev-guest/sev-guest.c @@ -176,7 +176,6 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques struct snp_guest_req req = {}; int ret, npages = 0, resp_len; sockptr_t certs_address; - struct page *page; if (sockptr_is_null(io->req_data) || sockptr_is_null(io->resp_data)) return -EINVAL; @@ -211,16 +210,15 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques * zeros to indicate that certificate data was not provided. */ npages = report_req->certs_len >> PAGE_SHIFT; - page = alloc_pages(GFP_KERNEL_ACCOUNT | __GFP_ZERO, - get_order(report_req->certs_len)); - if (!page) + req.certs_data = alloc_pages_exact(npages << PAGE_SHIFT, + GFP_KERNEL_ACCOUNT | __GFP_ZERO); + if (!req.certs_data) return -ENOMEM; - req.certs_data = page_address(page); ret = set_memory_decrypted((unsigned long)req.certs_data, npages); if (ret) { pr_err("failed to mark page shared, ret=%d\n", ret); - __free_pages(page, get_order(report_req->certs_len)); + free_pages_exact(req.certs_data, npages << PAGE_SHIFT); return -EFAULT; } @@ -277,7 +275,7 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques if (set_memory_encrypted((unsigned long)req.certs_data, npages)) WARN_ONCE(ret, "failed to restore encryption mask (leak it)\n"); else - __free_pages(page, get_order(report_req->certs_len)); + free_pages_exact(req.certs_data, npages << PAGE_SHIFT); } return ret; } diff --git a/drivers/xen/xen-acpi-pad.c b/drivers/xen/xen-acpi-pad.c index 75a39862c1df3e..5b98e0e9380768 100644 --- a/drivers/xen/xen-acpi-pad.c +++ b/drivers/xen/xen-acpi-pad.c @@ -110,9 +110,13 @@ static void acpi_pad_notify(acpi_handle handle, u32 event, static int acpi_pad_probe(struct platform_device *pdev) { - struct acpi_device *device = ACPI_COMPANION(&pdev->dev); + struct acpi_device *device; acpi_status status; + device = ACPI_COMPANION(&pdev->dev); + if (!device) + return -ENODEV; + strcpy(acpi_device_name(device), ACPI_PROCESSOR_AGGREGATOR_DEVICE_NAME); strcpy(acpi_device_class(device), ACPI_PROCESSOR_AGGREGATOR_CLASS); diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c index e6f5a17a13e369..b611c64119dbca 100644 --- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -2412,29 +2412,25 @@ static struct btrfs_block_group *btrfs_create_block_group( */ static int check_chunk_block_group_mappings(struct btrfs_fs_info *fs_info) { - u64 start = 0; + struct rb_node *node; int ret = 0; - while (1) { + /* + * This is called during mount from btrfs_read_block_groups(), before + * any background threads are started, so no concurrent writers can + * modify the mapping_tree. No lock is needed here. + */ + for (node = rb_first_cached(&fs_info->mapping_tree); node; node = rb_next(node)) { struct btrfs_chunk_map *map; struct btrfs_block_group *bg; - /* - * btrfs_find_chunk_map() will return the first chunk map - * intersecting the range, so setting @length to 1 is enough to - * get the first chunk. - */ - map = btrfs_find_chunk_map(fs_info, start, 1); - if (!map) - break; - + map = rb_entry(node, struct btrfs_chunk_map, rb_node); bg = btrfs_lookup_block_group(fs_info, map->start); if (unlikely(!bg)) { btrfs_err(fs_info, "chunk start=%llu len=%llu doesn't have corresponding block group", map->start, map->chunk_len); ret = -EUCLEAN; - btrfs_free_chunk_map(map); break; } if (unlikely(bg->start != map->start || bg->length != map->chunk_len || @@ -2447,12 +2443,9 @@ static int check_chunk_block_group_mappings(struct btrfs_fs_info *fs_info) bg->start, bg->length, bg->flags & BTRFS_BLOCK_GROUP_TYPE_MASK); ret = -EUCLEAN; - btrfs_free_chunk_map(map); btrfs_put_block_group(bg); break; } - start = map->start + map->chunk_len; - btrfs_free_chunk_map(map); btrfs_put_block_group(bg); } return ret; diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index b2393a48a8fe9c..a02b62e0a8f33e 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -407,22 +407,18 @@ static noinline int add_ra_bio_pages(struct inode *inode, end_index = (i_size_read(inode) - 1) >> PAGE_SHIFT; - /* - * Avoid direct reclaim when the caller does not allow it. Since - * add_ra_bio_pages() is always speculative, suppress allocation warnings - * in either case. - */ + /* Avoid direct reclaim when the caller does not allow it. */ + constraint_gfp = ~__GFP_FS; + cache_gfp = GFP_NOFS | __GFP_NOWARN; if (!direct_reclaim) { - constraint_gfp = ~(__GFP_FS | __GFP_DIRECT_RECLAIM) | __GFP_NOWARN; - cache_gfp = (GFP_NOFS & ~__GFP_DIRECT_RECLAIM) | __GFP_NOWARN; - } else { - constraint_gfp = (~__GFP_FS) | __GFP_NOWARN; - cache_gfp = GFP_NOFS | __GFP_NOWARN; + constraint_gfp &= ~__GFP_DIRECT_RECLAIM; + cache_gfp &= ~__GFP_DIRECT_RECLAIM; } while (cur < compressed_end) { pgoff_t page_end; pgoff_t pg_index = cur >> PAGE_SHIFT; + gfp_t masked_constraint_gfp; u32 add_size; if (pg_index > end_index) @@ -449,8 +445,14 @@ static noinline int add_ra_bio_pages(struct inode *inode, continue; } - folio = filemap_alloc_folio(mapping_gfp_constraint(mapping, constraint_gfp), - 0, NULL); + /* + * Since add_ra_bio_pages() is always speculative, suppress + * allocation warnings. + */ + masked_constraint_gfp = mapping_gfp_constraint(mapping, constraint_gfp); + masked_constraint_gfp |= __GFP_NOWARN; + + folio = filemap_alloc_folio(masked_constraint_gfp, 0, NULL); if (!folio) break; diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 8a11be02eeb9be..c0a30bb213d7a0 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -4686,6 +4686,7 @@ static void btrfs_destroy_marked_extents(struct btrfs_fs_info *fs_info, free_extent_buffer_stale(eb); } } + btrfs_extent_io_tree_release(dirty_pages); } static void btrfs_destroy_pinned_extent(struct btrfs_fs_info *fs_info, diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 906d5c21ebc477..75136a17271014 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -9299,10 +9299,38 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode, if (!(mode & FALLOC_FL_KEEP_SIZE) && (actual_len > inode->i_size) && (cur_offset > inode->i_size)) { + u64 range_start; + u64 range_end; + if (cur_offset > actual_len) i_size = actual_len; else i_size = cur_offset; + + /* + * Make sure the file_extent_tree covers the entire + * range [old_i_size, new_i_size) before we update + * disk_i_size. Without this, a previous KEEP_SIZE + * prealloc that extended past i_size (and was lost + * across umount/mount because file_extent_tree is + * only populated up to round_up(i_size) on inode + * load) can leave a gap inside this range. That gap + * would cause btrfs_inode_safe_disk_i_size_write() + * (via find_contiguous_extent_bit() starting at 0) + * to truncate disk_i_size to the start of the gap, + * making the persisted size smaller than i_size. + */ + range_start = round_down(inode->i_size, fs_info->sectorsize); + range_end = round_up(i_size, fs_info->sectorsize); + ret = btrfs_inode_set_file_extent_range(BTRFS_I(inode), + range_start, range_end - range_start); + if (ret) { + btrfs_abort_transaction(trans, ret); + if (own_trans) + btrfs_end_transaction(trans); + break; + } + i_size_write(inode, i_size); btrfs_inode_safe_disk_i_size_write(BTRFS_I(inode), 0); } diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index 248adb785051b7..194f581b36f36e 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -1293,14 +1293,13 @@ static int btrfs_write_and_wait_transaction(struct btrfs_trans_handle *trans) blk_finish_plug(&plug); ret2 = btrfs_wait_extents(fs_info, dirty_pages); - btrfs_extent_io_tree_release(&trans->transaction->dirty_pages); - if (ret) return ret; - else if (ret2) + if (ret2) return ret2; - else - return 0; + + btrfs_extent_io_tree_release(&trans->transaction->dirty_pages); + return 0; } /* diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index 1454760332ffce..0a86f672cc09ce 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -1336,6 +1336,7 @@ void ceph_process_folio_batch(struct address_space *mapping, ceph_wbc, folio); if (rc == -ENODATA) { folio_unlock(folio); + folio_put(folio); ceph_wbc->fbatch.folios[i] = NULL; continue; } else if (rc == -E2BIG) { @@ -1346,6 +1347,7 @@ void ceph_process_folio_batch(struct address_space *mapping, if (!folio_clear_dirty_for_io(folio)) { doutc(cl, "%p !folio_clear_dirty_for_io\n", folio); folio_unlock(folio); + folio_put(folio); ceph_wbc->fbatch.folios[i] = NULL; continue; } diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c index 4dc9426643e834..053d5bf0c9f07c 100644 --- a/fs/ceph/quota.c +++ b/fs/ceph/quota.c @@ -228,12 +228,19 @@ static int get_quota_realm(struct ceph_mds_client *mdsc, struct inode *inode, restart: realm = ceph_inode(inode)->i_snap_realm; - if (realm) + if (realm) { ceph_get_snap_realm(mdsc, realm); - else - pr_err_ratelimited_client(cl, - "%p %llx.%llx null i_snap_realm\n", - inode, ceph_vinop(inode)); + } else { + /* + * i_snap_realm is NULL when all caps have been released, e.g. + * after an MDS session rejection. This is a transient state; + * the realm will be restored once caps are re-granted. + * Treat it as "no quota realm found". + */ + doutc(cl, "%p %llx.%llx null i_snap_realm\n", + inode, ceph_vinop(inode)); + } + while (realm) { bool has_inode; @@ -340,12 +347,19 @@ static bool check_quota_exceeded(struct inode *inode, enum quota_check_op op, down_read(&mdsc->snap_rwsem); restart: realm = ceph_inode(inode)->i_snap_realm; - if (realm) + if (realm) { ceph_get_snap_realm(mdsc, realm); - else - pr_err_ratelimited_client(cl, - "%p %llx.%llx null i_snap_realm\n", - inode, ceph_vinop(inode)); + } else { + /* + * i_snap_realm is NULL when all caps have been released, e.g. + * after an MDS session rejection. This is a transient state; + * the realm will be restored once caps are re-granted. + * Treat it as "quota not exceeded". + */ + doutc(cl, "%p %llx.%llx null i_snap_realm\n", + inode, ceph_vinop(inode)); + } + while (realm) { bool has_inode; @@ -496,6 +510,9 @@ bool ceph_quota_update_statfs(struct ceph_fs_client *fsc, struct kstatfs *buf) u64 total = 0, used, free; bool is_updated = false; + if (!ceph_has_realms_with_quotas(d_inode(fsc->sb->s_root))) + return false; + down_read(&mdsc->snap_rwsem); get_quota_realm(mdsc, d_inode(fsc->sb->s_root), QUOTA_GET_MAX_BYTES, &realm, true); diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c index 5f87f62091a144..e773be07f7674a 100644 --- a/fs/ceph/xattr.c +++ b/fs/ceph/xattr.c @@ -1254,6 +1254,22 @@ int __ceph_setxattr(struct inode *inode, const char *name, ceph_vinop(inode), name, ceph_cap_string(issued)); __build_xattrs(inode); + /* + * __build_xattrs() may have released and reacquired i_ceph_lock, + * during which handle_cap_grant() could have replaced i_xattrs.blob + * with a newer MDS-provided blob and bumped i_xattrs.version. If that + * caused __build_xattrs() to rebuild the rb-tree from the new blob, + * count/names_size/vals_size may now be larger than when + * required_blob_size was computed above. Recompute it here so the + * prealloc_blob size check below reflects the current tree state. + */ + required_blob_size = __get_required_blob_size(ci, name_len, val_len); + if (required_blob_size > mdsc->mdsmap->m_max_xattr_size) { + doutc(cl, "sync (size too large): %d > %llu\n", + required_blob_size, mdsc->mdsmap->m_max_xattr_size); + goto do_sync; + } + if (!ci->i_xattrs.prealloc_blob || required_blob_size > ci->i_xattrs.prealloc_blob->alloc_len) { struct ceph_buffer *blob; @@ -1294,6 +1310,7 @@ int __ceph_setxattr(struct inode *inode, const char *name, do_sync: spin_unlock(&ci->i_ceph_lock); + ceph_buffer_put(old_blob); do_sync_unlocked: if (lock_snap_rwsem) up_read(&mdsc->snap_rwsem); diff --git a/fs/efivarfs/super.c b/fs/efivarfs/super.c index 1c5224cf183e6d..733c19571f1cfe 100644 --- a/fs/efivarfs/super.c +++ b/fs/efivarfs/super.c @@ -191,13 +191,10 @@ static const struct dentry_operations efivarfs_d_ops = { static struct dentry *efivarfs_alloc_dentry(struct dentry *parent, char *name) { + struct qstr q = QSTR(name); struct dentry *d; - struct qstr q; int err; - q.name = name; - q.len = strlen(name); - err = efivarfs_d_hash(parent, &q); if (err) return ERR_PTR(err); diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index b0a6549b38487b..b36ee619cdcdda 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -355,7 +355,7 @@ static ssize_t iomap_dio_bio_iter_one(struct iomap_iter *iter, if (dio->flags & IOMAP_DIO_BOUNCE) ret = bio_iov_iter_bounce(bio, dio->submit.iter, - iomap_max_bio_size(&iter->iomap)); + iomap_max_bio_size(&iter->iomap), alignment); else ret = bio_iov_iter_get_pages(bio, dio->submit.iter, alignment - 1); diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c index 85e94c30285a26..ab39ec88544050 100644 --- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c @@ -1413,6 +1413,9 @@ nfsd4_clone(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, dst, clone->cl_dst_pos, clone->cl_count, EX_ISSYNC(cstate->current_fh.fh_export)); + if (!status && (READ_ONCE(dst->nf_file->f_mode) & FMODE_NOCMTIME) != 0) + nfsd_update_cmtime_attr(dst->nf_file, 0); + nfsd_file_put(dst); nfsd_file_put(src); out: @@ -2118,8 +2121,10 @@ static int nfsd4_do_async_copy(void *data) set_bit(NFSD4_COPY_F_COMPLETED, ©->cp_flags); trace_nfsd_copy_async_done(copy); - nfsd4_send_cb_offload(copy); atomic_dec(©->cp_nn->pending_async_copies); + if (copy->cp_res.wr_bytes_written > 0 && copy->attr_update) + nfsd_update_cmtime_attr(copy->nf_dst->nf_file, 0); + nfsd4_send_cb_offload(copy); return 0; } @@ -2179,6 +2184,9 @@ nfsd4_copy(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, memcpy(&result->cb_stateid, ©->cp_stateid.cs_stid, sizeof(result->cb_stateid)); dup_copy_fields(copy, async_copy); + if ((READ_ONCE(copy->nf_dst->nf_file->f_mode) & + FMODE_NOCMTIME) != 0) + async_copy->attr_update = true; memcpy(async_copy->cp_cb_offload.co_referring_sessionid.data, cstate->session->se_sessionid.data, NFS4_MAX_SESSIONID_LEN); @@ -2197,6 +2205,10 @@ nfsd4_copy(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, } else { status = nfsd4_do_copy(copy, copy->nf_src->nf_file, copy->nf_dst->nf_file, true); + if ((READ_ONCE(copy->nf_dst->nf_file->f_mode) & + FMODE_NOCMTIME) != 0 && + copy->cp_res.wr_bytes_written > 0) + nfsd_update_cmtime_attr(copy->nf_dst->nf_file, 0); } out: trace_nfsd_copy_done(copy, status); @@ -2535,10 +2547,6 @@ nfsd4_get_dir_delegation(struct svc_rqst *rqstp, dd = nfsd_get_dir_deleg(cstate, gdd, nf); nfsd_file_put(nf); if (IS_ERR(dd)) { - int err = PTR_ERR(dd); - - if (err != -EAGAIN) - return nfserrno(err); gdd->gddrnf_status = GDD4_UNAVAIL; return nfs_ok; } diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index c2d13b26a68763..6837b63d986451 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -1221,10 +1221,6 @@ static void put_deleg_file(struct nfs4_file *fp) static void nfsd4_finalize_deleg_timestamps(struct nfs4_delegation *dp, struct file *f) { - struct iattr ia = { .ia_valid = ATTR_ATIME | ATTR_CTIME | ATTR_MTIME | ATTR_DELEG }; - struct inode *inode = file_inode(f); - int ret; - /* don't do anything if FMODE_NOCMTIME isn't set */ if ((READ_ONCE(f->f_mode) & FMODE_NOCMTIME) == 0) return; @@ -1242,17 +1238,7 @@ static void nfsd4_finalize_deleg_timestamps(struct nfs4_delegation *dp, struct f return; /* Stamp everything to "now" */ - inode_lock(inode); - ret = notify_change(&nop_mnt_idmap, f->f_path.dentry, &ia, NULL); - inode_unlock(inode); - if (ret) { - struct inode *inode = file_inode(f); - - pr_notice_ratelimited("nfsd: Unable to update timestamps on inode %02x:%02x:%llu: %d\n", - MAJOR(inode->i_sb->s_dev), - MINOR(inode->i_sb->s_dev), - inode->i_ino, ret); - } + nfsd_update_cmtime_attr(f, ATTR_ATIME); } static void nfs4_unlock_deleg_lease(struct nfs4_delegation *dp) @@ -1865,6 +1851,13 @@ void nfsd4_revoke_states(struct nfsd_net *nn, struct super_block *sb) break; case SC_TYPE_LAYOUT: ls = layoutstateid(stid); + spin_lock(&clp->cl_lock); + if (stid->sc_status == 0) { + stid->sc_status |= + SC_STATUS_ADMIN_REVOKED; + atomic_inc(&clp->cl_admin_revoked); + } + spin_unlock(&clp->cl_lock); nfsd4_close_layout(ls); break; } @@ -6378,7 +6371,6 @@ nfs4_open_delegation(struct svc_rqst *rqstp, struct nfsd4_open *open, } open->op_delegate_type = deleg_ts ? OPEN_DELEGATE_WRITE_ATTRS_DELEG : OPEN_DELEGATE_WRITE; - dp->dl_cb_fattr.ncf_cur_fsize = stat.size; dp->dl_cb_fattr.ncf_initial_cinfo = nfsd4_change_attribute(&stat); dp->dl_atime = stat.atime; dp->dl_ctime = stat.ctime; @@ -9429,11 +9421,15 @@ nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp, struct dentry *dentry, if (status != nfserr_jukebox || !nfsd_wait_for_delegreturn(rqstp, inode)) goto out_status; + status = nfs_ok; + goto out_status; + } + if (!ncf->ncf_file_modified) { + if (ncf->ncf_initial_cinfo != ncf->ncf_cb_change) + ncf->ncf_file_modified = true; + else if (i_size_read(inode) != ncf->ncf_cb_fsize) + ncf->ncf_file_modified = true; } - if (!ncf->ncf_file_modified && - (ncf->ncf_initial_cinfo != ncf->ncf_cb_change || - ncf->ncf_cur_fsize != ncf->ncf_cb_fsize)) - ncf->ncf_file_modified = true; if (ncf->ncf_file_modified) { int err; @@ -9560,3 +9556,31 @@ nfsd_get_dir_deleg(struct nfsd4_compound_state *cstate, put_nfs4_file(fp); return ERR_PTR(status); } + +/** + * nfsd_update_cmtime_attr - update file's delegated ctime/mtime, + * and optionally other attributes (ie ATTR_ATIME). + * @f: pointer to an opened file + * @flags: any additional flags that should be updated + * + * Given upon opening a file delegated attributes were issues, update + * @f attributes to current times. + */ +void nfsd_update_cmtime_attr(struct file *f, unsigned int flags) +{ + int ret; + struct inode *inode = file_inode(f); + struct iattr attr = { + .ia_valid = ATTR_CTIME | ATTR_MTIME | ATTR_DELEG | flags, + }; + + inode_lock(inode); + ret = notify_change(&nop_mnt_idmap, f->f_path.dentry, &attr, NULL); + inode_unlock(inode); + if (ret) + pr_notice_ratelimited("nfsd: Unable to update timestamps on " + "inode %02x:%02x:%llu: %d\n", + MAJOR(inode->i_sb->s_dev), + MINOR(inode->i_sb->s_dev), + inode->i_ino, ret); +} diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h index 953675eba5c366..c5ccea64c2817f 100644 --- a/fs/nfsd/state.h +++ b/fs/nfsd/state.h @@ -843,6 +843,7 @@ extern void nfsd4_shutdown_copy(struct nfs4_client *clp); void nfsd4_put_client(struct nfs4_client *clp); void nfsd4_async_copy_reaper(struct nfsd_net *nn); bool nfsd4_has_active_async_copies(struct nfs4_client *clp); +void nfsd_update_cmtime_attr(struct file *f, unsigned int flags); extern struct nfs4_client_reclaim *nfs4_client_to_reclaim(struct xdr_netobj name, struct xdr_netobj princhash, struct nfsd_net *nn); extern bool nfs4_has_reclaimed_state(struct xdr_netobj name, struct nfsd_net *nn); diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h index 417e9ad9fbb397..9a4124c77e049d 100644 --- a/fs/nfsd/xdr4.h +++ b/fs/nfsd/xdr4.h @@ -752,6 +752,7 @@ struct nfsd4_copy { struct nfsd_file *nf_src; struct nfsd_file *nf_dst; + bool attr_update; copy_stateid_t cp_stateid; diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c index 7b86a6bac6449a..b41f4788e4f06b 100644 --- a/fs/overlayfs/util.c +++ b/fs/overlayfs/util.c @@ -1354,7 +1354,7 @@ int ovl_ensure_verity_loaded(const struct path *datapath) struct inode *inode = d_inode(datapath->dentry); struct file *filp; - if (!fsverity_active(inode) && IS_VERITY(inode)) { + if (IS_VERITY(inode) && fsverity_get_info(inode) == NULL) { /* * If this inode was not yet opened, the verity info hasn't been * loaded yet, so we need to do that here to force it into memory. diff --git a/fs/smb/client/cached_dir.c b/fs/smb/client/cached_dir.c index 02791ec3c5a169..88d5e9a32f28b3 100644 --- a/fs/smb/client/cached_dir.c +++ b/fs/smb/client/cached_dir.c @@ -286,6 +286,14 @@ int open_cached_dir(unsigned int xid, struct cifs_tcon *tcon, &rqst[0], &oplock, &oparms, utf16_path); if (rc) goto oshr_free; + + if (oplock != SMB2_OPLOCK_LEVEL_II) { + rc = -EINVAL; + cifs_dbg(FYI, "%s: Oplock level %d not suitable for cached directory\n", + __func__, oplock); + goto oshr_free; + } + smb2_set_next_command(tcon, &rqst[0]); memset(&qi_iov, 0, sizeof(qi_iov)); diff --git a/fs/smb/client/cifsacl.c b/fs/smb/client/cifsacl.c index ec5d477793040c..786dbbc43c5b99 100644 --- a/fs/smb/client/cifsacl.c +++ b/fs/smb/client/cifsacl.c @@ -1264,6 +1264,17 @@ static int parse_sid(struct smb_sid *psid, char *end_of_acl) return 0; } +static bool dacl_offset_valid(unsigned int acl_len, __u32 dacloffset) +{ + if (acl_len < sizeof(struct smb_acl)) + return false; + + if (dacloffset < sizeof(struct smb_ntsd)) + return false; + + return dacloffset <= acl_len - sizeof(struct smb_acl); +} + /* Convert CIFS ACL to POSIX form */ static int parse_sec_desc(struct cifs_sb_info *cifs_sb, @@ -1284,7 +1295,6 @@ static int parse_sec_desc(struct cifs_sb_info *cifs_sb, group_sid_ptr = (struct smb_sid *)((char *)pntsd + le32_to_cpu(pntsd->gsidoffset)); dacloffset = le32_to_cpu(pntsd->dacloffset); - dacl_ptr = (struct smb_acl *)((char *)pntsd + dacloffset); cifs_dbg(NOISY, "revision %d type 0x%x ooffset 0x%x goffset 0x%x sacloffset 0x%x dacloffset 0x%x\n", pntsd->revision, pntsd->type, le32_to_cpu(pntsd->osidoffset), le32_to_cpu(pntsd->gsidoffset), @@ -1315,11 +1325,18 @@ static int parse_sec_desc(struct cifs_sb_info *cifs_sb, return rc; } - if (dacloffset) + if (dacloffset) { + if (!dacl_offset_valid(acl_len, dacloffset)) { + cifs_dbg(VFS, "Server returned illegal DACL offset\n"); + return -EINVAL; + } + + dacl_ptr = (struct smb_acl *)((char *)pntsd + dacloffset); parse_dacl(dacl_ptr, end_of_acl, owner_sid_ptr, group_sid_ptr, fattr, get_mode_from_special_sid); - else + } else { cifs_dbg(FYI, "no ACL\n"); /* BB grant all or default perms? */ + } return rc; } @@ -1342,6 +1359,11 @@ static int build_sec_desc(struct smb_ntsd *pntsd, struct smb_ntsd *pnntsd, dacloffset = le32_to_cpu(pntsd->dacloffset); if (dacloffset) { + if (!dacl_offset_valid(secdesclen, dacloffset)) { + cifs_dbg(VFS, "Server returned illegal DACL offset\n"); + return -EINVAL; + } + dacl_ptr = (struct smb_acl *)((char *)pntsd + dacloffset); rc = validate_dacl(dacl_ptr, end_of_acl); if (rc) @@ -1710,6 +1732,12 @@ id_mode_to_cifs_acl(struct inode *inode, const char *path, __u64 *pnmode, nsecdesclen = sizeof(struct smb_ntsd) + (sizeof(struct smb_sid) * 2); dacloffset = le32_to_cpu(pntsd->dacloffset); if (dacloffset) { + if (!dacl_offset_valid(secdesclen, dacloffset)) { + cifs_dbg(VFS, "Server returned illegal DACL offset\n"); + rc = -EINVAL; + goto id_mode_to_cifs_acl_exit; + } + dacl_ptr = (struct smb_acl *)((char *)pntsd + dacloffset); rc = validate_dacl(dacl_ptr, (char *)pntsd + secdesclen); if (rc) { @@ -1732,7 +1760,7 @@ id_mode_to_cifs_acl(struct inode *inode, const char *path, __u64 *pnmode, * descriptor parameters, and security descriptor itself */ nsecdesclen = max_t(u32, nsecdesclen, DEFAULT_SEC_DESC_LEN); - pnntsd = kmalloc(nsecdesclen, GFP_KERNEL); + pnntsd = kzalloc(nsecdesclen, GFP_KERNEL); if (!pnntsd) { kfree(pntsd); cifs_put_tlink(tlink); @@ -1752,6 +1780,7 @@ id_mode_to_cifs_acl(struct inode *inode, const char *path, __u64 *pnmode, rc = ops->set_acl(pnntsd, nsecdesclen, inode, path, aclflag); cifs_dbg(NOISY, "set_cifs_acl rc: %d\n", rc); } +id_mode_to_cifs_acl_exit: cifs_put_tlink(tlink); kfree(pnntsd); diff --git a/fs/smb/client/fs_context.c b/fs/smb/client/fs_context.c index b63ec7ab6e5131..4c8b1b9ade8b83 100644 --- a/fs/smb/client/fs_context.c +++ b/fs/smb/client/fs_context.c @@ -736,7 +736,7 @@ static int smb3_fs_context_parse_param(struct fs_context *fc, static int smb3_fs_context_parse_monolithic(struct fs_context *fc, void *data); static int smb3_get_tree(struct fs_context *fc); -static void smb3_sync_ses_chan_max(struct cifs_ses *ses, unsigned int max_channels); +static void smb3_sync_ses_chan_max(struct cifs_ses *ses, size_t max_channels); static int smb3_reconfigure(struct fs_context *fc); static const struct fs_context_operations smb3_fs_context_ops = { @@ -1010,25 +1010,34 @@ do { \ int smb3_sync_session_ctx_passwords(struct cifs_sb_info *cifs_sb, struct cifs_ses *ses) { + char *password = NULL, *password2 = NULL; + if (ses->password && cifs_sb->ctx->password && strcmp(ses->password, cifs_sb->ctx->password)) { - kfree_sensitive(cifs_sb->ctx->password); - cifs_sb->ctx->password = kstrdup(ses->password, GFP_KERNEL); - if (!cifs_sb->ctx->password) + password = kstrdup(ses->password, GFP_KERNEL); + if (!password) return -ENOMEM; } if (ses->password2 && cifs_sb->ctx->password2 && strcmp(ses->password2, cifs_sb->ctx->password2)) { - kfree_sensitive(cifs_sb->ctx->password2); - cifs_sb->ctx->password2 = kstrdup(ses->password2, GFP_KERNEL); - if (!cifs_sb->ctx->password2) { - kfree_sensitive(cifs_sb->ctx->password); - cifs_sb->ctx->password = NULL; + password2 = kstrdup(ses->password2, GFP_KERNEL); + if (!password2) { + kfree_sensitive(password); return -ENOMEM; } } + + if (password) { + kfree_sensitive(cifs_sb->ctx->password); + cifs_sb->ctx->password = password; + } + if (password2) { + kfree_sensitive(cifs_sb->ctx->password2); + cifs_sb->ctx->password2 = password2; + } + return 0; } @@ -1041,7 +1050,7 @@ int smb3_sync_session_ctx_passwords(struct cifs_sb_info *cifs_sb, struct cifs_se * with the session's channel lock. This should be called whenever the maximum * allowed channels for a session changes (e.g., after a remount or reconfigure). */ -static void smb3_sync_ses_chan_max(struct cifs_ses *ses, unsigned int max_channels) +static void smb3_sync_ses_chan_max(struct cifs_ses *ses, size_t max_channels) { spin_lock(&ses->chan_lock); ses->chan_max = max_channels; @@ -1051,12 +1060,15 @@ static void smb3_sync_ses_chan_max(struct cifs_ses *ses, unsigned int max_channe static int smb3_reconfigure(struct fs_context *fc) { struct smb3_fs_context *ctx = smb3_fc2context(fc); + struct smb3_fs_context *new_ctx = NULL; + struct smb3_fs_context *old_ctx = NULL; struct dentry *root = fc->root; struct cifs_sb_info *cifs_sb = CIFS_SB(root->d_sb); struct cifs_ses *ses = cifs_sb_master_tcon(cifs_sb)->ses; unsigned int rsize = ctx->rsize, wsize = ctx->wsize; char *new_password = NULL, *new_password2 = NULL; bool need_recon = false; + bool need_mchan_update; int rc; if (ses->expired_pwd) @@ -1066,6 +1078,16 @@ static int smb3_reconfigure(struct fs_context *fc) if (rc) return rc; + old_ctx = kzalloc_obj(*old_ctx); + if (!old_ctx) + return -ENOMEM; + + rc = smb3_fs_context_dup(old_ctx, cifs_sb->ctx); + if (rc) { + kfree(old_ctx); + return rc; + } + /* * We can not change UNC/username/password/domainname/ * workstation_name/nodename/iocharset @@ -1075,16 +1097,22 @@ static int smb3_reconfigure(struct fs_context *fc) STEAL_STRING(cifs_sb, ctx, UNC); STEAL_STRING(cifs_sb, ctx, source); STEAL_STRING(cifs_sb, ctx, username); + STEAL_STRING(cifs_sb, ctx, domainname); + STEAL_STRING(cifs_sb, ctx, nodename); + STEAL_STRING(cifs_sb, ctx, iocharset); - if (need_recon == false) + if (!need_recon) { STEAL_STRING_SENSITIVE(cifs_sb, ctx, password); - else { + } else { if (ctx->password) { new_password = kstrdup(ctx->password, GFP_KERNEL); - if (!new_password) - return -ENOMEM; - } else + if (!new_password) { + rc = -ENOMEM; + goto restore_ctx; + } + } else { STEAL_STRING_SENSITIVE(cifs_sb, ctx, password); + } } /* @@ -1094,11 +1122,29 @@ static int smb3_reconfigure(struct fs_context *fc) if (ctx->password2) { new_password2 = kstrdup(ctx->password2, GFP_KERNEL); if (!new_password2) { - kfree_sensitive(new_password); - return -ENOMEM; + rc = -ENOMEM; + goto restore_ctx; } - } else + } else { STEAL_STRING_SENSITIVE(cifs_sb, ctx, password2); + } + + /* if rsize or wsize not passed in on remount, use previous values */ + ctx->rsize = rsize ? CIFS_ALIGN_RSIZE(fc, rsize) : cifs_sb->ctx->rsize; + ctx->wsize = wsize ? CIFS_ALIGN_WSIZE(fc, wsize) : cifs_sb->ctx->wsize; + + new_ctx = kzalloc_obj(*new_ctx); + if (!new_ctx) { + rc = -ENOMEM; + goto restore_ctx; + } + + rc = smb3_fs_context_dup(new_ctx, ctx); + if (rc) + goto restore_ctx; + + need_mchan_update = ctx->multichannel != cifs_sb->ctx->multichannel || + ctx->max_channels != cifs_sb->ctx->max_channels; /* * we may update the passwords in the ses struct below. Make sure we do @@ -1109,54 +1155,55 @@ static int smb3_reconfigure(struct fs_context *fc) /* * smb2_reconnect may swap password and password2 in case session setup * failed. First get ctx passwords in sync with ses passwords. It should - * be okay to do this even if this function were to return an error at a - * later stage + * be done before committing new passwords. */ rc = smb3_sync_session_ctx_passwords(cifs_sb, ses); if (rc) { mutex_unlock(&ses->session_mutex); - kfree_sensitive(new_password); - kfree_sensitive(new_password2); - return rc; + goto cleanup_new_ctx; + } + + /* + * If multichannel or max_channels has changed, update the session's channels accordingly. + * This may add or remove channels to match the new configuration. + */ + if (need_mchan_update) { + /* Prevent concurrent scaling operations */ + spin_lock(&ses->ses_lock); + if (ses->flags & CIFS_SES_FLAG_SCALE_CHANNELS) { + spin_unlock(&ses->ses_lock); + mutex_unlock(&ses->session_mutex); + rc = -EINVAL; + goto cleanup_new_ctx; + } + ses->flags |= CIFS_SES_FLAG_SCALE_CHANNELS; + spin_unlock(&ses->ses_lock); } /* - * now that allocations for passwords are done, commit them + * Commit session passwords before any channel work so newly added + * channels authenticate with the new credentials. */ if (new_password) { kfree_sensitive(ses->password); ses->password = new_password; + new_password = NULL; } if (new_password2) { kfree_sensitive(ses->password2); ses->password2 = new_password2; + new_password2 = NULL; } - /* - * If multichannel or max_channels has changed, update the session's channels accordingly. - * This may add or remove channels to match the new configuration. - */ - if ((ctx->multichannel != cifs_sb->ctx->multichannel) || - (ctx->max_channels != cifs_sb->ctx->max_channels)) { - + if (need_mchan_update) { /* Synchronize ses->chan_max with the new mount context */ smb3_sync_ses_chan_max(ses, ctx->max_channels); - /* Now update the session's channels to match the new configuration */ - /* Prevent concurrent scaling operations */ - spin_lock(&ses->ses_lock); - if (ses->flags & CIFS_SES_FLAG_SCALE_CHANNELS) { - spin_unlock(&ses->ses_lock); - mutex_unlock(&ses->session_mutex); - return -EINVAL; - } - ses->flags |= CIFS_SES_FLAG_SCALE_CHANNELS; - spin_unlock(&ses->ses_lock); mutex_unlock(&ses->session_mutex); - rc = smb3_update_ses_channels(ses, ses->server, - false /* from_reconnect */, - false /* disable_mchan */); + smb3_update_ses_channels(ses, ses->server, + false /* from_reconnect */, + false /* disable_mchan */); /* Clear scaling flag after operation */ spin_lock(&ses->ses_lock); @@ -1166,22 +1213,30 @@ static int smb3_reconfigure(struct fs_context *fc) mutex_unlock(&ses->session_mutex); } - STEAL_STRING(cifs_sb, ctx, domainname); - STEAL_STRING(cifs_sb, ctx, nodename); - STEAL_STRING(cifs_sb, ctx, iocharset); - - /* if rsize or wsize not passed in on remount, use previous values */ - ctx->rsize = rsize ? CIFS_ALIGN_RSIZE(fc, rsize) : cifs_sb->ctx->rsize; - ctx->wsize = wsize ? CIFS_ALIGN_WSIZE(fc, wsize) : cifs_sb->ctx->wsize; - smb3_cleanup_fs_context_contents(cifs_sb->ctx); - rc = smb3_fs_context_dup(cifs_sb->ctx, ctx); + memcpy(cifs_sb->ctx, new_ctx, sizeof(*new_ctx)); + kfree(new_ctx); + new_ctx = NULL; + smb3_cleanup_fs_context(old_ctx); + old_ctx = NULL; smb3_update_mnt_flags(cifs_sb); #ifdef CONFIG_CIFS_DFS_UPCALL if (!rc) rc = dfs_cache_remount_fs(cifs_sb); #endif + return rc; + +cleanup_new_ctx: + smb3_cleanup_fs_context_contents(new_ctx); +restore_ctx: + kfree(new_ctx); + kfree_sensitive(new_password); + kfree_sensitive(new_password2); + smb3_cleanup_fs_context_contents(cifs_sb->ctx); + memcpy(cifs_sb->ctx, old_ctx, sizeof(*old_ctx)); + kfree(old_ctx); + return rc; } diff --git a/fs/smb/client/ioctl.c b/fs/smb/client/ioctl.c index 9afab3237e54c3..17408bb8ab65bf 100644 --- a/fs/smb/client/ioctl.c +++ b/fs/smb/client/ioctl.c @@ -296,7 +296,7 @@ static int cifs_dump_full_key(struct cifs_tcon *tcon, struct smb3_full_key_debug break; case SMB2_ENCRYPTION_AES256_CCM: case SMB2_ENCRYPTION_AES256_GCM: - out.session_key_length = CIFS_SESS_KEY_SIZE; + out.session_key_length = ses->auth_key.len; out.server_in_key_length = out.server_out_key_length = SMB3_GCM256_CRYPTKEY_SIZE; break; default: diff --git a/fs/smb/client/smb2file.c b/fs/smb/client/smb2file.c index b292aa94a5932a..6860eff3169329 100644 --- a/fs/smb/client/smb2file.c +++ b/fs/smb/client/smb2file.c @@ -49,6 +49,9 @@ static struct smb2_symlink_err_rsp *symlink_data(const struct kvec *iov) __func__, le32_to_cpu(p->ErrorId)); len = ALIGN(le32_to_cpu(p->ErrorDataLength), 8); + if (len > end - ((u8 *)p + sizeof(*p))) + return ERR_PTR(-EINVAL); + p = (struct smb2_error_context_rsp *)(p->ErrorContextData + len); } } else if (le32_to_cpu(err->ByteCount) >= sizeof(*sym) && diff --git a/fs/smb/client/smb2inode.c b/fs/smb/client/smb2inode.c index 286912616c7339..6c9c229b91f654 100644 --- a/fs/smb/client/smb2inode.c +++ b/fs/smb/client/smb2inode.c @@ -111,7 +111,7 @@ static int check_wsl_eas(struct kvec *rsp_iov) u32 outlen, next; u16 vlen; u8 nlen; - u8 *end; + u8 *ea_end, *iov_end; outlen = le32_to_cpu(rsp->OutputBufferLength); if (outlen < SMB2_WSL_MIN_QUERY_EA_RESP_SIZE || @@ -120,15 +120,19 @@ static int check_wsl_eas(struct kvec *rsp_iov) ea = (void *)((u8 *)rsp_iov->iov_base + le16_to_cpu(rsp->OutputBufferOffset)); - end = (u8 *)rsp_iov->iov_base + rsp_iov->iov_len; + ea_end = (u8 *)ea + outlen; + iov_end = (u8 *)rsp_iov->iov_base + rsp_iov->iov_len; + if (ea_end > iov_end) + return -EINVAL; + for (;;) { - if ((u8 *)ea > end - sizeof(*ea)) + if ((u8 *)ea > ea_end - sizeof(*ea)) return -EINVAL; nlen = ea->ea_name_length; vlen = le16_to_cpu(ea->ea_value_length); if (nlen != SMB2_WSL_XATTR_NAME_LEN || - (u8 *)ea->ea_data + nlen + 1 + vlen > end) + (u8 *)ea->ea_data + nlen + 1 + vlen > ea_end) return -EINVAL; switch (vlen) { diff --git a/fs/smb/client/smb2misc.c b/fs/smb/client/smb2misc.c index 973fce3c959c4b..2a7355ce1a0783 100644 --- a/fs/smb/client/smb2misc.c +++ b/fs/smb/client/smb2misc.c @@ -241,7 +241,8 @@ smb2_check_message(char *buf, unsigned int pdu_len, unsigned int len, if (len != calc_len) { /* create failed on symlink */ if (command == SMB2_CREATE_HE && - shdr->Status == STATUS_STOPPED_ON_SYMLINK) + shdr->Status == STATUS_STOPPED_ON_SYMLINK && + len > calc_len) return 0; /* Windows 7 server returns 24 bytes more */ if (calc_len + 24 == len && command == SMB2_OPLOCK_BREAK_HE) diff --git a/fs/smb/client/smb2ops.c b/fs/smb/client/smb2ops.c index e6cb9b144530d2..3738204984f5a0 100644 --- a/fs/smb/client/smb2ops.c +++ b/fs/smb/client/smb2ops.c @@ -4721,6 +4721,7 @@ handle_read_data(struct TCP_Server_Info *server, struct mid_q_entry *mid, { unsigned int data_offset; unsigned int data_len; + unsigned int end_off; unsigned int cur_off; unsigned int cur_page_idx; unsigned int pad_len; @@ -4836,7 +4837,8 @@ handle_read_data(struct TCP_Server_Info *server, struct mid_q_entry *mid, } rdata->got_bytes = buffer_len; - } else if (buf_len >= data_offset + data_len) { + } else if (!check_add_overflow(data_offset, data_len, &end_off) && + buf_len >= end_off) { /* read response payload is in buf */ WARN_ONCE(buffer, "read data can be either in buf or in buffer"); copied = copy_to_iter(buf + data_offset, data_len, &rdata->subreq.io_iter); diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c index cb61051f9af3b3..995fcdd306815f 100644 --- a/fs/smb/client/smb2pdu.c +++ b/fs/smb/client/smb2pdu.c @@ -1713,17 +1713,30 @@ SMB2_auth_kerberos(struct SMB2_sess_data *sess_data) is_binding = (ses->ses_status == SES_GOOD); spin_unlock(&ses->ses_lock); + /* + * Per MS-SMB2 3.2.5.3, Session.SessionKey is the first 16 bytes of the + * GSS cryptographic key, right-padded with zero bytes if shorter. + * Allocate at least SMB2_NTLMV2_SESSKEY_SIZE bytes (zeroed) so the KDF + * input buffer is always valid for HMAC-SHA256 even with deprecated + * Kerberos enctypes that return a short session key. + */ + if (unlikely(msg->sesskey_len < SMB2_NTLMV2_SESSKEY_SIZE)) + cifs_dbg(VFS, + "short GSS session key (%u bytes); zero-padding per MS-SMB2 3.2.5.3\n", + msg->sesskey_len); + kfree_sensitive(ses->auth_key.response); - ses->auth_key.response = kmemdup(msg->data, - msg->sesskey_len, - GFP_KERNEL); + ses->auth_key.len = max_t(unsigned int, msg->sesskey_len, + SMB2_NTLMV2_SESSKEY_SIZE); + ses->auth_key.response = kzalloc(ses->auth_key.len, GFP_KERNEL); if (!ses->auth_key.response) { cifs_dbg(VFS, "%s: can't allocate (%u bytes) memory\n", - __func__, msg->sesskey_len); + __func__, ses->auth_key.len); + ses->auth_key.len = 0; rc = -ENOMEM; goto out_put_spnego_key; } - ses->auth_key.len = msg->sesskey_len; + memcpy(ses->auth_key.response, msg->data, msg->sesskey_len); sess_data->iov[1].iov_base = msg->data + msg->sesskey_len; sess_data->iov[1].iov_len = msg->secblob_len; diff --git a/fs/smb/client/smb2transport.c b/fs/smb/client/smb2transport.c index 41009039b4cbe6..e8eeff9e50d67f 100644 --- a/fs/smb/client/smb2transport.c +++ b/fs/smb/client/smb2transport.c @@ -251,7 +251,8 @@ smb2_calc_signature(struct smb_rqst *rqst, struct TCP_Server_Info *server) } static void generate_key(struct cifs_ses *ses, struct kvec label, - struct kvec context, __u8 *key, unsigned int key_size) + struct kvec context, __u8 *key, unsigned int key_size, + unsigned int full_key_size) { unsigned char zero = 0x0; __u8 i[4] = {0, 0, 0, 1}; @@ -265,7 +266,7 @@ static void generate_key(struct cifs_ses *ses, struct kvec label, memset(key, 0x0, key_size); hmac_sha256_init_usingrawkey(&hmac_ctx, ses->auth_key.response, - SMB2_NTLMV2_SESSKEY_SIZE); + full_key_size); hmac_sha256_update(&hmac_ctx, i, 4); hmac_sha256_update(&hmac_ctx, label.iov_base, label.iov_len); hmac_sha256_update(&hmac_ctx, &zero, 1); @@ -298,6 +299,7 @@ generate_smb3signingkey(struct cifs_ses *ses, struct TCP_Server_Info *server, const struct derivation_triplet *ptriplet) { + unsigned int full_key_size = SMB2_NTLMV2_SESSKEY_SIZE; bool is_binding = false; int chan_index = 0; @@ -330,12 +332,24 @@ generate_smb3signingkey(struct cifs_ses *ses, if (is_binding) { generate_key(ses, ptriplet->signing.label, ptriplet->signing.context, - ses->chans[chan_index].signkey, - SMB3_SIGN_KEY_SIZE); + ses->chans[chan_index].signkey, SMB3_SIGN_KEY_SIZE, + SMB2_NTLMV2_SESSKEY_SIZE); } else { generate_key(ses, ptriplet->signing.label, - ptriplet->signing.context, - ses->smb3signingkey, SMB3_SIGN_KEY_SIZE); + ptriplet->signing.context, ses->smb3signingkey, + SMB3_SIGN_KEY_SIZE, SMB2_NTLMV2_SESSKEY_SIZE); + + /* + * Per MS-SMB2 3.2.5.3.1, signing key always uses Session.SessionKey + * (first 16 bytes). Encryption/decryption keys use + * Session.FullSessionKey when dialect is 3.1.1 and cipher is + * AES-256-CCM or AES-256-GCM, otherwise Session.SessionKey. + */ + + if (server->dialect == SMB311_PROT_ID && + (server->cipher_type == SMB2_ENCRYPTION_AES256_CCM || + server->cipher_type == SMB2_ENCRYPTION_AES256_GCM)) + full_key_size = ses->auth_key.len; /* safe to access primary channel, since it will never go away */ spin_lock(&ses->chan_lock); @@ -345,10 +359,13 @@ generate_smb3signingkey(struct cifs_ses *ses, generate_key(ses, ptriplet->encryption.label, ptriplet->encryption.context, - ses->smb3encryptionkey, SMB3_ENC_DEC_KEY_SIZE); + ses->smb3encryptionkey, SMB3_ENC_DEC_KEY_SIZE, + full_key_size); + generate_key(ses, ptriplet->decryption.label, ptriplet->decryption.context, - ses->smb3decryptionkey, SMB3_ENC_DEC_KEY_SIZE); + ses->smb3decryptionkey, SMB3_ENC_DEC_KEY_SIZE, + full_key_size); } #ifdef CONFIG_CIFS_DEBUG_DUMP_KEYS @@ -361,7 +378,7 @@ generate_smb3signingkey(struct cifs_ses *ses, &ses->Suid); cifs_dbg(VFS, "Cipher type %d\n", server->cipher_type); cifs_dbg(VFS, "Session Key %*ph\n", - SMB2_NTLMV2_SESSKEY_SIZE, ses->auth_key.response); + (int)ses->auth_key.len, ses->auth_key.response); cifs_dbg(VFS, "Signing Key %*ph\n", SMB3_SIGN_KEY_SIZE, ses->smb3signingkey); if ((server->cipher_type == SMB2_ENCRYPTION_AES256_CCM) || diff --git a/fs/smb/client/smbdirect.c b/fs/smb/client/smbdirect.c index 75f9f91a7ec968..563ef488a22585 100644 --- a/fs/smb/client/smbdirect.c +++ b/fs/smb/client/smbdirect.c @@ -9,7 +9,6 @@ #include "cifs_debug.h" #include "cifsproto.h" #include "smb2proto.h" -#include "../smbdirect/public.h" /* Port numbers for SMBD transport */ #define SMB_PORT 445 @@ -558,3 +557,5 @@ void smbd_debug_proc_show(struct TCP_Server_Info *server, struct seq_file *m) server->rdma_readwrite_threshold, m); } + +MODULE_IMPORT_NS("SMBDIRECT"); diff --git a/fs/smb/client/smbdirect.h b/fs/smb/client/smbdirect.h index 287ac849213d40..be205ec02077e6 100644 --- a/fs/smb/client/smbdirect.h +++ b/fs/smb/client/smbdirect.h @@ -12,7 +12,7 @@ #include "cifsglob.h" -#include "../smbdirect/smbdirect.h" +#include extern int rdma_readwrite_threshold; extern int smbd_max_frmr_depth; diff --git a/fs/smb/client/transport.c b/fs/smb/client/transport.c index 05f8099047e1a4..fdf4e50c27ceb4 100644 --- a/fs/smb/client/transport.c +++ b/fs/smb/client/transport.c @@ -1158,7 +1158,7 @@ int cifs_readv_receive(struct TCP_Server_Info *server, struct mid_q_entry *mid) { int length, len; - unsigned int data_offset, data_len; + unsigned int data_offset, data_len, end_off; struct cifs_io_subrequest *rdata = mid->callback_data; char *buf = server->smallbuf; unsigned int buflen = server->pdu_size; @@ -1256,11 +1256,14 @@ cifs_readv_receive(struct TCP_Server_Info *server, struct mid_q_entry *mid) use_rdma_mr = rdata->mr; #endif data_len = server->ops->read_data_length(buf, use_rdma_mr); - if (!use_rdma_mr && (data_offset + data_len > buflen)) { - /* data_len is corrupt -- discard frame */ - rdata->result = smb_EIO2(smb_eio_trace_read_rsp_malformed, - data_offset + data_len, buflen); - return cifs_readv_discard(server, mid); + if (!use_rdma_mr) { + if (check_add_overflow(data_offset, data_len, &end_off) || + end_off > buflen) { + /* data_len is corrupt -- discard frame */ + rdata->result = smb_EIO2(smb_eio_trace_read_rsp_malformed, + end_off, buflen); + return cifs_readv_discard(server, mid); + } } #ifdef CONFIG_CIFS_SMB_DIRECT diff --git a/fs/smb/common/fscc.h b/fs/smb/common/fscc.h index b4ccddca925656..bc3012cc295da6 100644 --- a/fs/smb/common/fscc.h +++ b/fs/smb/common/fscc.h @@ -260,12 +260,12 @@ typedef struct { char FileName[]; } __packed FILE_DIRECTORY_INFO; /* level 0x101 FF resp data */ -/* See MS-FSCC 2.4.13 */ +/* See MS-FSCC 2.4.14 */ struct smb2_file_eof_info { /* encoding of request for level 10 */ __le64 EndOfFile; /* new end of file value */ } __packed; /* level 20 Set */ -/* See MS-FSCC 2.4.14 */ +/* See MS-FSCC 2.4.15 */ typedef struct { __le32 NextEntryOffset; __u32 FileIndex; diff --git a/fs/smb/common/smb2pdu.h b/fs/smb/common/smb2pdu.h index a4b12eb8df81ec..aeb0a245c5324e 100644 --- a/fs/smb/common/smb2pdu.h +++ b/fs/smb/common/smb2pdu.h @@ -1566,6 +1566,10 @@ struct validate_negotiate_info_rsp { #define FILE_STANDARD_LINK_INFORMATION 54 #define FILE_ID_INFORMATION 59 #define FILE_ID_EXTD_DIRECTORY_INFORMATION 60 /* also for QUERY_DIR */ +#define FileId64ExtdDirectoryInformation 78 /* also for QUERY_DIR */ +#define FileId64ExtdBothDirectoryInformation 79 /* also for QUERY_DIR */ +#define FileIdAllExtdDirectoryInformation 80 /* also for QUERY_DIR */ +#define FileIdAllExtdBothDirectoryInformation 81 /* also for QUERY_DIR */ /* Used for Query Info and Find File POSIX Info for SMB3.1.1 and SMB1 */ #define SMB_FIND_FILE_POSIX_INFO 0x064 diff --git a/fs/smb/server/connection.c b/fs/smb/server/connection.c index c5aac4946cbe52..8347495dbc6284 100644 --- a/fs/smb/server/connection.c +++ b/fs/smb/server/connection.c @@ -79,6 +79,85 @@ static int create_proc_clients(void) { return 0; } static void delete_proc_clients(void) {} #endif +static struct workqueue_struct *ksmbd_conn_wq; + +int ksmbd_conn_wq_init(void) +{ + ksmbd_conn_wq = alloc_workqueue("ksmbd-conn-release", + WQ_UNBOUND | WQ_MEM_RECLAIM, 0); + if (!ksmbd_conn_wq) + return -ENOMEM; + return 0; +} + +void ksmbd_conn_wq_destroy(void) +{ + if (ksmbd_conn_wq) { + destroy_workqueue(ksmbd_conn_wq); + ksmbd_conn_wq = NULL; + } +} + +/* + * __ksmbd_conn_release_work() - perform the final, once-per-struct cleanup + * of a ksmbd_conn whose refcount has just dropped to zero. + * + * This is the common release path used by ksmbd_conn_put() for the embedded + * state that outlives the connection thread: async_ida and the attached + * transport (which owns the socket and iov for TCP). Called from a workqueue + * so that sleep-allowed teardown (sock_release -> tcp_close -> + * lock_sock_nested) never runs from an RCU softirq callback (free_opinfo_rcu) + * or any other non-sleeping putter context. + */ +static void __ksmbd_conn_release_work(struct work_struct *work) +{ + struct ksmbd_conn *conn = + container_of(work, struct ksmbd_conn, release_work); + + ida_destroy(&conn->async_ida); + conn->transport->ops->free_transport(conn->transport); + kfree(conn); +} + +/** + * ksmbd_conn_get() - take a reference on @conn and return it. + * + * @conn: connection instance to get a reference to + * + * Returns @conn unchanged so callers can write + * "fp->conn = ksmbd_conn_get(work->conn);" in one expression. Returns NULL + * if @conn is NULL. + */ +struct ksmbd_conn *ksmbd_conn_get(struct ksmbd_conn *conn) +{ + if (!conn) + return NULL; + + atomic_inc(&conn->refcnt); + return conn; +} + +/** + * ksmbd_conn_put() - drop a reference and, if it was the last, queue the + * release onto ksmbd_conn_wq so it runs from process context. + * + * @conn: connection instance to put a reference to + * + * Callable from any context including RCU softirq callbacks and non-sleeping + * locks; the actual release is deferred to the workqueue. ksmbd_conn_wq is + * created in ksmbd_server_init() before any conn can be allocated and is + * destroyed in ksmbd_server_exit() after rcu_barrier(), so it is always + * non-NULL while a conn reference is held. + */ +void ksmbd_conn_put(struct ksmbd_conn *conn) +{ + if (!conn) + return; + + if (atomic_dec_and_test(&conn->refcnt)) + queue_work(ksmbd_conn_wq, &conn->release_work); +} + /** * ksmbd_conn_free() - free resources of the connection instance * @@ -93,23 +172,19 @@ void ksmbd_conn_free(struct ksmbd_conn *conn) hash_del(&conn->hlist); up_write(&conn_list_lock); + /* + * request_buf / preauth_info / mechToken are only ever accessed by the + * connection handler thread that owns @conn. ksmbd_conn_free() is + * called from the transport free_transport() path when that thread is + * exiting, so it is safe to release them unconditionally even when + * ksmbd_conn_put() below is not the final putter (oplock / ksmbd_file + * holders only retain the conn pointer, not these per-thread buffers). + */ xa_destroy(&conn->sessions); kvfree(conn->request_buf); kfree(conn->preauth_info); kfree(conn->mechToken); - if (atomic_dec_and_test(&conn->refcnt)) { - /* - * async_ida is embedded in struct ksmbd_conn, so pair - * ida_destroy() with the final kfree() rather than with - * the unconditional field teardown above. This keeps - * the IDA valid for the entire lifetime of the struct, - * even while other refcount holders (oplock / vfs - * durable handles) still reference the connection. - */ - ida_destroy(&conn->async_ida); - conn->transport->ops->free_transport(conn->transport); - kfree(conn); - } + ksmbd_conn_put(conn); } /** @@ -136,6 +211,7 @@ struct ksmbd_conn *ksmbd_conn_alloc(void) conn->um = ERR_PTR(-EOPNOTSUPP); if (IS_ERR(conn->um)) conn->um = NULL; + INIT_WORK(&conn->release_work, __ksmbd_conn_release_work); atomic_set(&conn->req_running, 0); atomic_set(&conn->r_count, 0); atomic_set(&conn->refcnt, 1); @@ -512,8 +588,7 @@ void ksmbd_conn_r_count_dec(struct ksmbd_conn *conn) if (!atomic_dec_return(&conn->r_count) && waitqueue_active(&conn->r_count_q)) wake_up(&conn->r_count_q); - if (atomic_dec_and_test(&conn->refcnt)) - kfree(conn); + ksmbd_conn_put(conn); } int ksmbd_conn_transport_init(void) diff --git a/fs/smb/server/connection.h b/fs/smb/server/connection.h index de2d46941c930b..e074be9425823b 100644 --- a/fs/smb/server/connection.h +++ b/fs/smb/server/connection.h @@ -16,6 +16,7 @@ #include #include #include +#include #include "smb_common.h" #include "ksmbd_work.h" @@ -120,6 +121,7 @@ struct ksmbd_conn { bool binding; atomic_t refcnt; bool is_aapl; + struct work_struct release_work; }; struct ksmbd_conn_ops { @@ -164,6 +166,10 @@ void ksmbd_conn_wait_idle(struct ksmbd_conn *conn); int ksmbd_conn_wait_idle_sess_id(struct ksmbd_conn *curr_conn, u64 sess_id); struct ksmbd_conn *ksmbd_conn_alloc(void); void ksmbd_conn_free(struct ksmbd_conn *conn); +struct ksmbd_conn *ksmbd_conn_get(struct ksmbd_conn *conn); +void ksmbd_conn_put(struct ksmbd_conn *conn); +int ksmbd_conn_wq_init(void); +void ksmbd_conn_wq_destroy(void); bool ksmbd_conn_lookup_dialect(struct ksmbd_conn *c); int ksmbd_conn_write(struct ksmbd_work *work); int ksmbd_conn_rdma_read(struct ksmbd_conn *conn, diff --git a/fs/smb/server/mgmt/share_config.c b/fs/smb/server/mgmt/share_config.c index 53f44ff4d376f3..6f97f8d39657cd 100644 --- a/fs/smb/server/mgmt/share_config.c +++ b/fs/smb/server/mgmt/share_config.c @@ -167,7 +167,10 @@ static struct ksmbd_share_config *share_config_request(struct ksmbd_work *work, share->path = kstrndup(ksmbd_share_config_path(resp), path_len, KSMBD_DEFAULT_GFP); - if (share->path) { + if (!share->path) { + ret = -ENOMEM; + } else { + ret = 0; share->path_sz = strlen(share->path); while (share->path_sz > 1 && share->path[share->path_sz - 1] == '/') @@ -179,9 +182,10 @@ static struct ksmbd_share_config *share_config_request(struct ksmbd_work *work, share->force_directory_mode = resp->force_directory_mode; share->force_uid = resp->force_uid; share->force_gid = resp->force_gid; - ret = parse_veto_list(share, - KSMBD_SHARE_CONFIG_VETO_LIST(resp), - resp->veto_list_sz); + if (!ret) + ret = parse_veto_list(share, + KSMBD_SHARE_CONFIG_VETO_LIST(resp), + resp->veto_list_sz); if (!ret && share->path) { if (__ksmbd_override_fsids(work, share)) { kill_share(share); diff --git a/fs/smb/server/oplock.c b/fs/smb/server/oplock.c index cd3f28b0e7cb24..8feca02ddbf259 100644 --- a/fs/smb/server/oplock.c +++ b/fs/smb/server/oplock.c @@ -30,7 +30,6 @@ static DEFINE_RWLOCK(lease_list_lock); static struct oplock_info *alloc_opinfo(struct ksmbd_work *work, u64 id, __u16 Tid) { - struct ksmbd_conn *conn = work->conn; struct ksmbd_session *sess = work->sess; struct oplock_info *opinfo; @@ -39,7 +38,7 @@ static struct oplock_info *alloc_opinfo(struct ksmbd_work *work, return NULL; opinfo->sess = sess; - opinfo->conn = conn; + opinfo->conn = ksmbd_conn_get(work->conn); opinfo->level = SMB2_OPLOCK_LEVEL_NONE; opinfo->op_state = OPLOCK_STATE_NONE; opinfo->pending_break = 0; @@ -50,7 +49,6 @@ static struct oplock_info *alloc_opinfo(struct ksmbd_work *work, init_waitqueue_head(&opinfo->oplock_brk); atomic_set(&opinfo->refcount, 1); atomic_set(&opinfo->breaking_cnt, 0); - atomic_inc(&opinfo->conn->refcnt); return opinfo; } @@ -132,8 +130,7 @@ static void __free_opinfo(struct oplock_info *opinfo) { if (opinfo->is_lease) free_lease(opinfo); - if (opinfo->conn && atomic_dec_and_test(&opinfo->conn->refcnt)) - kfree(opinfo->conn); + ksmbd_conn_put(opinfo->conn); kfree(opinfo); } diff --git a/fs/smb/server/server.c b/fs/smb/server/server.c index 58ef02c423fcef..5d799b2d4c62f4 100644 --- a/fs/smb/server/server.c +++ b/fs/smb/server/server.c @@ -596,8 +596,14 @@ static int __init ksmbd_server_init(void) if (ret) goto err_crypto_destroy; + ret = ksmbd_conn_wq_init(); + if (ret) + goto err_workqueue_destroy; + return 0; +err_workqueue_destroy: + ksmbd_workqueue_destroy(); err_crypto_destroy: ksmbd_crypto_destroy(); err_release_inode_hash: @@ -623,6 +629,12 @@ static void __exit ksmbd_server_exit(void) { ksmbd_server_shutdown(); rcu_barrier(); + /* + * ksmbd_conn_put() defers the final release onto ksmbd_conn_wq, + * so drain it after rcu_barrier() has fired any pending RCU + * callbacks that may have queued a release. + */ + ksmbd_conn_wq_destroy(); ksmbd_release_inode_hash(); } diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c index 47b7af631f7b35..62d4399a993d80 100644 --- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -3767,8 +3767,10 @@ int smb2_open(struct ksmbd_work *work) err_out2: if (!rc) { - ksmbd_update_fstate(&work->sess->file_table, fp, FP_INITED); - rc = ksmbd_iov_pin_rsp(work, (void *)rsp, iov_len); + rc = ksmbd_update_fstate(&work->sess->file_table, fp, + FP_INITED); + if (!rc) + rc = ksmbd_iov_pin_rsp(work, (void *)rsp, iov_len); } if (rc) { if (rc == -EINVAL) diff --git a/fs/smb/server/smbacl.c b/fs/smb/server/smbacl.c index 4bbc2c27e6805e..c1d1f34581d69d 100644 --- a/fs/smb/server/smbacl.c +++ b/fs/smb/server/smbacl.c @@ -1068,7 +1068,26 @@ static void smb_set_ace(struct smb_ace *ace, const struct smb_sid *sid, u8 type, ace->flags = flags; ace->access_req = access_req; smb_copy_sid(&ace->sid, sid); - ace->size = cpu_to_le16(1 + 1 + 2 + 4 + 1 + 1 + 6 + (sid->num_subauth * 4)); + ace->size = cpu_to_le16(1 + 1 + 2 + 4 + 1 + 1 + 6 + + (ace->sid.num_subauth * 4)); +} + +static int smb_append_inherited_ace(struct smb_ace **ace, int *nt_size, + u16 *ace_cnt, const struct smb_sid *sid, + u8 type, u8 flags, __le32 access_req) +{ + int ace_size; + + smb_set_ace(*ace, sid, type, flags, access_req); + ace_size = le16_to_cpu((*ace)->size); + /* pdacl->size is __le16 and includes struct smb_acl. */ + if (check_add_overflow(*nt_size, ace_size, nt_size) || + *nt_size > U16_MAX - (int)sizeof(struct smb_acl)) + return -EINVAL; + + (*ace_cnt)++; + *ace = (struct smb_ace *)((char *)*ace + ace_size); + return 0; } int smb_inherit_dacl(struct ksmbd_conn *conn, @@ -1157,6 +1176,12 @@ int smb_inherit_dacl(struct ksmbd_conn *conn, CIFS_SID_BASE_SIZE) break; + if (parent_aces->sid.num_subauth > SID_MAX_SUB_AUTHORITIES || + pace_size < offsetof(struct smb_ace, sid) + + CIFS_SID_BASE_SIZE + + sizeof(__le32) * parent_aces->sid.num_subauth) + break; + aces_size -= pace_size; flags = parent_aces->flags; @@ -1186,22 +1211,24 @@ int smb_inherit_dacl(struct ksmbd_conn *conn, } if (is_dir && creator && flags & CONTAINER_INHERIT_ACE) { - smb_set_ace(aces, psid, parent_aces->type, inherited_flags, - parent_aces->access_req); - nt_size += le16_to_cpu(aces->size); - ace_cnt++; - aces = (struct smb_ace *)((char *)aces + le16_to_cpu(aces->size)); + rc = smb_append_inherited_ace(&aces, &nt_size, &ace_cnt, + psid, parent_aces->type, + inherited_flags, + parent_aces->access_req); + if (rc) + goto free_aces_base; flags |= INHERIT_ONLY_ACE; psid = creator; } else if (is_dir && !(parent_aces->flags & NO_PROPAGATE_INHERIT_ACE)) { psid = &parent_aces->sid; } - smb_set_ace(aces, psid, parent_aces->type, flags | inherited_flags, - parent_aces->access_req); - nt_size += le16_to_cpu(aces->size); - aces = (struct smb_ace *)((char *)aces + le16_to_cpu(aces->size)); - ace_cnt++; + rc = smb_append_inherited_ace(&aces, &nt_size, &ace_cnt, psid, + parent_aces->type, + flags | inherited_flags, + parent_aces->access_req); + if (rc) + goto free_aces_base; pass: parent_aces = (struct smb_ace *)((char *)parent_aces + pace_size); } @@ -1211,7 +1238,7 @@ int smb_inherit_dacl(struct ksmbd_conn *conn, struct smb_acl *pdacl; struct smb_sid *powner_sid = NULL, *pgroup_sid = NULL; int powner_sid_size = 0, pgroup_sid_size = 0, pntsd_size; - int pntsd_alloc_size; + size_t pntsd_alloc_size; if (parent_pntsd->osidoffset) { powner_sid = (struct smb_sid *)((char *)parent_pntsd + @@ -1224,8 +1251,19 @@ int smb_inherit_dacl(struct ksmbd_conn *conn, pgroup_sid_size = 1 + 1 + 6 + (pgroup_sid->num_subauth * 4); } - pntsd_alloc_size = sizeof(struct smb_ntsd) + powner_sid_size + - pgroup_sid_size + sizeof(struct smb_acl) + nt_size; + if (check_add_overflow(sizeof(struct smb_ntsd), + (size_t)powner_sid_size, + &pntsd_alloc_size) || + check_add_overflow(pntsd_alloc_size, + (size_t)pgroup_sid_size, + &pntsd_alloc_size) || + check_add_overflow(pntsd_alloc_size, sizeof(struct smb_acl), + &pntsd_alloc_size) || + check_add_overflow(pntsd_alloc_size, (size_t)nt_size, + &pntsd_alloc_size)) { + rc = -EINVAL; + goto free_aces_base; + } pntsd = kzalloc(pntsd_alloc_size, KSMBD_DEFAULT_GFP); if (!pntsd) { diff --git a/fs/smb/server/transport_rdma.c b/fs/smb/server/transport_rdma.c index a8242c00096f37..b6d63ff8a8a3d5 100644 --- a/fs/smb/server/transport_rdma.c +++ b/fs/smb/server/transport_rdma.c @@ -18,7 +18,6 @@ #include "smb_common.h" #include "../common/smb2status.h" #include "transport_rdma.h" -#include "../smbdirect/public.h" #define SMB_DIRECT_PORT_IWARP 5445 @@ -540,3 +539,5 @@ static const struct ksmbd_transport_ops ksmbd_smb_direct_transport_ops = { .rdma_write = smb_direct_rdma_write, .free_transport = smb_direct_free_transport, }; + +MODULE_IMPORT_NS("SMBDIRECT"); diff --git a/fs/smb/server/transport_rdma.h b/fs/smb/server/transport_rdma.h index bde3d88aecc719..8b78917a179580 100644 --- a/fs/smb/server/transport_rdma.h +++ b/fs/smb/server/transport_rdma.h @@ -25,6 +25,6 @@ static inline void init_smbd_max_io_size(unsigned int sz) { } static inline unsigned int get_smbd_max_read_write_size(struct ksmbd_transport *kt) { return 0; } #endif -#include "../smbdirect/smbdirect.h" +#include #endif /* __KSMBD_TRANSPORT_RDMA_H__ */ diff --git a/fs/smb/server/vfs_cache.c b/fs/smb/server/vfs_cache.c index 3551f01a3fa035..354c4d8a1cfbd1 100644 --- a/fs/smb/server/vfs_cache.c +++ b/fs/smb/server/vfs_cache.c @@ -418,6 +418,14 @@ static void __ksmbd_remove_durable_fd(struct ksmbd_file *fp) return; idr_remove(global_ft.idr, fp->persistent_id); + /* + * Clear persistent_id so a later __ksmbd_close_fd() that runs from a + * delayed putter (e.g. when a concurrent ksmbd_lookup_fd_inode() + * walker held the final reference) does not re-issue idr_remove() on + * an id that idr_alloc_cyclic() may have already handed out to a new + * durable handle. + */ + fp->persistent_id = KSMBD_NO_FID; } static void ksmbd_remove_durable_fd(struct ksmbd_file *fp) @@ -431,13 +439,13 @@ static void ksmbd_remove_durable_fd(struct ksmbd_file *fp) static void __ksmbd_remove_fd(struct ksmbd_file_table *ft, struct ksmbd_file *fp) { - if (!has_file_id(fp->volatile_id)) - return; - down_write(&fp->f_ci->m_lock); list_del_init(&fp->node); up_write(&fp->f_ci->m_lock); + if (!has_file_id(fp->volatile_id)) + return; + write_lock(&ft->lock); idr_remove(ft->idr, fp->volatile_id); write_unlock(&ft->lock); @@ -475,6 +483,17 @@ static void __ksmbd_close_fd(struct ksmbd_file_table *ft, struct ksmbd_file *fp) kfree(smb_lock); } + /* + * Drop fp's strong reference on conn (taken in ksmbd_open_fd() / + * ksmbd_reopen_durable_fd()). Durable fps that reached the + * scavenger have already had fp->conn cleared by session_fd_check(), + * in which case there is nothing to drop here. + */ + if (fp->conn) { + ksmbd_conn_put(fp->conn); + fp->conn = NULL; + } + if (ksmbd_stream_fd(fp)) kfree(fp->stream.name); kfree(fp->owner.name); @@ -510,6 +529,20 @@ static struct ksmbd_file *__ksmbd_lookup_fd(struct ksmbd_file_table *ft, static void __put_fd_final(struct ksmbd_work *work, struct ksmbd_file *fp) { + /* + * Detached durable fp -- session_fd_check() cleared fp->conn at + * preserve, so this fp is no longer tracked by any conn's + * stats.open_files_count. This happens when + * ksmbd_scavenger_dispose_dh() hands the final close off to an + * m_fp_list walker (e.g. ksmbd_lookup_fd_inode()) whose work->conn + * is unrelated to the conn that originally opened the handle; close + * via the NULL-ft path so we do not underflow that unrelated + * counter. + */ + if (!fp->conn) { + __ksmbd_close_fd(NULL, fp); + return; + } __ksmbd_close_fd(&work->sess->file_table, fp); atomic_dec(&work->conn->stats.open_files_count); } @@ -752,7 +785,14 @@ struct ksmbd_file *ksmbd_open_fd(struct ksmbd_work *work, struct file *filp) atomic_set(&fp->refcount, 1); fp->filp = filp; - fp->conn = work->conn; + /* + * fp owns a strong reference on fp->conn for as long as fp->conn is + * non-NULL, so session_fd_check() and __ksmbd_close_fd() never + * dereference a dangling pointer. Paired with ksmbd_conn_put() in + * session_fd_check() (durable preserve), in __ksmbd_close_fd() + * (final close), and on the error paths below. + */ + fp->conn = ksmbd_conn_get(work->conn); fp->tcon = work->tcon; fp->volatile_id = KSMBD_NO_FID; fp->persistent_id = KSMBD_NO_FID; @@ -774,19 +814,64 @@ struct ksmbd_file *ksmbd_open_fd(struct ksmbd_work *work, struct file *filp) return fp; err_out: + /* fp->conn was set and refcounted before every branch here. */ + ksmbd_conn_put(fp->conn); kmem_cache_free(filp_cache, fp); return ERR_PTR(ret); } -void ksmbd_update_fstate(struct ksmbd_file_table *ft, struct ksmbd_file *fp, - unsigned int state) +/** + * ksmbd_update_fstate() - update an fp state under the file-table lock + * @ft: file table that publishes @fp's volatile id + * @fp: file pointer to update + * @state: new state + * + * Return: 0 on success. The FP_NEW -> FP_INITED transition is special: + * -ENOENT if teardown already unpublished @fp by advancing the state or + * clearing the volatile id. Other state updates preserve the historical + * fire-and-forget behavior. + */ +int ksmbd_update_fstate(struct ksmbd_file_table *ft, struct ksmbd_file *fp, + unsigned int state) { + int ret; + if (!fp) - return; + return -ENOENT; write_lock(&ft->lock); - fp->f_state = state; + if (state == FP_INITED && + (fp->f_state != FP_NEW || !has_file_id(fp->volatile_id))) { + ret = -ENOENT; + } else { + fp->f_state = state; + ret = 0; + } write_unlock(&ft->lock); + + return ret; +} + +/* + * ksmbd_mark_fp_closed() - mark fp closed under ft->lock and return how many + * refs the teardown path owns. + * + * FP_INITED has a normal idr-owned reference, so teardown owns both that + * reference and the transient lookup reference. FP_NEW is still owned by the + * in-flight opener/reopener, which will drop the original reference after + * ksmbd_update_fstate(..., FP_INITED) observes the cleared volatile id. + * FP_CLOSED on entry means an earlier ksmbd_close_fd() already consumed the + * idr-owned ref. + */ +static int ksmbd_mark_fp_closed(struct ksmbd_file *fp) +{ + if (fp->f_state == FP_INITED) { + set_close_state_blocked_works(fp); + fp->f_state = FP_CLOSED; + return 2; + } + + return 1; } static int @@ -794,7 +879,8 @@ __close_file_table_ids(struct ksmbd_session *sess, struct ksmbd_tree_connect *tcon, bool (*skip)(struct ksmbd_tree_connect *tcon, struct ksmbd_file *fp, - struct ksmbd_user *user)) + struct ksmbd_user *user), + bool skip_preserves_fp) { struct ksmbd_file_table *ft = &sess->file_table; struct ksmbd_file *fp; @@ -802,32 +888,120 @@ __close_file_table_ids(struct ksmbd_session *sess, int num = 0; while (1) { + int n_to_drop; + write_lock(&ft->lock); fp = idr_get_next(ft->idr, &id); if (!fp) { write_unlock(&ft->lock); break; } - - if (skip(tcon, fp, sess->user) || - !atomic_dec_and_test(&fp->refcount)) { + if (!atomic_inc_not_zero(&fp->refcount)) { id++; write_unlock(&ft->lock); continue; } - set_close_state_blocked_works(fp); - idr_remove(ft->idr, fp->volatile_id); - fp->volatile_id = KSMBD_NO_FID; - write_unlock(&ft->lock); + if (skip_preserves_fp) { + /* + * Session teardown: skip() is session_fd_check(), + * which may sleep and mutates fp->conn / fp->tcon / + * fp->volatile_id when it chooses to preserve fp + * for durable reconnect. Unpublish fp from the + * session idr here, under ft->lock, so that + * __ksmbd_lookup_fd() through this session cannot + * grant a new ksmbd_fp_get() reference to an fp + * whose fields are about to be rewritten outside + * the lock. Durable reconnect still reaches fp via + * global_ft. + */ + idr_remove(ft->idr, id); + fp->volatile_id = KSMBD_NO_FID; + write_unlock(&ft->lock); + if (skip(tcon, fp, sess->user)) { + /* + * session_fd_check() has converted fp to + * durable-preserve state and cleared its + * per-conn fields. fp is already unpublished + * above; the original idr-owned ref keeps it + * alive for the durable scavenger. Drop only + * the transient ref. atomic_dec() is safe -- + * atomic_inc_not_zero() succeeded on a + * positive value and we added one more, so + * refcount cannot be zero here. + */ + atomic_dec(&fp->refcount); + id++; + continue; + } + + /* + * Keep the close-state decision under the same lock + * observed by ksmbd_update_fstate(), which is how an + * in-flight FP_NEW opener learns that teardown has + * cleared its volatile id. + */ + write_lock(&ft->lock); + n_to_drop = ksmbd_mark_fp_closed(fp); + write_unlock(&ft->lock); + } else { + /* + * Tree teardown: skip() is tree_conn_fd_check(), a + * cheap pointer compare that doesn't sleep and has + * no side effects, so keep the skip decision plus + * the unpublish-and-mark-closed sequence atomic + * under ft->lock. fps belonging to other tree + * connects (skip() == true) stay fully published in + * the session idr with no lock window. + */ + if (skip(tcon, fp, sess->user)) { + atomic_dec(&fp->refcount); + write_unlock(&ft->lock); + id++; + continue; + } + idr_remove(ft->idr, id); + fp->volatile_id = KSMBD_NO_FID; + n_to_drop = ksmbd_mark_fp_closed(fp); + write_unlock(&ft->lock); + } + + /* + * fp->volatile_id is already cleared to prevent stale idr + * removal from a deferred final close. Remove fp from + * m_fp_list here because __ksmbd_remove_fd() will skip the + * list unlink when volatile_id is KSMBD_NO_FID. + */ down_write(&fp->f_ci->m_lock); list_del_init(&fp->node); up_write(&fp->f_ci->m_lock); - __ksmbd_close_fd(ft, fp); - - num++; + /* + * Drop the references this iteration owns: + * + * n_to_drop == 2: we observed FP_INITED and committed + * the FP_CLOSED transition ourselves, so we own the + * transient (+1) and the still-intact idr-owned ref. + * + * n_to_drop == 1: either a prior ksmbd_close_fd() + * already consumed the idr-owned ref, or fp was still + * FP_NEW and the in-flight opener/reopener must keep + * the original reference until ksmbd_update_fstate() + * observes the cleared volatile id. + * + * If we end up as the final putter, finalize fp and + * account the open_files_count decrement via the caller's + * atomic_sub(num, ...). Otherwise the remaining user's + * ksmbd_fd_put() reaches __put_fd_final(), which does its + * own atomic_dec(&open_files_count), so we must not count + * this fp here -- doing so would double-decrement the + * connection-wide counter. + */ + if (atomic_sub_and_test(n_to_drop, &fp->refcount)) { + __ksmbd_close_fd(NULL, fp); + num++; + } id++; } @@ -881,24 +1055,37 @@ static bool ksmbd_durable_scavenger_alive(void) return true; } -static void ksmbd_scavenger_dispose_dh(struct list_head *head) +static void ksmbd_scavenger_dispose_dh(struct ksmbd_file *fp) { - while (!list_empty(head)) { - struct ksmbd_file *fp; + /* + * Durable-preserved fp can remain linked on f_ci->m_fp_list for + * share-mode checks. Unlink it before final close; fp->node is not + * available as a scavenger-private list node because re-adding it to + * another list corrupts m_fp_list. + */ + down_write(&fp->f_ci->m_lock); + list_del_init(&fp->node); + up_write(&fp->f_ci->m_lock); - fp = list_first_entry(head, struct ksmbd_file, node); - list_del_init(&fp->node); + /* + * Drop both the durable lifetime reference and the transient reference + * taken by the scavenger under global_ft.lock. If a concurrent + * ksmbd_lookup_fd_inode() (or any other m_fp_list walker) snatched fp + * before the unlink above, that holder owns the final close via + * ksmbd_fd_put() -> __ksmbd_close_fd(). Otherwise the scavenger is + * the last putter and finalises fp here. + */ + if (atomic_sub_and_test(2, &fp->refcount)) __ksmbd_close_fd(NULL, fp); - } } static int ksmbd_durable_scavenger(void *dummy) { struct ksmbd_file *fp = NULL; + struct ksmbd_file *expired_fp; unsigned int id; unsigned int min_timeout = 1; bool found_fp_timeout; - LIST_HEAD(scavenger_list); unsigned long remaining_jiffies; __module_get(THIS_MODULE); @@ -908,8 +1095,6 @@ static int ksmbd_durable_scavenger(void *dummy) if (try_to_freeze()) continue; - found_fp_timeout = false; - remaining_jiffies = wait_event_timeout(dh_wq, ksmbd_durable_scavenger_alive() == false, __msecs_to_jiffies(min_timeout)); @@ -918,23 +1103,39 @@ static int ksmbd_durable_scavenger(void *dummy) else min_timeout = DURABLE_HANDLE_MAX_TIMEOUT; - write_lock(&global_ft.lock); - idr_for_each_entry(global_ft.idr, fp, id) { - if (!fp->durable_timeout) - continue; + do { + expired_fp = NULL; + found_fp_timeout = false; - if (atomic_read(&fp->refcount) > 1 || - fp->conn) - continue; - - found_fp_timeout = true; - if (fp->durable_scavenger_timeout <= - jiffies_to_msecs(jiffies)) { - __ksmbd_remove_durable_fd(fp); - list_add(&fp->node, &scavenger_list); - } else { + write_lock(&global_ft.lock); + idr_for_each_entry(global_ft.idr, fp, id) { unsigned long durable_timeout; + if (!fp->durable_timeout) + continue; + + if (atomic_read(&fp->refcount) > 1 || + fp->conn) + continue; + + found_fp_timeout = true; + if (fp->durable_scavenger_timeout <= + jiffies_to_msecs(jiffies)) { + __ksmbd_remove_durable_fd(fp); + /* + * Take a transient reference so fp + * cannot be freed by an in-flight + * ksmbd_lookup_fd_inode() that found + * it through f_ci->m_fp_list while we + * drop global_ft.lock and reach the + * m_fp_list unlink in + * ksmbd_scavenger_dispose_dh(). + */ + atomic_inc(&fp->refcount); + expired_fp = fp; + break; + } + durable_timeout = fp->durable_scavenger_timeout - jiffies_to_msecs(jiffies); @@ -942,10 +1143,11 @@ static int ksmbd_durable_scavenger(void *dummy) if (min_timeout > durable_timeout) min_timeout = durable_timeout; } - } - write_unlock(&global_ft.lock); + write_unlock(&global_ft.lock); - ksmbd_scavenger_dispose_dh(&scavenger_list); + if (expired_fp) + ksmbd_scavenger_dispose_dh(expired_fp); + } while (expired_fp); if (found_fp_timeout == false) break; @@ -1062,25 +1264,35 @@ static bool session_fd_check(struct ksmbd_tree_connect *tcon, if (!is_reconnectable(fp)) return false; + if (fp->f_state != FP_INITED) + return false; + + if (WARN_ON_ONCE(!fp->conn)) + return false; + if (ksmbd_vfs_copy_durable_owner(fp, user)) return false; + /* + * fp owns a strong reference on fp->conn (taken in ksmbd_open_fd() + * / ksmbd_reopen_durable_fd()), so conn stays valid for the whole + * body of this function regardless of any op->conn puts below. + */ conn = fp->conn; ci = fp->f_ci; down_write(&ci->m_lock); list_for_each_entry_rcu(op, &ci->m_op_list, op_entry) { if (op->conn != conn) continue; - if (op->conn && atomic_dec_and_test(&op->conn->refcnt)) - kfree(op->conn); + ksmbd_conn_put(op->conn); op->conn = NULL; } up_write(&ci->m_lock); list_for_each_entry_safe(smb_lock, tmp_lock, &fp->lock_list, flist) { - spin_lock(&fp->conn->llist_lock); + spin_lock(&conn->llist_lock); list_del_init(&smb_lock->clist); - spin_unlock(&fp->conn->llist_lock); + spin_unlock(&conn->llist_lock); } fp->conn = NULL; @@ -1091,6 +1303,8 @@ static bool session_fd_check(struct ksmbd_tree_connect *tcon, fp->durable_scavenger_timeout = jiffies_to_msecs(jiffies) + fp->durable_timeout; + /* Drop fp's own reference on conn. */ + ksmbd_conn_put(conn); return true; } @@ -1098,7 +1312,8 @@ void ksmbd_close_tree_conn_fds(struct ksmbd_work *work) { int num = __close_file_table_ids(work->sess, work->tcon, - tree_conn_fd_check); + tree_conn_fd_check, + false); atomic_sub(num, &work->conn->stats.open_files_count); } @@ -1107,7 +1322,8 @@ void ksmbd_close_session_fds(struct ksmbd_work *work) { int num = __close_file_table_ids(work->sess, work->tcon, - session_fd_check); + session_fd_check, + true); atomic_sub(num, &work->conn->stats.open_files_count); } @@ -1178,15 +1394,27 @@ int ksmbd_reopen_durable_fd(struct ksmbd_work *work, struct ksmbd_file *fp) old_f_state = fp->f_state; fp->f_state = FP_NEW; + + /* + * Initialize fp's connection binding before publishing fp into the + * session's file table. If __open_id() is ordered first, a + * concurrent teardown that iterates the table can observe a valid + * volatile_id with fp->conn == NULL and preserve a + * partially-initialized fp. fp owns a strong reference on the new + * conn (see ksmbd_open_fd()); undo it on __open_id() failure. + */ + fp->conn = ksmbd_conn_get(conn); + fp->tcon = work->tcon; + __open_id(&work->sess->file_table, fp, OPEN_ID_TYPE_VOLATILE_ID); if (!has_file_id(fp->volatile_id)) { + fp->conn = NULL; + fp->tcon = NULL; + ksmbd_conn_put(conn); fp->f_state = old_f_state; return -EBADF; } - fp->conn = conn; - fp->tcon = work->tcon; - list_for_each_entry(smb_lock, &fp->lock_list, flist) { spin_lock(&conn->llist_lock); list_add_tail(&smb_lock->clist, &conn->lock_list); @@ -1198,8 +1426,7 @@ int ksmbd_reopen_durable_fd(struct ksmbd_work *work, struct ksmbd_file *fp) list_for_each_entry_rcu(op, &ci->m_op_list, op_entry) { if (op->conn) continue; - op->conn = fp->conn; - atomic_inc(&op->conn->refcnt); + op->conn = ksmbd_conn_get(fp->conn); } up_write(&ci->m_lock); @@ -1228,7 +1455,7 @@ void ksmbd_destroy_file_table(struct ksmbd_session *sess) if (!ft->idr) return; - __close_file_table_ids(sess, NULL, session_fd_check); + __close_file_table_ids(sess, NULL, session_fd_check, true); idr_destroy(ft->idr); kfree(ft->idr); ft->idr = NULL; diff --git a/fs/smb/server/vfs_cache.h b/fs/smb/server/vfs_cache.h index 866f32c10d4dda..e6871266a94bad 100644 --- a/fs/smb/server/vfs_cache.h +++ b/fs/smb/server/vfs_cache.h @@ -172,8 +172,8 @@ int ksmbd_close_inode_fds(struct ksmbd_work *work, struct inode *inode); int ksmbd_init_global_file_table(void); void ksmbd_free_global_file_table(void); void ksmbd_set_fd_limit(unsigned long limit); -void ksmbd_update_fstate(struct ksmbd_file_table *ft, struct ksmbd_file *fp, - unsigned int state); +int ksmbd_update_fstate(struct ksmbd_file_table *ft, struct ksmbd_file *fp, + unsigned int state); bool ksmbd_vfs_compare_durable_owner(struct ksmbd_file *fp, struct ksmbd_user *user); diff --git a/fs/smb/smbdirect/accept.c b/fs/smb/smbdirect/accept.c index 704b271af3a8c0..5297400058385e 100644 --- a/fs/smb/smbdirect/accept.c +++ b/fs/smb/smbdirect/accept.c @@ -854,4 +854,4 @@ struct smbdirect_socket *smbdirect_socket_accept(struct smbdirect_socket *lsc, return nsc; } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_socket_accept); +EXPORT_SYMBOL_GPL(smbdirect_socket_accept); diff --git a/fs/smb/smbdirect/connect.c b/fs/smb/smbdirect/connect.c index 8addee43a38113..cd726b399afecb 100644 --- a/fs/smb/smbdirect/connect.c +++ b/fs/smb/smbdirect/connect.c @@ -60,7 +60,7 @@ int smbdirect_connect(struct smbdirect_socket *sc, const struct sockaddr *dst) */ return 0; } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_connect); +EXPORT_SYMBOL_GPL(smbdirect_connect); static int smbdirect_connect_setup_connection(struct smbdirect_socket *sc) { @@ -922,4 +922,4 @@ int smbdirect_connect_sync(struct smbdirect_socket *sc, return 0; } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_connect_sync); +EXPORT_SYMBOL_GPL(smbdirect_connect_sync); diff --git a/fs/smb/smbdirect/connection.c b/fs/smb/smbdirect/connection.c index 822366718d457d..8adf580975344b 100644 --- a/fs/smb/smbdirect/connection.c +++ b/fs/smb/smbdirect/connection.c @@ -706,7 +706,7 @@ bool smbdirect_connection_is_connected(struct smbdirect_socket *sc) return false; return true; } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_connection_is_connected); +EXPORT_SYMBOL_GPL(smbdirect_connection_is_connected); int smbdirect_connection_wait_for_connected(struct smbdirect_socket *sc) { @@ -779,7 +779,7 @@ int smbdirect_connection_wait_for_connected(struct smbdirect_socket *sc) return 0; } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_connection_wait_for_connected); +EXPORT_SYMBOL_GPL(smbdirect_connection_wait_for_connected); void smbdirect_connection_idle_timer_work(struct work_struct *work) { @@ -958,7 +958,7 @@ int smbdirect_connection_send_batch_flush(struct smbdirect_socket *sc, return ret; } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_connection_send_batch_flush); +EXPORT_SYMBOL_GPL(smbdirect_connection_send_batch_flush); struct smbdirect_send_batch * smbdirect_init_send_batch_storage(struct smbdirect_send_batch_storage *storage, @@ -976,7 +976,7 @@ smbdirect_init_send_batch_storage(struct smbdirect_send_batch_storage *storage, return batch; } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_init_send_batch_storage); +EXPORT_SYMBOL_GPL(smbdirect_init_send_batch_storage); static int smbdirect_connection_wait_for_send_bcredit(struct smbdirect_socket *sc, struct smbdirect_send_batch *batch) @@ -1263,7 +1263,7 @@ int smbdirect_connection_send_single_iter(struct smbdirect_socket *sc, bcredit_failed: return ret; } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_connection_send_single_iter); +EXPORT_SYMBOL_GPL(smbdirect_connection_send_single_iter); int smbdirect_connection_send_wait_zero_pending(struct smbdirect_socket *sc) { @@ -1288,7 +1288,7 @@ int smbdirect_connection_send_wait_zero_pending(struct smbdirect_socket *sc) return 0; } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_connection_send_wait_zero_pending); +EXPORT_SYMBOL_GPL(smbdirect_connection_send_wait_zero_pending); int smbdirect_connection_send_iter(struct smbdirect_socket *sc, struct iov_iter *iter, @@ -1373,7 +1373,7 @@ int smbdirect_connection_send_iter(struct smbdirect_socket *sc, return total_count; } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_connection_send_iter); +EXPORT_SYMBOL_GPL(smbdirect_connection_send_iter); static void smbdirect_connection_send_io_done(struct ib_cq *cq, struct ib_wc *wc) { @@ -1937,7 +1937,7 @@ int smbdirect_connection_recvmsg(struct smbdirect_socket *sc, goto again; } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_connection_recvmsg); +EXPORT_SYMBOL_GPL(smbdirect_connection_recvmsg); static bool smbdirect_map_sges_single_page(struct smbdirect_map_sges *state, struct page *page, size_t off, size_t len) @@ -2168,7 +2168,7 @@ static ssize_t smbdirect_map_sges_from_iter(struct iov_iter *iter, size_t len, if (ret < 0) { while (state->num_sge > before) { - struct ib_sge *sge = &state->sge[state->num_sge--]; + struct ib_sge *sge = &state->sge[--state->num_sge]; ib_dma_unmap_page(state->device, sge->addr, diff --git a/fs/smb/smbdirect/debug.c b/fs/smb/smbdirect/debug.c index a66a19d4a46341..05ba7c8d165eec 100644 --- a/fs/smb/smbdirect/debug.c +++ b/fs/smb/smbdirect/debug.c @@ -85,4 +85,4 @@ void smbdirect_connection_legacy_debug_proc_show(struct smbdirect_socket *sc, atomic_read(&sc->mr_io.ready.count), atomic_read(&sc->mr_io.used.count)); } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_connection_legacy_debug_proc_show); +EXPORT_SYMBOL_GPL(smbdirect_connection_legacy_debug_proc_show); diff --git a/fs/smb/smbdirect/devices.c b/fs/smb/smbdirect/devices.c index 44962f221c352f..7adacbdfe12e71 100644 --- a/fs/smb/smbdirect/devices.c +++ b/fs/smb/smbdirect/devices.c @@ -238,7 +238,7 @@ u8 smbdirect_netdev_rdma_capable_node_type(struct net_device *netdev) return RDMA_NODE_UNSPECIFIED; } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_netdev_rdma_capable_node_type); +EXPORT_SYMBOL_GPL(smbdirect_netdev_rdma_capable_node_type); __init int smbdirect_devices_init(void) { diff --git a/fs/smb/smbdirect/internal.h b/fs/smb/smbdirect/internal.h index 2d5acf2c21bc55..e9959e6dc13ae8 100644 --- a/fs/smb/smbdirect/internal.h +++ b/fs/smb/smbdirect/internal.h @@ -6,11 +6,11 @@ #ifndef __FS_SMB_COMMON_SMBDIRECT_INTERNAL_H__ #define __FS_SMB_COMMON_SMBDIRECT_INTERNAL_H__ +#define DEFAULT_SYMBOL_NAMESPACE "SMBDIRECT" #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt -#include "smbdirect.h" +#include #include "pdu.h" -#include "public.h" #include diff --git a/fs/smb/smbdirect/listen.c b/fs/smb/smbdirect/listen.c index 143a7618d95f3c..2f78bcaedbf82e 100644 --- a/fs/smb/smbdirect/listen.c +++ b/fs/smb/smbdirect/listen.c @@ -90,7 +90,7 @@ int smbdirect_socket_listen(struct smbdirect_socket *sc, int backlog) */ return 0; } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_socket_listen); +EXPORT_SYMBOL_GPL(smbdirect_socket_listen); static int smbdirect_new_rdma_event_handler(struct rdma_cm_id *new_id, struct rdma_cm_event *event) diff --git a/fs/smb/smbdirect/mr.c b/fs/smb/smbdirect/mr.c index 5228e699cd5d41..15c6363a2f97af 100644 --- a/fs/smb/smbdirect/mr.c +++ b/fs/smb/smbdirect/mr.c @@ -269,7 +269,7 @@ smbdirect_connection_register_mr_io(struct smbdirect_socket *sc, { const struct smbdirect_socket_parameters *sp = &sc->parameters; struct smbdirect_mr_io *mr; - int ret, num_pages; + int ret, num_pages, num_mapped; struct ib_reg_wr *reg_wr; num_pages = iov_iter_npages(iter, sp->max_frmr_depth + 1); @@ -300,19 +300,22 @@ smbdirect_connection_register_mr_io(struct smbdirect_socket *sc, num_pages, iov_iter_count(iter), sp->max_frmr_depth); smbdirect_iter_to_sgt(iter, &mr->sgt, sp->max_frmr_depth); - ret = ib_dma_map_sg(sc->ib.dev, mr->sgt.sgl, mr->sgt.nents, mr->dir); - if (!ret) { + num_mapped = ib_dma_map_sg(sc->ib.dev, mr->sgt.sgl, mr->sgt.nents, mr->dir); + if (!num_mapped) { smbdirect_log_rdma_mr(sc, SMBDIRECT_LOG_ERR, - "ib_dma_map_sg num_pages=%u dir=%x ret=%d (%1pe)\n", - num_pages, mr->dir, ret, SMBDIRECT_DEBUG_ERR_PTR(ret)); + "ib_dma_map_sg num_pages=%u dir=%x num_mapped=%d\n", + num_pages, mr->dir, num_mapped); + ret = -EIO; goto dma_map_error; } - ret = ib_map_mr_sg(mr->mr, mr->sgt.sgl, mr->sgt.nents, NULL, PAGE_SIZE); - if (ret != mr->sgt.nents) { + ret = ib_map_mr_sg(mr->mr, mr->sgt.sgl, num_mapped, NULL, PAGE_SIZE); + if (ret != num_mapped) { smbdirect_log_rdma_mr(sc, SMBDIRECT_LOG_ERR, - "ib_map_mr_sg failed ret = %d nents = %u\n", - ret, mr->sgt.nents); + "ib_map_mr_sg failed ret = %d num_mapped = %u\n", + ret, num_mapped); + if (ret >= 0) + ret = -EIO; goto map_mr_error; } @@ -380,7 +383,7 @@ smbdirect_connection_register_mr_io(struct smbdirect_socket *sc, mutex_unlock(&mr->mutex); return NULL; } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_connection_register_mr_io); +EXPORT_SYMBOL_GPL(smbdirect_connection_register_mr_io); void smbdirect_mr_io_fill_buffer_descriptor(struct smbdirect_mr_io *mr, struct smbdirect_buffer_descriptor_v1 *v1) @@ -397,7 +400,7 @@ void smbdirect_mr_io_fill_buffer_descriptor(struct smbdirect_mr_io *mr, } mutex_unlock(&mr->mutex); } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_mr_io_fill_buffer_descriptor); +EXPORT_SYMBOL_GPL(smbdirect_mr_io_fill_buffer_descriptor); /* * Deregister a MR after I/O is done @@ -490,4 +493,4 @@ void smbdirect_connection_deregister_mr_io(struct smbdirect_mr_io *mr) if (!kref_put(&mr->kref, smbdirect_mr_io_free_locked)) mutex_unlock(&mr->mutex); } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_connection_deregister_mr_io); +EXPORT_SYMBOL_GPL(smbdirect_connection_deregister_mr_io); diff --git a/fs/smb/smbdirect/rw.c b/fs/smb/smbdirect/rw.c index c2f46b17731ec0..6fe38042cfb968 100644 --- a/fs/smb/smbdirect/rw.c +++ b/fs/smb/smbdirect/rw.c @@ -252,4 +252,4 @@ int smbdirect_connection_rdma_xmit(struct smbdirect_socket *sc, kfree(msg); goto out; } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_connection_rdma_xmit); +EXPORT_SYMBOL_GPL(smbdirect_connection_rdma_xmit); diff --git a/fs/smb/smbdirect/smbdirect.h b/fs/smb/smbdirect/smbdirect.h deleted file mode 100644 index bbab5f7f7cc9b1..00000000000000 --- a/fs/smb/smbdirect/smbdirect.h +++ /dev/null @@ -1,52 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0-or-later */ -/* - * Copyright (C) 2025 Stefan Metzmacher - */ - -#ifndef __FS_SMB_COMMON_SMBDIRECT_SMBDIRECT_H__ -#define __FS_SMB_COMMON_SMBDIRECT_SMBDIRECT_H__ - -#include - -/* SMB-DIRECT buffer descriptor V1 structure [MS-SMBD] 2.2.3.1 */ -struct smbdirect_buffer_descriptor_v1 { - __le64 offset; - __le32 token; - __le32 length; -} __packed; - -/* - * Connection parameters mostly from [MS-SMBD] 3.1.1.1 - * - * These are setup and negotiated at the beginning of a - * connection and remain constant unless explicitly changed. - * - * Some values are important for the upper layer. - */ -struct smbdirect_socket_parameters { - __u64 flags; -#define SMBDIRECT_FLAG_PORT_RANGE_ONLY_IB ((__u64)0x1) -#define SMBDIRECT_FLAG_PORT_RANGE_ONLY_IW ((__u64)0x2) - __u32 resolve_addr_timeout_msec; - __u32 resolve_route_timeout_msec; - __u32 rdma_connect_timeout_msec; - __u32 negotiate_timeout_msec; - __u16 initiator_depth; /* limited to U8_MAX */ - __u16 responder_resources; /* limited to U8_MAX */ - __u16 recv_credit_max; - __u16 send_credit_target; - __u32 max_send_size; - __u32 max_fragmented_send_size; - __u32 max_recv_size; - __u32 max_fragmented_recv_size; - __u32 max_read_write_size; - __u32 max_frmr_depth; - __u32 keepalive_interval_msec; - __u32 keepalive_timeout_msec; -} __packed; - -#define SMBDIRECT_FLAG_PORT_RANGE_MASK ( \ - SMBDIRECT_FLAG_PORT_RANGE_ONLY_IB | \ - SMBDIRECT_FLAG_PORT_RANGE_ONLY_IW) - -#endif /* __FS_SMB_COMMON_SMBDIRECT_SMBDIRECT_H__ */ diff --git a/fs/smb/smbdirect/socket.c b/fs/smb/smbdirect/socket.c index 1b4ab01b745e65..39cca7219c4df8 100644 --- a/fs/smb/smbdirect/socket.c +++ b/fs/smb/smbdirect/socket.c @@ -20,7 +20,7 @@ bool smbdirect_frwr_is_supported(const struct ib_device_attr *attrs) return false; return true; } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_frwr_is_supported); +EXPORT_SYMBOL_GPL(smbdirect_frwr_is_supported); static void smbdirect_socket_cleanup_work(struct work_struct *work); @@ -107,7 +107,7 @@ int smbdirect_socket_create_kern(struct net *net, struct smbdirect_socket **_sc) alloc_failed: return ret; } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_socket_create_kern); +EXPORT_SYMBOL_GPL(smbdirect_socket_create_kern); int smbdirect_socket_init_accepting(struct rdma_cm_id *id, struct smbdirect_socket *sc) { @@ -148,7 +148,7 @@ int smbdirect_socket_create_accepting(struct rdma_cm_id *id, struct smbdirect_so alloc_failed: return ret; } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_socket_create_accepting); +EXPORT_SYMBOL_GPL(smbdirect_socket_create_accepting); int smbdirect_socket_set_initial_parameters(struct smbdirect_socket *sc, const struct smbdirect_socket_parameters *sp) @@ -189,14 +189,14 @@ int smbdirect_socket_set_initial_parameters(struct smbdirect_socket *sc, return 0; } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_socket_set_initial_parameters); +EXPORT_SYMBOL_GPL(smbdirect_socket_set_initial_parameters); const struct smbdirect_socket_parameters * smbdirect_socket_get_current_parameters(struct smbdirect_socket *sc) { return &sc->parameters; } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_socket_get_current_parameters); +EXPORT_SYMBOL_GPL(smbdirect_socket_get_current_parameters); int smbdirect_socket_set_kernel_settings(struct smbdirect_socket *sc, enum ib_poll_context poll_ctx, @@ -220,7 +220,7 @@ int smbdirect_socket_set_kernel_settings(struct smbdirect_socket *sc, return 0; } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_socket_set_kernel_settings); +EXPORT_SYMBOL_GPL(smbdirect_socket_set_kernel_settings); void smbdirect_socket_set_logging(struct smbdirect_socket *sc, void *private_ptr, @@ -240,7 +240,7 @@ void smbdirect_socket_set_logging(struct smbdirect_socket *sc, sc->logging.needed = needed; sc->logging.vaprintf = vaprintf; } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_socket_set_logging); +EXPORT_SYMBOL_GPL(smbdirect_socket_set_logging); static void smbdirect_socket_wake_up_all(struct smbdirect_socket *sc) { @@ -663,13 +663,13 @@ int smbdirect_socket_bind(struct smbdirect_socket *sc, struct sockaddr *addr) return 0; } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_socket_bind); +EXPORT_SYMBOL_GPL(smbdirect_socket_bind); void smbdirect_socket_shutdown(struct smbdirect_socket *sc) { smbdirect_socket_schedule_cleanup_lvl(sc, SMBDIRECT_LOG_INFO, -ESHUTDOWN); } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_socket_shutdown); +EXPORT_SYMBOL_GPL(smbdirect_socket_shutdown); static void smbdirect_socket_release_disconnect(struct kref *kref) { @@ -712,7 +712,7 @@ void smbdirect_socket_release(struct smbdirect_socket *sc) */ kref_put(&sc->refs.destroy, smbdirect_socket_release_destroy); } -__SMBDIRECT_EXPORT_SYMBOL__(smbdirect_socket_release); +EXPORT_SYMBOL_GPL(smbdirect_socket_release); int smbdirect_socket_wait_for_credits(struct smbdirect_socket *sc, enum smbdirect_socket_status expected_status, diff --git a/fs/xfs/libxfs/xfs_dir2_data.c b/fs/xfs/libxfs/xfs_dir2_data.c index 80ba94f51e5c7c..aecbab61014c77 100644 --- a/fs/xfs/libxfs/xfs_dir2_data.c +++ b/fs/xfs/libxfs/xfs_dir2_data.c @@ -382,6 +382,7 @@ xfs_dir3_data_write_verify( struct xfs_mount *mp = bp->b_mount; struct xfs_buf_log_item *bip = bp->b_log_item; struct xfs_dir3_blk_hdr *hdr3 = bp->b_addr; + struct xfs_dir3_data_hdr *datahdr3 = bp->b_addr; xfs_failaddr_t fa; fa = xfs_dir3_data_verify(bp); @@ -396,6 +397,11 @@ xfs_dir3_data_write_verify( if (bip) hdr3->lsn = cpu_to_be64(bip->bli_item.li_lsn); + /* + * Zero padding that may be stale from old kernels. + */ + datahdr3->pad = 0; + xfs_buf_update_cksum(bp, XFS_DIR3_DATA_CRC_OFF); } @@ -728,7 +734,6 @@ xfs_dir3_data_init( struct xfs_dir2_data_unused *dup; struct xfs_dir2_data_free *bf; int error; - int i; /* * Get the buffer set up for the block. @@ -741,13 +746,16 @@ xfs_dir3_data_init( xfs_trans_buf_set_type(tp, bp, XFS_BLFT_DIR_DATA_BUF); /* - * Initialize the header. + * Initialize the whole directory header region to zero + * so that all padding, bestfree entries, and any + * future header fields are clean. */ hdr = bp->b_addr; + memset(hdr, 0, geo->data_entry_offset); + if (xfs_has_crc(mp)) { struct xfs_dir3_blk_hdr *hdr3 = bp->b_addr; - memset(hdr3, 0, sizeof(*hdr3)); hdr3->magic = cpu_to_be32(XFS_DIR3_DATA_MAGIC); hdr3->blkno = cpu_to_be64(xfs_buf_daddr(bp)); hdr3->owner = cpu_to_be64(args->owner); @@ -759,10 +767,6 @@ xfs_dir3_data_init( bf = xfs_dir2_data_bestfree_p(mp, hdr); bf[0].offset = cpu_to_be16(geo->data_entry_offset); bf[0].length = cpu_to_be16(geo->blksize - geo->data_entry_offset); - for (i = 1; i < XFS_DIR2_DATA_FD_COUNT; i++) { - bf[i].length = 0; - bf[i].offset = 0; - } /* * Set up an unused entry for the block's body. diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c index 40c7f0ff6cf3a7..0ec6ccd8b4dcf5 100644 --- a/fs/xfs/libxfs/xfs_refcount.c +++ b/fs/xfs/libxfs/xfs_refcount.c @@ -1414,8 +1414,7 @@ xfs_refcount_finish_one( if (rcur == NULL) { struct xfs_perag *pag = to_perag(ri->ri_group); - error = xfs_alloc_read_agf(pag, tp, - XFS_ALLOC_FLAG_FREEING, &agbp); + error = xfs_alloc_read_agf(pag, tp, 0, &agbp); if (error) return error; diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c index 20e63069088b39..3d40cb0b2496c4 100644 --- a/fs/xfs/scrub/common.c +++ b/fs/xfs/scrub/common.c @@ -251,6 +251,17 @@ xchk_ino_set_preen( trace_xchk_ino_preen(sc, ino, __return_address); } +/* Record a block indexed by a file fork that could be optimized. */ +void +xchk_fblock_set_preen( + struct xfs_scrub *sc, + int whichfork, + xfs_fileoff_t offset) +{ + sc->sm->sm_flags |= XFS_SCRUB_OFLAG_PREEN; + trace_xchk_fblock_preen(sc, whichfork, offset, __return_address); +} + /* Record something being wrong with the filesystem primary superblock. */ void xchk_set_corrupt( diff --git a/fs/xfs/scrub/common.h b/fs/xfs/scrub/common.h index f2ecc68538f0c3..b494d747c0084a 100644 --- a/fs/xfs/scrub/common.h +++ b/fs/xfs/scrub/common.h @@ -25,6 +25,8 @@ bool xchk_fblock_xref_process_error(struct xfs_scrub *sc, void xchk_block_set_preen(struct xfs_scrub *sc, struct xfs_buf *bp); void xchk_ino_set_preen(struct xfs_scrub *sc, xfs_ino_t ino); +void xchk_fblock_set_preen(struct xfs_scrub *sc, + int whichfork, xfs_fileoff_t offset); void xchk_set_corrupt(struct xfs_scrub *sc); void xchk_block_set_corrupt(struct xfs_scrub *sc, diff --git a/fs/xfs/scrub/dabtree.c b/fs/xfs/scrub/dabtree.c index 1a71d36898b1d7..c2d6ad59d03ef2 100644 --- a/fs/xfs/scrub/dabtree.c +++ b/fs/xfs/scrub/dabtree.c @@ -454,7 +454,12 @@ xchk_da_btree_block( } } - /* XXX: Check hdr3.pad32 once we know how to fix it. */ + if (xfs_has_crc(ip->i_mount)) { + struct xfs_da3_node_hdr *nodehdr3 = blk->bp->b_addr; + + if (nodehdr3->__pad32) + xchk_da_set_preen(ds, level); + } break; default: xchk_da_set_corrupt(ds, level); diff --git a/fs/xfs/scrub/dir.c b/fs/xfs/scrub/dir.c index e09724cd372553..09715a4aa154bf 100644 --- a/fs/xfs/scrub/dir.c +++ b/fs/xfs/scrub/dir.c @@ -492,7 +492,12 @@ xchk_directory_data_bestfree( goto out; xchk_buffer_recheck(sc, bp); - /* XXX: Check xfs_dir3_data_hdr.pad is zero once we start setting it. */ + if (xfs_has_crc(sc->mp)) { + struct xfs_dir3_data_hdr *hdr3 = bp->b_addr; + + if (hdr3->pad) + xchk_fblock_set_preen(sc, XFS_DATA_FORK, lblk); + } if (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT) goto out_buf; diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index beaa26ec62da40..9978ac1422fc4f 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -699,12 +699,6 @@ xfs_create( */ error = xfs_trans_alloc_icreate(mp, tres, udqp, gdqp, pdqp, resblks, &tp); - if (error == -ENOSPC) { - /* flush outstanding delalloc blocks and retry */ - xfs_flush_inodes(mp); - error = xfs_trans_alloc_icreate(mp, tres, udqp, gdqp, pdqp, - resblks, &tp); - } if (error) goto out_parent; diff --git a/fs/xfs/xfs_notify_failure.c b/fs/xfs/xfs_notify_failure.c index 64c8afb935c261..b994ff15d5e455 100644 --- a/fs/xfs/xfs_notify_failure.c +++ b/fs/xfs/xfs_notify_failure.c @@ -350,7 +350,7 @@ xfs_dax_notify_dev_failure( /* * Shutdown fs from a force umount in pre-remove case which won't fail, * so errors can be ignored. Otherwise, shutdown the filesystem with - * CORRUPT flag if error occured or notify.want_shutdown was set during + * CORRUPT flag if error occurred or notify.want_shutdown was set during * RMAP querying. */ if (mf_flags & MF_MEM_PRE_REMOVE) diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c index bcc470f56e4662..148cc32449c1f8 100644 --- a/fs/xfs/xfs_trans.c +++ b/fs/xfs/xfs_trans.c @@ -1199,10 +1199,21 @@ xfs_trans_alloc_icreate( { struct xfs_trans *tp; bool retried = false; + bool flushed = false; int error; retry: error = xfs_trans_alloc(mp, resv, dblocks, 0, 0, &tp); + if (error == -ENOSPC && !flushed) { + /* + * Flush all delalloc blocks to reclaim space from speculative + * preallocation. This is similar to the quota retry below + * but targets FS-wide ENOSPC. + */ + xfs_flush_inodes(mp); + flushed = true; + goto retry; + } if (error) return error; diff --git a/fs/xfs/xfs_zone_alloc.c b/fs/xfs/xfs_zone_alloc.c index a851b98143c0b4..5e297b75a85f63 100644 --- a/fs/xfs/xfs_zone_alloc.c +++ b/fs/xfs/xfs_zone_alloc.c @@ -1170,7 +1170,7 @@ xfs_calc_open_zones( if (bdev_open_zones && bdev_open_zones < mp->m_max_open_zones) { mp->m_max_open_zones = bdev_open_zones; - xfs_info(mp, "limiting open zones to %u due to hardware limit.\n", + xfs_info(mp, "limiting open zones to %u due to hardware limit.", bdev_open_zones); } @@ -1217,7 +1217,7 @@ xfs_alloc_zone_info( return zi; out_free_bitmaps: - while (--i > 0) + while (--i >= 0) kvfree(zi->zi_used_bucket_bitmap[i]); kfree(zi); return NULL; diff --git a/fs/xfs/xfs_zone_gc.c b/fs/xfs/xfs_zone_gc.c index fedcc47048aff5..c8a1d5c0332c52 100644 --- a/fs/xfs/xfs_zone_gc.c +++ b/fs/xfs/xfs_zone_gc.c @@ -1221,7 +1221,7 @@ xfs_zone_gc_mount( if (data->oz) xfs_open_zone_put(data->oz); out_free_gc_data: - kfree(data); + xfs_zone_gc_data_free(data); return error; } diff --git a/include/asm-generic/kprobes.h b/include/asm-generic/kprobes.h index 060eab094e5a22..5290a2b2e15a0f 100644 --- a/include/asm-generic/kprobes.h +++ b/include/asm-generic/kprobes.h @@ -14,7 +14,7 @@ static unsigned long __used \ _kbl_addr_##fname = (unsigned long)fname; # define NOKPROBE_SYMBOL(fname) __NOKPROBE_SYMBOL(fname) /* Use this to forbid a kprobes attach on very low level functions */ -# define __kprobes __section(".kprobes.text") +# define __kprobes notrace __section(".kprobes.text") # define nokprobe_inline __always_inline #else # define NOKPROBE_SYMBOL(fname) diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h index 33e80f30b8b822..a5d386583fb6e0 100644 --- a/include/drm/ttm/ttm_resource.h +++ b/include/drm/ttm/ttm_resource.h @@ -448,6 +448,8 @@ void ttm_resource_add_bulk_move(struct ttm_resource *res, struct ttm_buffer_object *bo); void ttm_resource_del_bulk_move(struct ttm_resource *res, struct ttm_buffer_object *bo); +void ttm_resource_del_bulk_move_unevictable(struct ttm_resource *res, + struct ttm_buffer_object *bo); void ttm_resource_move_to_lru_tail(struct ttm_resource *res); void ttm_resource_init(struct ttm_buffer_object *bo, diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h index 50b47eba7d0152..e7195750d21bb4 100644 --- a/include/linux/arm-smccc.h +++ b/include/linux/arm-smccc.h @@ -105,6 +105,12 @@ ARM_SMCCC_SMC_32, \ 0, 0x3fff) +/* C1-Pro erratum 4193714: SME DVMSync early acknowledgement */ +#define ARM_SMCCC_CPU_WORKAROUND_4193714 \ + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ + ARM_SMCCC_SMC_32, \ + ARM_SMCCC_OWNER_CPU, 0x10) + #define ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID \ ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ ARM_SMCCC_SMC_32, \ diff --git a/include/linux/bio.h b/include/linux/bio.h index 97d747320b35bc..dc17780d6c1e36 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -475,7 +475,8 @@ void __bio_release_pages(struct bio *bio, bool mark_dirty); extern void bio_set_pages_dirty(struct bio *bio); extern void bio_check_pages_dirty(struct bio *bio); -int bio_iov_iter_bounce(struct bio *bio, struct iov_iter *iter, size_t maxlen); +int bio_iov_iter_bounce(struct bio *bio, struct iov_iter *iter, size_t maxlen, + size_t minsize); void bio_iov_iter_unbounce(struct bio *bio, bool is_error, bool mark_dirty); extern void bio_copy_data_iter(struct bio *dst, struct bvec_iter *dst_iter, diff --git a/include/linux/bpf.h b/include/linux/bpf.h index b4b703c90ca94f..01e20396489287 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -3725,6 +3725,7 @@ extern const struct bpf_func_proto bpf_for_each_map_elem_proto; extern const struct bpf_func_proto bpf_btf_find_by_name_kind_proto; extern const struct bpf_func_proto bpf_sk_setsockopt_proto; extern const struct bpf_func_proto bpf_sk_getsockopt_proto; +extern const struct bpf_func_proto bpf_sk_setsockopt_nodelay_proto; extern const struct bpf_func_proto bpf_unlocked_sk_setsockopt_proto; extern const struct bpf_func_proto bpf_unlocked_sk_getsockopt_proto; extern const struct bpf_func_proto bpf_find_vma_proto; diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h index f42563739d2e53..50a784da7a81aa 100644 --- a/include/linux/cgroup-defs.h +++ b/include/linux/cgroup-defs.h @@ -611,8 +611,8 @@ struct cgroup { /* used to wait for offlining of csses */ wait_queue_head_t offline_waitq; - /* used by cgroup_rmdir() to wait for dying tasks to leave */ - wait_queue_head_t dying_populated_waitq; + /* defers killing csses after removal until cgroup is depopulated */ + struct work_struct finish_destroy_work; /* used to schedule release agent */ struct work_struct release_agent_work; diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index e52160e85af4b5..f6d037a30fd899 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -53,6 +53,7 @@ struct kernel_clone_args; enum css_task_iter_flags { CSS_TASK_ITER_PROCS = (1U << 0), /* walk only threadgroup leaders */ CSS_TASK_ITER_THREADED = (1U << 1), /* walk all threaded css_sets in the domain */ + CSS_TASK_ITER_WITH_DEAD = (1U << 2), /* include exiting tasks */ CSS_TASK_ITER_SKIPPED = (1U << 16), /* internal flags */ }; diff --git a/include/linux/fprobe.h b/include/linux/fprobe.h index 0a3bcd1718f379..be1b38c981d4d9 100644 --- a/include/linux/fprobe.h +++ b/include/linux/fprobe.h @@ -94,6 +94,7 @@ int register_fprobe(struct fprobe *fp, const char *filter, const char *notfilter int register_fprobe_ips(struct fprobe *fp, unsigned long *addrs, int num); int register_fprobe_syms(struct fprobe *fp, const char **syms, int num); int unregister_fprobe(struct fprobe *fp); +int unregister_fprobe_async(struct fprobe *fp); bool fprobe_is_registered(struct fprobe *fp); int fprobe_count_ips_from_filter(const char *filter, const char *notfilter); #else @@ -113,6 +114,10 @@ static inline int unregister_fprobe(struct fprobe *fp) { return -EOPNOTSUPP; } +static inline int unregister_fprobe_async(struct fprobe *fp) +{ + return -EOPNOTSUPP; +} static inline bool fprobe_is_registered(struct fprobe *fp) { return false; diff --git a/include/linux/hid.h b/include/linux/hid.h index 442a80d79e899d..bfb9859f391ee5 100644 --- a/include/linux/hid.h +++ b/include/linux/hid.h @@ -1030,6 +1030,8 @@ struct hid_field *hid_find_field(struct hid_device *hdev, unsigned int report_ty int hid_set_field(struct hid_field *, unsigned, __s32); int hid_input_report(struct hid_device *hid, enum hid_report_type type, u8 *data, u32 size, int interrupt); +int hid_safe_input_report(struct hid_device *hid, enum hid_report_type type, u8 *data, + size_t bufsize, u32 size, int interrupt); struct hid_field *hidinput_get_led_field(struct hid_device *hid); unsigned int hidinput_count_leds(struct hid_device *hid); __s32 hidinput_calc_abs_res(const struct hid_field *field, __u16 code); @@ -1298,8 +1300,8 @@ static inline u32 hid_report_len(struct hid_report *report) return DIV_ROUND_UP(report->size, 8) + (report->id > 0); } -int hid_report_raw_event(struct hid_device *hid, enum hid_report_type type, u8 *data, u32 size, - int interrupt); +int hid_report_raw_event(struct hid_device *hid, enum hid_report_type type, u8 *data, + size_t bufsize, u32 size, int interrupt); /* HID quirks API */ unsigned long hid_lookup_quirk(const struct hid_device *hdev); diff --git a/include/linux/hid_bpf.h b/include/linux/hid_bpf.h index a2e47dbcf82c8b..19fffa4574a47c 100644 --- a/include/linux/hid_bpf.h +++ b/include/linux/hid_bpf.h @@ -72,8 +72,8 @@ struct hid_ops { int (*hid_hw_output_report)(struct hid_device *hdev, __u8 *buf, size_t len, u64 source, bool from_bpf); int (*hid_input_report)(struct hid_device *hid, enum hid_report_type type, - u8 *data, u32 size, int interrupt, u64 source, bool from_bpf, - bool lock_already_taken); + u8 *data, size_t bufsize, u32 size, int interrupt, u64 source, + bool from_bpf, bool lock_already_taken); struct module *owner; const struct bus_type *bus_type; }; @@ -200,7 +200,8 @@ struct hid_bpf { #ifdef CONFIG_HID_BPF u8 *dispatch_hid_bpf_device_event(struct hid_device *hid, enum hid_report_type type, u8 *data, - u32 *size, int interrupt, u64 source, bool from_bpf); + size_t *buf_size, u32 *size, int interrupt, u64 source, + bool from_bpf); int dispatch_hid_bpf_raw_requests(struct hid_device *hdev, unsigned char reportnum, __u8 *buf, u32 size, enum hid_report_type rtype, @@ -215,8 +216,11 @@ int hid_bpf_device_init(struct hid_device *hid); const u8 *call_hid_bpf_rdesc_fixup(struct hid_device *hdev, const u8 *rdesc, unsigned int *size); #else /* CONFIG_HID_BPF */ static inline u8 *dispatch_hid_bpf_device_event(struct hid_device *hid, enum hid_report_type type, - u8 *data, u32 *size, int interrupt, - u64 source, bool from_bpf) { return data; } + u8 *data, size_t *buf_size, u32 *size, + int interrupt, u64 source, bool from_bpf) +{ + return data; +} static inline int dispatch_hid_bpf_raw_requests(struct hid_device *hdev, unsigned char reportnum, u8 *buf, u32 size, enum hid_report_type rtype, diff --git a/include/linux/intel_tpmi.h b/include/linux/intel_tpmi.h index 94c06bf214fb6f..15f02422e9ca5e 100644 --- a/include/linux/intel_tpmi.h +++ b/include/linux/intel_tpmi.h @@ -28,6 +28,12 @@ enum intel_tpmi_id { TPMI_INFO_ID = 0x81, /* Special ID for PCI BDF and Package ID information */ }; +#define TPMI_CORE_INIT 0 +#define TPMI_CORE_EXIT 1 + +int tpmi_register_notifier(struct notifier_block *nb); +int tpmi_unregister_notifier(struct notifier_block *nb); + struct oobmsm_plat_info *tpmi_get_platform_data(struct auxiliary_device *auxdev); struct resource *tpmi_get_resource_at_index(struct auxiliary_device *auxdev, int index); int tpmi_get_resource_count(struct auxiliary_device *auxdev); diff --git a/include/linux/irq-entry-common.h b/include/linux/irq-entry-common.h index 167fba7dbf043f..1fabf0f5ea8e76 100644 --- a/include/linux/irq-entry-common.h +++ b/include/linux/irq-entry-common.h @@ -218,14 +218,6 @@ static __always_inline void __exit_to_user_mode_validate(void) lockdep_sys_exit(); } -/* Temporary workaround to keep ARM64 alive */ -static __always_inline void exit_to_user_mode_prepare_legacy(struct pt_regs *regs) -{ - __exit_to_user_mode_prepare(regs, EXIT_TO_USER_MODE_WORK); - rseq_exit_to_user_mode_legacy(); - __exit_to_user_mode_validate(); -} - /** * syscall_exit_to_user_mode_prepare - call exit_to_user_mode_loop() if required * @regs: Pointer to pt_regs on entry stack diff --git a/include/linux/irqchip/arm-gic-v5.h b/include/linux/irqchip/arm-gic-v5.h index 40d2fce682940a..f78787e654f4c6 100644 --- a/include/linux/irqchip/arm-gic-v5.h +++ b/include/linux/irqchip/arm-gic-v5.h @@ -425,9 +425,6 @@ struct gicv5_its_itt_cfg { void gicv5_init_lpis(u32 max); void gicv5_deinit_lpis(void); -int gicv5_alloc_lpi(void); -void gicv5_free_lpi(u32 lpi); - void __init gicv5_its_of_probe(struct device_node *parent); void __init gicv5_its_acpi_probe(void); #endif diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h index 77c778d84d4cba..5a1c5c336fa4fe 100644 --- a/include/linux/netfilter/x_tables.h +++ b/include/linux/netfilter/x_tables.h @@ -146,6 +146,9 @@ struct xt_match { /* Called when user tries to insert an entry of this type. */ int (*checkentry)(const struct xt_mtchk_param *); + /* Called to validate hooks based on the match configuration. */ + int (*check_hooks)(const struct xt_mtchk_param *); + /* Called when entry of this type deleted. */ void (*destroy)(const struct xt_mtdtor_param *); #ifdef CONFIG_NETFILTER_XTABLES_COMPAT @@ -187,6 +190,9 @@ struct xt_target { /* Should return 0 on success or an error code otherwise (-Exxxx). */ int (*checkentry)(const struct xt_tgchk_param *); + /* Called to validate hooks based on the target configuration. */ + int (*check_hooks)(const struct xt_tgchk_param *); + /* Called when entry of this type deleted. */ void (*destroy)(const struct xt_tgdtor_param *); #ifdef CONFIG_NETFILTER_XTABLES_COMPAT @@ -279,8 +285,10 @@ bool xt_find_jump_offset(const unsigned int *offsets, int xt_check_proc_name(const char *name, unsigned int size); +int xt_check_hooks_match(struct xt_mtchk_param *par); int xt_check_match(struct xt_mtchk_param *, unsigned int size, u16 proto, bool inv_proto); +int xt_check_hooks_target(struct xt_tgchk_param *par); int xt_check_target(struct xt_tgchk_param *, unsigned int size, u16 proto, bool inv_proto); @@ -297,9 +305,11 @@ struct xt_counters *xt_counters_alloc(unsigned int counters); struct xt_table *xt_register_table(struct net *net, const struct xt_table *table, + const struct nf_hook_ops *template_ops, struct xt_table_info *bootstrap, struct xt_table_info *newinfo); -void *xt_unregister_table(struct xt_table *table); +void xt_unregister_table_pre_exit(struct net *net, u8 af, const char *name); +struct xt_table *xt_unregister_table_exit(struct net *net, u8 af, const char *name); struct xt_table_info *xt_replace_table(struct xt_table *table, unsigned int num_counters, diff --git a/include/linux/netfilter_arp/arp_tables.h b/include/linux/netfilter_arp/arp_tables.h index a40aaf645fa479..05631a25e62293 100644 --- a/include/linux/netfilter_arp/arp_tables.h +++ b/include/linux/netfilter_arp/arp_tables.h @@ -53,7 +53,6 @@ int arpt_register_table(struct net *net, const struct xt_table *table, const struct arpt_replace *repl, const struct nf_hook_ops *ops); void arpt_unregister_table(struct net *net, const char *name); -void arpt_unregister_table_pre_exit(struct net *net, const char *name); extern unsigned int arpt_do_table(void *priv, struct sk_buff *skb, const struct nf_hook_state *state); diff --git a/include/linux/netfilter_ipv4/ip_tables.h b/include/linux/netfilter_ipv4/ip_tables.h index 132b0e4a6d4df6..13593391d6058d 100644 --- a/include/linux/netfilter_ipv4/ip_tables.h +++ b/include/linux/netfilter_ipv4/ip_tables.h @@ -26,7 +26,6 @@ int ipt_register_table(struct net *net, const struct xt_table *table, const struct ipt_replace *repl, const struct nf_hook_ops *ops); -void ipt_unregister_table_pre_exit(struct net *net, const char *name); void ipt_unregister_table_exit(struct net *net, const char *name); /* Standard entry. */ diff --git a/include/linux/netfilter_ipv6/ip6_tables.h b/include/linux/netfilter_ipv6/ip6_tables.h index 8b8885a73c7649..c6d5b927830dd3 100644 --- a/include/linux/netfilter_ipv6/ip6_tables.h +++ b/include/linux/netfilter_ipv6/ip6_tables.h @@ -27,7 +27,6 @@ extern void *ip6t_alloc_initial_table(const struct xt_table *); int ip6t_register_table(struct net *net, const struct xt_table *table, const struct ip6t_replace *repl, const struct nf_hook_ops *ops); -void ip6t_unregister_table_pre_exit(struct net *net, const char *name); void ip6t_unregister_table_exit(struct net *net, const char *name); extern unsigned int ip6t_do_table(void *priv, struct sk_buff *skb, const struct nf_hook_state *state); diff --git a/include/linux/of.h b/include/linux/of.h index 959786f8f19662..28153616e616a3 100644 --- a/include/linux/of.h +++ b/include/linux/of.h @@ -1476,6 +1476,13 @@ static inline int of_property_read_s32(const struct device_node *np, return of_property_read_u32(np, propname, (u32*) out_value); } +static inline int of_property_read_s32_index(const struct device_node *np, + const char *propname, u32 index, + s32 *out_value) +{ + return of_property_read_u32_index(np, propname, index, (u32 *)out_value); +} + #define of_for_each_phandle(it, err, np, ln, cn, cc) \ for (of_phandle_iterator_init((it), (np), (ln), (cn), (cc)), \ err = of_phandle_iterator_next(it); \ diff --git a/include/linux/rseq.h b/include/linux/rseq.h index b9d62fc2140dd1..7ef79b25e714b9 100644 --- a/include/linux/rseq.h +++ b/include/linux/rseq.h @@ -9,6 +9,11 @@ void __rseq_handle_slowpath(struct pt_regs *regs); +static __always_inline bool rseq_v2(struct task_struct *t) +{ + return IS_ENABLED(CONFIG_GENERIC_IRQ_ENTRY) && likely(t->rseq.event.has_rseq > 1); +} + /* Invoked from resume_user_mode_work() */ static inline void rseq_handle_slowpath(struct pt_regs *regs) { @@ -16,8 +21,7 @@ static inline void rseq_handle_slowpath(struct pt_regs *regs) if (current->rseq.event.slowpath) __rseq_handle_slowpath(regs); } else { - /* '&' is intentional to spare one conditional branch */ - if (current->rseq.event.sched_switch & current->rseq.event.has_rseq) + if (current->rseq.event.sched_switch && current->rseq.event.has_rseq) __rseq_handle_slowpath(regs); } } @@ -30,9 +34,9 @@ void __rseq_signal_deliver(int sig, struct pt_regs *regs); */ static inline void rseq_signal_deliver(struct ksignal *ksig, struct pt_regs *regs) { - if (IS_ENABLED(CONFIG_GENERIC_IRQ_ENTRY)) { - /* '&' is intentional to spare one conditional branch */ - if (current->rseq.event.has_rseq & current->rseq.event.user_irq) + if (rseq_v2(current)) { + /* has_rseq is implied in rseq_v2() */ + if (current->rseq.event.user_irq) __rseq_signal_deliver(ksig->sig, regs); } else { if (current->rseq.event.has_rseq) @@ -50,15 +54,22 @@ static __always_inline void rseq_sched_switch_event(struct task_struct *t) { struct rseq_event *ev = &t->rseq.event; - if (IS_ENABLED(CONFIG_GENERIC_IRQ_ENTRY)) { + /* + * Only apply the user_irq optimization for RSEQ ABI V2 registrations. + * Legacy users like TCMalloc rely on the original ABI V1 behaviour + * which updates IDs on every context swtich. + */ + if (rseq_v2(t)) { /* - * Avoid a boat load of conditionals by using simple logic - * to determine whether NOTIFY_RESUME needs to be raised. + * Avoid a boat load of conditionals by using simple logic to + * determine whether TIF_NOTIFY_RESUME or TIF_RSEQ needs to be + * raised. * - * It's required when the CPU or MM CID has changed or - * the entry was from user space. + * It's required when the CPU or MM CID has changed or the entry + * was via interrupt from user space. ev->has_rseq does not have + * to be evaluated here because rseq_v2() implies has_rseq. */ - bool raise = (ev->user_irq | ev->ids_changed) & ev->has_rseq; + bool raise = ev->user_irq | ev->ids_changed; if (raise) { ev->sched_switch = true; @@ -66,6 +77,7 @@ static __always_inline void rseq_sched_switch_event(struct task_struct *t) } } else { if (ev->has_rseq) { + t->rseq.event.ids_changed = true; t->rseq.event.sched_switch = true; rseq_raise_notify_resume(t); } @@ -119,6 +131,8 @@ static inline void rseq_virt_userspace_exit(void) static inline void rseq_reset(struct task_struct *t) { + /* Protect against preemption and membarrier IPI */ + guard(irqsave)(); memset(&t->rseq, 0, sizeof(t->rseq)); t->rseq.ids.cpu_id = RSEQ_CPU_ID_UNINITIALIZED; } @@ -159,6 +173,7 @@ static inline unsigned int rseq_alloc_align(void) } #else /* CONFIG_RSEQ */ +static inline bool rseq_v2(struct task_struct *t) { return false; } static inline void rseq_handle_slowpath(struct pt_regs *regs) { } static inline void rseq_signal_deliver(struct ksignal *ksig, struct pt_regs *regs) { } static inline void rseq_sched_switch_event(struct task_struct *t) { } diff --git a/include/linux/rseq_entry.h b/include/linux/rseq_entry.h index f11ebd34f8b955..63bc72086e75bb 100644 --- a/include/linux/rseq_entry.h +++ b/include/linux/rseq_entry.h @@ -111,6 +111,20 @@ static __always_inline void rseq_slice_clear_grant(struct task_struct *t) t->rseq.slice.state.granted = false; } +/* + * Open coded, so it can be invoked within a user access region. + * + * This clears the user space state of the time slice extensions field only when + * the task has registered the optimized RSEQ_ABI V2. Some legacy registrations, + * e.g. TCMalloc, have conflicting non-ABI fields in struct RSEQ, which would be + * overwritten by an unconditional write. + */ +#define rseq_slice_clear_user(rseq, efault) \ +do { \ + if (rseq_slice_extension_enabled()) \ + unsafe_put_user(0U, &rseq->slice_ctrl.all, efault); \ +} while (0) + static __always_inline bool __rseq_grant_slice_extension(bool work_pending) { struct task_struct *curr = current; @@ -230,10 +244,10 @@ static __always_inline bool rseq_slice_extension_enabled(void) { return false; } static __always_inline bool rseq_arm_slice_extension_timer(void) { return false; } static __always_inline void rseq_slice_clear_grant(struct task_struct *t) { } static __always_inline bool rseq_grant_slice_extension(unsigned long ti_work, unsigned long mask) { return false; } +#define rseq_slice_clear_user(rseq, efault) do { } while (0) #endif /* !CONFIG_RSEQ_SLICE_EXTENSION */ bool rseq_debug_update_user_cs(struct task_struct *t, struct pt_regs *regs, unsigned long csaddr); -bool rseq_debug_validate_ids(struct task_struct *t); static __always_inline void rseq_note_user_irq_entry(void) { @@ -353,43 +367,6 @@ bool rseq_debug_update_user_cs(struct task_struct *t, struct pt_regs *regs, return false; } -/* - * On debug kernels validate that user space did not mess with it if the - * debug branch is enabled. - */ -bool rseq_debug_validate_ids(struct task_struct *t) -{ - struct rseq __user *rseq = t->rseq.usrptr; - u32 cpu_id, uval, node_id; - - /* - * On the first exit after registering the rseq region CPU ID is - * RSEQ_CPU_ID_UNINITIALIZED and node_id in user space is 0! - */ - node_id = t->rseq.ids.cpu_id != RSEQ_CPU_ID_UNINITIALIZED ? - cpu_to_node(t->rseq.ids.cpu_id) : 0; - - scoped_user_read_access(rseq, efault) { - unsafe_get_user(cpu_id, &rseq->cpu_id_start, efault); - if (cpu_id != t->rseq.ids.cpu_id) - goto die; - unsafe_get_user(uval, &rseq->cpu_id, efault); - if (uval != cpu_id) - goto die; - unsafe_get_user(uval, &rseq->node_id, efault); - if (uval != node_id) - goto die; - unsafe_get_user(uval, &rseq->mm_cid, efault); - if (uval != t->rseq.ids.mm_cid) - goto die; - } - return true; -die: - t->rseq.event.fatal = true; -efault: - return false; -} - #endif /* RSEQ_BUILD_SLOW_PATH */ /* @@ -499,37 +476,50 @@ rseq_update_user_cs(struct task_struct *t, struct pt_regs *regs, unsigned long c * faults in task context are fatal too. */ static rseq_inline -bool rseq_set_ids_get_csaddr(struct task_struct *t, struct rseq_ids *ids, - u32 node_id, u64 *csaddr) +bool rseq_set_ids_get_csaddr(struct task_struct *t, struct rseq_ids *ids, u64 *csaddr) { struct rseq __user *rseq = t->rseq.usrptr; - if (static_branch_unlikely(&rseq_debug_enabled)) { - if (!rseq_debug_validate_ids(t)) - return false; - } - scoped_user_rw_access(rseq, efault) { + /* Validate the R/O fields for debug and optimized mode */ + if (static_branch_unlikely(&rseq_debug_enabled) || rseq_v2(t)) { + u32 cpu_id, uval; + + unsafe_get_user(cpu_id, &rseq->cpu_id_start, efault); + if (cpu_id != t->rseq.ids.cpu_id) + goto die; + unsafe_get_user(uval, &rseq->cpu_id, efault); + if (uval != cpu_id) + goto die; + unsafe_get_user(uval, &rseq->node_id, efault); + if (uval != t->rseq.ids.node_id) + goto die; + unsafe_get_user(uval, &rseq->mm_cid, efault); + if (uval != t->rseq.ids.mm_cid) + goto die; + } + unsafe_put_user(ids->cpu_id, &rseq->cpu_id_start, efault); unsafe_put_user(ids->cpu_id, &rseq->cpu_id, efault); - unsafe_put_user(node_id, &rseq->node_id, efault); + unsafe_put_user(ids->node_id, &rseq->node_id, efault); unsafe_put_user(ids->mm_cid, &rseq->mm_cid, efault); if (csaddr) unsafe_get_user(*csaddr, &rseq->rseq_cs, efault); - /* Open coded, so it's in the same user access region */ - if (rseq_slice_extension_enabled()) { - /* Unconditionally clear it, no point in conditionals */ - unsafe_put_user(0U, &rseq->slice_ctrl.all, efault); - } + /* RSEQ ABI V2 only operations */ + if (rseq_v2(t)) + rseq_slice_clear_user(rseq, efault); } rseq_slice_clear_grant(t); /* Cache the new values */ - t->rseq.ids.cpu_cid = ids->cpu_cid; + t->rseq.ids = *ids; rseq_stat_inc(rseq_stats.ids); rseq_trace_update(t, ids); return true; + +die: + t->rseq.event.fatal = true; efault: return false; } @@ -539,11 +529,11 @@ bool rseq_set_ids_get_csaddr(struct task_struct *t, struct rseq_ids *ids, * is in a critical section. */ static rseq_inline bool rseq_update_usr(struct task_struct *t, struct pt_regs *regs, - struct rseq_ids *ids, u32 node_id) + struct rseq_ids *ids) { u64 csaddr; - if (!rseq_set_ids_get_csaddr(t, ids, node_id, &csaddr)) + if (!rseq_set_ids_get_csaddr(t, ids, &csaddr)) return false; /* @@ -612,6 +602,14 @@ static __always_inline bool rseq_exit_user_update(struct pt_regs *regs, struct t * interrupts disabled */ guard(pagefault)(); + /* + * This optimization is only valid when the task registered for the + * optimized RSEQ_ABI_V2 variant. Some legacy users rely on the original + * RSEQ implementation behaviour which unconditionally updated the IDs. + * rseq_sched_switch_event() ensures that legacy registrations always + * have both sched_switch and ids_changed set, which is compatible with + * the historical TIF_NOTIFY_RESUME behaviour. + */ if (likely(!t->rseq.event.ids_changed)) { struct rseq __user *rseq = t->rseq.usrptr; /* @@ -623,11 +621,9 @@ static __always_inline bool rseq_exit_user_update(struct pt_regs *regs, struct t scoped_user_rw_access(rseq, efault) { unsafe_get_user(csaddr, &rseq->rseq_cs, efault); - /* Open coded, so it's in the same user access region */ - if (rseq_slice_extension_enabled()) { - /* Unconditionally clear it, no point in conditionals */ - unsafe_put_user(0U, &rseq->slice_ctrl.all, efault); - } + /* RSEQ ABI V2 only operations */ + if (rseq_v2(t)) + rseq_slice_clear_user(rseq, efault); } rseq_slice_clear_grant(t); @@ -640,12 +636,12 @@ static __always_inline bool rseq_exit_user_update(struct pt_regs *regs, struct t } struct rseq_ids ids = { - .cpu_id = task_cpu(t), - .mm_cid = task_mm_cid(t), + .cpu_id = task_cpu(t), + .mm_cid = task_mm_cid(t), + .node_id = cpu_to_node(ids.cpu_id), }; - u32 node_id = cpu_to_node(ids.cpu_id); - return rseq_update_usr(t, regs, &ids, node_id); + return rseq_update_usr(t, regs, &ids); efault: return false; } @@ -753,24 +749,6 @@ static __always_inline void rseq_irqentry_exit_to_user_mode(void) ev->events = 0; } -/* Required to keep ARM64 working */ -static __always_inline void rseq_exit_to_user_mode_legacy(void) -{ - struct rseq_event *ev = ¤t->rseq.event; - - rseq_stat_inc(rseq_stats.exit); - - if (static_branch_unlikely(&rseq_debug_enabled)) - WARN_ON_ONCE(ev->sched_switch); - - /* - * Ensure that event (especially user_irq) is cleared when the - * interrupt did not result in a schedule and therefore the - * rseq processing did not clear it. - */ - ev->events = 0; -} - void __rseq_debug_syscall_return(struct pt_regs *regs); static __always_inline void rseq_debug_syscall_return(struct pt_regs *regs) @@ -786,7 +764,6 @@ static inline bool rseq_exit_to_user_mode_restart(struct pt_regs *regs, unsigned } static inline void rseq_syscall_exit_to_user_mode(void) { } static inline void rseq_irqentry_exit_to_user_mode(void) { } -static inline void rseq_exit_to_user_mode_legacy(void) { } static inline void rseq_debug_syscall_return(struct pt_regs *regs) { } static inline bool rseq_grant_slice_extension(unsigned long ti_work, unsigned long mask) { return false; } #endif /* !CONFIG_RSEQ */ diff --git a/include/linux/rseq_types.h b/include/linux/rseq_types.h index 0b42045988db00..85739a63e85e6f 100644 --- a/include/linux/rseq_types.h +++ b/include/linux/rseq_types.h @@ -9,6 +9,12 @@ #ifdef CONFIG_RSEQ struct rseq; +/* + * rseq_event::has_rseq contains the ABI version number so preserving it + * in AND operations requires a mask. + */ +#define RSEQ_HAS_RSEQ_VERSION_MASK 0xff + /** * struct rseq_event - Storage for rseq related event management * @all: Compound to initialize and clear the data efficiently @@ -17,7 +23,8 @@ struct rseq; * exit to user * @ids_changed: Indicator that IDs need to be updated * @user_irq: True on interrupt entry from user mode - * @has_rseq: True if the task has a rseq pointer installed + * @has_rseq: Greater than 0 if the task has a rseq pointer installed. + * Contains the RSEQ version number * @error: Compound error code for the slow path to analyze * @fatal: User space data corrupted or invalid * @slowpath: Indicator that slow path processing via TIF_NOTIFY_RESUME @@ -59,8 +66,9 @@ struct rseq_event { * compiler emit a single compare on 64-bit * @cpu_id: The CPU ID which was written last to user space * @mm_cid: The MM CID which was written last to user space + * @node_id: The node ID which was written last to user space * - * @cpu_id and @mm_cid are updated when the data is written to user space. + * @cpu_id, @mm_cid and @node_id are updated when the data is written to user space. */ struct rseq_ids { union { @@ -70,6 +78,7 @@ struct rseq_ids { u32 mm_cid; }; }; + u32 node_id; }; /** diff --git a/include/linux/sched.h b/include/linux/sched.h index 368c7b4d7cb510..ee06cba5c6f538 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1002,6 +1002,9 @@ struct task_struct { unsigned sched_rt_mutex:1; #endif + /* Save user-dumpable when mm goes away */ + unsigned user_dumpable:1; + /* Bit to tell TOMOYO we're in execve(): */ unsigned in_execve:1; unsigned in_iowait:1; diff --git a/include/linux/sched/deadline.h b/include/linux/sched/deadline.h index 1198138cb839eb..273538200a4459 100644 --- a/include/linux/sched/deadline.h +++ b/include/linux/sched/deadline.h @@ -33,6 +33,15 @@ struct root_domain; extern void dl_add_task_root_domain(struct task_struct *p); extern void dl_clear_root_domain(struct root_domain *rd); extern void dl_clear_root_domain_cpu(int cpu); +/* + * Return whether moving DL task @p to @new_mask requires moving DL + * bandwidth accounting between root domains. This helper is specific to + * DL bandwidth move accounting semantics and is shared by + * cpuset_can_attach() and set_cpus_allowed_dl() so both paths use the + * same source root-domain test. + */ +extern bool dl_task_needs_bw_move(struct task_struct *p, + const struct cpumask *new_mask); extern u64 dl_cookie; extern bool dl_bw_visited(int cpu, u64 cookie); diff --git a/include/linux/sched/ext.h b/include/linux/sched/ext.h index 1a3af2ea2a794a..2129e18ada58ba 100644 --- a/include/linux/sched/ext.h +++ b/include/linux/sched/ext.h @@ -103,21 +103,25 @@ enum scx_ent_flags { SCX_TASK_IMMED = 1 << 5, /* task is on local DSQ with %SCX_ENQ_IMMED */ /* - * Bits 8 and 9 are used to carry task state: + * Bits 8 to 10 are used to carry task state: * * NONE ops.init_task() not called yet + * INIT_BEGIN ops.init_task() in flight; see sched_ext_dead() * INIT ops.init_task() succeeded, but task can be cancelled * READY fully initialized, but not in sched_ext * ENABLED fully initialized and in sched_ext + * DEAD terminal state set by sched_ext_dead() */ - SCX_TASK_STATE_SHIFT = 8, /* bits 8 and 9 are used to carry task state */ - SCX_TASK_STATE_BITS = 2, + SCX_TASK_STATE_SHIFT = 8, + SCX_TASK_STATE_BITS = 3, SCX_TASK_STATE_MASK = ((1 << SCX_TASK_STATE_BITS) - 1) << SCX_TASK_STATE_SHIFT, SCX_TASK_NONE = 0 << SCX_TASK_STATE_SHIFT, - SCX_TASK_INIT = 1 << SCX_TASK_STATE_SHIFT, - SCX_TASK_READY = 2 << SCX_TASK_STATE_SHIFT, - SCX_TASK_ENABLED = 3 << SCX_TASK_STATE_SHIFT, + SCX_TASK_INIT_BEGIN = 1 << SCX_TASK_STATE_SHIFT, + SCX_TASK_INIT = 2 << SCX_TASK_STATE_SHIFT, + SCX_TASK_READY = 3 << SCX_TASK_STATE_SHIFT, + SCX_TASK_ENABLED = 4 << SCX_TASK_STATE_SHIFT, + SCX_TASK_DEAD = 5 << SCX_TASK_STATE_SHIFT, /* * Bits 12 and 13 are used to carry reenqueue reason. In addition to diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h index dc3975ff1b2e1a..cf0fd03dd7a24e 100644 --- a/include/linux/sched/isolation.h +++ b/include/linux/sched/isolation.h @@ -20,6 +20,11 @@ enum hk_type { HK_TYPE_KERNEL_NOISE, HK_TYPE_MAX, + /* + * HK_TYPE_KTHREAD is now an alias of HK_TYPE_DOMAIN + */ + HK_TYPE_KTHREAD = HK_TYPE_DOMAIN, + /* * The following housekeeping types are only set by the nohz_full * boot commandline option. So they can share the same value. @@ -29,7 +34,6 @@ enum hk_type { HK_TYPE_RCU = HK_TYPE_KERNEL_NOISE, HK_TYPE_MISC = HK_TYPE_KERNEL_NOISE, HK_TYPE_WQ = HK_TYPE_KERNEL_NOISE, - HK_TYPE_KTHREAD = HK_TYPE_KERNEL_NOISE }; #ifdef CONFIG_CPU_ISOLATION diff --git a/include/linux/slab.h b/include/linux/slab.h index 15a60b501b95b6..2b5ab488e96b07 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -1234,6 +1234,9 @@ void *kvrealloc_node_align_noprof(const void *p, size_t size, unsigned long alig extern void kvfree(const void *addr); DEFINE_FREE(kvfree, void *, if (!IS_ERR_OR_NULL(_T)) kvfree(_T)) +extern void kvfree_atomic(const void *addr); +DEFINE_FREE(kvfree_atomic, void *, if (!IS_ERR_OR_NULL(_T)) kvfree_atomic(_T)) + extern void kvfree_sensitive(const void *addr, size_t len); unsigned int kmem_cache_size(struct kmem_cache *s); diff --git a/fs/smb/smbdirect/public.h b/include/linux/smbdirect.h similarity index 76% rename from fs/smb/smbdirect/public.h rename to include/linux/smbdirect.h index 50088155e7c37f..97f5ba730fa74c 100644 --- a/fs/smb/smbdirect/public.h +++ b/include/linux/smbdirect.h @@ -3,18 +3,56 @@ * Copyright (C) 2025, Stefan Metzmacher */ -#ifndef __FS_SMB_COMMON_SMBDIRECT_SMBDIRECT_PUBLIC_H__ -#define __FS_SMB_COMMON_SMBDIRECT_SMBDIRECT_PUBLIC_H__ +#ifndef __LINUX_SMBDIRECT_H__ +#define __LINUX_SMBDIRECT_H__ -struct smbdirect_buffer_descriptor_v1; -struct smbdirect_socket_parameters; +#include + +/* SMB-DIRECT buffer descriptor V1 structure [MS-SMBD] 2.2.3.1 */ +struct smbdirect_buffer_descriptor_v1 { + __le64 offset; + __le32 token; + __le32 length; +} __packed; + +/* + * Connection parameters mostly from [MS-SMBD] 3.1.1.1 + * + * These are setup and negotiated at the beginning of a + * connection and remain constant unless explicitly changed. + * + * Some values are important for the upper layer. + */ +struct smbdirect_socket_parameters { + __u64 flags; +#define SMBDIRECT_FLAG_PORT_RANGE_ONLY_IB ((__u64)0x1) +#define SMBDIRECT_FLAG_PORT_RANGE_ONLY_IW ((__u64)0x2) + __u32 resolve_addr_timeout_msec; + __u32 resolve_route_timeout_msec; + __u32 rdma_connect_timeout_msec; + __u32 negotiate_timeout_msec; + __u16 initiator_depth; /* limited to U8_MAX */ + __u16 responder_resources; /* limited to U8_MAX */ + __u16 recv_credit_max; + __u16 send_credit_target; + __u32 max_send_size; + __u32 max_fragmented_send_size; + __u32 max_recv_size; + __u32 max_fragmented_recv_size; + __u32 max_read_write_size; + __u32 max_frmr_depth; + __u32 keepalive_interval_msec; + __u32 keepalive_timeout_msec; +} __packed; + +#define SMBDIRECT_FLAG_PORT_RANGE_MASK ( \ + SMBDIRECT_FLAG_PORT_RANGE_ONLY_IB | \ + SMBDIRECT_FLAG_PORT_RANGE_ONLY_IW) struct smbdirect_socket; struct smbdirect_send_batch; struct smbdirect_mr_io; -#define __SMBDIRECT_EXPORT_SYMBOL__(__sym) EXPORT_SYMBOL_FOR_MODULES(__sym, "cifs,ksmbd") - #include u8 smbdirect_netdev_rdma_capable_node_type(struct net_device *netdev); @@ -145,4 +183,4 @@ void smbdirect_connection_legacy_debug_proc_show(struct smbdirect_socket *sc, unsigned int rdma_readwrite_threshold, struct seq_file *m); -#endif /* __FS_SMB_COMMON_SMBDIRECT_SMBDIRECT_PUBLIC_H__ */ +#endif /* __LINUX_SMBDIRECT_H__ */ diff --git a/include/linux/soundwire/sdw.h b/include/linux/soundwire/sdw.h index 6147eb1fb210d6..a46cbaec594912 100644 --- a/include/linux/soundwire/sdw.h +++ b/include/linux/soundwire/sdw.h @@ -1093,6 +1093,8 @@ int sdw_slave_get_current_bank(struct sdw_slave *sdev); int sdw_slave_get_scale_index(struct sdw_slave *slave, u8 *base); +int sdw_slave_wait_for_init(struct sdw_slave *slave, int timeout_ms); + /* messaging and data APIs */ int sdw_read(struct sdw_slave *slave, u32 addr); int sdw_write(struct sdw_slave *slave, u32 addr, u8 value); @@ -1136,6 +1138,12 @@ static inline int sdw_slave_get_current_bank(struct sdw_slave *sdev) return -EINVAL; } +static inline int sdw_slave_wait_for_init(struct sdw_slave *slave, int timeout_ms) +{ + WARN_ONCE(1, "SoundWire API is disabled"); + return -EINVAL; +} + /* messaging and data APIs */ static inline int sdw_read(struct sdw_slave *slave, u32 addr) { diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h index 2ebba746c18f7b..89165b769e5c16 100644 --- a/include/linux/vfio_pci_core.h +++ b/include/linux/vfio_pci_core.h @@ -21,7 +21,7 @@ #define VFIO_PCI_CORE_H #define VFIO_PCI_OFFSET_SHIFT 40 -#define VFIO_PCI_OFFSET_TO_INDEX(off) (off >> VFIO_PCI_OFFSET_SHIFT) +#define VFIO_PCI_OFFSET_TO_INDEX(off) ((u64)(off) >> VFIO_PCI_OFFSET_SHIFT) #define VFIO_PCI_INDEX_TO_OFFSET(index) ((u64)(index) << VFIO_PCI_OFFSET_SHIFT) #define VFIO_PCI_OFFSET_MASK (((u64)(1) << VFIO_PCI_OFFSET_SHIFT) - 1) diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h index ab6cb70ca1a52c..6177624539b3b3 100644 --- a/include/linux/workqueue.h +++ b/include/linux/workqueue.h @@ -534,8 +534,10 @@ alloc_workqueue_noprof(const char *fmt, unsigned int flags, int max_active, ...) * Pointer to the allocated workqueue on success, %NULL on failure. */ __printf(2, 5) struct workqueue_struct * -devm_alloc_workqueue(struct device *dev, const char *fmt, unsigned int flags, - int max_active, ...); +devm_alloc_workqueue_noprof(struct device *dev, const char *fmt, + unsigned int flags, int max_active, ...); +#define devm_alloc_workqueue(...) \ + alloc_hooks(devm_alloc_workqueue_noprof(__VA_ARGS__)) #ifdef CONFIG_LOCKDEP /** diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h index a7bffb908c1ec9..aa600fbf9a5359 100644 --- a/include/net/bluetooth/hci_core.h +++ b/include/net/bluetooth/hci_core.h @@ -2495,7 +2495,7 @@ void mgmt_adv_monitor_device_lost(struct hci_dev *hdev, u16 handle, bdaddr_t *bdaddr, u8 addr_type); int hci_abort_conn(struct hci_conn *conn, u8 reason); -u8 hci_le_conn_update(struct hci_conn *conn, u16 min, u16 max, u16 latency, +void hci_le_conn_update(struct hci_conn *conn, u16 min, u16 max, u16 latency, u16 to_multiplier); void hci_le_start_enc(struct hci_conn *conn, __le16 ediv, __le64 rand, __u8 ltk[16], __u8 key_size); diff --git a/include/net/dropreason-core.h b/include/net/dropreason-core.h index e0ca3904ff8e0c..2f312d1f67d698 100644 --- a/include/net/dropreason-core.h +++ b/include/net/dropreason-core.h @@ -99,6 +99,7 @@ FN(FRAG_TOO_FAR) \ FN(TCP_MINTTL) \ FN(IPV6_BAD_EXTHDR) \ + FN(IPV6_TOO_MANY_EXTHDRS) \ FN(IPV6_NDISC_FRAG) \ FN(IPV6_NDISC_HOP_LIMIT) \ FN(IPV6_NDISC_BAD_CODE) \ @@ -494,6 +495,11 @@ enum skb_drop_reason { SKB_DROP_REASON_TCP_MINTTL, /** @SKB_DROP_REASON_IPV6_BAD_EXTHDR: Bad IPv6 extension header. */ SKB_DROP_REASON_IPV6_BAD_EXTHDR, + /** + * @SKB_DROP_REASON_IPV6_TOO_MANY_EXTHDRS: Number of IPv6 extension + * headers in the packet exceeds IP6_MAX_EXT_HDRS_CNT. + */ + SKB_DROP_REASON_IPV6_TOO_MANY_EXTHDRS, /** @SKB_DROP_REASON_IPV6_NDISC_FRAG: invalid frag (suppress_frag_ndisc). */ SKB_DROP_REASON_IPV6_NDISC_FRAG, /** @SKB_DROP_REASON_IPV6_NDISC_HOP_LIMIT: invalid hop limit. */ diff --git a/include/net/genetlink.h b/include/net/genetlink.h index 7b84f2cef8b1f9..d70510ac31ab54 100644 --- a/include/net/genetlink.h +++ b/include/net/genetlink.h @@ -489,8 +489,10 @@ genlmsg_multicast_netns_filtered(const struct genl_family *family, netlink_filter_fn filter, void *filter_data) { - if (WARN_ON_ONCE(group >= family->n_mcgrps)) + if (WARN_ON_ONCE(group >= family->n_mcgrps)) { + nlmsg_free(skb); return -EINVAL; + } group = family->mcgrp_offset + group; return nlmsg_multicast_filtered(net->genl_sock, skb, portid, group, flags, filter, filter_data); diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h index 72d325c813132d..02762ce73a0ce4 100644 --- a/include/net/ip_vs.h +++ b/include/net/ip_vs.h @@ -491,6 +491,7 @@ struct ip_vs_est_kt_data { DECLARE_BITMAP(avail, IPVS_EST_NTICKS); /* tick has space for ests */ unsigned long est_timer; /* estimation timer (jiffies) */ struct ip_vs_stats *calc_stats; /* Used for calculation */ + int needed; /* task is needed */ int tick_len[IPVS_EST_NTICKS]; /* est count */ int id; /* ktid per netns */ int chain_max; /* max ests per tick chain */ @@ -1411,7 +1412,7 @@ static inline int sysctl_run_estimation(struct netns_ipvs *ipvs) return ipvs->sysctl_run_estimation; } -static inline const struct cpumask *sysctl_est_cpulist(struct netns_ipvs *ipvs) +static inline const struct cpumask *__sysctl_est_cpulist(struct netns_ipvs *ipvs) { if (ipvs->est_cpulist_valid) return ipvs->sysctl_est_cpulist; @@ -1529,7 +1530,7 @@ static inline int sysctl_run_estimation(struct netns_ipvs *ipvs) return 1; } -static inline const struct cpumask *sysctl_est_cpulist(struct netns_ipvs *ipvs) +static inline const struct cpumask *__sysctl_est_cpulist(struct netns_ipvs *ipvs) { return housekeeping_cpumask(HK_TYPE_KTHREAD); } @@ -1564,6 +1565,18 @@ static inline int sysctl_svc_lfactor(struct netns_ipvs *ipvs) return READ_ONCE(ipvs->sysctl_svc_lfactor); } +static inline bool sysctl_est_cpulist_empty(struct netns_ipvs *ipvs) +{ + guard(rcu)(); + return cpumask_empty(__sysctl_est_cpulist(ipvs)); +} + +static inline unsigned int sysctl_est_cpulist_weight(struct netns_ipvs *ipvs) +{ + guard(rcu)(); + return cpumask_weight(__sysctl_est_cpulist(ipvs)); +} + /* IPVS core functions * (from ip_vs_core.c) */ @@ -1884,18 +1897,26 @@ int ip_vs_start_estimator(struct netns_ipvs *ipvs, struct ip_vs_stats *stats); void ip_vs_stop_estimator(struct netns_ipvs *ipvs, struct ip_vs_stats *stats); void ip_vs_zero_estimator(struct ip_vs_stats *stats); void ip_vs_read_estimator(struct ip_vs_kstats *dst, struct ip_vs_stats *stats); -void ip_vs_est_reload_start(struct netns_ipvs *ipvs); +void ip_vs_est_reload_start(struct netns_ipvs *ipvs, bool restart); int ip_vs_est_kthread_start(struct netns_ipvs *ipvs, struct ip_vs_est_kt_data *kd); void ip_vs_est_kthread_stop(struct ip_vs_est_kt_data *kd); +static inline void ip_vs_stop_estimator_tot_stats(struct netns_ipvs *ipvs) +{ +#ifdef CONFIG_SYSCTL + ip_vs_stop_estimator(ipvs, &ipvs->tot_stats->s); + ipvs->tot_stats->s.est.ktid = -2; +#endif +} + static inline void ip_vs_est_stopped_recalc(struct netns_ipvs *ipvs) { #ifdef CONFIG_SYSCTL /* Stop tasks while cpulist is empty or if disabled with flag */ ipvs->est_stopped = !sysctl_run_estimation(ipvs) || (ipvs->est_cpulist_valid && - cpumask_empty(sysctl_est_cpulist(ipvs))); + sysctl_est_cpulist_empty(ipvs)); #endif } @@ -1911,7 +1932,7 @@ static inline bool ip_vs_est_stopped(struct netns_ipvs *ipvs) static inline int ip_vs_est_max_threads(struct netns_ipvs *ipvs) { unsigned int limit = IPVS_EST_CPU_KTHREADS * - cpumask_weight(sysctl_est_cpulist(ipvs)); + sysctl_est_cpulist_weight(ipvs); return max(1U, limit); } diff --git a/include/net/ipv6.h b/include/net/ipv6.h index d042afe7a24564..1dec81faff282f 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -90,6 +90,9 @@ struct ip_tunnel_info; #define IP6_DEFAULT_MAX_DST_OPTS_LEN INT_MAX /* No limit */ #define IP6_DEFAULT_MAX_HBH_OPTS_LEN INT_MAX /* No limit */ +/* Hard limit on traversed IPv6 extension headers */ +#define IP6_MAX_EXT_HDRS_CNT 12 + /* * Addr type * diff --git a/include/net/macsec.h b/include/net/macsec.h index bc7de5b53e5431..d962093ee92375 100644 --- a/include/net/macsec.h +++ b/include/net/macsec.h @@ -9,6 +9,7 @@ #include #include +#include #include #include @@ -123,6 +124,7 @@ struct macsec_dev_stats { * @key: key structure * @ssci: short secure channel identifier * @stats: per-SA stats + * @destroy_work: deferred work to free the SA in process context after RCU grace period */ struct macsec_rx_sa { struct macsec_key key; @@ -136,7 +138,7 @@ struct macsec_rx_sa { bool active; struct macsec_rx_sa_stats __percpu *stats; struct macsec_rx_sc *sc; - struct rcu_head rcu; + struct rcu_work destroy_work; }; struct pcpu_rx_sc_stats { @@ -174,6 +176,7 @@ struct macsec_rx_sc { * @key: key structure * @ssci: short secure channel identifier * @stats: per-SA stats + * @destroy_work: deferred work to free the SA in process context after RCU grace period */ struct macsec_tx_sa { struct macsec_key key; @@ -186,7 +189,7 @@ struct macsec_tx_sa { refcount_t refcnt; bool active; struct macsec_tx_sa_stats __percpu *stats; - struct rcu_head rcu; + struct rcu_work destroy_work; }; /** diff --git a/include/net/mana/shm_channel.h b/include/net/mana/shm_channel.h index 5199b41497fff9..dbabcfb95daf3e 100644 --- a/include/net/mana/shm_channel.h +++ b/include/net/mana/shm_channel.h @@ -4,6 +4,12 @@ #ifndef _SHM_CHANNEL_H #define _SHM_CHANNEL_H +#define SMC_APERTURE_BITS 256 +#define SMC_BASIC_UNIT (sizeof(u32)) +#define SMC_APERTURE_DWORDS (SMC_APERTURE_BITS / (SMC_BASIC_UNIT * 8)) +#define SMC_LAST_DWORD (SMC_APERTURE_DWORDS - 1) +#define SMC_APERTURE_SIZE (SMC_APERTURE_BITS / 8) + struct shm_channel { struct device *dev; void __iomem *base; diff --git a/include/net/netfilter/nf_conntrack_expect.h b/include/net/netfilter/nf_conntrack_expect.h index e9a8350e7ccfb0..80f50fd0f7ad27 100644 --- a/include/net/netfilter/nf_conntrack_expect.h +++ b/include/net/netfilter/nf_conntrack_expect.h @@ -45,9 +45,12 @@ struct nf_conntrack_expect { void (*expectfn)(struct nf_conn *new, struct nf_conntrack_expect *this); - /* Helper to assign to new connection */ + /* Helper that created this expectation */ struct nf_conntrack_helper __rcu *helper; + /* Helper to assign to new connection */ + struct nf_conntrack_helper __rcu *assign_helper; + /* The conntrack of the master connection */ struct nf_conn *master; diff --git a/include/net/netfilter/nf_dup_netdev.h b/include/net/netfilter/nf_dup_netdev.h index b175d271aec953..609bcf422a9b31 100644 --- a/include/net/netfilter/nf_dup_netdev.h +++ b/include/net/netfilter/nf_dup_netdev.h @@ -3,10 +3,23 @@ #define _NF_DUP_NETDEV_H_ #include +#include +#include void nf_dup_netdev_egress(const struct nft_pktinfo *pkt, int oif); void nf_fwd_netdev_egress(const struct nft_pktinfo *pkt, int oif); +#define NF_RECURSION_LIMIT 2 + +static inline u8 *nf_get_nf_dup_skb_recursion(void) +{ +#ifndef CONFIG_PREEMPT_RT + return this_cpu_ptr(&softnet_data.xmit.nf_dup_skb_recursion); +#else + return ¤t->net_xmit.nf_dup_skb_recursion; +#endif +} + struct nft_offload_ctx; struct nft_flow_rule; diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h index b09c11c048d519..7b23b245a5a86a 100644 --- a/include/net/netfilter/nf_flow_table.h +++ b/include/net/netfilter/nf_flow_table.h @@ -148,9 +148,10 @@ struct flow_offload_tuple { /* All members above are keys for lookups, see flow_offload_hash(). */ struct { } __hash; - u8 dir:2, + u16 dir:2, xmit_type:3, encap_num:2, + needs_gso_segment:1, tun_num:2, in_vlan_ingress:2; u16 mtu; @@ -232,6 +233,7 @@ struct nf_flow_route { u32 hw_ifindex; u8 h_source[ETH_ALEN]; u8 h_dest[ETH_ALEN]; + u8 needs_gso_segment:1; } out; enum flow_offload_xmit_type xmit_type; } tuple[FLOW_OFFLOAD_DIR_MAX]; diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h index 80ccd4dda8e0fc..6e27c56514df57 100644 --- a/include/net/netns/ipv4.h +++ b/include/net/netns/ipv4.h @@ -275,7 +275,7 @@ struct netns_ipv4 { #ifdef CONFIG_IP_MROUTE #ifndef CONFIG_IP_MROUTE_MULTIPLE_TABLES - struct mr_table *mrt; + struct mr_table __rcu *mrt; #else struct list_head mr_tables; struct fib_rules_ops *mr_rules_ops; diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h index 499e4288170fc6..875916d60bfe4e 100644 --- a/include/net/netns/ipv6.h +++ b/include/net/netns/ipv6.h @@ -119,6 +119,7 @@ struct netns_ipv6 { struct fib_notifier_ops *notifier_ops; struct fib_notifier_ops *ip6mr_notifier_ops; atomic_t ipmr_seq; + int flowlabel_count; struct { struct hlist_head head; spinlock_t lock; diff --git a/include/net/nsh.h b/include/net/nsh.h index 16a75109389693..15a26c59081514 100644 --- a/include/net/nsh.h +++ b/include/net/nsh.h @@ -247,10 +247,10 @@ struct nshhdr { #define NSH_M_TYPE1_LEN 24 /* NSH header maximum Length. */ -#define NSH_HDR_MAX_LEN 256 +#define NSH_HDR_MAX_LEN ((NSH_LEN_MASK >> NSH_LEN_SHIFT) * 4) /* NSH context headers maximum Length. */ -#define NSH_CTX_HDRS_MAX_LEN 248 +#define NSH_CTX_HDRS_MAX_LEN (NSH_HDR_MAX_LEN - NSH_BASE_HDR_LEN) static inline struct nshhdr *nsh_hdr(struct sk_buff *skb) { diff --git a/include/sound/emux_synth.h b/include/sound/emux_synth.h index 3f7f365ed248cd..2c0a976e00abb4 100644 --- a/include/sound/emux_synth.h +++ b/include/sound/emux_synth.h @@ -125,7 +125,6 @@ struct snd_emux { */ struct snd_emux_port { - struct snd_midi_channel_set chset; struct snd_emux *emu; char port_mode; /* operation mode */ @@ -138,6 +137,7 @@ struct snd_emux_port { #if IS_ENABLED(CONFIG_SND_SEQUENCER_OSS) struct snd_seq_oss_arg *oss_arg; #endif + struct snd_midi_channel_set chset; }; /* port_mode */ diff --git a/include/sound/seq_midi_emul.h b/include/sound/seq_midi_emul.h index 88799d1e1f53df..afc7654378701e 100644 --- a/include/sound/seq_midi_emul.h +++ b/include/sound/seq_midi_emul.h @@ -55,14 +55,14 @@ struct snd_midi_channel_set { int client; /* Client for this port */ int port; /* The port number */ - int max_channels; /* Size of the channels array */ - struct snd_midi_channel *channels; - unsigned char midi_mode; /* MIDI operating mode */ unsigned char gs_master_volume; /* SYSEX master volume: 0-127 */ unsigned char gs_chorus_mode; unsigned char gs_reverb_mode; + int max_channels; /* Size of the channels array */ + struct snd_midi_channel channels[] __counted_by(max_channels); + }; struct snd_midi_op { diff --git a/include/sound/soc.h b/include/sound/soc.h index 5e3eb617d8323a..50d1fd110a4ab1 100644 --- a/include/sound/soc.h +++ b/include/sound/soc.h @@ -431,7 +431,7 @@ struct snd_soc_jack_pin; int snd_soc_register_card(struct snd_soc_card *card); void snd_soc_unregister_card(struct snd_soc_card *card); int devm_snd_soc_register_card(struct device *dev, struct snd_soc_card *card); -int devm_snd_soc_register_deferrable_card(struct device *dev, struct snd_soc_card *card); +#define devm_snd_soc_register_deferrable_card(d, c) devm_snd_soc_register_card(d, c) #ifdef CONFIG_PM_SLEEP int snd_soc_suspend(struct device *dev); int snd_soc_resume(struct device *dev); diff --git a/include/sound/soc_sdw_utils.h b/include/sound/soc_sdw_utils.h index d713ab2f66203f..79c21966220b24 100644 --- a/include/sound/soc_sdw_utils.h +++ b/include/sound/soc_sdw_utils.h @@ -223,6 +223,17 @@ int asoc_sdw_cs42l43_spk_init(struct snd_soc_card *card, struct asoc_sdw_codec_info *info, bool playback); +/* es9356 codec support */ +int asoc_sdw_es9356_init(struct snd_soc_card *card, + struct snd_soc_dai_link *dai_links, + struct asoc_sdw_codec_info *info, + bool playback); +int asoc_sdw_es9356_amp_init(struct snd_soc_card *card, + struct snd_soc_dai_link *dai_links, + struct asoc_sdw_codec_info *info, + bool playback); +int asoc_sdw_es9356_exit(struct snd_soc_card *card, struct snd_soc_dai_link *dai_link); + /* CS AMP support */ int asoc_sdw_bridge_cs35l56_count_sidecar(struct snd_soc_card *card, int *num_dais, int *num_devs); @@ -278,5 +289,8 @@ int asoc_sdw_ti_amp_initial_settings(struct snd_soc_card *card, const char *name_prefix); int asoc_sdw_ti_dmic_rtd_init(struct snd_soc_pcm_runtime *rtd, struct snd_soc_dai *dai); int asoc_sdw_ti_sdca_jack_rtd_init(struct snd_soc_pcm_runtime *rtd, struct snd_soc_dai *dai); +int asoc_sdw_es9356_rtd_init(struct snd_soc_pcm_runtime *rtd, struct snd_soc_dai *dai); +int asoc_sdw_es9356_spk_rtd_init(struct snd_soc_pcm_runtime *rtd, struct snd_soc_dai *dai); +int asoc_sdw_es9356_dmic_rtd_init(struct snd_soc_pcm_runtime *rtd, struct snd_soc_dai *dai); #endif diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h index 8ad7a2d76c1d57..ec1df8b94517c5 100644 --- a/include/trace/events/btrfs.h +++ b/include/trace/events/btrfs.h @@ -771,10 +771,8 @@ TRACE_EVENT(btrfs_sync_file, TP_fast_assign( struct dentry *dentry = file_dentry(file); struct inode *inode = file_inode(file); - struct dentry *parent = dget_parent(dentry); - struct inode *parent_inode = d_inode(parent); + struct inode *parent_inode = d_inode(dentry->d_parent); - dput(parent); TP_fast_assign_fsid(btrfs_sb(inode->i_sb)); __entry->ino = btrfs_ino(BTRFS_I(inode)); __entry->parent = btrfs_ino(BTRFS_I(parent_inode)); diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h index f69344fe6c0863..ca6fe1f9d05e7e 100644 --- a/include/uapi/linux/rseq.h +++ b/include/uapi/linux/rseq.h @@ -28,7 +28,7 @@ enum rseq_cs_flags_bit { RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT_BIT = 0, RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT = 1, RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT = 2, - /* (3) Intentional gap to put new bits into a separate byte */ + /* (3) Intentional gap to keep new bits separate */ /* User read only feature flags */ RSEQ_CS_FLAG_SLICE_EXT_AVAILABLE_BIT = 4, @@ -161,6 +161,9 @@ struct rseq { * - RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT * - RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL * - RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE + * + * It is now used for feature status advertisement by the kernel. + * See: enum rseq_cs_flags_bit for further information. */ __u32 flags; diff --git a/include/ufs/unipro.h b/include/ufs/unipro.h index f849a2a101ae29..9c168703b10480 100644 --- a/include/ufs/unipro.h +++ b/include/ufs/unipro.h @@ -333,6 +333,11 @@ enum ufs_eom_eye_mask { #define DME_LocalTC0ReplayTimeOutVal 0xD042 #define DME_LocalAFC0ReqTimeOutVal 0xD043 +enum ufs_op_mode { + LS_MODE = 1, + HS_MODE = 2, +}; + /* PA power modes */ enum ufs_pa_pwr_mode { FAST_MODE = 1, diff --git a/include/video/imx-ipu-image-convert.h b/include/video/imx-ipu-image-convert.h index 003b3927ede5cd..6b77968a6a150f 100644 --- a/include/video/imx-ipu-image-convert.h +++ b/include/video/imx-ipu-image-convert.h @@ -27,12 +27,13 @@ struct ipu_image_convert_run { int status; + /* private: */ /* internal to image converter, callers don't touch */ struct list_head list; }; /** - * ipu_image_convert_cb_t - conversion callback function prototype + * typedef ipu_image_convert_cb_t - conversion callback function prototype * * @run: the completed conversion run pointer * @ctx: a private context pointer for the callback @@ -60,7 +61,7 @@ void ipu_image_convert_adjust(struct ipu_image *in, struct ipu_image *out, * @out: output image format * @rot_mode: rotation mode * - * Returns 0 if the formats and rotation mode meet IPU restrictions, + * Returns: 0 if the formats and rotation mode meet IPU restrictions, * -EINVAL otherwise. */ int ipu_image_convert_verify(struct ipu_image *in, struct ipu_image *out, @@ -77,11 +78,11 @@ int ipu_image_convert_verify(struct ipu_image *in, struct ipu_image *out, * @complete: run completion callback * @complete_context: a context pointer for the completion callback * - * Returns an opaque conversion context pointer on success, error pointer + * In V4L2, drivers should call ipu_image_convert_prepare() at streamon. + * + * Returns: an opaque conversion context pointer on success, error pointer * on failure. The input/output formats and rotation mode must already meet * IPU retrictions. - * - * In V4L2, drivers should call ipu_image_convert_prepare() at streamon. */ struct ipu_image_convert_ctx * ipu_image_convert_prepare(struct ipu_soc *ipu, enum ipu_ic_task ic_task, @@ -122,6 +123,8 @@ void ipu_image_convert_unprepare(struct ipu_image_convert_ctx *ctx); * In V4L2, drivers should call ipu_image_convert_queue() while * streaming to queue the conversion of a received input buffer. * For example mem2mem devices this would be called in .device_run. + * + * Returns: 0 on success or -errno on error. */ int ipu_image_convert_queue(struct ipu_image_convert_run *run); @@ -155,6 +158,9 @@ void ipu_image_convert_abort(struct ipu_image_convert_ctx *ctx); * On successful return the caller can queue more run requests if needed, using * the prepared context in run->ctx. The caller is responsible for unpreparing * the context when no more conversion requests are needed. + * + * Returns: pointer to the created &struct ipu_image_convert_run that has + * been queued on success; an ERR_PTR(errno) on error. */ struct ipu_image_convert_run * ipu_image_convert(struct ipu_soc *ipu, enum ipu_ic_task ic_task, diff --git a/include/video/udlfb.h b/include/video/udlfb.h index 58fb5732831a43..ab34790d57ecd6 100644 --- a/include/video/udlfb.h +++ b/include/video/udlfb.h @@ -56,6 +56,7 @@ struct dlfb_data { spinlock_t damage_lock; struct work_struct damage_work; struct fb_ops ops; + atomic_t mmap_count; /* blit-only rendering path metrics, exposed through sysfs */ atomic_t bytes_rendered; /* raw pixel-bytes driver asked to render */ atomic_t bytes_identical; /* saved effort with backbuffer comparison */ diff --git a/include/xen/arm/interface.h b/include/xen/arm/interface.h index c3eada2642aa9a..61360b89da405e 100644 --- a/include/xen/arm/interface.h +++ b/include/xen/arm/interface.h @@ -30,7 +30,7 @@ #define __HYPERVISOR_platform_op_raw __HYPERVISOR_platform_op -#ifndef __ASSEMBLY__ +#ifndef __ASSEMBLER__ /* Explicitly size integers that represent pfns in the interface with * Xen so that we can have one ABI that works for 32 and 64 bit guests. * Note that this means that the xen_pfn_t type may be capable of diff --git a/io_uring/cancel.c b/io_uring/cancel.c index 5e5eb9cfc7cd6f..4aa3103ba9c39c 100644 --- a/io_uring/cancel.c +++ b/io_uring/cancel.c @@ -561,8 +561,8 @@ __cold bool io_uring_try_cancel_requests(struct io_ring_ctx *ctx, ret |= io_waitid_remove_all(ctx, tctx, cancel_all); ret |= io_futex_remove_all(ctx, tctx, cancel_all); ret |= io_uring_try_cancel_uring_cmd(ctx, tctx, cancel_all); - mutex_unlock(&ctx->uring_lock); ret |= io_kill_timeouts(ctx, tctx, cancel_all); + mutex_unlock(&ctx->uring_lock); if (tctx) ret |= io_run_task_work() > 0; else diff --git a/io_uring/eventfd.c b/io_uring/eventfd.c index 3da028500f76a6..d656cc2a0b9ba5 100644 --- a/io_uring/eventfd.c +++ b/io_uring/eventfd.c @@ -43,6 +43,7 @@ static void io_eventfd_do_signal(struct rcu_head *rcu) { struct io_ev_fd *ev_fd = container_of(rcu, struct io_ev_fd, rcu); + atomic_andnot(BIT(IO_EVENTFD_OP_SIGNAL_BIT), &ev_fd->ops); eventfd_signal_mask(ev_fd->cq_ev_fd, EPOLL_URING_WAKE); io_eventfd_put(ev_fd); } diff --git a/io_uring/fdinfo.c b/io_uring/fdinfo.c index c2d3e45544bb4e..001fb542dc11a8 100644 --- a/io_uring/fdinfo.c +++ b/io_uring/fdinfo.c @@ -190,8 +190,9 @@ static void __io_uring_show_fdinfo(struct io_ring_ctx *ctx, struct seq_file *m) get_task_struct(tsk); rcu_read_unlock(); usec = io_sq_cpu_usec(tsk); + sq_pid = task_pid_nr_ns(tsk, + proc_pid_ns(file_inode(m->file)->i_sb)); put_task_struct(tsk); - sq_pid = sq->task_pid; sq_cpu = sq->sq_cpu; sq_total_time = usec; sq_work_time = sq->work_time; diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c index 7a9f94a0ce6f2f..8cc7b47d30894a 100644 --- a/io_uring/io-wq.c +++ b/io_uring/io-wq.c @@ -1124,7 +1124,8 @@ static inline void io_wq_remove_pending(struct io_wq *wq, if (io_wq_is_hashed(work) && work == wq->hash_tail[hash]) { if (prev) prev_work = container_of(prev, struct io_wq_work, list); - if (prev_work && io_get_work_hash(prev_work) == hash) + if (prev_work && io_wq_is_hashed(prev_work) && + io_get_work_hash(prev_work) == hash) wq->hash_tail[hash] = prev_work; else wq->hash_tail[hash] = NULL; diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 4ed998d60c09cb..036145ee466cd2 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -686,13 +686,27 @@ static struct io_overflow_cqe *io_alloc_ocqe(struct io_ring_ctx *ctx, return ocqe; } +/* + * Compute queued CQEs for free-space calculation, clamped to cq_entries. + */ +static unsigned int io_cqring_queued(struct io_ring_ctx *ctx) +{ + struct io_rings *rings = io_get_rings(ctx); + int diff; + + diff = (int)(ctx->cached_cq_tail - READ_ONCE(rings->cq.head)); + if (diff >= 0) + return min((unsigned int)diff, ctx->cq_entries); + return 0; +} + /* * Fill an empty dummy CQE, in case alignment is off for posting a 32b CQE * because the ring is a single 16b entry away from wrapping. */ static bool io_fill_nop_cqe(struct io_ring_ctx *ctx, unsigned int off) { - if (__io_cqring_events(ctx) < ctx->cq_entries) { + if (io_cqring_queued(ctx) < ctx->cq_entries) { struct io_uring_cqe *cqe = &ctx->rings->cqes[off]; cqe->user_data = 0; @@ -713,7 +727,7 @@ bool io_cqe_cache_refill(struct io_ring_ctx *ctx, bool overflow, bool cqe32) { struct io_rings *rings = ctx->rings; unsigned int off = ctx->cached_cq_tail & (ctx->cq_entries - 1); - unsigned int free, queued, len; + unsigned int free, len; /* * Posting into the CQ when there are pending overflowed CQEs may break @@ -733,9 +747,7 @@ bool io_cqe_cache_refill(struct io_ring_ctx *ctx, bool overflow, bool cqe32) off = 0; } - /* userspace may cheat modifying the tail, be safe and do min */ - queued = min(__io_cqring_events(ctx), ctx->cq_entries); - free = ctx->cq_entries - queued; + free = ctx->cq_entries - io_cqring_queued(ctx); /* we need a contiguous range, limit based on the current array offset */ len = min(free, ctx->cq_entries - off); if (len < (cqe32 + 1)) @@ -1452,8 +1464,13 @@ struct io_wq_work *io_wq_free_work(struct io_wq_work *work) struct io_kiocb *nxt = NULL; if (req_ref_put_and_test_atomic(req)) { - if (req->flags & IO_REQ_LINK_FLAGS) + if (req->flags & IO_REQ_LINK_FLAGS) { + struct io_ring_ctx *ctx = req->ctx; + + mutex_lock(&ctx->uring_lock); nxt = io_req_find_next(req); + mutex_unlock(&ctx->uring_lock); + } io_free_req(req); } return nxt ? &nxt->work : NULL; diff --git a/io_uring/napi.c b/io_uring/napi.c index 8d68366a4b9039..bfc77144591203 100644 --- a/io_uring/napi.c +++ b/io_uring/napi.c @@ -38,7 +38,8 @@ static inline ktime_t net_to_ktime(unsigned long t) return ns_to_ktime(t << 10); } -int __io_napi_add_id(struct io_ring_ctx *ctx, unsigned int napi_id) +int __io_napi_add_id(struct io_ring_ctx *ctx, unsigned int napi_id, + unsigned int mode) { struct hlist_head *hash_list; struct io_napi_entry *e; @@ -69,6 +70,11 @@ int __io_napi_add_id(struct io_ring_ctx *ctx, unsigned int napi_id) * kfree() */ spin_lock(&ctx->napi_lock); + if (unlikely(READ_ONCE(ctx->napi_track_mode) != mode)) { + spin_unlock(&ctx->napi_lock); + kfree(e); + return -EINVAL; + } if (unlikely(io_napi_hash_find(hash_list, napi_id))) { spin_unlock(&ctx->napi_lock); kfree(e); @@ -196,9 +202,14 @@ __io_napi_do_busy_loop(struct io_ring_ctx *ctx, bool (*loop_end)(void *, unsigned long), void *loop_end_arg) { - if (READ_ONCE(ctx->napi_track_mode) == IO_URING_NAPI_TRACKING_STATIC) + switch (READ_ONCE(ctx->napi_track_mode)) { + case IO_URING_NAPI_TRACKING_STATIC: return static_tracking_do_busy_loop(ctx, loop_end, loop_end_arg); - return dynamic_tracking_do_busy_loop(ctx, loop_end, loop_end_arg); + case IO_URING_NAPI_TRACKING_DYNAMIC: + return dynamic_tracking_do_busy_loop(ctx, loop_end, loop_end_arg); + default: + return false; + } } static void io_napi_blocking_busy_loop(struct io_ring_ctx *ctx, @@ -273,13 +284,13 @@ static int io_napi_register_napi(struct io_ring_ctx *ctx, default: return -EINVAL; } - /* clean the napi list for new settings */ + WRITE_ONCE(ctx->napi_track_mode, IO_URING_NAPI_TRACKING_INACTIVE); io_napi_free(ctx); - WRITE_ONCE(ctx->napi_track_mode, napi->op_param); /* cap NAPI at 10 msec of spin time */ napi->busy_poll_to = min(10000, napi->busy_poll_to); WRITE_ONCE(ctx->napi_busy_poll_dt, napi->busy_poll_to * NSEC_PER_USEC); WRITE_ONCE(ctx->napi_prefer_busy_poll, !!napi->prefer_busy_poll); + WRITE_ONCE(ctx->napi_track_mode, napi->op_param); return 0; } @@ -315,7 +326,8 @@ int io_register_napi(struct io_ring_ctx *ctx, void __user *arg) case IO_URING_NAPI_STATIC_ADD_ID: if (curr.op_param != IO_URING_NAPI_TRACKING_STATIC) return -EINVAL; - return __io_napi_add_id(ctx, napi.op_param); + return __io_napi_add_id(ctx, napi.op_param, + IO_URING_NAPI_TRACKING_STATIC); case IO_URING_NAPI_STATIC_DEL_ID: if (curr.op_param != IO_URING_NAPI_TRACKING_STATIC) return -EINVAL; @@ -343,9 +355,10 @@ int io_unregister_napi(struct io_ring_ctx *ctx, void __user *arg) if (arg && copy_to_user(arg, &curr, sizeof(curr))) return -EFAULT; + WRITE_ONCE(ctx->napi_track_mode, IO_URING_NAPI_TRACKING_INACTIVE); WRITE_ONCE(ctx->napi_busy_poll_dt, 0); WRITE_ONCE(ctx->napi_prefer_busy_poll, false); - WRITE_ONCE(ctx->napi_track_mode, IO_URING_NAPI_TRACKING_INACTIVE); + io_napi_free(ctx); return 0; } diff --git a/io_uring/napi.h b/io_uring/napi.h index fa742f42e09b47..e0aecccc5065b9 100644 --- a/io_uring/napi.h +++ b/io_uring/napi.h @@ -15,7 +15,8 @@ void io_napi_free(struct io_ring_ctx *ctx); int io_register_napi(struct io_ring_ctx *ctx, void __user *arg); int io_unregister_napi(struct io_ring_ctx *ctx, void __user *arg); -int __io_napi_add_id(struct io_ring_ctx *ctx, unsigned int napi_id); +int __io_napi_add_id(struct io_ring_ctx *ctx, unsigned int napi_id, + unsigned int mode); void __io_napi_busy_loop(struct io_ring_ctx *ctx, struct io_wait_queue *iowq); int io_napi_sqpoll_busy_poll(struct io_ring_ctx *ctx); @@ -43,13 +44,14 @@ static inline void io_napi_add(struct io_kiocb *req) { struct io_ring_ctx *ctx = req->ctx; struct socket *sock; + unsigned int mode = IO_URING_NAPI_TRACKING_DYNAMIC; - if (READ_ONCE(ctx->napi_track_mode) != IO_URING_NAPI_TRACKING_DYNAMIC) + if (READ_ONCE(ctx->napi_track_mode) != mode) return; sock = sock_from_file(req->file); if (sock && sock->sk) - __io_napi_add_id(ctx, READ_ONCE(sock->sk->sk_napi_id)); + __io_napi_add_id(ctx, READ_ONCE(sock->sk->sk_napi_id), mode); } #else diff --git a/io_uring/rw.c b/io_uring/rw.c index e729e0e7657ecc..0c48346452797a 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -230,7 +230,7 @@ static inline void io_meta_restore(struct io_async_rw *io, struct kiocb *kiocb) } static int io_prep_rw_pi(struct io_kiocb *req, struct io_rw *rw, int ddir, - u64 attr_ptr, u64 attr_type_mask) + u64 attr_ptr) { struct io_uring_attr_pi pi_attr; struct io_async_rw *io; @@ -305,7 +305,7 @@ static int __io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe, return -EINVAL; attr_ptr = READ_ONCE(sqe->attr_ptr); - return io_prep_rw_pi(req, rw, ddir, attr_ptr, attr_type_mask); + return io_prep_rw_pi(req, rw, ddir, attr_ptr); } return 0; } diff --git a/io_uring/timeout.c b/io_uring/timeout.c index 4cfdfc519770ab..6353a4d979dc00 100644 --- a/io_uring/timeout.c +++ b/io_uring/timeout.c @@ -3,6 +3,7 @@ #include #include #include +#include #include @@ -35,6 +36,22 @@ struct io_timeout_rem { bool ltimeout; }; +static clockid_t io_flags_to_clock(unsigned flags) +{ + switch (flags & IORING_TIMEOUT_CLOCK_MASK) { + case IORING_TIMEOUT_BOOTTIME: + return CLOCK_BOOTTIME; + case IORING_TIMEOUT_REALTIME: + return CLOCK_REALTIME; + default: + /* can't happen, vetted at prep time */ + WARN_ON_ONCE(1); + fallthrough; + case 0: + return CLOCK_MONOTONIC; + } +} + static int io_parse_user_time(ktime_t *time, u64 arg, unsigned flags) { struct timespec64 ts; @@ -43,7 +60,7 @@ static int io_parse_user_time(ktime_t *time, u64 arg, unsigned flags) *time = ns_to_ktime(arg); if (*time < 0) return -EINVAL; - return 0; + goto out; } if (get_timespec64(&ts, u64_to_user_ptr(arg))) @@ -51,6 +68,9 @@ static int io_parse_user_time(ktime_t *time, u64 arg, unsigned flags) if (ts.tv_sec < 0 || ts.tv_nsec < 0) return -EINVAL; *time = timespec64_to_ktime(ts); +out: + if (flags & IORING_TIMEOUT_ABS) + *time = timens_ktime_to_host(io_flags_to_clock(flags), *time); return 0; } @@ -264,6 +284,10 @@ static struct io_kiocb *__io_disarm_linked_timeout(struct io_kiocb *req, struct io_timeout *timeout = io_kiocb_to_cmd(link, struct io_timeout); io_remove_next_linked(req); + + /* If this is NULL, then timer already claimed it and will complete it */ + if (!timeout->head) + return NULL; timeout->head = NULL; if (hrtimer_try_to_cancel(&io->timer) != -1) { list_del(&timeout->list); @@ -347,6 +371,14 @@ static void io_req_task_link_timeout(struct io_tw_req tw_req, io_tw_token_t tw) int ret; if (prev) { + /* + * splice the linked timeout out of prev's chain if the regular + * completion path didn't already do it. + */ + if (prev->link == req) + prev->link = req->link; + req->link = NULL; + if (!tw.cancel) { struct io_cancel_data cd = { .ctx = req->ctx, @@ -381,10 +413,10 @@ static enum hrtimer_restart io_link_timeout_fn(struct hrtimer *timer) /* * We don't expect the list to be empty, that will only happen if we - * race with the completion of the linked work. + * race with the completion of the linked work. Splice of prev is + * done in io_req_task_link_timeout(), if needed. */ if (prev) { - io_remove_next_linked(prev); if (!req_ref_inc_not_zero(prev)) prev = NULL; } @@ -399,18 +431,7 @@ static enum hrtimer_restart io_link_timeout_fn(struct hrtimer *timer) static clockid_t io_timeout_get_clock(struct io_timeout_data *data) { - switch (data->flags & IORING_TIMEOUT_CLOCK_MASK) { - case IORING_TIMEOUT_BOOTTIME: - return CLOCK_BOOTTIME; - case IORING_TIMEOUT_REALTIME: - return CLOCK_REALTIME; - default: - /* can't happen, vetted at prep time */ - WARN_ON_ONCE(1); - fallthrough; - case 0: - return CLOCK_MONOTONIC; - } + return io_flags_to_clock(data->flags); } static int io_linked_timeout_update(struct io_ring_ctx *ctx, __u64 user_data, diff --git a/io_uring/wait.c b/io_uring/wait.c index 91df86ce0d18c1..ec01e78a216d6c 100644 --- a/io_uring/wait.c +++ b/io_uring/wait.c @@ -5,6 +5,7 @@ #include #include #include +#include #include @@ -229,7 +230,10 @@ int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, if (ext_arg->ts_set) { iowq.timeout = timespec64_to_ktime(ext_arg->ts); - if (!(flags & IORING_ENTER_ABS_TIMER)) + if (flags & IORING_ENTER_ABS_TIMER) + iowq.timeout = timens_ktime_to_host(ctx->clockid, + iowq.timeout); + else iowq.timeout = ktime_add(iowq.timeout, start_time); } diff --git a/kernel/audit.c b/kernel/audit.c index e1d489bc2dff99..34dc7cb246ff27 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -1468,6 +1468,8 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh, err = audit_list_rules_send(skb, seq); break; case AUDIT_TRIM: + if (audit_enabled == AUDIT_LOCKED) + return -EPERM; audit_trim_trees(); audit_log_common_recv_msg(audit_context(), &ab, AUDIT_CONFIG_CHANGE); @@ -1480,6 +1482,8 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh, size_t msglen = data_len; char *old, *new; + if (audit_enabled == AUDIT_LOCKED) + return -EPERM; err = -EINVAL; if (msglen < 2 * sizeof(u32)) break; diff --git a/kernel/auditsc.c b/kernel/auditsc.c index ab54fccba215ca..abdf8da3be9340 100644 --- a/kernel/auditsc.c +++ b/kernel/auditsc.c @@ -2786,7 +2786,7 @@ void __audit_log_capset(const struct cred *new, const struct cred *old) context->capset.pid = task_tgid_nr(current); context->capset.cap.effective = new->cap_effective; - context->capset.cap.inheritable = new->cap_effective; + context->capset.cap.inheritable = new->cap_inheritable; context->capset.cap.permitted = new->cap_permitted; context->capset.cap.ambient = new->cap_ambient; context->type = AUDIT_CAPSET; diff --git a/kernel/bpf/arena.c b/kernel/bpf/arena.c index 802656c6fd3c03..49a8f7b1beef59 100644 --- a/kernel/bpf/arena.c +++ b/kernel/bpf/arena.c @@ -511,7 +511,7 @@ static int arena_map_direct_value_addr(const struct bpf_map *map, u64 *imm, u32 { struct bpf_arena *arena = container_of(map, struct bpf_arena, map); - if ((u64)off > arena->user_vm_end - arena->user_vm_start) + if ((u64)off >= arena->user_vm_end - arena->user_vm_start) return -ERANGE; *imm = (unsigned long)arena->user_vm_start; return 0; diff --git a/kernel/bpf/liveness.c b/kernel/bpf/liveness.c index 332e6e003f270a..58197d73b12010 100644 --- a/kernel/bpf/liveness.c +++ b/kernel/bpf/liveness.c @@ -1914,26 +1914,15 @@ int bpf_compute_subprog_arg_access(struct bpf_verifier_env *env) return -ENOMEM; } - instance = call_instance(env, NULL, 0, 0); - if (IS_ERR(instance)) { - err = PTR_ERR(instance); - goto out; - } - err = analyze_subprog(env, NULL, info, instance, callsites); - if (err) - goto out; - /* - * Subprogs and callbacks that don't receive FP-derived arguments - * cannot access ancestor stack frames, so they were skipped during - * the recursive walk above. Async callbacks (timer, workqueue) are - * also not reachable from the main program's call graph. Analyze - * all unvisited subprogs as independent roots at depth 0. + * Analyze every subprog in reverse topological order (callers + * before callees) so that each subprog is analyzed before its + * callees, allowing the recursive walk inside analyze_subprog() + * to naturally reach callees that receive FP-derived args. * - * Use reverse topological order (callers before callees) so that - * each subprog is analyzed before its callees, allowing the - * recursive walk inside analyze_subprog() to naturally - * reach nested callees that also lack FP-derived args. + * Subprogs and callbacks that don't receive FP-derived arguments + * cannot access ancestor stack frames are analyzed independently. + * Async callbacks (timer, workqueue) are handled the same way. */ for (k = env->subprog_cnt - 1; k >= 0; k--) { int sub = env->subprog_topo_order[k]; diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 45c0b1ed687acb..6152add0c5eb6e 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -264,10 +264,12 @@ static void cgroup_finalize_control(struct cgroup *cgrp, int ret); static void css_task_iter_skip(struct css_task_iter *it, struct task_struct *task); static int cgroup_destroy_locked(struct cgroup *cgrp); +static void cgroup_finish_destroy(struct cgroup *cgrp); +static void kill_css_sync(struct cgroup_subsys_state *css); +static void kill_css_finish(struct cgroup_subsys_state *css); static struct cgroup_subsys_state *css_create(struct cgroup *cgrp, struct cgroup_subsys *ss); static void css_release(struct percpu_ref *ref); -static void kill_css(struct cgroup_subsys_state *css); static int cgroup_addrm_files(struct cgroup_subsys_state *css, struct cgroup *cgrp, struct cftype cfts[], bool is_add); @@ -797,6 +799,16 @@ static void cgroup_update_populated(struct cgroup *cgrp, bool populated) if (was_populated == cgroup_is_populated(cgrp)) break; + /* + * Subtree just emptied below an offlined cgrp. Fire deferred + * destroy. The transition is one-shot. + */ + if (was_populated && !css_is_online(&cgrp->self)) { + cgroup_get(cgrp); + WARN_ON_ONCE(!queue_work(cgroup_offline_wq, + &cgrp->finish_destroy_work)); + } + cgroup1_check_for_release(cgrp); TRACE_CGROUP_PATH(notify_populated, cgrp, cgroup_is_populated(cgrp)); @@ -2039,6 +2051,16 @@ static int cgroup_reconfigure(struct fs_context *fc) return 0; } +static void cgroup_finish_destroy_work_fn(struct work_struct *work) +{ + struct cgroup *cgrp = container_of(work, struct cgroup, finish_destroy_work); + + cgroup_lock(); + cgroup_finish_destroy(cgrp); + cgroup_unlock(); + cgroup_put(cgrp); +} + static void init_cgroup_housekeeping(struct cgroup *cgrp) { struct cgroup_subsys *ss; @@ -2065,7 +2087,7 @@ static void init_cgroup_housekeeping(struct cgroup *cgrp) #endif init_waitqueue_head(&cgrp->offline_waitq); - init_waitqueue_head(&cgrp->dying_populated_waitq); + INIT_WORK(&cgrp->finish_destroy_work, cgroup_finish_destroy_work_fn); INIT_WORK(&cgrp->release_agent_work, cgroup1_release_agent); } @@ -3375,7 +3397,8 @@ static void cgroup_apply_control_disable(struct cgroup *cgrp) if (css->parent && !(cgroup_ss_mask(dsct) & (1 << ss->id))) { - kill_css(css); + kill_css_sync(css); + kill_css_finish(css); } else if (!css_visible(css)) { css_clear_dir(css); if (ss->css_reset) @@ -5067,10 +5090,12 @@ static void css_task_iter_advance(struct css_task_iter *it) task = list_entry(it->task_pos, struct task_struct, cg_list); /* - * Hide tasks that are exiting but not yet removed. Keep zombie - * leaders with live threads visible. + * Hide tasks that are exiting but not yet removed by default. Keep + * zombie leaders with live threads visible. Usages that need to walk + * every existing task can opt out via CSS_TASK_ITER_WITH_DEAD. */ - if ((task->flags & PF_EXITING) && !atomic_read(&task->signal->live)) + if (!(it->flags & CSS_TASK_ITER_WITH_DEAD) && + (task->flags & PF_EXITING) && !atomic_read(&task->signal->live)) goto repeat; if (it->flags & CSS_TASK_ITER_PROCS) { @@ -5514,7 +5539,7 @@ static struct cftype cgroup_psi_files[] = { * css destruction is four-stage process. * * 1. Destruction starts. Killing of the percpu_ref is initiated. - * Implemented in kill_css(). + * Implemented in kill_css_finish(). * * 2. When the percpu_ref is confirmed to be visible as killed on all CPUs * and thus css_tryget_online() is guaranteed to fail, the css can be @@ -5993,7 +6018,7 @@ int cgroup_mkdir(struct kernfs_node *parent_kn, const char *name, umode_t mode) /* * This is called when the refcnt of a css is confirmed to be killed. * css_tryget_online() is now guaranteed to fail. Tell the subsystem to - * initiate destruction and put the css ref from kill_css(). + * initiate destruction and put the css ref from kill_css_finish(). */ static void css_killed_work_fn(struct work_struct *work) { @@ -6026,15 +6051,12 @@ static void css_killed_ref_fn(struct percpu_ref *ref) } /** - * kill_css - destroy a css - * @css: css to destroy + * kill_css_sync - synchronous half of css teardown + * @css: css being killed * - * This function initiates destruction of @css by removing cgroup interface - * files and putting its base reference. ->css_offline() will be invoked - * asynchronously once css_tryget_online() is guaranteed to fail and when - * the reference count reaches zero, @css will be released. + * See cgroup_destroy_locked(). */ -static void kill_css(struct cgroup_subsys_state *css) +static void kill_css_sync(struct cgroup_subsys_state *css) { struct cgroup_subsys *ss = css->ss; @@ -6057,24 +6079,6 @@ static void kill_css(struct cgroup_subsys_state *css) */ css_clear_dir(css); - /* - * Killing would put the base ref, but we need to keep it alive - * until after ->css_offline(). - */ - css_get(css); - - /* - * cgroup core guarantees that, by the time ->css_offline() is - * invoked, no new css reference will be given out via - * css_tryget_online(). We can't simply call percpu_ref_kill() and - * proceed to offlining css's because percpu_ref_kill() doesn't - * guarantee that the ref is seen as killed on all CPUs on return. - * - * Use percpu_ref_kill_and_confirm() to get notifications as each - * css is confirmed to be seen as killed on all CPUs. - */ - percpu_ref_kill_and_confirm(&css->refcnt, css_killed_ref_fn); - css->cgroup->nr_dying_subsys[ss->id]++; /* * Parent css and cgroup cannot be freed until after the freeing @@ -6087,44 +6091,88 @@ static void kill_css(struct cgroup_subsys_state *css) } /** - * cgroup_destroy_locked - the first stage of cgroup destruction + * kill_css_finish - deferred half of css teardown + * @css: css being killed + * + * See cgroup_destroy_locked(). + */ +static void kill_css_finish(struct cgroup_subsys_state *css) +{ + lockdep_assert_held(&cgroup_mutex); + + /* + * Skip on re-entry: cgroup_apply_control_disable() may have killed @css + * earlier. cgroup_destroy_locked() can still walk it because + * offline_css() (which NULLs cgrp->subsys[ssid]) runs async. + */ + if (percpu_ref_is_dying(&css->refcnt)) + return; + + /* + * Killing would put the base ref, but we need to keep it alive until + * after ->css_offline(). + */ + css_get(css); + + /* + * cgroup core guarantees that, by the time ->css_offline() is invoked, + * no new css reference will be given out via css_tryget_online(). We + * can't simply call percpu_ref_kill() and proceed to offlining css's + * because percpu_ref_kill() doesn't guarantee that the ref is seen as + * killed on all CPUs on return. + * + * Use percpu_ref_kill_and_confirm() to get notifications as each css is + * confirmed to be seen as killed on all CPUs. + */ + percpu_ref_kill_and_confirm(&css->refcnt, css_killed_ref_fn); +} + +/** + * cgroup_destroy_locked - destroy @cgrp (called on rmdir) * @cgrp: cgroup to be destroyed * - * css's make use of percpu refcnts whose killing latency shouldn't be - * exposed to userland and are RCU protected. Also, cgroup core needs to - * guarantee that css_tryget_online() won't succeed by the time - * ->css_offline() is invoked. To satisfy all the requirements, - * destruction is implemented in the following two steps. - * - * s1. Verify @cgrp can be destroyed and mark it dying. Remove all - * userland visible parts and start killing the percpu refcnts of - * css's. Set up so that the next stage will be kicked off once all - * the percpu refcnts are confirmed to be killed. - * - * s2. Invoke ->css_offline(), mark the cgroup dead and proceed with the - * rest of destruction. Once all cgroup references are gone, the - * cgroup is RCU-freed. - * - * This function implements s1. After this step, @cgrp is gone as far as - * the userland is concerned and a new cgroup with the same name may be - * created. As cgroup doesn't care about the names internally, this - * doesn't cause any problem. + * Tear down @cgrp on behalf of rmdir. Constraints: + * + * - Userspace: rmdir must succeed when cgroup.procs and friends are empty. + * + * - Kernel: subsystem ->css_offline() must not run while any task in @cgrp's + * subtree is still doing kernel work. A task hidden from cgroup.procs (past + * exit_signals() with signal->live cleared) can still schedule, allocate, and + * consume resources until its final context switch. Dying descendants in the + * subtree can host such tasks too. + * + * - Kernel: css_tryget_online() must fail by the time ->css_offline() runs. + * + * The destruction runs in three parts: + * + * - This function: synchronous user-visible state teardown plus kill_css_sync() + * on each subsystem css. + * + * - cgroup_finish_destroy(): kicks the percpu_ref kill via kill_css_finish() on + * each subsystem css. Fires once @cgrp's subtree is fully drained, either + * inline here or from cgroup_update_populated(). + * + * - The percpu_ref kill chain: css_killed_ref_fn -> css_killed_work_fn -> + * ->css_offline() -> release/free. + * + * Return 0 on success, -EBUSY if a userspace-visible task or an online child + * remains. */ static int cgroup_destroy_locked(struct cgroup *cgrp) - __releases(&cgroup_mutex) __acquires(&cgroup_mutex) { struct cgroup *tcgrp, *parent = cgroup_parent(cgrp); struct cgroup_subsys_state *css; struct cgrp_cset_link *link; + struct css_task_iter it; + struct task_struct *task; int ssid, ret; lockdep_assert_held(&cgroup_mutex); - /* - * Only migration can raise populated from zero and we're already - * holding cgroup_mutex. - */ - if (cgroup_is_populated(cgrp)) + css_task_iter_start(&cgrp->self, 0, &it); + task = css_task_iter_next(&it); + css_task_iter_end(&it); + if (task) return -EBUSY; /* @@ -6148,9 +6196,8 @@ static int cgroup_destroy_locked(struct cgroup *cgrp) link->cset->dead = true; spin_unlock_irq(&css_set_lock); - /* initiate massacre of all css's */ for_each_css(css, ssid, cgrp) - kill_css(css); + kill_css_sync(css); /* clear and remove @cgrp dir, @cgrp has an extra ref on its kn */ css_clear_dir(&cgrp->self); @@ -6181,79 +6228,27 @@ static int cgroup_destroy_locked(struct cgroup *cgrp) /* put the base reference */ percpu_ref_kill(&cgrp->self.refcnt); + if (!cgroup_is_populated(cgrp)) + cgroup_finish_destroy(cgrp); + return 0; }; /** - * cgroup_drain_dying - wait for dying tasks to leave before rmdir - * @cgrp: the cgroup being removed - * - * cgroup.procs and cgroup.threads use css_task_iter which filters out - * PF_EXITING tasks so that userspace doesn't see tasks that have already been - * reaped via waitpid(). However, cgroup_has_tasks() - which tests whether the - * cgroup has non-empty css_sets - is only updated when dying tasks pass through - * cgroup_task_dead() in finish_task_switch(). This creates a window where - * cgroup.procs reads empty but cgroup_has_tasks() is still true, making rmdir - * fail with -EBUSY from cgroup_destroy_locked() even though userspace sees no - * tasks. - * - * This function aligns cgroup_has_tasks() with what userspace can observe. If - * cgroup_has_tasks() but the task iterator sees nothing (all remaining tasks are - * PF_EXITING), we wait for cgroup_task_dead() to finish processing them. As the - * window between PF_EXITING and cgroup_task_dead() is short, the wait is brief. + * cgroup_finish_destroy - deferred half of @cgrp destruction + * @cgrp: cgroup whose subtree just became empty * - * This function only concerns itself with this cgroup's own dying tasks. - * Whether the cgroup has children is cgroup_destroy_locked()'s problem. - * - * Each cgroup_task_dead() kicks the waitqueue via cset->cgrp_links, and we - * retry the full check from scratch. - * - * Must be called with cgroup_mutex held. + * See cgroup_destroy_locked() for the rationale. */ -static int cgroup_drain_dying(struct cgroup *cgrp) - __releases(&cgroup_mutex) __acquires(&cgroup_mutex) +static void cgroup_finish_destroy(struct cgroup *cgrp) { - struct css_task_iter it; - struct task_struct *task; - DEFINE_WAIT(wait); + struct cgroup_subsys_state *css; + int ssid; lockdep_assert_held(&cgroup_mutex); -retry: - if (!cgroup_has_tasks(cgrp)) - return 0; - - /* Same iterator as cgroup.threads - if any task is visible, it's busy */ - css_task_iter_start(&cgrp->self, 0, &it); - task = css_task_iter_next(&it); - css_task_iter_end(&it); - - if (task) - return -EBUSY; - /* - * All remaining tasks are PF_EXITING and will pass through - * cgroup_task_dead() shortly. Wait for a kick and retry. - * - * cgroup_has_tasks() can't transition from false to true while we're - * holding cgroup_mutex, but the true to false transition happens - * under css_set_lock (via cgroup_task_dead()). We must retest and - * prepare_to_wait() under css_set_lock. Otherwise, the transition - * can happen between our first test and prepare_to_wait(), and we - * sleep with no one to wake us. - */ - spin_lock_irq(&css_set_lock); - if (!cgroup_has_tasks(cgrp)) { - spin_unlock_irq(&css_set_lock); - return 0; - } - prepare_to_wait(&cgrp->dying_populated_waitq, &wait, - TASK_UNINTERRUPTIBLE); - spin_unlock_irq(&css_set_lock); - mutex_unlock(&cgroup_mutex); - schedule(); - finish_wait(&cgrp->dying_populated_waitq, &wait); - mutex_lock(&cgroup_mutex); - goto retry; + for_each_css(css, ssid, cgrp) + kill_css_finish(css); } int cgroup_rmdir(struct kernfs_node *kn) @@ -6265,12 +6260,9 @@ int cgroup_rmdir(struct kernfs_node *kn) if (!cgrp) return 0; - ret = cgroup_drain_dying(cgrp); - if (!ret) { - ret = cgroup_destroy_locked(cgrp); - if (!ret) - TRACE_CGROUP_PATH(rmdir, cgrp); - } + ret = cgroup_destroy_locked(cgrp); + if (!ret) + TRACE_CGROUP_PATH(rmdir, cgrp); cgroup_kn_unlock(kn); return ret; @@ -7030,7 +7022,6 @@ void cgroup_task_exit(struct task_struct *tsk) static void do_cgroup_task_dead(struct task_struct *tsk) { - struct cgrp_cset_link *link; struct css_set *cset; unsigned long flags; @@ -7044,11 +7035,6 @@ static void do_cgroup_task_dead(struct task_struct *tsk) if (thread_group_leader(tsk) && atomic_read(&tsk->signal->live)) list_add_tail(&tsk->cg_list, &cset->dying_tasks); - /* kick cgroup_drain_dying() waiters, see cgroup_rmdir() */ - list_for_each_entry(link, &cset->cgrp_links, cgrp_link) - if (waitqueue_active(&link->cgrp->dying_populated_waitq)) - wake_up(&link->cgrp->dying_populated_waitq); - if (dl_task(tsk)) dec_dl_tasks_cs(tsk); diff --git a/kernel/cgroup/cpuset-internal.h b/kernel/cgroup/cpuset-internal.h index bb4e692bea300c..f7aaf01f7cd5e3 100644 --- a/kernel/cgroup/cpuset-internal.h +++ b/kernel/cgroup/cpuset-internal.h @@ -167,6 +167,7 @@ struct cpuset { */ int nr_deadline_tasks; int nr_migrate_dl_tasks; + /* DL bandwidth that needs destination reservation for this attach. */ u64 sum_migrate_dl_bw; /* * CPU used for temporary DL bandwidth allocation during attach; diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index e3a081a07c6d51..5c33ab20cc208b 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -1718,7 +1718,8 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd, */ if (is_partition_valid(parent)) adding = cpumask_and(tmp->addmask, - xcpus, parent->effective_xcpus); + cs->effective_xcpus, + parent->effective_xcpus); if (old_prs > 0) new_prs = -old_prs; @@ -2993,7 +2994,7 @@ static int cpuset_can_attach(struct cgroup_taskset *tset) struct cpuset *cs, *oldcs; struct task_struct *task; bool setsched_check; - int ret; + int cpu, ret; /* used later by cpuset_attach() */ cpuset_attach_old_cs = task_cs(cgroup_taskset_first(tset, &css)); @@ -3038,31 +3039,31 @@ static int cpuset_can_attach(struct cgroup_taskset *tset) } if (dl_task(task)) { + /* + * Count all migrating DL tasks for cpuset task accounting. + * Only tasks that need a root-domain bandwidth move + * contribute to sum_migrate_dl_bw. + */ cs->nr_migrate_dl_tasks++; - cs->sum_migrate_dl_bw += task->dl.dl_bw; + if (dl_task_needs_bw_move(task, cs->effective_cpus)) + cs->sum_migrate_dl_bw += task->dl.dl_bw; } } - if (!cs->nr_migrate_dl_tasks) + if (!cs->sum_migrate_dl_bw) goto out_success; - if (!cpumask_intersects(oldcs->effective_cpus, cs->effective_cpus)) { - int cpu = cpumask_any_and(cpu_active_mask, cs->effective_cpus); - - if (unlikely(cpu >= nr_cpu_ids)) { - reset_migrate_dl_data(cs); - ret = -EINVAL; - goto out_unlock; - } + cpu = cpumask_any_and(cpu_active_mask, cs->effective_cpus); + if (unlikely(cpu >= nr_cpu_ids)) { + ret = -EINVAL; + goto out_unlock; + } - ret = dl_bw_alloc(cpu, cs->sum_migrate_dl_bw); - if (ret) { - reset_migrate_dl_data(cs); - goto out_unlock; - } + ret = dl_bw_alloc(cpu, cs->sum_migrate_dl_bw); + if (ret) + goto out_unlock; - cs->dl_bw_cpu = cpu; - } + cs->dl_bw_cpu = cpu; out_success: /* @@ -3070,7 +3071,10 @@ static int cpuset_can_attach(struct cgroup_taskset *tset) * changes which zero cpus/mems_allowed. */ cs->attach_in_progress++; + out_unlock: + if (ret) + reset_migrate_dl_data(cs); mutex_unlock(&cpuset_mutex); return ret; } @@ -4176,11 +4180,11 @@ static struct cpuset *nearest_hardwall_ancestor(struct cpuset *cs) * current's mems_allowed, yes. If it's not a __GFP_HARDWALL request and this * node is set in the nearest hardwalled cpuset ancestor to current's cpuset, * yes. If current has access to memory reserves as an oom victim, yes. - * Otherwise, no. + * If the current task is PF_EXITING, yes. Otherwise, no. * * GFP_USER allocations are marked with the __GFP_HARDWALL bit, * and do not allow allocations outside the current tasks cpuset - * unless the task has been OOM killed. + * unless the task has been OOM killed or is exiting. * GFP_KERNEL allocations are not so marked, so can escape to the * nearest enclosing hardwalled ancestor cpuset. * @@ -4194,7 +4198,9 @@ static struct cpuset *nearest_hardwall_ancestor(struct cpuset *cs) * The first call here from mm/page_alloc:get_page_from_freelist() * has __GFP_HARDWALL set in gfp_mask, enforcing hardwall cpusets, * so no allocation on a node outside the cpuset is allowed (unless - * in interrupt, of course). + * in interrupt, of course). The PF_EXITING check must therefore + * come before the __GFP_HARDWALL check, otherwise a dying task + * would be blocked on the fast path. * * The second pass through get_page_from_freelist() doesn't even call * here for GFP_ATOMIC calls. For those calls, the __alloc_pages() @@ -4204,6 +4210,7 @@ static struct cpuset *nearest_hardwall_ancestor(struct cpuset *cs) * in_interrupt - any node ok (current task context irrelevant) * GFP_ATOMIC - any node ok * tsk_is_oom_victim - any node ok + * PF_EXITING - any node ok (let dying task exit quickly) * GFP_KERNEL - any node in enclosing hardwalled cpuset ok * GFP_USER - only nodes in current tasks mems allowed ok. */ @@ -4223,11 +4230,10 @@ bool cpuset_current_node_allowed(int node, gfp_t gfp_mask) */ if (unlikely(tsk_is_oom_victim(current))) return true; - if (gfp_mask & __GFP_HARDWALL) /* If hardwall request, stop here */ - return false; - if (current->flags & PF_EXITING) /* Let dying task have memory */ return true; + if (gfp_mask & __GFP_HARDWALL) /* If hardwall request, stop here */ + return false; /* Not hardwall and node outside mems_allowed: scan up cpusets */ spin_lock_irqsave(&callback_lock, flags); diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c index 1ab1fb47f2711e..4753a67d0f0f2b 100644 --- a/kernel/cgroup/dmem.c +++ b/kernel/cgroup/dmem.c @@ -602,6 +602,7 @@ get_cg_pool_unlocked(struct dmemcg_state *cg, struct dmem_cgroup_region *region) pool = NULL; continue; } + pool = ERR_PTR(-ENOMEM); } } diff --git a/kernel/events/core.c b/kernel/events/core.c index 6d1f8bad7e1c52..7935d5663944ee 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7006,6 +7006,7 @@ static void perf_mmap_open(struct vm_area_struct *vma) } static void perf_pmu_output_stop(struct perf_event *event); +static void perf_mmap_unaccount(struct vm_area_struct *vma, struct perf_buffer *rb); /* * A buffer can be mmap()ed multiple times; either directly through the same @@ -7021,8 +7022,6 @@ static void perf_mmap_close(struct vm_area_struct *vma) mapped_f unmapped = get_mapped(event, event_unmapped); struct perf_buffer *rb = ring_buffer_get(event); struct user_struct *mmap_user = rb->mmap_user; - int mmap_locked = rb->mmap_locked; - unsigned long size = perf_data_size(rb); bool detach_rest = false; /* FIXIES vs perf_pmu_unregister() */ @@ -7117,11 +7116,7 @@ static void perf_mmap_close(struct vm_area_struct *vma) * Aside from that, this buffer is 'fully' detached and unmapped, * undo the VM accounting. */ - - atomic_long_sub((size >> PAGE_SHIFT) + 1 - mmap_locked, - &mmap_user->locked_vm); - atomic64_sub(mmap_locked, &vma->vm_mm->pinned_vm); - free_uid(mmap_user); + perf_mmap_unaccount(vma, rb); out_put: ring_buffer_put(rb); /* could be last */ @@ -7261,6 +7256,15 @@ static void perf_mmap_account(struct vm_area_struct *vma, long user_extra, long atomic64_add(extra, &vma->vm_mm->pinned_vm); } +static void perf_mmap_unaccount(struct vm_area_struct *vma, struct perf_buffer *rb) +{ + struct user_struct *user = rb->mmap_user; + + atomic_long_sub((perf_data_size(rb) >> PAGE_SHIFT) + 1 - rb->mmap_locked, + &user->locked_vm); + atomic64_sub(rb->mmap_locked, &vma->vm_mm->pinned_vm); +} + static int perf_mmap_rb(struct vm_area_struct *vma, struct perf_event *event, unsigned long nr_pages) { @@ -7323,8 +7327,6 @@ static int perf_mmap_rb(struct vm_area_struct *vma, struct perf_event *event, if (!rb) return -ENOMEM; - refcount_set(&rb->mmap_count, 1); - rb->mmap_user = get_current_user(); rb->mmap_locked = extra; ring_buffer_attach(event, rb); @@ -7474,16 +7476,54 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma) mapped(event, vma->vm_mm); /* - * Try to map it into the page table. On fail, invoke - * perf_mmap_close() to undo the above, as the callsite expects - * full cleanup in this case and therefore does not invoke - * vmops::close(). + * Try to map it into the page table. On fail undo the above, + * as the callsite expects full cleanup in this case and + * therefore does not invoke vmops::close(). */ ret = map_range(event->rb, vma); - if (ret) - perf_mmap_close(vma); + if (likely(!ret)) + return 0; + + /* Error path */ + + /* + * If this is the first mmap(), then event->mmap_count should + * be stable at 1. It is only modified by: + * perf_mmap_{open,close}() and perf_mmap(). + * + * The former are not possible because this mmap() hasn't been + * successful yet, and the latter is serialized by + * event->mmap_mutex which we still hold (note that mmap_lock + * is not strictly sufficient here, because the event fd can + * be passed to another process through trivial means like + * fork(), leading to concurrent mmap() from different mm). + * + * Make sure to remove event->rb before releasing + * event->mmap_mutex, such that any concurrent mmap() will not + * attempt use this failed buffer. + */ + if (refcount_read(&event->mmap_count) == 1) { + /* + * Minimal perf_mmap_close(); there can't be AUX or + * other events on account of this being the first. + */ + mapped = get_mapped(event, event_unmapped); + if (mapped) + mapped(event, vma->vm_mm); + perf_mmap_unaccount(vma, event->rb); + ring_buffer_attach(event, NULL); /* drops last rb->refcount */ + refcount_set(&event->mmap_count, 0); + return ret; + } + + /* + * Otherwise this is an already existing buffer, and there is + * no race vs first exposure, so fall-through and call + * perf_mmap_close(). + */ } + perf_mmap_close(vma); return ret; } diff --git a/kernel/events/internal.h b/kernel/events/internal.h index d9cc5708309184..c03c4f2eea5718 100644 --- a/kernel/events/internal.h +++ b/kernel/events/internal.h @@ -67,6 +67,7 @@ static inline void rb_free_rcu(struct rcu_head *rcu_head) struct perf_buffer *rb; rb = container_of(rcu_head, struct perf_buffer, rcu_head); + free_uid(rb->mmap_user); rb_free(rb); } diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c index 3e7de266141729..9fe92161715e09 100644 --- a/kernel/events/ring_buffer.c +++ b/kernel/events/ring_buffer.c @@ -340,6 +340,8 @@ ring_buffer_init(struct perf_buffer *rb, long watermark, int flags) rb->paused = 1; mutex_init(&rb->aux_mutex); + rb->mmap_user = get_current_user(); + refcount_set(&rb->mmap_count, 1); } void perf_aux_output_flag(struct perf_output_handle *handle, u64 flags) diff --git a/kernel/exit.c b/kernel/exit.c index 25e9cb6de7e793..f50d73c272d6ee 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -571,6 +571,7 @@ static void exit_mm(void) */ smp_mb__after_spinlock(); local_irq_disable(); + current->user_dumpable = (get_dumpable(mm) == SUID_DUMP_USER); current->mm = NULL; membarrier_update_current_mm(NULL); enter_lazy_tlb(mm, current); @@ -1073,6 +1074,7 @@ void __noreturn make_task_dead(int signr) futex_exit_recursive(tsk); tsk->exit_state = EXIT_DEAD; refcount_inc(&tsk->rcu_users); + preempt_disable(); do_task_dead(); } diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c index 6c9b1dc4e7d467..b635e3c5d5b6b9 100644 --- a/kernel/irq/chip.c +++ b/kernel/irq/chip.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include @@ -893,7 +894,10 @@ void handle_percpu_irq(struct irq_desc *desc) * * action->percpu_dev_id is a pointer to percpu variables which * contain the real device id for the cpu on which this handler is - * called + * called. + * + * May be used for NMI interrupt lines, and so may be called in IRQ or NMI + * context. */ void handle_percpu_devid_irq(struct irq_desc *desc) { @@ -930,7 +934,8 @@ void handle_percpu_devid_irq(struct irq_desc *desc) enabled ? " and unmasked" : "", irq, cpu); } - add_interrupt_randomness(irq); + if (!in_nmi()) + add_interrupt_randomness(irq); if (chip->irq_eoi) chip->irq_eoi(&desc->irq_data); diff --git a/kernel/irq_work.c b/kernel/irq_work.c index 120fd7365fbe29..f7e2dc2c30c621 100644 --- a/kernel/irq_work.c +++ b/kernel/irq_work.c @@ -292,6 +292,12 @@ void irq_work_sync(struct irq_work *work) !arch_irq_work_has_interrupt()) { rcuwait_wait_event(&work->irqwait, !irq_work_is_busy(work), TASK_UNINTERRUPTIBLE); + /* + * Ensure irq_work_single() does not access @work + * after removing IRQ_WORK_BUSY. It is always + * accessed within a RCU-read section. + */ + synchronize_rcu(); return; } @@ -302,6 +308,7 @@ EXPORT_SYMBOL_GPL(irq_work_sync); static void run_irq_workd(unsigned int cpu) { + guard(rcu)(); irq_work_run_list(this_cpu_ptr(&lazy_list)); } diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c index 18509d8082ea75..2592f7ca16e2e3 100644 --- a/kernel/liveupdate/kexec_handover.c +++ b/kernel/liveupdate/kexec_handover.c @@ -1707,7 +1707,7 @@ int kho_fill_kimage(struct kimage *image) int err = 0; struct kexec_buf scratch; - if (!kho_enable) + if (!kho_enable || image->type == KEXEC_TYPE_CRASH) return 0; image->kho.fdt = virt_to_phys(kho_out.fdt); diff --git a/kernel/ptrace.c b/kernel/ptrace.c index 68c17daef8d40b..130043bfc2091c 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -272,11 +272,24 @@ static bool ptrace_has_cap(struct user_namespace *ns, unsigned int mode) return ns_capable(ns, CAP_SYS_PTRACE); } +static bool task_still_dumpable(struct task_struct *task, unsigned int mode) +{ + struct mm_struct *mm = task->mm; + if (mm) { + if (get_dumpable(mm) == SUID_DUMP_USER) + return true; + return ptrace_has_cap(mm->user_ns, mode); + } + + if (task->user_dumpable) + return true; + return ptrace_has_cap(&init_user_ns, mode); +} + /* Returns 0 on success, -errno on denial. */ static int __ptrace_may_access(struct task_struct *task, unsigned int mode) { const struct cred *cred = current_cred(), *tcred; - struct mm_struct *mm; kuid_t caller_uid; kgid_t caller_gid; @@ -337,11 +350,8 @@ static int __ptrace_may_access(struct task_struct *task, unsigned int mode) * Pairs with a write barrier in commit_creds(). */ smp_rmb(); - mm = task->mm; - if (mm && - ((get_dumpable(mm) != SUID_DUMP_USER) && - !ptrace_has_cap(mm->user_ns, mode))) - return -EPERM; + if (!task_still_dumpable(task, mode)) + return -EPERM; return security_ptrace_access_check(task, mode); } diff --git a/kernel/rseq.c b/kernel/rseq.c index 38d3ef540760f4..e75e3a5e312c8a 100644 --- a/kernel/rseq.c +++ b/kernel/rseq.c @@ -236,11 +236,6 @@ static int __init rseq_debugfs_init(void) } __initcall(rseq_debugfs_init); -static bool rseq_set_ids(struct task_struct *t, struct rseq_ids *ids, u32 node_id) -{ - return rseq_set_ids_get_csaddr(t, ids, node_id, NULL); -} - static bool rseq_handle_cs(struct task_struct *t, struct pt_regs *regs) { struct rseq __user *urseq = t->rseq.usrptr; @@ -258,14 +253,16 @@ static bool rseq_handle_cs(struct task_struct *t, struct pt_regs *regs) static void rseq_slowpath_update_usr(struct pt_regs *regs) { /* - * Preserve rseq state and user_irq state. The generic entry code - * clears user_irq on the way out, the non-generic entry - * architectures are not having user_irq. + * Preserve has_rseq and user_irq state. The generic entry code clears + * user_irq on the way out, the non-generic entry architectures are not + * setting user_irq. */ - const struct rseq_event evt_mask = { .has_rseq = true, .user_irq = true, }; + const struct rseq_event evt_mask = { + .has_rseq = RSEQ_HAS_RSEQ_VERSION_MASK, + .user_irq = true, + }; struct task_struct *t = current; struct rseq_ids ids; - u32 node_id; bool event; if (unlikely(t->flags & PF_EXITING)) @@ -301,9 +298,9 @@ static void rseq_slowpath_update_usr(struct pt_regs *regs) if (!event) return; - node_id = cpu_to_node(ids.cpu_id); + ids.node_id = cpu_to_node(ids.cpu_id); - if (unlikely(!rseq_update_usr(t, regs, &ids, node_id))) { + if (unlikely(!rseq_update_usr(t, regs, &ids))) { /* * Clear the errors just in case this might survive magically, but * leave the rest intact. @@ -335,8 +332,9 @@ void __rseq_handle_slowpath(struct pt_regs *regs) void __rseq_signal_deliver(int sig, struct pt_regs *regs) { rseq_stat_inc(rseq_stats.signal); + /* - * Don't update IDs, they are handled on exit to user if + * Don't update IDs yet, they are handled on exit to user if * necessary. The important thing is to abort a critical section of * the interrupted context as after this point the instruction * pointer in @regs points to the signal handler. @@ -349,6 +347,13 @@ void __rseq_signal_deliver(int sig, struct pt_regs *regs) current->rseq.event.error = 0; force_sigsegv(sig); } + + /* + * In legacy mode, force the update of IDs before returning to user + * space to stay compatible. + */ + if (!rseq_v2(current)) + rseq_force_update(); } /* @@ -384,19 +389,22 @@ void rseq_syscall(struct pt_regs *regs) static bool rseq_reset_ids(void) { - struct rseq_ids ids = { - .cpu_id = RSEQ_CPU_ID_UNINITIALIZED, - .mm_cid = 0, - }; + struct rseq __user *rseq = current->rseq.usrptr; /* * If this fails, terminate it because this leaves the kernel in * stupid state as exit to user space will try to fixup the ids * again. */ - if (rseq_set_ids(current, &ids, 0)) - return true; + scoped_user_rw_access(rseq, efault) { + unsafe_put_user(0, &rseq->cpu_id_start, efault); + unsafe_put_user(RSEQ_CPU_ID_UNINITIALIZED, &rseq->cpu_id, efault); + unsafe_put_user(0, &rseq->node_id, efault); + unsafe_put_user(0, &rseq->mm_cid, efault); + } + return true; +efault: force_sig(SIGSEGV); return false; } @@ -404,70 +412,29 @@ static bool rseq_reset_ids(void) /* The original rseq structure size (including padding) is 32 bytes. */ #define ORIG_RSEQ_SIZE 32 -/* - * sys_rseq - setup restartable sequences for caller thread. - */ -SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len, int, flags, u32, sig) +static long rseq_register(struct rseq __user * rseq, u32 rseq_len, int flags, u32 sig) { u32 rseqfl = 0; + u8 version = 1; - if (flags & RSEQ_FLAG_UNREGISTER) { - if (flags & ~RSEQ_FLAG_UNREGISTER) - return -EINVAL; - /* Unregister rseq for current thread. */ - if (current->rseq.usrptr != rseq || !current->rseq.usrptr) - return -EINVAL; - if (rseq_len != current->rseq.len) - return -EINVAL; - if (current->rseq.sig != sig) - return -EPERM; - if (!rseq_reset_ids()) - return -EFAULT; - rseq_reset(current); - return 0; - } - - if (unlikely(flags & ~(RSEQ_FLAG_SLICE_EXT_DEFAULT_ON))) - return -EINVAL; - - if (current->rseq.usrptr) { - /* - * If rseq is already registered, check whether - * the provided address differs from the prior - * one. - */ - if (current->rseq.usrptr != rseq || rseq_len != current->rseq.len) - return -EINVAL; - if (current->rseq.sig != sig) - return -EPERM; - /* Already registered. */ - return -EBUSY; - } - - /* - * If there was no rseq previously registered, ensure the provided rseq - * is properly aligned, as communcated to user-space through the ELF - * auxiliary vector AT_RSEQ_ALIGN. If rseq_len is the original rseq - * size, the required alignment is the original struct rseq alignment. - * - * The rseq_len is required to be greater or equal to the original rseq - * size. In order to be valid, rseq_len is either the original rseq size, - * or large enough to contain all supported fields, as communicated to - * user-space through the ELF auxiliary vector AT_RSEQ_FEATURE_SIZE. - */ - if (rseq_len < ORIG_RSEQ_SIZE || - (rseq_len == ORIG_RSEQ_SIZE && !IS_ALIGNED((unsigned long)rseq, ORIG_RSEQ_SIZE)) || - (rseq_len != ORIG_RSEQ_SIZE && (!IS_ALIGNED((unsigned long)rseq, rseq_alloc_align()) || - rseq_len < offsetof(struct rseq, end)))) - return -EINVAL; if (!access_ok(rseq, rseq_len)) return -EFAULT; - if (IS_ENABLED(CONFIG_RSEQ_SLICE_EXTENSION)) { - rseqfl |= RSEQ_CS_FLAG_SLICE_EXT_AVAILABLE; - if (rseq_slice_extension_enabled() && - (flags & RSEQ_FLAG_SLICE_EXT_DEFAULT_ON)) - rseqfl |= RSEQ_CS_FLAG_SLICE_EXT_ENABLED; + /* + * Architectures, which use the generic IRQ entry code (at least) enable + * registrations with a size greater than the original v1 fixed sized + * @rseq_len, which has been validated already to utilize the optimized + * v2 ABI mode which also enables extended RSEQ features beyond MMCID. + */ + if (IS_ENABLED(CONFIG_GENERIC_IRQ_ENTRY) && rseq_len > ORIG_RSEQ_SIZE) + version = 2; + + if (IS_ENABLED(CONFIG_RSEQ_SLICE_EXTENSION) && version > 1) { + if (rseq_slice_extension_enabled()) { + rseqfl |= RSEQ_CS_FLAG_SLICE_EXT_AVAILABLE; + if (flags & RSEQ_FLAG_SLICE_EXT_DEFAULT_ON) + rseqfl |= RSEQ_CS_FLAG_SLICE_EXT_ENABLED; + } } scoped_user_write_access(rseq, efault) { @@ -485,7 +452,15 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len, int, flags, u32 unsafe_put_user(RSEQ_CPU_ID_UNINITIALIZED, &rseq->cpu_id, efault); unsafe_put_user(0U, &rseq->node_id, efault); unsafe_put_user(0U, &rseq->mm_cid, efault); - unsafe_put_user(0U, &rseq->slice_ctrl.all, efault); + + /* + * All fields past mm_cid are only valid for non-legacy v2 + * registrations. + */ + if (version > 1) { + if (IS_ENABLED(CONFIG_RSEQ_SLICE_EXTENSION)) + unsafe_put_user(0U, &rseq->slice_ctrl.all, efault); + } } /* @@ -501,11 +476,10 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len, int, flags, u32 #endif /* - * If rseq was previously inactive, and has just been - * registered, ensure the cpu_id_start and cpu_id fields - * are updated before returning to user-space. + * Ensure the cpu_id_start and cpu_id fields are updated before + * returning to user-space. */ - current->rseq.event.has_rseq = true; + current->rseq.event.has_rseq = version; rseq_force_update(); return 0; @@ -513,6 +487,80 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len, int, flags, u32 return -EFAULT; } +static long rseq_unregister(struct rseq __user * rseq, u32 rseq_len, int flags, u32 sig) +{ + if (flags & ~RSEQ_FLAG_UNREGISTER) + return -EINVAL; + if (current->rseq.usrptr != rseq || !current->rseq.usrptr) + return -EINVAL; + if (rseq_len != current->rseq.len) + return -EINVAL; + if (current->rseq.sig != sig) + return -EPERM; + if (!rseq_reset_ids()) + return -EFAULT; + rseq_reset(current); + return 0; +} + +static long rseq_reregister(struct rseq __user * rseq, u32 rseq_len, u32 sig) +{ + /* + * If rseq is already registered, check whether the provided address + * differs from the prior one. + */ + if (current->rseq.usrptr != rseq || rseq_len != current->rseq.len) + return -EINVAL; + if (current->rseq.sig != sig) + return -EPERM; + /* Already registered. */ + return -EBUSY; +} + +static bool rseq_length_valid(struct rseq __user *rseq, unsigned int rseq_len) +{ + /* + * Ensure the provided rseq is properly aligned, as communicated to + * user-space through the ELF auxiliary vector AT_RSEQ_ALIGN. If + * rseq_len is the original rseq size, the required alignment is the + * original struct rseq alignment. + * + * In order to be valid, rseq_len is either the original rseq size, or + * large enough to contain all supported fields, as communicated to + * user-space through the ELF auxiliary vector AT_RSEQ_FEATURE_SIZE. + */ + if (rseq_len < ORIG_RSEQ_SIZE) + return false; + + if (rseq_len == ORIG_RSEQ_SIZE) + return IS_ALIGNED((unsigned long)rseq, ORIG_RSEQ_SIZE); + + return IS_ALIGNED((unsigned long)rseq, rseq_alloc_align()) && + rseq_len >= offsetof(struct rseq, end); +} + +#define RSEQ_FLAGS_SUPPORTED (RSEQ_FLAG_SLICE_EXT_DEFAULT_ON) + +/* + * sys_rseq - Register or unregister restartable sequences for the caller thread. + */ +SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len, int, flags, u32, sig) +{ + if (flags & RSEQ_FLAG_UNREGISTER) + return rseq_unregister(rseq, rseq_len, flags, sig); + + if (unlikely(flags & ~RSEQ_FLAGS_SUPPORTED)) + return -EINVAL; + + if (current->rseq.usrptr) + return rseq_reregister(rseq, rseq_len, sig); + + if (!rseq_length_valid(rseq, rseq_len)) + return -EINVAL; + + return rseq_register(rseq, rseq_len, flags, sig); +} + #ifdef CONFIG_RSEQ_SLICE_EXTENSION struct slice_timer { struct hrtimer timer; @@ -713,6 +761,8 @@ int rseq_slice_extension_prctl(unsigned long arg2, unsigned long arg3) return -ENOTSUPP; if (!current->rseq.usrptr) return -ENXIO; + if (!rseq_v2(current)) + return -ENOTSUPP; /* No change? */ if (enable == !!current->rseq.slice.state.enabled) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index edca7849b165d9..7db4c87df83b0a 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -3107,20 +3107,18 @@ static void task_woken_dl(struct rq *rq, struct task_struct *p) static void set_cpus_allowed_dl(struct task_struct *p, struct affinity_context *ctx) { - struct root_domain *src_rd; struct rq *rq; WARN_ON_ONCE(!dl_task(p)); rq = task_rq(p); - src_rd = rq->rd; /* * Migrating a SCHED_DEADLINE task between exclusive * cpusets (different root_domains) entails a bandwidth * update. We already made space for us in the destination * domain (see cpuset_can_attach()). */ - if (!cpumask_intersects(src_rd->span, ctx->new_mask)) { + if (dl_task_needs_bw_move(p, ctx->new_mask)) { struct dl_bw *src_dl_b; src_dl_b = dl_bw_of(cpu_of(rq)); @@ -3137,6 +3135,15 @@ static void set_cpus_allowed_dl(struct task_struct *p, set_cpus_allowed_common(p, ctx); } +bool dl_task_needs_bw_move(struct task_struct *p, + const struct cpumask *new_mask) +{ + if (!dl_task(p)) + return false; + + return !cpumask_intersects(task_rq(p)->rd->span, new_mask); +} + /* Assumes rq->lock is held */ static void rq_online_dl(struct rq *rq) { diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index 345aa11b84b28e..d3ce708991c3ed 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -297,7 +297,6 @@ static void scx_set_task_sched(struct task_struct *p, struct scx_sched *sch) #else /* CONFIG_EXT_SUB_SCHED */ static struct scx_sched *scx_parent(struct scx_sched *sch) { return NULL; } static struct scx_sched *scx_next_descendant_pre(struct scx_sched *pos, struct scx_sched *root) { return pos ? NULL : root; } -static struct scx_sched *scx_find_sub_sched(u64 cgroup_id) { return NULL; } static void scx_set_task_sched(struct task_struct *p, struct scx_sched *sch) {} #endif /* CONFIG_EXT_SUB_SCHED */ @@ -712,6 +711,51 @@ struct bpf_iter_scx_dsq { } __attribute__((aligned(8))); +static u32 scx_get_task_state(const struct task_struct *p) +{ + return p->scx.flags & SCX_TASK_STATE_MASK; +} + +static void scx_set_task_state(struct task_struct *p, u32 state) +{ + u32 prev_state = scx_get_task_state(p); + bool warn = false; + + switch (state) { + case SCX_TASK_NONE: + warn = prev_state == SCX_TASK_DEAD; + break; + case SCX_TASK_INIT_BEGIN: + warn = prev_state != SCX_TASK_NONE; + break; + case SCX_TASK_INIT: + warn = prev_state != SCX_TASK_INIT_BEGIN; + p->scx.flags |= SCX_TASK_RESET_RUNNABLE_AT; + break; + case SCX_TASK_READY: + warn = !(prev_state == SCX_TASK_INIT || + prev_state == SCX_TASK_ENABLED); + break; + case SCX_TASK_ENABLED: + warn = prev_state != SCX_TASK_READY; + break; + case SCX_TASK_DEAD: + warn = !(prev_state == SCX_TASK_NONE || + prev_state == SCX_TASK_INIT_BEGIN); + break; + default: + WARN_ONCE(1, "sched_ext: Invalid task state %d -> %d for %s[%d]", + prev_state, state, p->comm, p->pid); + return; + } + + WARN_ONCE(warn, "sched_ext: Invalid task state transition 0x%x -> 0x%x for %s[%d]", + prev_state, state, p->comm, p->pid); + + p->scx.flags &= ~SCX_TASK_STATE_MASK; + p->scx.flags |= state; +} + /* * SCX task iterator. */ @@ -766,7 +810,8 @@ static void scx_task_iter_start(struct scx_task_iter *iter, struct cgroup *cgrp) lockdep_assert_held(&cgroup_mutex); iter->cgrp = cgrp; iter->css_pos = css_next_descendant_pre(NULL, &iter->cgrp->self); - css_task_iter_start(iter->css_pos, 0, &iter->css_iter); + css_task_iter_start(iter->css_pos, CSS_TASK_ITER_WITH_DEAD, + &iter->css_iter); return; } #endif @@ -866,7 +911,8 @@ static struct task_struct *scx_task_iter_next(struct scx_task_iter *iter) iter->css_pos = css_next_descendant_pre(iter->css_pos, &iter->cgrp->self); if (iter->css_pos) - css_task_iter_start(iter->css_pos, 0, &iter->css_iter); + css_task_iter_start(iter->css_pos, CSS_TASK_ITER_WITH_DEAD, + &iter->css_iter); } return NULL; } @@ -926,16 +972,27 @@ static struct task_struct *scx_task_iter_next_locked(struct scx_task_iter *iter) * * Test for idle_sched_class as only init_tasks are on it. */ - if (p->sched_class != &idle_sched_class) - break; - } - if (!p) - return NULL; + if (p->sched_class == &idle_sched_class) + continue; - iter->rq = task_rq_lock(p, &iter->rf); - iter->locked_task = p; + iter->rq = task_rq_lock(p, &iter->rf); + iter->locked_task = p; - return p; + /* + * cgroup_task_dead() removes the dead tasks from cset->tasks + * after sched_ext_dead() and cgroup iteration may see tasks + * which already finished sched_ext_dead(). %SCX_TASK_DEAD is + * set by sched_ext_dead() under @p's rq lock. Test it to + * avoid visiting tasks which are already dead from SCX POV. + */ + if (scx_get_task_state(p) == SCX_TASK_DEAD) { + __scx_task_iter_rq_unlock(iter); + continue; + } + + return p; + } + return NULL; } /** @@ -3487,41 +3544,6 @@ static struct cgroup *tg_cgrp(struct task_group *tg) #endif /* CONFIG_EXT_GROUP_SCHED */ -static u32 scx_get_task_state(const struct task_struct *p) -{ - return p->scx.flags & SCX_TASK_STATE_MASK; -} - -static void scx_set_task_state(struct task_struct *p, u32 state) -{ - u32 prev_state = scx_get_task_state(p); - bool warn = false; - - switch (state) { - case SCX_TASK_NONE: - break; - case SCX_TASK_INIT: - warn = prev_state != SCX_TASK_NONE; - break; - case SCX_TASK_READY: - warn = prev_state == SCX_TASK_NONE; - break; - case SCX_TASK_ENABLED: - warn = prev_state != SCX_TASK_READY; - break; - default: - WARN_ONCE(1, "sched_ext: Invalid task state %d -> %d for %s[%d]", - prev_state, state, p->comm, p->pid); - return; - } - - WARN_ONCE(warn, "sched_ext: Invalid task state transition 0x%x -> 0x%x for %s[%d]", - prev_state, state, p->comm, p->pid); - - p->scx.flags &= ~SCX_TASK_STATE_MASK; - p->scx.flags |= state; -} - static int __scx_init_task(struct scx_sched *sch, struct task_struct *p, bool fork) { int ret; @@ -3573,22 +3595,6 @@ static int __scx_init_task(struct scx_sched *sch, struct task_struct *p, bool fo return 0; } -static int scx_init_task(struct scx_sched *sch, struct task_struct *p, bool fork) -{ - int ret; - - ret = __scx_init_task(sch, p, fork); - if (!ret) { - /* - * While @p's rq is not locked. @p is not visible to the rest of - * SCX yet and it's safe to update the flags and state. - */ - p->scx.flags |= SCX_TASK_RESET_RUNNABLE_AT; - scx_set_task_state(p, SCX_TASK_INIT); - } - return ret; -} - static void __scx_enable_task(struct scx_sched *sch, struct task_struct *p) { struct rq *rq = task_rq(p); @@ -3703,7 +3709,8 @@ static void scx_disable_and_exit_task(struct scx_sched *sch, * If set, @p exited between __scx_init_task() and scx_enable_task() in * scx_sub_enable() and is initialized for both the associated sched and * its parent. Exit for the child too - scx_enable_task() never ran for - * it, so undo only init_task. + * it, so undo only init_task. The flag is only set on the sub-enable + * path, so it's always clear when @p arrives here in %SCX_TASK_NONE. */ if (p->scx.flags & SCX_TASK_SUB_INIT) { if (!WARN_ON_ONCE(!scx_enabling_sub_sched)) @@ -3751,10 +3758,14 @@ int scx_fork(struct task_struct *p, struct kernel_clone_args *kargs) #else struct scx_sched *sch = scx_root; #endif - ret = scx_init_task(sch, p, true); - if (!ret) - scx_set_task_sched(p, sch); - return ret; + scx_set_task_state(p, SCX_TASK_INIT_BEGIN); + ret = __scx_init_task(sch, p, true); + if (unlikely(ret)) { + scx_set_task_state(p, SCX_TASK_NONE); + return ret; + } + scx_set_task_state(p, SCX_TASK_INIT); + scx_set_task_sched(p, sch); } return 0; @@ -3848,13 +3859,24 @@ void sched_ext_dead(struct task_struct *p) /* * @p is off scx_tasks and wholly ours. scx_root_enable()'s READY -> * ENABLED transitions can't race us. Disable ops for @p. + * + * %SCX_TASK_DEAD synchronizes against cgroup task iteration - see + * scx_task_iter_next_locked(). NONE tasks need no marking: cgroup + * iteration is only used from sub-sched paths, which require root + * enabled. Root enable transitions every live task to at least READY. + * + * %INIT_BEGIN means ops.init_task() is running for @p. Don't call + * into ops; transition to %DEAD so the post-init recheck unwinds + * via scx_sub_init_cancel_task(). */ if (scx_get_task_state(p) != SCX_TASK_NONE) { struct rq_flags rf; struct rq *rq; rq = task_rq_lock(p, &rf); - scx_disable_and_exit_task(scx_task_sched(p), p); + if (scx_get_task_state(p) != SCX_TASK_INIT_BEGIN) + scx_disable_and_exit_task(scx_task_sched(p), p); + scx_set_task_state(p, SCX_TASK_DEAD); task_rq_unlock(rq, p, &rf); } } @@ -3900,6 +3922,16 @@ static void switched_from_scx(struct rq *rq, struct task_struct *p) if (task_dead_and_done(p)) return; + /* + * %NONE means SCX is no longer tracking @p at the task level (e.g. + * scx_fail_parent() handed @p back to the parent at NONE pending the + * parent's own teardown). There is nothing to disable; calling + * scx_disable_task() would WARN on the non-%ENABLED state and trigger a + * NONE -> READY validation failure. + */ + if (scx_get_task_state(p) == SCX_TASK_NONE) + return; + scx_disable_task(scx_task_sched(p), p); } @@ -4789,6 +4821,8 @@ static void scx_sched_free_rcu_work(struct work_struct *work) kfree(sch->cgrp_path); if (sch_cgroup(sch)) cgroup_put(sch_cgroup(sch)); + if (sch->sub_kset) + kobject_put(&sch->sub_kset->kobj); #endif /* CONFIG_EXT_SUB_SCHED */ for_each_possible_cpu(cpu) { @@ -5566,10 +5600,12 @@ static void refresh_watchdog(void) static s32 scx_link_sched(struct scx_sched *sch) { + const char *err_msg = ""; + s32 ret = 0; + scoped_guard(raw_spinlock_irq, &scx_sched_lock) { #ifdef CONFIG_EXT_SUB_SCHED struct scx_sched *parent = scx_parent(sch); - s32 ret; if (parent) { /* @@ -5579,15 +5615,16 @@ static s32 scx_link_sched(struct scx_sched *sch) * parent can shoot us down. */ if (atomic_read(&parent->exit_kind) != SCX_EXIT_NONE) { - scx_error(sch, "parent disabled"); - return -ENOENT; + err_msg = "parent disabled"; + ret = -ENOENT; + break; } ret = rhashtable_lookup_insert_fast(&scx_sched_hash, &sch->hash_node, scx_sched_hash_params); if (ret) { - scx_error(sch, "failed to insert into scx_sched_hash (%d)", ret); - return ret; + err_msg = "failed to insert into scx_sched_hash"; + break; } list_add_tail(&sch->sibling, &parent->children); @@ -5597,6 +5634,15 @@ static s32 scx_link_sched(struct scx_sched *sch) list_add_tail_rcu(&sch->all, &scx_sched_all); } + /* + * scx_error() takes scx_sched_lock via scx_claim_exit(), so it must run after + * the guard above is released. + */ + if (ret) { + scx_error(sch, "%s (%d)", err_msg, ret); + return ret; + } + refresh_watchdog(); return 0; } @@ -5666,7 +5712,7 @@ static void scx_fail_parent(struct scx_sched *sch, scoped_guard (sched_change, p, DEQUEUE_SAVE | DEQUEUE_MOVE) { scx_disable_and_exit_task(sch, p); - rcu_assign_pointer(p->scx.sched, parent); + scx_set_task_sched(p, parent); } } scx_task_iter_stop(&sti); @@ -5744,6 +5790,21 @@ static void scx_sub_disable(struct scx_sched *sch) } rq = task_rq_lock(p, &rf); + + if (scx_get_task_state(p) == SCX_TASK_DEAD) { + /* + * sched_ext_dead() raced us between __scx_init_task() + * and this rq lock and ran exit_task() on @sch (the + * sched @p was on at that point), not on $parent. + * $parent's just-completed init is owed an exit_task() + * and we issue it here. + */ + scx_sub_init_cancel_task(parent, p); + task_rq_unlock(rq, p, &rf); + put_task_struct(p); + continue; + } + scoped_guard (sched_change, p, DEQUEUE_SAVE | DEQUEUE_MOVE) { /* * $p is initialized for $parent and still attached to @@ -5752,13 +5813,14 @@ static void scx_sub_disable(struct scx_sched *sch) * $p having already been initialized, and then enable. */ scx_disable_and_exit_task(sch, p); + scx_set_task_state(p, SCX_TASK_INIT_BEGIN); scx_set_task_state(p, SCX_TASK_INIT); - rcu_assign_pointer(p->scx.sched, parent); + scx_set_task_sched(p, parent); scx_set_task_state(p, SCX_TASK_READY); scx_enable_task(parent, p); } - task_rq_unlock(rq, p, &rf); + task_rq_unlock(rq, p, &rf); put_task_struct(p); } scx_task_iter_stop(&sti); @@ -5801,7 +5863,7 @@ static void scx_sub_disable(struct scx_sched *sch) if (sch->ops.exit) SCX_CALL_OP(sch, exit, NULL, sch->exit_info); if (sch->sub_kset) - kset_unregister(sch->sub_kset); + kobject_del(&sch->sub_kset->kobj); kobject_del(&sch->kobj); } #else /* CONFIG_EXT_SUB_SCHED */ @@ -5935,7 +5997,7 @@ static void scx_root_disable(struct scx_sched *sch) */ #ifdef CONFIG_EXT_SUB_SCHED if (sch->sub_kset) - kset_unregister(sch->sub_kset); + kobject_del(&sch->sub_kset->kobj); #endif kobject_del(&sch->kobj); @@ -6559,7 +6621,7 @@ static struct scx_sched *scx_alloc_and_add_sched(struct sched_ext_ops *ops, sch->slice_dfl = SCX_SLICE_DFL; atomic_set(&sch->exit_kind, SCX_EXIT_NONE); - init_irq_work(&sch->disable_irq_work, scx_disable_irq_workfn); + sch->disable_irq_work = IRQ_WORK_INIT_HARD(scx_disable_irq_workfn); kthread_init_work(&sch->disable_work, scx_disable_workfn); timer_setup(&sch->bypass_lb_timer, scx_bypass_lb_timerfn, 0); @@ -6575,6 +6637,7 @@ static struct scx_sched *scx_alloc_and_add_sched(struct sched_ext_ops *ops, rcu_assign_pointer(ops->priv, sch); sch->kobj.kset = scx_kset; + INIT_LIST_HEAD(&sch->all); #ifdef CONFIG_EXT_SUB_SCHED char *buf = kzalloc(PATH_MAX, GFP_KERNEL); @@ -6602,6 +6665,7 @@ static struct scx_sched *scx_alloc_and_add_sched(struct sched_ext_ops *ops, ret = kobject_init_and_add(&sch->kobj, &scx_ktype, NULL, "root"); if (ret < 0) { + RCU_INIT_POINTER(ops->priv, NULL); kobject_put(&sch->kobj); return ERR_PTR(ret); } @@ -6609,6 +6673,7 @@ static struct scx_sched *scx_alloc_and_add_sched(struct sched_ext_ops *ops, if (ops->sub_attach) { sch->sub_kset = kset_create_and_add("sub", NULL, &sch->kobj); if (!sch->sub_kset) { + RCU_INIT_POINTER(ops->priv, NULL); kobject_put(&sch->kobj); return ERR_PTR(-ENOMEM); } @@ -6616,14 +6681,18 @@ static struct scx_sched *scx_alloc_and_add_sched(struct sched_ext_ops *ops, #else /* CONFIG_EXT_SUB_SCHED */ ret = kobject_init_and_add(&sch->kobj, &scx_ktype, NULL, "root"); if (ret < 0) { + RCU_INIT_POINTER(ops->priv, NULL); kobject_put(&sch->kobj); return ERR_PTR(ret); } #endif /* CONFIG_EXT_SUB_SCHED */ return sch; +#ifdef CONFIG_EXT_SUB_SCHED err_free_lb_resched: + RCU_INIT_POINTER(ops->priv, NULL); free_cpumask_var(sch->bypass_lb_resched_cpumask); +#endif err_free_lb_cpumask: free_cpumask_var(sch->bypass_lb_donee_cpumask); err_stop_helper: @@ -6733,6 +6802,19 @@ static void scx_root_enable_workfn(struct kthread_work *work) goto err_unlock; } + /* + * @ops->priv binds @ops to its scx_sched instance. It is set here by + * scx_alloc_and_add_sched() and cleared at the tail of bpf_scx_unreg(), + * which runs after scx_root_disable() has dropped scx_enable_mutex. If + * it's still non-NULL here, a previous attachment on @ops has not + * finished tearing down; proceeding would let the in-flight unreg's + * RCU_INIT_POINTER(NULL) clobber the @ops->priv we are about to assign. + */ + if (rcu_access_pointer(ops->priv)) { + ret = -EBUSY; + goto err_unlock; + } + ret = alloc_kick_syncs(); if (ret) goto err_unlock; @@ -6855,6 +6937,9 @@ static void scx_root_enable_workfn(struct kthread_work *work) scx_task_iter_start(&sti, NULL); while ((p = scx_task_iter_next_locked(&sti))) { + struct rq_flags rf; + struct rq *rq; + /* * @p may already be dead, have lost all its usages counts and * be waiting for RCU grace period before being freed. @p can't @@ -6863,20 +6948,47 @@ static void scx_root_enable_workfn(struct kthread_work *work) if (!tryget_task_struct(p)) continue; + /* + * Set %INIT_BEGIN under the iter's rq lock so that a concurrent + * sched_ext_dead() does not call ops.exit_task() on @p while + * ops.init_task() is running. If sched_ext_dead() runs before + * this store, it has already removed @p from scx_tasks and the + * iter won't visit @p; if it runs after, it observes + * %INIT_BEGIN and transitions to %DEAD without calling ops, + * leaving the post-init recheck below to unwind. + */ + scx_set_task_state(p, SCX_TASK_INIT_BEGIN); scx_task_iter_unlock(&sti); - ret = scx_init_task(sch, p, false); - if (ret) { - put_task_struct(p); + ret = __scx_init_task(sch, p, false); + + rq = task_rq_lock(p, &rf); + + if (unlikely(ret)) { + if (scx_get_task_state(p) != SCX_TASK_DEAD) + scx_set_task_state(p, SCX_TASK_NONE); + task_rq_unlock(rq, p, &rf); scx_task_iter_stop(&sti); scx_error(sch, "ops.init_task() failed (%d) for %s[%d]", ret, p->comm, p->pid); + put_task_struct(p); goto err_disable_unlock_all; } - scx_set_task_sched(p, sch); - scx_set_task_state(p, SCX_TASK_READY); + if (scx_get_task_state(p) == SCX_TASK_DEAD) { + /* + * sched_ext_dead() observed %INIT_BEGIN and set %DEAD. + * ops.exit_task() is owed to the sched __scx_init_task() + * ran against; call it now. + */ + scx_sub_init_cancel_task(sch, p); + } else { + scx_set_task_state(p, SCX_TASK_INIT); + scx_set_task_sched(p, sch); + scx_set_task_state(p, SCX_TASK_READY); + } + task_rq_unlock(rq, p, &rf); put_task_struct(p); } scx_task_iter_stop(&sti); @@ -7020,6 +7132,12 @@ static void scx_sub_enable_workfn(struct kthread_work *work) goto out_unlock; } + /* See scx_root_enable_workfn() for the @ops->priv check. */ + if (rcu_access_pointer(ops->priv)) { + ret = -EBUSY; + goto out_unlock; + } + cgrp = cgroup_get_from_id(ops->sub_cgroup_id); if (IS_ERR(cgrp)) { ret = PTR_ERR(cgrp); @@ -7146,6 +7264,21 @@ static void scx_sub_enable_workfn(struct kthread_work *work) goto abort; rq = task_rq_lock(p, &rf); + + if (scx_get_task_state(p) == SCX_TASK_DEAD) { + /* + * sched_ext_dead() raced us between __scx_init_task() + * and this rq lock and ran exit_task() on $parent (the + * sched @p was on at that point), not on @sch. @sch's + * just-completed init is owed an exit_task() and we + * issue it here. + */ + scx_sub_init_cancel_task(sch, p); + task_rq_unlock(rq, p, &rf); + put_task_struct(p); + continue; + } + p->scx.flags |= SCX_TASK_SUB_INIT; task_rq_unlock(rq, p, &rf); @@ -7180,7 +7313,7 @@ static void scx_sub_enable_workfn(struct kthread_work *work) * $p is now only initialized for @sch and READY, which * is what we want. Assign it to @sch and enable. */ - rcu_assign_pointer(p->scx.sched, sch); + scx_set_task_sched(p, sch); scx_enable_task(sch, p); p->scx.flags &= ~SCX_TASK_SUB_INIT; @@ -7282,8 +7415,7 @@ static s32 scx_enable(struct sched_ext_ops *ops, struct bpf_link *link) static DEFINE_MUTEX(helper_mutex); struct scx_enable_cmd cmd; - if (!cpumask_equal(housekeeping_cpumask(HK_TYPE_DOMAIN), - cpu_possible_mask)) { + if (housekeeping_enabled(HK_TYPE_DOMAIN_BOOT)) { pr_err("sched_ext: Not compatible with \"isolcpus=\" domain isolation\n"); return -EINVAL; } diff --git a/kernel/sched/ext_idle.c b/kernel/sched/ext_idle.c index 7468560a6d8041..6e1980763270df 100644 --- a/kernel/sched/ext_idle.c +++ b/kernel/sched/ext_idle.c @@ -465,12 +465,6 @@ s32 scx_select_cpu_dfl(struct task_struct *p, s32 prev_cpu, u64 wake_flags, preempt_disable(); - /* - * Check whether @prev_cpu is still within the allowed set. If not, - * we can still try selecting a nearby CPU. - */ - is_prev_allowed = cpumask_test_cpu(prev_cpu, allowed); - /* * Determine the subset of CPUs usable by @p within @cpus_allowed. */ @@ -487,6 +481,12 @@ s32 scx_select_cpu_dfl(struct task_struct *p, s32 prev_cpu, u64 wake_flags, } } + /* + * Check whether @prev_cpu is still within the allowed set. If not, + * we can still try selecting a nearby CPU. + */ + is_prev_allowed = cpumask_test_cpu(prev_cpu, allowed); + /* * This is necessary to protect llc_cpus. */ diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 728965851842e1..3ebec186f98236 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -882,11 +882,11 @@ bool update_entity_lag(struct cfs_rq *cfs_rq, struct sched_entity *se) * * lag_i >= 0 -> V >= v_i * - * \Sum (v_i - v)*w_i - * V = ------------------ + v + * \Sum (v_i - v0)*w_i + * V = ------------------- + v0 * \Sum w_i * - * lag_i >= 0 -> \Sum (v_i - v)*w_i >= (v_i - v)*(\Sum w_i) + * lag_i >= 0 -> \Sum (v_i - v0)*w_i >= (v_i - v0)*(\Sum w_i) * * Note: using 'avg_vruntime() > se->vruntime' is inaccurate due * to the loss in precision caused by the division. @@ -894,7 +894,7 @@ bool update_entity_lag(struct cfs_rq *cfs_rq, struct sched_entity *se) static int vruntime_eligible(struct cfs_rq *cfs_rq, u64 vruntime) { struct sched_entity *curr = cfs_rq->curr; - s64 avg = cfs_rq->sum_w_vruntime; + s64 key, avg = cfs_rq->sum_w_vruntime; long load = cfs_rq->sum_weight; if (curr && curr->on_rq) { @@ -904,7 +904,36 @@ static int vruntime_eligible(struct cfs_rq *cfs_rq, u64 vruntime) load += weight; } - return avg >= vruntime_op(vruntime, "-", cfs_rq->zero_vruntime) * load; + key = vruntime_op(vruntime, "-", cfs_rq->zero_vruntime); + + /* + * The worst case term for @key includes 'NSEC_TICK * NICE_0_LOAD' + * and @load obviously includes NICE_0_LOAD. NSEC_TICK is around 24 + * bits, while NICE_0_LOAD is 20 on 64bit and 10 otherwise. + * + * This gives that on 64bit the product will be at least 64bit which + * overflows s64, while on 32bit it will only be 44bits and should fit + * comfortably. + */ +#ifdef CONFIG_64BIT +#ifdef CONFIG_ARCH_SUPPORTS_INT128 + /* This often results in simpler code than __builtin_mul_overflow(). */ + return avg >= (__int128)key * load; +#else + s64 rhs; + /* + * On overflow, the sign of key tells us the correct answer: a large + * positive key means vruntime >> V, so not eligible; a large negative + * key means vruntime << V, so eligible. + */ + if (check_mul_overflow(key, load, &rhs)) + return key <= 0; + + return avg >= rhs; +#endif +#else /* 32bit */ + return avg >= key * load; +#endif } int entity_eligible(struct cfs_rq *cfs_rq, struct sched_entity *se) @@ -9145,9 +9174,10 @@ static void wakeup_preempt_fair(struct rq *rq, struct task_struct *p, int wake_f /* * Because p is enqueued, nse being null can only mean that we - * dequeued a delayed task. + * dequeued a delayed task. If there are still entities queued in + * cfs, check if the next one will be p. */ - if (!nse) + if (!nse && cfs_rq->nr_queued) goto pick; if (sched_feat(RUN_TO_PARITY)) diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c index 62344560372526..226a6329f3e928 100644 --- a/kernel/sched/membarrier.c +++ b/kernel/sched/membarrier.c @@ -199,7 +199,16 @@ static void ipi_rseq(void *info) * is negligible. */ smp_mb(); - rseq_sched_switch_event(current); + /* + * Legacy mode requires that IDs are written and the critical section is + * evaluated. V2 optimized mode handles the critical section and IDs are + * only updated if they change as a consequence of preemption after + * return from this IPI. + */ + if (rseq_v2(current)) + rseq_sched_switch_event(current); + else + rseq_force_update(); } static void ipi_sync_rq_state(void *info) diff --git a/kernel/time/timer_migration.c b/kernel/time/timer_migration.c index 155eeaea411337..1d0d3a4058d597 100644 --- a/kernel/time/timer_migration.c +++ b/kernel/time/timer_migration.c @@ -1860,19 +1860,37 @@ static int tmigr_setup_groups(unsigned int cpu, unsigned int node, * child to the new parents. So tmigr_active_up() activates the * new parents while walking up from the old root to the new. * - * * It is ensured that @start is active, as this setup path is - * executed in hotplug prepare callback. This is executed by an - * already connected and !idle CPU. Even if all other CPUs go idle, - * the CPU executing the setup will be responsible up to current top - * level group. And the next time it goes inactive, it will release - * the new childmask and parent to subsequent walkers through this - * @child. Therefore propagate active state unconditionally. + * * It is ensured that @start is active, (or on the way to be activated + * by another CPU that woke up before the current one) as this setup path + * is executed in hotplug prepare callback. This is executed by an already + * connected and !idle CPU in the hierarchy. + * + * * The below RmW atomic operation ensures that: + * + * 1) If the old root has been completely activated, the latest state is + * acquired (the below implicit acquire pairs with the implicit release + * from cmpxchg() in tmigr_active_up()). + * + * 2) If the old root is still on the way to be activated, the lagging behind + * CPU performing the activation will acquire the links up to the new root. + * (The below implicit release pairs with the implicit acquire from cmpxchg() + * in tmigr_active_up()). + * + * 3) Every subsequent CPU below the old root will acquire the new links while + * walking through the old root (The below implicit release pairs with the + * implicit acquire from cmpxchg() in either tmigr_active_up()) or + * tmigr_inactive_up(). */ - state.state = atomic_read(&start->migr_state); - WARN_ON_ONCE(!state.active); + state.state = atomic_fetch_or(0, &start->migr_state); WARN_ON_ONCE(!start->parent); - data.childmask = start->groupmask; - __walk_groups_from(tmigr_active_up, &data, start, start->parent); + /* + * If the state of the old root is inactive, another CPU is on its way to activate + * it and propagate to the new root. + */ + if (state.active) { + data.childmask = start->groupmask; + __walk_groups_from(tmigr_active_up, &data, start, start->parent); + } } /* Root update */ diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile index 1decdce8cbef25..9b0834134cae24 100644 --- a/kernel/trace/Makefile +++ b/kernel/trace/Makefile @@ -143,8 +143,8 @@ obj-$(CONFIG_TRACE_REMOTE_TEST) += remote_test.o targets += undefsyms_base.o KASAN_SANITIZE_undefsyms_base.o := y -UNDEFINED_ALLOWLIST = __asan __gcov __kasan __kcsan __hwasan __sancov __sanitizer __tsan __ubsan __x86_indirect_thunk \ - __msan simple_ring_buffer \ +UNDEFINED_ALLOWLIST = __asan __gcov __kasan __kcsan __hwasan __sancov __sanitizer __tsan __ubsan __msan \ + __aeabi_unwind_cpp __s390_indirect_jump __x86_indirect_thunk simple_ring_buffer \ $(shell $(NM) -u $(obj)/undefsyms_base.o 2>/dev/null | awk '{print $$2}') quiet_cmd_check_undefined = NM $< diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index af7079aa0f36d9..a02bd258677ee1 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -2384,7 +2384,8 @@ static void bpf_kprobe_multi_link_release(struct bpf_link *link) struct bpf_kprobe_multi_link *kmulti_link; kmulti_link = container_of(link, struct bpf_kprobe_multi_link, link); - unregister_fprobe(&kmulti_link->fp); + /* Don't wait for RCU GP here. */ + unregister_fprobe_async(&kmulti_link->fp); kprobe_multi_put_modules(kmulti_link->mods, kmulti_link->mods_cnt); } diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c index cc49ebd2a7737b..f378613ad1209c 100644 --- a/kernel/trace/fprobe.c +++ b/kernel/trace/fprobe.c @@ -1093,14 +1093,15 @@ static int unregister_fprobe_nolock(struct fprobe *fp) } /** - * unregister_fprobe() - Unregister fprobe. + * unregister_fprobe_async() - Unregister fprobe without RCU GP wait * @fp: A fprobe data structure to be unregistered. * * Unregister fprobe (and remove ftrace hooks from the function entries). + * This function will NOT wait until the fprobe is no longer used. * * Return 0 if @fp is unregistered successfully, -errno if not. */ -int unregister_fprobe(struct fprobe *fp) +int unregister_fprobe_async(struct fprobe *fp) { guard(mutex)(&fprobe_mutex); if (!fp || !fprobe_registered(fp)) @@ -1108,6 +1109,24 @@ int unregister_fprobe(struct fprobe *fp) return unregister_fprobe_nolock(fp); } + +/** + * unregister_fprobe() - Unregister fprobe with RCU GP wait + * @fp: A fprobe data structure to be unregistered. + * + * Unregister fprobe (and remove ftrace hooks from the function entries). + * This function will block until the fprobe is no longer used. + * + * Return 0 if @fp is unregistered successfully, -errno if not. + */ +int unregister_fprobe(struct fprobe *fp) +{ + int ret = unregister_fprobe_async(fp); + + if (!ret) + synchronize_rcu(); + return ret; +} EXPORT_SYMBOL_GPL(unregister_fprobe); static int __init fprobe_initcall(void) diff --git a/kernel/trace/remote_test.c b/kernel/trace/remote_test.c index 6c1b7701ddae82..a3e2c9b606eb1f 100644 --- a/kernel/trace/remote_test.c +++ b/kernel/trace/remote_test.c @@ -110,9 +110,9 @@ static struct trace_buffer_desc *remote_test_load(unsigned long size, void *unus return remote_test_buffer_desc; err_unload: - for_each_ring_buffer_desc(rb_desc, cpu, remote_test_buffer_desc) + for_each_ring_buffer_desc(rb_desc, cpu, desc) remote_test_unload_simple_rb(rb_desc->cpu); - trace_remote_free_buffer(remote_test_buffer_desc); + trace_remote_free_buffer(desc); err_free_desc: kfree(desc); diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 5f747f241a5f1e..33b721a9af0223 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -2296,6 +2296,18 @@ static void __queue_work(int cpu, struct workqueue_struct *wq, if (unlikely(wq->flags & (__WQ_DESTROYING | __WQ_DRAINING) && WARN_ONCE(!is_chained_work(wq), "workqueue: cannot queue %ps on wq %s\n", work->func, wq->name))) { + struct work_offq_data offqd; + + /* + * State on entry: PENDING is set, work is off-queue (no + * insert_work() has run). + * + * Returning without clearing PENDING would leave the work + * in a weird state (PENDING=1, PWQ=0, entry empty) + */ + work_offqd_unpack(&offqd, *work_data_bits(work)); + set_work_pool_and_clear_pending(work, offqd.pool_id, + work_offqd_pack_flags(&offqd)); return; } rcu_read_lock(); @@ -5642,7 +5654,9 @@ static int alloc_and_link_pwqs(struct workqueue_struct *wq) ret = apply_workqueue_attrs_locked(wq, unbound_std_wq_attrs[highpri]); } - return ret; + if (ret) + goto enomem; + return 0; enomem: if (wq->cpu_pwq) { @@ -5906,6 +5920,21 @@ static struct workqueue_struct *__alloc_workqueue(const char *fmt, return NULL; } +__printf(1, 0) +static struct workqueue_struct *alloc_workqueue_va(const char *fmt, + unsigned int flags, + int max_active, + va_list args) +{ + struct workqueue_struct *wq; + + wq = __alloc_workqueue(fmt, flags, max_active, args); + if (wq) + wq_init_lockdep(wq); + + return wq; +} + __printf(1, 4) struct workqueue_struct *alloc_workqueue_noprof(const char *fmt, unsigned int flags, @@ -5915,12 +5944,8 @@ struct workqueue_struct *alloc_workqueue_noprof(const char *fmt, va_list args; va_start(args, max_active); - wq = __alloc_workqueue(fmt, flags, max_active, args); + wq = alloc_workqueue_va(fmt, flags, max_active, args); va_end(args); - if (!wq) - return NULL; - - wq_init_lockdep(wq); return wq; } @@ -5932,15 +5957,15 @@ static void devm_workqueue_release(void *res) } __printf(2, 5) struct workqueue_struct * -devm_alloc_workqueue(struct device *dev, const char *fmt, unsigned int flags, - int max_active, ...) +devm_alloc_workqueue_noprof(struct device *dev, const char *fmt, + unsigned int flags, int max_active, ...) { struct workqueue_struct *wq; va_list args; int ret; va_start(args, max_active); - wq = alloc_workqueue(fmt, flags, max_active, args); + wq = alloc_workqueue_va(fmt, flags, max_active, args); va_end(args); if (!wq) return NULL; @@ -5951,7 +5976,7 @@ devm_alloc_workqueue(struct device *dev, const char *fmt, unsigned int flags, return wq; } -EXPORT_SYMBOL_GPL(devm_alloc_workqueue); +EXPORT_SYMBOL_GPL(devm_alloc_workqueue_noprof); #ifdef CONFIG_LOCKDEP __printf(1, 5) diff --git a/lib/fonts/font_rotate.c b/lib/fonts/font_rotate.c index 065e0fc0667baf..275406008823b6 100644 --- a/lib/fonts/font_rotate.c +++ b/lib/fonts/font_rotate.c @@ -106,7 +106,7 @@ static void __font_glyph_rotate_180(const unsigned char *glyph, for (y = 0; y < height; y++) { for (x = 0; x < width; x++) { if (font_glyph_test_bit(glyph, x, y, bit_pitch)) { - font_glyph_set_bit(out, width - (1 + x + shift), height - (1 + y), + font_glyph_set_bit(out, bit_pitch - 1 - x - shift, height - 1 - y, bit_pitch); } } diff --git a/lib/kunit/Kconfig b/lib/kunit/Kconfig index 498cc51e493dc9..94ff8e4089bfbd 100644 --- a/lib/kunit/Kconfig +++ b/lib/kunit/Kconfig @@ -16,8 +16,9 @@ menuconfig KUNIT if KUNIT config KUNIT_DEBUGFS - bool "KUnit - Enable /sys/kernel/debug/kunit debugfs representation" if !KUNIT_ALL_TESTS - default KUNIT_ALL_TESTS + bool "KUnit - Enable /sys/kernel/debug/kunit debugfs representation" + depends on DEBUG_FS + default y help Enable debugfs representation for kunit. Currently this consists of /sys/kernel/debug/kunit//results files for each diff --git a/lib/rhashtable.c b/lib/rhashtable.c index 7a67ef5b67b666..04b3a808fca9f2 100644 --- a/lib/rhashtable.c +++ b/lib/rhashtable.c @@ -114,6 +114,14 @@ static void bucket_table_free(const struct bucket_table *tbl) kvfree(tbl); } +static void bucket_table_free_atomic(const struct bucket_table *tbl) +{ + if (tbl->nest) + nested_bucket_table_free(tbl); + + kvfree_atomic(tbl); +} + static void bucket_table_free_rcu(struct rcu_head *head) { bucket_table_free(container_of(head, struct bucket_table, rcu)); @@ -496,7 +504,7 @@ static int rhashtable_insert_rehash(struct rhashtable *ht, err = rhashtable_rehash_attach(ht, tbl, new_tbl); if (err) { - bucket_table_free(new_tbl); + bucket_table_free_atomic(new_tbl); if (err == -EEXIST) err = 0; } else @@ -1166,6 +1174,11 @@ static void rhashtable_free_one(struct rhashtable *ht, struct rhash_head *obj, * This function will eventually sleep to wait for an async resize * to complete. The caller is responsible that no further write operations * occurs in parallel. + * + * After cancel_work_sync() has returned, the deferred rehash worker is + * quiesced and, per the contract above, no other concurrent access to the + * rhashtable is possible. The tables are therefore owned exclusively by + * this function and can be walked without ht->mutex held. */ void rhashtable_free_and_destroy(struct rhashtable *ht, void (*free_fn)(void *ptr, void *arg), @@ -1177,8 +1190,15 @@ void rhashtable_free_and_destroy(struct rhashtable *ht, irq_work_sync(&ht->run_irq_work); cancel_work_sync(&ht->run_work); - mutex_lock(&ht->mutex); - tbl = rht_dereference(ht->tbl, ht); + /* + * Do NOT take ht->mutex here. The rehash worker establishes + * ht->mutex -> fs_reclaim via GFP_KERNEL bucket allocation under + * the mutex; callers on the reclaim path (e.g. simple_xattr_ht_free() + * from evict() under the dcache shrinker for shmem/kernfs/pidfs + * inodes) would otherwise close a circular dependency + * fs_reclaim -> ht->mutex. + */ + tbl = rcu_dereference_raw(ht->tbl); restart: if (free_fn) { for (i = 0; i < tbl->size; i++) { @@ -1187,22 +1207,21 @@ void rhashtable_free_and_destroy(struct rhashtable *ht, cond_resched(); for (pos = rht_ptr_exclusive(rht_bucket(tbl, i)), next = !rht_is_a_nulls(pos) ? - rht_dereference(pos->next, ht) : NULL; + rcu_dereference_raw(pos->next) : NULL; !rht_is_a_nulls(pos); pos = next, next = !rht_is_a_nulls(pos) ? - rht_dereference(pos->next, ht) : NULL) + rcu_dereference_raw(pos->next) : NULL) rhashtable_free_one(ht, pos, free_fn, arg); } } - next_tbl = rht_dereference(tbl->future_tbl, ht); + next_tbl = rcu_dereference_raw(tbl->future_tbl); bucket_table_free(tbl); if (next_tbl) { tbl = next_tbl; goto restart; } - mutex_unlock(&ht->mutex); } EXPORT_SYMBOL_GPL(rhashtable_free_and_destroy); diff --git a/lib/tests/test_kprobes.c b/lib/tests/test_kprobes.c index b7582010125c3f..06e729e4de0516 100644 --- a/lib/tests/test_kprobes.c +++ b/lib/tests/test_kprobes.c @@ -12,6 +12,12 @@ #define div_factor 3 +#define KP_CLEAR(_kp) \ +do { \ + (_kp).addr = NULL; \ + (_kp).flags = 0; \ +} while (0) + static u32 rand1, preh_val, posth_val; static u32 (*target)(u32 value); static u32 (*recursed_target)(u32 value); @@ -125,10 +131,6 @@ static void test_kprobes(struct kunit *test) current_test = test; - /* addr and flags should be cleard for reusing kprobe. */ - kp.addr = NULL; - kp.flags = 0; - KUNIT_EXPECT_EQ(test, 0, register_kprobes(kps, 2)); preh_val = 0; posth_val = 0; @@ -226,9 +228,6 @@ static void test_kretprobes(struct kunit *test) struct kretprobe *rps[2] = {&rp, &rp2}; current_test = test; - /* addr and flags should be cleard for reusing kprobe. */ - rp.kp.addr = NULL; - rp.kp.flags = 0; KUNIT_EXPECT_EQ(test, 0, register_kretprobes(rps, 2)); krph_val = 0; @@ -290,8 +289,6 @@ static void test_stacktrace_on_kretprobe(struct kunit *test) unsigned long myretaddr = (unsigned long)__builtin_return_address(0); current_test = test; - rp3.kp.addr = NULL; - rp3.kp.flags = 0; /* * Run the stacktrace_driver() to record correct return address in @@ -352,8 +349,6 @@ static void test_stacktrace_on_nested_kretprobe(struct kunit *test) struct kretprobe *rps[2] = {&rp3, &rp4}; current_test = test; - rp3.kp.addr = NULL; - rp3.kp.flags = 0; //KUNIT_ASSERT_NE(test, myretaddr, stacktrace_driver()); @@ -367,6 +362,18 @@ static void test_stacktrace_on_nested_kretprobe(struct kunit *test) static int kprobes_test_init(struct kunit *test) { + KP_CLEAR(kp); + KP_CLEAR(kp2); + KP_CLEAR(kp_missed); +#ifdef CONFIG_KRETPROBES + KP_CLEAR(rp.kp); + KP_CLEAR(rp2.kp); +#ifdef CONFIG_ARCH_CORRECT_STACKTRACE_ON_KRETPROBE + KP_CLEAR(rp3.kp); + KP_CLEAR(rp4.kp); +#endif +#endif + target = kprobe_target; target2 = kprobe_target2; recursed_target = kprobe_recursed_target; diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c index a5798bd26d20b1..da224011fafd93 100644 --- a/lib/vdso/gettimeofday.c +++ b/lib/vdso/gettimeofday.c @@ -248,11 +248,10 @@ bool do_aux(const struct vdso_time_data *vd, clockid_t clock, struct __kernel_ti vc = &vd->aux_clock_data[idx]; do { - if (vdso_read_begin_timens(vc, &seq)) { + while (vdso_read_begin_timens(vc, &seq)) { + /* Re-read from the real time data page, reload seq by looping */ vd = __arch_get_vdso_u_timens_data(vd); vc = &vd->aux_clock_data[idx]; - /* Re-read from the real time data page */ - continue; } /* Auxclock disabled? */ diff --git a/mm/memfd_luo.c b/mm/memfd_luo.c index b02b503c750df0..59de210bee5f9e 100644 --- a/mm/memfd_luo.c +++ b/mm/memfd_luo.c @@ -50,6 +50,11 @@ * memfds are always opened with ``O_RDWR`` and ``O_LARGEFILE``. This property * is maintained. * + * Seals + * File seals set on the memfd are preserved and re-applied on restore. + * Only seals known to this LUO version (see ``MEMFD_LUO_ALL_SEALS``) may + * be present; preservation fails with ``-EOPNOTSUPP`` otherwise. + * * Non-Preserved Properties * ======================== * @@ -61,10 +66,6 @@ * A memfd can be created with the ``MFD_CLOEXEC`` flag that sets the * ``FD_CLOEXEC`` on the file. This flag is not preserved and must be set * again after restore via ``fcntl()``. - * - * Seals - * File seals are not preserved. The file is unsealed on restore and if - * needed, must be sealed again via ``fcntl()``. */ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt @@ -259,7 +260,7 @@ static int memfd_luo_preserve(struct liveupdate_file_op_args *args) struct inode *inode = file_inode(args->file); struct memfd_luo_folio_ser *folios_ser; struct memfd_luo_ser *ser; - u64 nr_folios; + u64 nr_folios, inode_size; int err = 0, seals; inode_lock(inode); @@ -285,7 +286,18 @@ static int memfd_luo_preserve(struct liveupdate_file_op_args *args) } ser->pos = args->file->f_pos; - ser->size = i_size_read(inode); + inode_size = i_size_read(inode); + + /* + * memfd_pin_folios() caps at UINT_MAX folios; refuse larger + * files to avoid silently preserving only a prefix. + */ + if (DIV_ROUND_UP_ULL(inode_size, PAGE_SIZE) > UINT_MAX) { + err = -EFBIG; + goto err_free_ser; + } + + ser->size = inode_size; ser->seals = seals; err = memfd_luo_preserve_folios(args->file, &ser->folios, @@ -427,6 +439,7 @@ static int memfd_luo_retrieve_folios(struct file *file, if (!folio) { pr_err("Unable to restore folio at physical address: %llx\n", phys); + err = -EIO; goto put_folios; } index = pfolio->index; diff --git a/mm/slub.c b/mm/slub.c index 0baa906f39ab84..8f900453672968 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -6882,6 +6882,22 @@ void kvfree(const void *addr) } EXPORT_SYMBOL(kvfree); +/** + * kvfree_atomic() - Free memory. + * @addr: Pointer to allocated memory. + * + * Same as kvfree(), but uses vfree_atomic() for vmalloc + * backed memory. Must not be called from NMI context. + */ +void kvfree_atomic(const void *addr) +{ + if (is_vmalloc_addr(addr)) + vfree_atomic(addr); + else + kfree(addr); +} +EXPORT_SYMBOL(kvfree_atomic); + /** * kvfree_sensitive - Free a data object containing sensitive information. * @addr: address of the data object to be freed. diff --git a/net/atm/signaling.c b/net/atm/signaling.c index 358fbe5e4d1d06..b991d937205af3 100644 --- a/net/atm/signaling.c +++ b/net/atm/signaling.c @@ -179,6 +179,7 @@ static int sigd_send(struct atm_vcc *vcc, struct sk_buff *skb) break; default: pr_alert("bad message type %d\n", (int)msg->type); + dev_kfree_skb(skb); /* Paired with find_get_vcc(msg->vcc) above */ sock_put(sk); return -EINVAL; diff --git a/net/batman-adv/bat_iv_ogm.c b/net/batman-adv/bat_iv_ogm.c index f28e9cbf8ad5f2..74ef7dc2b2f981 100644 --- a/net/batman-adv/bat_iv_ogm.c +++ b/net/batman-adv/bat_iv_ogm.c @@ -173,19 +173,12 @@ batadv_iv_ogm_orig_get(struct batadv_priv *bat_priv, const u8 *addr) static struct batadv_neigh_node * batadv_iv_ogm_neigh_new(struct batadv_hard_iface *hard_iface, const u8 *neigh_addr, - struct batadv_orig_node *orig_node, - struct batadv_orig_node *orig_neigh) + struct batadv_orig_node *orig_node) { struct batadv_neigh_node *neigh_node; neigh_node = batadv_neigh_node_get_or_create(orig_node, hard_iface, neigh_addr); - if (!neigh_node) - goto out; - - neigh_node->orig_node = orig_neigh; - -out: return neigh_node; } @@ -335,7 +328,7 @@ static void batadv_iv_ogm_send_to_if(struct batadv_forw_packet *forw_packet, struct batadv_priv *bat_priv = netdev_priv(hard_iface->mesh_iface); const char *fwd_str; u8 packet_num; - s16 buff_pos; + int buff_pos; struct batadv_ogm_packet *batadv_ogm_packet; struct sk_buff *skb; u8 *packet_pos; @@ -906,6 +899,31 @@ static u8 batadv_iv_orig_ifinfo_sum(struct batadv_orig_node *orig_node, return sum; } +/** + * batadv_iv_ogm_neigh_ifinfo_sum() - Get bcast_own sum for a last-hop neighbor + * @bat_priv: the bat priv with all the mesh interface information + * @neigh_node: last-hop neighbor of an originator + * + * Return: Number of replied (rebroadcasted) OGMs for the originator currently + * announced by the neighbor. Returns 0 if the neighbor's originator entry is + * not available anymore. + */ +static u8 batadv_iv_ogm_neigh_ifinfo_sum(struct batadv_priv *bat_priv, + const struct batadv_neigh_node *neigh_node) +{ + struct batadv_orig_node *orig_neigh; + u8 sum; + + orig_neigh = batadv_orig_hash_find(bat_priv, neigh_node->addr); + if (!orig_neigh) + return 0; + + sum = batadv_iv_orig_ifinfo_sum(orig_neigh, neigh_node->if_incoming); + batadv_orig_node_put(orig_neigh); + + return sum; +} + /** * batadv_iv_ogm_orig_update() - use OGM to update corresponding data in an * originator @@ -975,17 +993,9 @@ batadv_iv_ogm_orig_update(struct batadv_priv *bat_priv, } if (!neigh_node) { - struct batadv_orig_node *orig_tmp; - - orig_tmp = batadv_iv_ogm_orig_get(bat_priv, ethhdr->h_source); - if (!orig_tmp) - goto unlock; - neigh_node = batadv_iv_ogm_neigh_new(if_incoming, ethhdr->h_source, - orig_node, orig_tmp); - - batadv_orig_node_put(orig_tmp); + orig_node); if (!neigh_node) goto unlock; } else { @@ -1037,10 +1047,9 @@ batadv_iv_ogm_orig_update(struct batadv_priv *bat_priv, */ if (router_ifinfo && neigh_ifinfo->bat_iv.tq_avg == router_ifinfo->bat_iv.tq_avg) { - sum_orig = batadv_iv_orig_ifinfo_sum(router->orig_node, - router->if_incoming); - sum_neigh = batadv_iv_orig_ifinfo_sum(neigh_node->orig_node, - neigh_node->if_incoming); + sum_orig = batadv_iv_ogm_neigh_ifinfo_sum(bat_priv, router); + sum_neigh = batadv_iv_ogm_neigh_ifinfo_sum(bat_priv, + neigh_node); if (sum_orig >= sum_neigh) goto out; } @@ -1106,7 +1115,6 @@ static bool batadv_iv_ogm_calc_tq(struct batadv_orig_node *orig_node, if (!neigh_node) neigh_node = batadv_iv_ogm_neigh_new(if_incoming, orig_neigh_node->orig, - orig_neigh_node, orig_neigh_node); if (!neigh_node) @@ -1302,6 +1310,32 @@ batadv_iv_ogm_update_seqnos(const struct ethhdr *ethhdr, return ret; } +/** + * batadv_orig_to_direct_router() - get direct next hop neighbor to an orig address + * @bat_priv: the bat priv with all the mesh interface information + * @orig_addr: the originator MAC address to search the best next hop router for + * @if_outgoing: the interface where the OGM should be sent to + * + * Return: A neighbor node which is the best router towards the given originator + * address. Bonding candidates are ignored. + */ +static struct batadv_neigh_node * +batadv_orig_to_direct_router(struct batadv_priv *bat_priv, u8 *orig_addr, + struct batadv_hard_iface *if_outgoing) +{ + struct batadv_neigh_node *neigh_node; + struct batadv_orig_node *orig_node; + + orig_node = batadv_orig_hash_find(bat_priv, orig_addr); + if (!orig_node) + return NULL; + + neigh_node = batadv_orig_router_get(orig_node, if_outgoing); + batadv_orig_node_put(orig_node); + + return neigh_node; +} + /** * batadv_iv_ogm_process_per_outif() - process a batman iv OGM for an outgoing * interface @@ -1372,8 +1406,9 @@ batadv_iv_ogm_process_per_outif(const struct sk_buff *skb, int ogm_offset, router = batadv_orig_router_get(orig_node, if_outgoing); if (router) { - router_router = batadv_orig_router_get(router->orig_node, - if_outgoing); + router_router = batadv_orig_to_direct_router(bat_priv, + router->addr, + if_outgoing); router_ifinfo = batadv_neigh_ifinfo_get(router, if_outgoing); } diff --git a/net/batman-adv/bridge_loop_avoidance.c b/net/batman-adv/bridge_loop_avoidance.c index 51fe028b90881e..cec11f1251d66a 100644 --- a/net/batman-adv/bridge_loop_avoidance.c +++ b/net/batman-adv/bridge_loop_avoidance.c @@ -318,8 +318,8 @@ batadv_bla_del_backbone_claims(struct batadv_bla_backbone_gw *backbone_gw) if (claim->backbone_gw != backbone_gw) continue; - batadv_claim_put(claim); hlist_del_rcu(&claim->hash_entry); + batadv_claim_put(claim); } spin_unlock_bh(list_lock); } @@ -723,6 +723,7 @@ static void batadv_bla_add_claim(struct batadv_priv *bat_priv, if (unlikely(hash_added != 0)) { /* only local changes happened. */ + batadv_backbone_gw_put(backbone_gw); kfree(claim); return; } @@ -1288,6 +1289,13 @@ static void batadv_bla_purge_claims(struct batadv_priv *bat_priv, rcu_read_lock(); hlist_for_each_entry_rcu(claim, head, hash_entry) { + /* only purge claims not currently in the process of being released. + * Such claims could otherwise have a NULL-ptr backbone_gw set because + * they already went through batadv_claim_release() + */ + if (!kref_get_unless_zero(&claim->refcount)) + continue; + backbone_gw = batadv_bla_claim_get_backbone_gw(claim); if (now) goto purge_now; @@ -1313,6 +1321,7 @@ static void batadv_bla_purge_claims(struct batadv_priv *bat_priv, claim->addr, claim->vid); skip: batadv_backbone_gw_put(backbone_gw); + batadv_claim_put(claim); } rcu_read_unlock(); } diff --git a/net/batman-adv/main.c b/net/batman-adv/main.c index 3a35aadd8b4191..a4d33ee0fda59e 100644 --- a/net/batman-adv/main.c +++ b/net/batman-adv/main.c @@ -249,6 +249,7 @@ void batadv_mesh_free(struct net_device *mesh_iface) atomic_set(&bat_priv->mesh_state, BATADV_MESH_DEACTIVATING); batadv_purge_outstanding_packets(bat_priv, NULL); + batadv_tp_stop_all(bat_priv); batadv_gw_node_free(bat_priv); diff --git a/net/batman-adv/tp_meter.c b/net/batman-adv/tp_meter.c index 2e42f6b348c83d..066c76113fc433 100644 --- a/net/batman-adv/tp_meter.c +++ b/net/batman-adv/tp_meter.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include #include @@ -365,23 +366,38 @@ static void batadv_tp_vars_put(struct batadv_tp_vars *tp_vars) } /** - * batadv_tp_sender_cleanup() - cleanup sender data and drop and timer - * @bat_priv: the bat priv with all the mesh interface information - * @tp_vars: the private data of the current TP meter session to cleanup + * batadv_tp_list_detach() - remove tp session from mesh session list once + * @tp_vars: the private data of the current TP meter session */ -static void batadv_tp_sender_cleanup(struct batadv_priv *bat_priv, - struct batadv_tp_vars *tp_vars) +static void batadv_tp_list_detach(struct batadv_tp_vars *tp_vars) { - cancel_delayed_work(&tp_vars->finish_work); + bool detached = false; spin_lock_bh(&tp_vars->bat_priv->tp_list_lock); - hlist_del_rcu(&tp_vars->list); + if (!hlist_unhashed(&tp_vars->list)) { + hlist_del_init_rcu(&tp_vars->list); + detached = true; + } spin_unlock_bh(&tp_vars->bat_priv->tp_list_lock); + if (!detached) + return; + + atomic_dec(&tp_vars->bat_priv->tp_num); + /* drop list reference */ batadv_tp_vars_put(tp_vars); +} - atomic_dec(&tp_vars->bat_priv->tp_num); +/** + * batadv_tp_sender_cleanup() - cleanup sender data and drop and timer + * @tp_vars: the private data of the current TP meter session to cleanup + */ +static void batadv_tp_sender_cleanup(struct batadv_tp_vars *tp_vars) +{ + cancel_delayed_work_sync(&tp_vars->finish_work); + + batadv_tp_list_detach(tp_vars); /* kill the timer and remove its reference */ timer_delete_sync(&tp_vars->timer); @@ -886,7 +902,8 @@ static int batadv_tp_send(void *arg) batadv_orig_node_put(orig_node); batadv_tp_sender_end(bat_priv, tp_vars); - batadv_tp_sender_cleanup(bat_priv, tp_vars); + batadv_tp_sender_cleanup(tp_vars); + complete(&tp_vars->finished); batadv_tp_vars_put(tp_vars); @@ -918,7 +935,8 @@ static void batadv_tp_start_kthread(struct batadv_tp_vars *tp_vars) batadv_tp_vars_put(tp_vars); /* cleanup of failed tp meter variables */ - batadv_tp_sender_cleanup(bat_priv, tp_vars); + batadv_tp_sender_cleanup(tp_vars); + complete(&tp_vars->finished); return; } @@ -947,6 +965,13 @@ void batadv_tp_start(struct batadv_priv *bat_priv, const u8 *dst, /* look for an already existing test towards this node */ spin_lock_bh(&bat_priv->tp_list_lock); + if (atomic_read(&bat_priv->mesh_state) != BATADV_MESH_ACTIVE) { + spin_unlock_bh(&bat_priv->tp_list_lock); + batadv_tp_batctl_error_notify(BATADV_TP_REASON_DST_UNREACHABLE, + dst, bat_priv, session_cookie); + return; + } + tp_vars = batadv_tp_list_find(bat_priv, dst); if (tp_vars) { spin_unlock_bh(&bat_priv->tp_list_lock); @@ -969,6 +994,7 @@ void batadv_tp_start(struct batadv_priv *bat_priv, const u8 *dst, tp_vars = kmalloc_obj(*tp_vars, GFP_ATOMIC); if (!tp_vars) { + atomic_dec(&bat_priv->tp_num); spin_unlock_bh(&bat_priv->tp_list_lock); batadv_dbg(BATADV_DBG_TP_METER, bat_priv, "Meter: %s cannot allocate list elements\n", @@ -1017,6 +1043,7 @@ void batadv_tp_start(struct batadv_priv *bat_priv, const u8 *dst, tp_vars->start_time = jiffies; init_waitqueue_head(&tp_vars->more_bytes); + init_completion(&tp_vars->finished); spin_lock_init(&tp_vars->unacked_lock); INIT_LIST_HEAD(&tp_vars->unacked_list); @@ -1119,14 +1146,7 @@ static void batadv_tp_receiver_shutdown(struct timer_list *t) "Shutting down for inactivity (more than %dms) from %pM\n", BATADV_TP_RECV_TIMEOUT, tp_vars->other_end); - spin_lock_bh(&tp_vars->bat_priv->tp_list_lock); - hlist_del_rcu(&tp_vars->list); - spin_unlock_bh(&tp_vars->bat_priv->tp_list_lock); - - /* drop list reference */ - batadv_tp_vars_put(tp_vars); - - atomic_dec(&bat_priv->tp_num); + batadv_tp_list_detach(tp_vars); spin_lock_bh(&tp_vars->unacked_lock); list_for_each_entry_safe(un, safe, &tp_vars->unacked_list, list) { @@ -1329,9 +1349,12 @@ static struct batadv_tp_vars * batadv_tp_init_recv(struct batadv_priv *bat_priv, const struct batadv_icmp_tp_packet *icmp) { - struct batadv_tp_vars *tp_vars; + struct batadv_tp_vars *tp_vars = NULL; spin_lock_bh(&bat_priv->tp_list_lock); + if (atomic_read(&bat_priv->mesh_state) != BATADV_MESH_ACTIVE) + goto out_unlock; + tp_vars = batadv_tp_list_find_session(bat_priv, icmp->orig, icmp->session); if (tp_vars) @@ -1344,8 +1367,10 @@ batadv_tp_init_recv(struct batadv_priv *bat_priv, } tp_vars = kmalloc_obj(*tp_vars, GFP_ATOMIC); - if (!tp_vars) + if (!tp_vars) { + atomic_dec(&bat_priv->tp_num); goto out_unlock; + } ether_addr_copy(tp_vars->other_end, icmp->orig); tp_vars->role = BATADV_TP_RECEIVER; @@ -1464,6 +1489,9 @@ void batadv_tp_meter_recv(struct batadv_priv *bat_priv, struct sk_buff *skb) { struct batadv_icmp_tp_packet *icmp; + if (atomic_read(&bat_priv->mesh_state) != BATADV_MESH_ACTIVE) + goto out; + icmp = (struct batadv_icmp_tp_packet *)skb->data; switch (icmp->subtype) { @@ -1478,9 +1506,57 @@ void batadv_tp_meter_recv(struct batadv_priv *bat_priv, struct sk_buff *skb) "Received unknown TP Metric packet type %u\n", icmp->subtype); } + +out: consume_skb(skb); } +/** + * batadv_tp_stop_all() - stop all currently running tp meter sessions + * @bat_priv: the bat priv with all the mesh interface information + */ +void batadv_tp_stop_all(struct batadv_priv *bat_priv) +{ + struct batadv_tp_vars *tp_vars[BATADV_TP_MAX_NUM]; + struct batadv_tp_vars *tp_var; + size_t count = 0; + size_t i; + + spin_lock_bh(&bat_priv->tp_list_lock); + hlist_for_each_entry(tp_var, &bat_priv->tp_list, list) { + if (WARN_ON_ONCE(count >= BATADV_TP_MAX_NUM)) + break; + + if (!kref_get_unless_zero(&tp_var->refcount)) + continue; + + tp_vars[count++] = tp_var; + } + spin_unlock_bh(&bat_priv->tp_list_lock); + + for (i = 0; i < count; i++) { + tp_var = tp_vars[i]; + + switch (tp_var->role) { + case BATADV_TP_SENDER: + batadv_tp_sender_shutdown(tp_var, + BATADV_TP_REASON_CANCEL); + wake_up(&tp_var->more_bytes); + wait_for_completion(&tp_var->finished); + break; + case BATADV_TP_RECEIVER: + batadv_tp_list_detach(tp_var); + if (timer_shutdown_sync(&tp_var->timer)) + batadv_tp_vars_put(tp_var); + break; + } + + batadv_tp_vars_put(tp_var); + } + + synchronize_net(); +} + /** * batadv_tp_meter_init() - initialize global tp_meter structures */ diff --git a/net/batman-adv/tp_meter.h b/net/batman-adv/tp_meter.h index f0046d366eac65..4e97cd10cd0259 100644 --- a/net/batman-adv/tp_meter.h +++ b/net/batman-adv/tp_meter.h @@ -17,6 +17,7 @@ void batadv_tp_start(struct batadv_priv *bat_priv, const u8 *dst, u32 test_length, u32 *cookie); void batadv_tp_stop(struct batadv_priv *bat_priv, const u8 *dst, u8 return_value); +void batadv_tp_stop_all(struct batadv_priv *bat_priv); void batadv_tp_meter_recv(struct batadv_priv *bat_priv, struct sk_buff *skb); #endif /* _NET_BATMAN_ADV_TP_METER_H_ */ diff --git a/net/batman-adv/types.h b/net/batman-adv/types.h index 8fc5fe0e9b0539..daa06f42115429 100644 --- a/net/batman-adv/types.h +++ b/net/batman-adv/types.h @@ -14,6 +14,7 @@ #include #include #include +#include #include #include #include @@ -1328,6 +1329,9 @@ struct batadv_tp_vars { /** @finish_work: work item for the finishing procedure */ struct delayed_work finish_work; + /** @finished: completion signaled when a sender thread exits */ + struct completion finished; + /** @test_length: test length in milliseconds */ u32 test_length; diff --git a/net/bluetooth/bnep/core.c b/net/bluetooth/bnep/core.c index d44987d4515c0b..853c8d7644b558 100644 --- a/net/bluetooth/bnep/core.c +++ b/net/bluetooth/bnep/core.c @@ -330,11 +330,18 @@ static int bnep_rx_frame(struct bnep_session *s, struct sk_buff *skb) goto badframe; break; case BNEP_FILTER_MULTI_ADDR_SET: - case BNEP_FILTER_NET_TYPE_SET: - /* Pull: ctrl type (1 b), len (2 b), data (len bytes) */ - if (!skb_pull(skb, 3 + *(u16 *)(skb->data + 1) * 2)) + case BNEP_FILTER_NET_TYPE_SET: { + u8 *hdr; + + /* Pull ctrl type (1 b) + len (2 b) */ + hdr = skb_pull_data(skb, 3); + if (!hdr) + goto badframe; + /* Pull data (len bytes); length is big-endian */ + if (!skb_pull(skb, get_unaligned_be16(&hdr[1]))) goto badframe; break; + } default: kfree_skb(skb); return 0; diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c index 3a059259908615..17b46ad6a34964 100644 --- a/net/bluetooth/hci_conn.c +++ b/net/bluetooth/hci_conn.c @@ -480,40 +480,107 @@ bool hci_setup_sync(struct hci_conn *conn, __u16 handle) return hci_setup_sync_conn(conn, handle); } -u8 hci_le_conn_update(struct hci_conn *conn, u16 min, u16 max, u16 latency, - u16 to_multiplier) +struct le_conn_update_data { + struct hci_conn *conn; + u16 min; + u16 max; + u16 latency; + u16 to_multiplier; +}; + +static int le_conn_update_sync(struct hci_dev *hdev, void *data) { - struct hci_dev *hdev = conn->hdev; + struct le_conn_update_data *d = data; + struct hci_conn *conn = d->conn; struct hci_conn_params *params; struct hci_cp_le_conn_update cp; + u16 timeout; + u8 store_hint; + int err; + /* Verify connection is still alive and read conn fields under + * the same lock to prevent a concurrent disconnect from freeing + * or reusing the connection while we build the HCI command. + */ hci_dev_lock(hdev); - params = hci_conn_params_lookup(hdev, &conn->dst, conn->dst_type); - if (params) { - params->conn_min_interval = min; - params->conn_max_interval = max; - params->conn_latency = latency; - params->supervision_timeout = to_multiplier; + if (!hci_conn_valid(hdev, conn)) { + hci_dev_unlock(hdev); + return -ECANCELED; } - hci_dev_unlock(hdev); - memset(&cp, 0, sizeof(cp)); cp.handle = cpu_to_le16(conn->handle); - cp.conn_interval_min = cpu_to_le16(min); - cp.conn_interval_max = cpu_to_le16(max); - cp.conn_latency = cpu_to_le16(latency); - cp.supervision_timeout = cpu_to_le16(to_multiplier); + cp.conn_interval_min = cpu_to_le16(d->min); + cp.conn_interval_max = cpu_to_le16(d->max); + cp.conn_latency = cpu_to_le16(d->latency); + cp.supervision_timeout = cpu_to_le16(d->to_multiplier); cp.min_ce_len = cpu_to_le16(0x0000); cp.max_ce_len = cpu_to_le16(0x0000); + timeout = conn->conn_timeout; + + hci_dev_unlock(hdev); - hci_send_cmd(hdev, HCI_OP_LE_CONN_UPDATE, sizeof(cp), &cp); + err = __hci_cmd_sync_status_sk(hdev, HCI_OP_LE_CONN_UPDATE, + sizeof(cp), &cp, + HCI_EV_LE_CONN_UPDATE_COMPLETE, + timeout, NULL); + if (err) + return err; + + /* Update stored connection parameters after the controller has + * confirmed the update via the LE Connection Update Complete event. + */ + hci_dev_lock(hdev); + + params = hci_conn_params_lookup(hdev, &conn->dst, conn->dst_type); + if (params) { + params->conn_min_interval = d->min; + params->conn_max_interval = d->max; + params->conn_latency = d->latency; + params->supervision_timeout = d->to_multiplier; + store_hint = 0x01; + } else { + store_hint = 0x00; + } - if (params) - return 0x01; + hci_dev_unlock(hdev); - return 0x00; + mgmt_new_conn_param(hdev, &conn->dst, conn->dst_type, store_hint, + d->min, d->max, d->latency, d->to_multiplier); + + return 0; +} + +static void le_conn_update_complete(struct hci_dev *hdev, void *data, int err) +{ + struct le_conn_update_data *d = data; + + hci_conn_put(d->conn); + kfree(d); +} + +void hci_le_conn_update(struct hci_conn *conn, u16 min, u16 max, u16 latency, + u16 to_multiplier) +{ + struct le_conn_update_data *d; + + d = kzalloc_obj(*d); + if (!d) + return; + + hci_conn_get(conn); + d->conn = conn; + d->min = min; + d->max = max; + d->latency = latency; + d->to_multiplier = to_multiplier; + + if (hci_cmd_sync_queue(conn->hdev, le_conn_update_sync, d, + le_conn_update_complete) < 0) { + hci_conn_put(conn); + kfree(d); + } } void hci_le_start_enc(struct hci_conn *conn, __le16 ediv, __le64 rand, @@ -2130,6 +2197,9 @@ static int create_big_sync(struct hci_dev *hdev, void *data) u32 flags = 0; int err; + if (!hci_conn_valid(hdev, conn)) + return -ECANCELED; + if (qos->bcast.out.phys == BIT(1)) flags |= MGMT_ADV_FLAG_SEC_2M; @@ -2204,11 +2274,24 @@ static void create_big_complete(struct hci_dev *hdev, void *data, int err) bt_dev_dbg(hdev, "conn %p", conn); + if (err == -ECANCELED) + goto done; + + hci_dev_lock(hdev); + + if (!hci_conn_valid(hdev, conn)) + goto unlock; + if (err) { bt_dev_err(hdev, "Unable to create BIG: %d", err); hci_connect_cfm(conn, err); hci_conn_del(conn); } + +unlock: + hci_dev_unlock(hdev); +done: + hci_conn_put(conn); } struct hci_conn *hci_bind_bis(struct hci_dev *hdev, bdaddr_t *dst, __u8 sid, @@ -2336,10 +2419,11 @@ struct hci_conn *hci_connect_bis(struct hci_dev *hdev, bdaddr_t *dst, BT_BOUND, &data); /* Queue start periodic advertising and create BIG */ - err = hci_cmd_sync_queue(hdev, create_big_sync, conn, + err = hci_cmd_sync_queue(hdev, create_big_sync, hci_conn_get(conn), create_big_complete); if (err < 0) { hci_conn_drop(conn); + hci_conn_put(conn); return ERR_PTR(err); } diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c index b2ee6b6a0f5650..eea2f810aafab3 100644 --- a/net/bluetooth/hci_event.c +++ b/net/bluetooth/hci_event.c @@ -7118,9 +7118,29 @@ static void hci_le_create_big_complete_evt(struct hci_dev *hdev, void *data, continue; } + if (ev->num_bis <= i) { + bt_dev_err(hdev, + "Not enough BIS handles for BIG 0x%2.2x", + ev->handle); + ev->status = HCI_ERROR_UNSPECIFIED; + hci_connect_cfm(conn, ev->status); + hci_conn_del(conn); + continue; + } + if (hci_conn_set_handle(conn, - __le16_to_cpu(ev->bis_handle[i++]))) + __le16_to_cpu(ev->bis_handle[i++]))) { + bt_dev_err(hdev, + "Failed to set BIS handle for BIG 0x%2.2x", + ev->handle); + /* Force error so BIG gets terminated as not all BIS + * could be connected. + */ + ev->status = HCI_ERROR_UNSPECIFIED; + hci_connect_cfm(conn, ev->status); + hci_conn_del(conn); continue; + } conn->state = BT_CONNECTED; set_bit(HCI_CONN_BIG_CREATED, &conn->flags); @@ -7129,7 +7149,10 @@ static void hci_le_create_big_complete_evt(struct hci_dev *hdev, void *data, hci_iso_setup_path(conn); } - if (!ev->status && !i) + /* If there is an unexpected error or if no BISes have been connected + * for the BIG, terminate it. + */ + if (ev->status == HCI_ERROR_UNSPECIFIED || (!ev->status && !i)) /* If no BISes have been connected for the BIG, * terminate. This is in case all bound connections * have been closed before the BIG creation @@ -7168,7 +7191,7 @@ static void hci_le_big_sync_established_evt(struct hci_dev *hdev, void *data, clear_bit(HCI_CONN_CREATE_BIG_SYNC, &conn->flags); conn->num_bis = 0; - memset(conn->bis, 0, sizeof(conn->num_bis)); + memset(conn->bis, 0, sizeof(conn->bis)); for (i = 0; i < ev->num_bis; i++) { u16 handle = le16_to_cpu(ev->bis[i]); diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c index 7bcf8c5ceaeedc..976f91eeb745e4 100644 --- a/net/bluetooth/hidp/core.c +++ b/net/bluetooth/hidp/core.c @@ -1035,6 +1035,28 @@ static struct hidp_session *hidp_session_find(const bdaddr_t *bdaddr) return session; } +/* + * Consume session->conn: clear the member under hidp_session_sem, then + * l2cap_unregister_user() and l2cap_conn_put() the snapshot outside the + * sem. At most one caller wins; later callers see NULL and skip. The + * reference is the one hidp_session_new() took via l2cap_conn_get(). + */ +static void hidp_session_unregister_conn(struct hidp_session *session) +{ + struct l2cap_conn *conn; + + down_write(&hidp_session_sem); + conn = session->conn; + if (conn) + session->conn = NULL; + up_write(&hidp_session_sem); + + if (conn) { + l2cap_unregister_user(conn, &session->user); + l2cap_conn_put(conn); + } +} + /* * Start session synchronously * This starts a session thread and waits until initialization @@ -1311,8 +1333,7 @@ static int hidp_session_thread(void *arg) * Instead, this call has the same semantics as if user-space tried to * delete the session. */ - if (session->conn) - l2cap_unregister_user(session->conn, &session->user); + hidp_session_unregister_conn(session); hidp_session_put(session); @@ -1418,7 +1439,7 @@ int hidp_connection_del(struct hidp_conndel_req *req) HIDP_CTRL_VIRTUAL_CABLE_UNPLUG, NULL, 0); else - l2cap_unregister_user(session->conn, &session->user); + hidp_session_unregister_conn(session); hidp_session_put(session); diff --git a/net/bluetooth/iso.c b/net/bluetooth/iso.c index be145e2736b783..7cb2864fe87241 100644 --- a/net/bluetooth/iso.c +++ b/net/bluetooth/iso.c @@ -347,6 +347,7 @@ static int iso_connect_bis(struct sock *sk) return -EHOSTUNREACH; hci_dev_lock(hdev); + lock_sock(sk); if (!bis_capable(hdev)) { err = -EOPNOTSUPP; @@ -399,13 +400,9 @@ static int iso_connect_bis(struct sock *sk) goto unlock; } - lock_sock(sk); - err = iso_chan_add(conn, sk, NULL); - if (err) { - release_sock(sk); + if (err) goto unlock; - } /* Update source addr of the socket */ bacpy(&iso_pi(sk)->src, &hcon->src); @@ -421,9 +418,8 @@ static int iso_connect_bis(struct sock *sk) iso_sock_set_timer(sk, READ_ONCE(sk->sk_sndtimeo)); } - release_sock(sk); - unlock: + release_sock(sk); hci_dev_unlock(hdev); hci_dev_put(hdev); return err; @@ -444,6 +440,7 @@ static int iso_connect_cis(struct sock *sk) return -EHOSTUNREACH; hci_dev_lock(hdev); + lock_sock(sk); if (!cis_central_capable(hdev)) { err = -EOPNOTSUPP; @@ -498,13 +495,9 @@ static int iso_connect_cis(struct sock *sk) goto unlock; } - lock_sock(sk); - err = iso_chan_add(conn, sk, NULL); - if (err) { - release_sock(sk); + if (err) goto unlock; - } /* Update source addr of the socket */ bacpy(&iso_pi(sk)->src, &hcon->src); @@ -520,9 +513,8 @@ static int iso_connect_cis(struct sock *sk) iso_sock_set_timer(sk, READ_ONCE(sk->sk_sndtimeo)); } - release_sock(sk); - unlock: + release_sock(sk); hci_dev_unlock(hdev); hci_dev_put(hdev); return err; @@ -1193,7 +1185,7 @@ static int iso_sock_connect(struct socket *sock, struct sockaddr_unsized *addr, release_sock(sk); - if (bacmp(&iso_pi(sk)->dst, BDADDR_ANY)) + if (bacmp(&sa->iso_bdaddr, BDADDR_ANY)) err = iso_connect_cis(sk); else err = iso_connect_bis(sk); @@ -2256,8 +2248,10 @@ int iso_connect_ind(struct hci_dev *hdev, bdaddr_t *bdaddr, __u8 *flags) sk = iso_get_sock(hdev, &hdev->bdaddr, bdaddr, BT_LISTEN, iso_match_sid, ev1); if (sk && !ev1->status) { + lock_sock(sk); iso_pi(sk)->sync_handle = le16_to_cpu(ev1->handle); iso_pi(sk)->bc_sid = ev1->sid; + release_sock(sk); } goto done; @@ -2268,8 +2262,10 @@ int iso_connect_ind(struct hci_dev *hdev, bdaddr_t *bdaddr, __u8 *flags) sk = iso_get_sock(hdev, &hdev->bdaddr, bdaddr, BT_LISTEN, iso_match_sid_past, ev1a); if (sk && !ev1a->status) { + lock_sock(sk); iso_pi(sk)->sync_handle = le16_to_cpu(ev1a->sync_handle); iso_pi(sk)->bc_sid = ev1a->sid; + release_sock(sk); } goto done; @@ -2296,27 +2292,35 @@ int iso_connect_ind(struct hci_dev *hdev, bdaddr_t *bdaddr, __u8 *flags) ev2); if (sk) { - int err; - struct hci_conn *hcon = iso_pi(sk)->conn->hcon; + int err = 0; + bool big_sync; + struct hci_conn *hcon; + lock_sock(sk); + + hcon = iso_pi(sk)->conn->hcon; iso_pi(sk)->qos.bcast.encryption = ev2->encryption; if (ev2->num_bis < iso_pi(sk)->bc_num_bis) iso_pi(sk)->bc_num_bis = ev2->num_bis; - if (!test_bit(BT_SK_DEFER_SETUP, &bt_sk(sk)->flags) && - !test_and_set_bit(BT_SK_BIG_SYNC, &iso_pi(sk)->flags)) { + big_sync = !test_bit(BT_SK_DEFER_SETUP, &bt_sk(sk)->flags) && + !test_and_set_bit(BT_SK_BIG_SYNC, &iso_pi(sk)->flags); + + if (big_sync) err = hci_conn_big_create_sync(hdev, hcon, &iso_pi(sk)->qos, iso_pi(sk)->sync_handle, iso_pi(sk)->bc_num_bis, iso_pi(sk)->bc_bis); - if (err) { - bt_dev_err(hdev, "hci_le_big_create_sync: %d", - err); - sock_put(sk); - sk = NULL; - } + + release_sock(sk); + + if (big_sync && err) { + bt_dev_err(hdev, "hci_le_big_create_sync: %d", + err); + sock_put(sk); + sk = NULL; } } @@ -2370,8 +2374,10 @@ int iso_connect_ind(struct hci_dev *hdev, bdaddr_t *bdaddr, __u8 *flags) if (!base || base_len > BASE_MAX_LENGTH) goto done; + lock_sock(sk); memcpy(iso_pi(sk)->base, base, base_len); iso_pi(sk)->base_len = base_len; + release_sock(sk); } else { /* This is a PA data fragment. Keep pa_data_len set to 0 * until all data has been reassembled. diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c index 77dec104a9c367..7701528f11677c 100644 --- a/net/bluetooth/l2cap_core.c +++ b/net/bluetooth/l2cap_core.c @@ -4706,16 +4706,8 @@ static inline int l2cap_conn_param_update_req(struct l2cap_conn *conn, l2cap_send_cmd(conn, cmd->ident, L2CAP_CONN_PARAM_UPDATE_RSP, sizeof(rsp), &rsp); - if (!err) { - u8 store_hint; - - store_hint = hci_le_conn_update(hcon, min, max, latency, - to_multiplier); - mgmt_new_conn_param(hcon->hdev, &hcon->dst, hcon->dst_type, - store_hint, min, max, latency, - to_multiplier); - - } + if (!err) + hci_le_conn_update(hcon, min, max, latency, to_multiplier); return 0; } @@ -5428,7 +5420,7 @@ static inline int l2cap_ecred_reconf_req(struct l2cap_conn *conn, * configured, the MPS field may be less than the current MPS * of that channel. */ - if (chan[i]->remote_mps >= mps && i) { + if (chan[i]->remote_mps > mps && num_scid > 1) { BT_ERR("chan %p decreased MPS %u -> %u", chan[i], chan[i]->remote_mps, mps); result = L2CAP_RECONF_INVALID_MPS; diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c index 71e8c1b45bcee1..cf590a67d3641c 100644 --- a/net/bluetooth/l2cap_sock.c +++ b/net/bluetooth/l2cap_sock.c @@ -1498,6 +1498,9 @@ static struct l2cap_chan *l2cap_sock_new_connection_cb(struct l2cap_chan *chan) { struct sock *sk, *parent = chan->data; + if (!parent) + return NULL; + lock_sock(parent); /* Check for backlog size */ @@ -1657,6 +1660,9 @@ static void l2cap_sock_state_change_cb(struct l2cap_chan *chan, int state, { struct sock *sk = chan->data; + if (!sk) + return; + sk->sk_state = state; if (err) @@ -1758,6 +1764,9 @@ static long l2cap_sock_get_sndtimeo_cb(struct l2cap_chan *chan) { struct sock *sk = chan->data; + if (!sk) + return 0; + return READ_ONCE(sk->sk_sndtimeo); } diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c index 611a9a94151ecf..d11bd5337d573e 100644 --- a/net/bluetooth/rfcomm/core.c +++ b/net/bluetooth/rfcomm/core.c @@ -1715,9 +1715,12 @@ static int rfcomm_recv_data(struct rfcomm_session *s, u8 dlci, int pf, struct sk } if (pf && d->cfc) { - u8 credits = *(u8 *) skb->data; skb_pull(skb, 1); + u8 *credits = skb_pull_data(skb, 1); - d->tx_credits += credits; + if (!credits) + goto drop; + + d->tx_credits += *credits; if (d->tx_credits) clear_bit(RFCOMM_TX_THROTTLED, &d->flags); } diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c index 18826d4b9c0bf8..eba44525d41d98 100644 --- a/net/bluetooth/sco.c +++ b/net/bluetooth/sco.c @@ -472,9 +472,13 @@ static struct sock *sco_get_sock_listen(bdaddr_t *src) sk1 = sk; } + sk = sk ? sk : sk1; + if (sk) + sock_hold(sk); + read_unlock(&sco_sk_list.lock); - return sk ? sk : sk1; + return sk; } static void sco_sock_destruct(struct sock *sk) @@ -515,11 +519,13 @@ static void sco_sock_kill(struct sock *sk) BT_DBG("sk %p state %d", sk, sk->sk_state); /* Sock is dead, so set conn->sk to NULL to avoid possible UAF */ + lock_sock(sk); if (sco_pi(sk)->conn) { sco_conn_lock(sco_pi(sk)->conn); sco_pi(sk)->conn->sk = NULL; sco_conn_unlock(sco_pi(sk)->conn); } + release_sock(sk); /* Kill poor orphan */ bt_sock_unlink(&sco_sk_list, sk); @@ -1365,40 +1371,51 @@ static int sco_sock_release(struct socket *sock) static void sco_conn_ready(struct sco_conn *conn) { - struct sock *parent; - struct sock *sk = conn->sk; + struct sock *parent, *sk; + + sco_conn_lock(conn); + sk = sco_sock_hold(conn); + sco_conn_unlock(conn); BT_DBG("conn %p", conn); if (sk) { lock_sock(sk); - sco_sock_clear_timer(sk); - sk->sk_state = BT_CONNECTED; - sk->sk_state_change(sk); + + /* conn->sk may have become NULL if racing with sk close, but + * due to held hdev->lock, it can't become different sk. + */ + if (conn->sk) { + sco_sock_clear_timer(sk); + sk->sk_state = BT_CONNECTED; + sk->sk_state_change(sk); + } + release_sock(sk); + sock_put(sk); } else { - sco_conn_lock(conn); - - if (!conn->hcon) { - sco_conn_unlock(conn); + if (!conn->hcon) return; - } + + lockdep_assert_held(&conn->hcon->hdev->lock); parent = sco_get_sock_listen(&conn->hcon->src); - if (!parent) { - sco_conn_unlock(conn); + if (!parent) return; - } lock_sock(parent); + sco_conn_lock(conn); + + /* hdev->lock guarantees conn->sk == NULL still here */ + + if (parent->sk_state != BT_LISTEN) + goto release; + sk = sco_sock_alloc(sock_net(parent), NULL, BTPROTO_SCO, GFP_ATOMIC, 0); - if (!sk) { - release_sock(parent); - sco_conn_unlock(conn); - return; - } + if (!sk) + goto release; sco_sock_init(sk, parent); @@ -1417,9 +1434,10 @@ static void sco_conn_ready(struct sco_conn *conn) /* Wake up parent */ parent->sk_data_ready(parent); - release_sock(parent); - +release: sco_conn_unlock(conn); + release_sock(parent); + sock_put(parent); } } diff --git a/net/bridge/netfilter/ebtable_broute.c b/net/bridge/netfilter/ebtable_broute.c index 74136021955257..f05c79f215ea06 100644 --- a/net/bridge/netfilter/ebtable_broute.c +++ b/net/bridge/netfilter/ebtable_broute.c @@ -112,24 +112,22 @@ static struct pernet_operations broute_net_ops = { static int __init ebtable_broute_init(void) { - int ret = ebt_register_template(&broute_table, broute_table_init); + int ret = register_pernet_subsys(&broute_net_ops); if (ret) return ret; - ret = register_pernet_subsys(&broute_net_ops); - if (ret) { - ebt_unregister_template(&broute_table); - return ret; - } + ret = ebt_register_template(&broute_table, broute_table_init); + if (ret) + unregister_pernet_subsys(&broute_net_ops); - return 0; + return ret; } static void __exit ebtable_broute_fini(void) { - unregister_pernet_subsys(&broute_net_ops); ebt_unregister_template(&broute_table); + unregister_pernet_subsys(&broute_net_ops); } module_init(ebtable_broute_init); diff --git a/net/bridge/netfilter/ebtable_filter.c b/net/bridge/netfilter/ebtable_filter.c index dacd81b12e6264..0fc03b07e62aeb 100644 --- a/net/bridge/netfilter/ebtable_filter.c +++ b/net/bridge/netfilter/ebtable_filter.c @@ -93,24 +93,22 @@ static struct pernet_operations frame_filter_net_ops = { static int __init ebtable_filter_init(void) { - int ret = ebt_register_template(&frame_filter, frame_filter_table_init); + int ret = register_pernet_subsys(&frame_filter_net_ops); if (ret) return ret; - ret = register_pernet_subsys(&frame_filter_net_ops); - if (ret) { - ebt_unregister_template(&frame_filter); - return ret; - } + ret = ebt_register_template(&frame_filter, frame_filter_table_init); + if (ret) + unregister_pernet_subsys(&frame_filter_net_ops); - return 0; + return ret; } static void __exit ebtable_filter_fini(void) { - unregister_pernet_subsys(&frame_filter_net_ops); ebt_unregister_template(&frame_filter); + unregister_pernet_subsys(&frame_filter_net_ops); } module_init(ebtable_filter_init); diff --git a/net/bridge/netfilter/ebtable_nat.c b/net/bridge/netfilter/ebtable_nat.c index 0f2a8c6118d42e..8a10375d890992 100644 --- a/net/bridge/netfilter/ebtable_nat.c +++ b/net/bridge/netfilter/ebtable_nat.c @@ -93,24 +93,22 @@ static struct pernet_operations frame_nat_net_ops = { static int __init ebtable_nat_init(void) { - int ret = ebt_register_template(&frame_nat, frame_nat_table_init); + int ret = register_pernet_subsys(&frame_nat_net_ops); if (ret) return ret; - ret = register_pernet_subsys(&frame_nat_net_ops); - if (ret) { - ebt_unregister_template(&frame_nat); - return ret; - } + ret = ebt_register_template(&frame_nat, frame_nat_table_init); + if (ret) + unregister_pernet_subsys(&frame_nat_net_ops); return ret; } static void __exit ebtable_nat_fini(void) { - unregister_pernet_subsys(&frame_nat_net_ops); ebt_unregister_template(&frame_nat); + unregister_pernet_subsys(&frame_nat_net_ops); } module_init(ebtable_nat_init); diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c index aea3e19875c69d..b9f4daac09af36 100644 --- a/net/bridge/netfilter/ebtables.c +++ b/net/bridge/netfilter/ebtables.c @@ -42,6 +42,7 @@ struct ebt_pernet { struct list_head tables; + struct list_head dead_tables; }; struct ebt_template { @@ -1162,11 +1163,6 @@ static int do_replace(struct net *net, sockptr_t arg, unsigned int len) static void __ebt_unregister_table(struct net *net, struct ebt_table *table) { - mutex_lock(&ebt_mutex); - list_del(&table->list); - mutex_unlock(&ebt_mutex); - audit_log_nfcfg(table->name, AF_BRIDGE, table->private->nentries, - AUDIT_XT_OP_UNREGISTER, GFP_KERNEL); EBT_ENTRY_ITERATE(table->private->entries, table->private->entries_size, ebt_cleanup_entry, net, NULL); if (table->private->nentries) @@ -1267,13 +1263,15 @@ int ebt_register_table(struct net *net, const struct ebt_table *input_table, for (i = 0; i < num_ops; i++) ops[i].priv = table; - list_add(&table->list, &ebt_net->tables); - mutex_unlock(&ebt_mutex); - table->ops = ops; ret = nf_register_net_hooks(net, ops, num_ops); - if (ret) + if (ret) { + synchronize_rcu(); __ebt_unregister_table(net, table); + } else { + list_add(&table->list, &ebt_net->tables); + } + mutex_unlock(&ebt_mutex); audit_log_nfcfg(repl->name, AF_BRIDGE, repl->nentries, AUDIT_XT_OP_REGISTER, GFP_KERNEL); @@ -1339,7 +1337,7 @@ void ebt_unregister_template(const struct ebt_table *t) } EXPORT_SYMBOL(ebt_unregister_template); -static struct ebt_table *__ebt_find_table(struct net *net, const char *name) +void ebt_unregister_table_pre_exit(struct net *net, const char *name) { struct ebt_pernet *ebt_net = net_generic(net, ebt_pernet_id); struct ebt_table *t; @@ -1348,30 +1346,36 @@ static struct ebt_table *__ebt_find_table(struct net *net, const char *name) list_for_each_entry(t, &ebt_net->tables, list) { if (strcmp(t->name, name) == 0) { + list_move(&t->list, &ebt_net->dead_tables); mutex_unlock(&ebt_mutex); - return t; + nf_unregister_net_hooks(net, t->ops, hweight32(t->valid_hooks)); + return; } } mutex_unlock(&ebt_mutex); - return NULL; -} - -void ebt_unregister_table_pre_exit(struct net *net, const char *name) -{ - struct ebt_table *table = __ebt_find_table(net, name); - - if (table) - nf_unregister_net_hooks(net, table->ops, hweight32(table->valid_hooks)); } EXPORT_SYMBOL(ebt_unregister_table_pre_exit); void ebt_unregister_table(struct net *net, const char *name) { - struct ebt_table *table = __ebt_find_table(net, name); + struct ebt_pernet *ebt_net = net_generic(net, ebt_pernet_id); + struct ebt_table *t; - if (table) - __ebt_unregister_table(net, table); + mutex_lock(&ebt_mutex); + + list_for_each_entry(t, &ebt_net->dead_tables, list) { + if (strcmp(t->name, name) == 0) { + list_del(&t->list); + audit_log_nfcfg(t->name, AF_BRIDGE, t->private->nentries, + AUDIT_XT_OP_UNREGISTER, GFP_KERNEL); + __ebt_unregister_table(net, t); + mutex_unlock(&ebt_mutex); + return; + } + } + + mutex_unlock(&ebt_mutex); } /* userspace just supplied us with counters */ @@ -2556,11 +2560,21 @@ static int __net_init ebt_pernet_init(struct net *net) struct ebt_pernet *ebt_net = net_generic(net, ebt_pernet_id); INIT_LIST_HEAD(&ebt_net->tables); + INIT_LIST_HEAD(&ebt_net->dead_tables); return 0; } +static void __net_exit ebt_pernet_exit(struct net *net) +{ + struct ebt_pernet *ebt_net = net_generic(net, ebt_pernet_id); + + WARN_ON_ONCE(!list_empty(&ebt_net->tables)); + WARN_ON_ONCE(!list_empty(&ebt_net->dead_tables)); +} + static struct pernet_operations ebt_net_ops = { .init = ebt_pernet_init, + .exit = ebt_pernet_exit, .id = &ebt_pernet_id, .size = sizeof(struct ebt_pernet), }; @@ -2569,19 +2583,20 @@ static int __init ebtables_init(void) { int ret; - ret = xt_register_target(&ebt_standard_target); + ret = register_pernet_subsys(&ebt_net_ops); if (ret < 0) return ret; - ret = nf_register_sockopt(&ebt_sockopts); + + ret = xt_register_target(&ebt_standard_target); if (ret < 0) { - xt_unregister_target(&ebt_standard_target); + unregister_pernet_subsys(&ebt_net_ops); return ret; } - ret = register_pernet_subsys(&ebt_net_ops); + ret = nf_register_sockopt(&ebt_sockopts); if (ret < 0) { - nf_unregister_sockopt(&ebt_sockopts); xt_unregister_target(&ebt_standard_target); + unregister_pernet_subsys(&ebt_net_ops); return ret; } diff --git a/net/ceph/auth_x.c b/net/ceph/auth_x.c index 692e0b86882238..9e64e82d0b63bf 100644 --- a/net/ceph/auth_x.c +++ b/net/ceph/auth_x.c @@ -115,6 +115,11 @@ static int __ceph_x_decrypt(const struct ceph_crypto_key *key, int usage_slot, if (ret) return ret; + if (plaintext_len < sizeof(*hdr)) { + pr_err("%s plaintext too small %d\n", __func__, plaintext_len); + return -EINVAL; + } + hdr = p + ceph_crypt_data_offset(key); if (le64_to_cpu(hdr->magic) != CEPHX_ENC_MAGIC) { pr_err("%s bad magic\n", __func__); diff --git a/net/ceph/crush/crush.c b/net/ceph/crush/crush.c index 254ded0b05f6a1..521aec1d5fc060 100644 --- a/net/ceph/crush/crush.c +++ b/net/ceph/crush/crush.c @@ -47,7 +47,6 @@ int crush_get_bucket_item_weight(const struct crush_bucket *b, int p) void crush_destroy_bucket_uniform(struct crush_bucket_uniform *b) { kfree(b->h.items); - kfree(b); } void crush_destroy_bucket_list(struct crush_bucket_list *b) @@ -55,14 +54,12 @@ void crush_destroy_bucket_list(struct crush_bucket_list *b) kfree(b->item_weights); kfree(b->sum_weights); kfree(b->h.items); - kfree(b); } void crush_destroy_bucket_tree(struct crush_bucket_tree *b) { kfree(b->h.items); kfree(b->node_weights); - kfree(b); } void crush_destroy_bucket_straw(struct crush_bucket_straw *b) @@ -70,14 +67,12 @@ void crush_destroy_bucket_straw(struct crush_bucket_straw *b) kfree(b->straws); kfree(b->item_weights); kfree(b->h.items); - kfree(b); } void crush_destroy_bucket_straw2(struct crush_bucket_straw2 *b) { kfree(b->item_weights); kfree(b->h.items); - kfree(b); } void crush_destroy_bucket(struct crush_bucket *b) @@ -99,6 +94,7 @@ void crush_destroy_bucket(struct crush_bucket *b) crush_destroy_bucket_straw2((struct crush_bucket_straw2 *)b); break; } + kfree(b); } /** diff --git a/net/ceph/osdmap.c b/net/ceph/osdmap.c index c89e66d4fcb7fe..8b5b0587a0cfa2 100644 --- a/net/ceph/osdmap.c +++ b/net/ceph/osdmap.c @@ -72,8 +72,7 @@ static int crush_decode_uniform_bucket(void **p, void *end, struct crush_bucket_uniform *b) { dout("crush_decode_uniform_bucket %p to %p\n", *p, end); - ceph_decode_need(p, end, (1+b->h.size) * sizeof(u32), bad); - b->item_weight = ceph_decode_32(p); + ceph_decode_32_safe(p, end, b->item_weight, bad); return 0; bad: return -EINVAL; @@ -389,11 +388,15 @@ static int decode_choose_args(void **p, void *end, struct crush_map *c) goto fail; if (arg->ids_size && - arg->ids_size != c->buckets[bucket_index]->size) + (!c->buckets[bucket_index] || + arg->ids_size != c->buckets[bucket_index]->size)) goto e_inval; } - insert_choose_arg_map(&c->choose_args, arg_map); + if (!__insert_choose_arg_map(&c->choose_args, arg_map)) { + ret = -EEXIST; + goto fail; + } } return 0; @@ -516,6 +519,10 @@ static struct crush_map *crush_decode(void *pbyval, void *end) b->id = ceph_decode_32(p); b->type = ceph_decode_16(p); b->alg = ceph_decode_8(p); + if (b->alg != alg) { + b->alg = 0; + goto bad; + } b->hash = ceph_decode_8(p); b->weight = ceph_decode_32(p); b->size = ceph_decode_32(p); @@ -1702,7 +1709,7 @@ static int osdmap_decode(void **p, void *end, bool msgr2, ceph_decode_need(p, end, 3*sizeof(u32) + map->max_osd*(struct_v >= 5 ? sizeof(u32) : sizeof(u8)) + - sizeof(*map->osd_weight), e_inval); + map->max_osd*sizeof(*map->osd_weight), e_inval); if (ceph_decode_32(p) != map->max_osd) goto e_inval; diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c index 14eb7812bda4a9..ecd659f79fd4a0 100644 --- a/net/core/bpf_sk_storage.c +++ b/net/core/bpf_sk_storage.c @@ -172,7 +172,7 @@ int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk) struct bpf_map *map; smap = rcu_dereference(SDATA(selem)->smap); - if (!(smap->map.map_flags & BPF_F_CLONE)) + if (!smap || !(smap->map.map_flags & BPF_F_CLONE)) continue; /* Note that for lockless listeners adding new element @@ -531,10 +531,10 @@ bpf_sk_storage_diag_alloc(const struct nlattr *nla_stgs) } EXPORT_SYMBOL_GPL(bpf_sk_storage_diag_alloc); -static int diag_get(struct bpf_local_storage_data *sdata, struct sk_buff *skb) +static int diag_get(struct bpf_local_storage_map *smap, + struct bpf_local_storage_data *sdata, struct sk_buff *skb) { struct nlattr *nla_stg, *nla_value; - struct bpf_local_storage_map *smap; /* It cannot exceed max nlattr's payload */ BUILD_BUG_ON(U16_MAX - NLA_HDRLEN < BPF_LOCAL_STORAGE_MAX_VALUE_SIZE); @@ -543,7 +543,6 @@ static int diag_get(struct bpf_local_storage_data *sdata, struct sk_buff *skb) if (!nla_stg) return -EMSGSIZE; - smap = rcu_dereference(sdata->smap); if (nla_put_u32(skb, SK_DIAG_BPF_STORAGE_MAP_ID, smap->map.id)) goto errout; @@ -558,6 +557,7 @@ static int diag_get(struct bpf_local_storage_data *sdata, struct sk_buff *skb) sdata->data, true); else copy_map_value(&smap->map, nla_data(nla_value), sdata->data); + check_and_init_map_value(&smap->map, nla_data(nla_value)); nla_nest_end(skb, nla_stg); return 0; @@ -596,9 +596,11 @@ static int bpf_sk_storage_diag_put_all(struct sock *sk, struct sk_buff *skb, saved_len = skb->len; hlist_for_each_entry_rcu(selem, &sk_storage->list, snode) { smap = rcu_dereference(SDATA(selem)->smap); + if (!smap) + continue; diag_size += nla_value_size(smap->map.value_size); - if (nla_stgs && diag_get(SDATA(selem), skb)) + if (nla_stgs && diag_get(smap, SDATA(selem), skb)) /* Continue to learn diag_size */ err = -EMSGSIZE; } @@ -665,7 +667,7 @@ int bpf_sk_storage_diag_put(struct bpf_sk_storage_diag *diag, diag_size += nla_value_size(diag->maps[i]->value_size); - if (nla_stgs && diag_get(sdata, skb)) + if (nla_stgs && diag_get((struct bpf_local_storage_map *)diag->maps[i], sdata, skb)) /* Continue to learn diag_size */ err = -EMSGSIZE; } diff --git a/net/core/dev.c b/net/core/dev.c index 06c195906231a3..0c6c270d9f7d11 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -371,7 +371,7 @@ static void netdev_name_node_alt_free(struct rcu_head *head) static void __netdev_name_node_alt_destroy(struct netdev_name_node *name_node) { netdev_name_node_del(name_node); - list_del(&name_node->list); + list_del_rcu(&name_node->list); call_rcu(&name_node->rcu, netdev_name_node_alt_free); } @@ -6862,9 +6862,9 @@ static void skb_defer_free_flush(void) #if defined(CONFIG_NET_RX_BUSY_POLL) -static void __busy_poll_stop(struct napi_struct *napi, bool skip_schedule) +static void __busy_poll_stop(struct napi_struct *napi, unsigned long timeout) { - if (!skip_schedule) { + if (!timeout) { gro_normal_list(&napi->gro); __napi_schedule(napi); return; @@ -6874,6 +6874,8 @@ static void __busy_poll_stop(struct napi_struct *napi, bool skip_schedule) gro_flush_normal(&napi->gro, HZ >= 1000); clear_bit(NAPI_STATE_SCHED, &napi->state); + hrtimer_start(&napi->timer, ns_to_ktime(timeout), + HRTIMER_MODE_REL_PINNED); } enum { @@ -6885,8 +6887,7 @@ static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock, unsigned flags, u16 budget) { struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; - bool skip_schedule = false; - unsigned long timeout; + unsigned long timeout = 0; int rc; /* Busy polling means there is a high chance device driver hard irq @@ -6906,10 +6907,12 @@ static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock, if (flags & NAPI_F_PREFER_BUSY_POLL) { napi->defer_hard_irqs_count = napi_get_defer_hard_irqs(napi); - timeout = napi_get_gro_flush_timeout(napi); - if (napi->defer_hard_irqs_count && timeout) { - hrtimer_start(&napi->timer, ns_to_ktime(timeout), HRTIMER_MODE_REL_PINNED); - skip_schedule = true; + if (napi->defer_hard_irqs_count) { + /* A short enough gro flush timeout and long enough + * poll can result in timer firing too early. + * Timer will be armed later if necessary. + */ + timeout = napi_get_gro_flush_timeout(napi); } } @@ -6924,7 +6927,7 @@ static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock, trace_napi_poll(napi, rc, budget); netpoll_poll_unlock(have_poll_lock); if (rc == budget) - __busy_poll_stop(napi, skip_schedule); + __busy_poll_stop(napi, timeout); bpf_net_ctx_clear(bpf_net_ctx); local_bh_enable(); } diff --git a/net/core/failover.c b/net/core/failover.c index 11bb183c7a1bac..e43c59cd686853 100644 --- a/net/core/failover.c +++ b/net/core/failover.c @@ -12,6 +12,7 @@ #include #include #include +#include #include static LIST_HEAD(failover_list); @@ -221,8 +222,11 @@ failover_existing_slave_register(struct net_device *failover_dev) for_each_netdev(net, dev) { if (netif_is_failover(dev)) continue; - if (ether_addr_equal(failover_dev->perm_addr, dev->perm_addr)) + if (ether_addr_equal(failover_dev->perm_addr, dev->perm_addr)) { + netdev_lock_ops(dev); failover_slave_register(dev); + netdev_unlock_ops(dev); + } } rtnl_unlock(); } diff --git a/net/core/filter.c b/net/core/filter.c index 80a3b702a2d4da..9590877b0714f7 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -1654,15 +1654,24 @@ int sk_reuseport_attach_bpf(u32 ufd, struct sock *sk) return err; } +static void sk_reuseport_prog_free_rcu(struct rcu_head *rcu) +{ + struct bpf_prog_aux *aux = container_of(rcu, struct bpf_prog_aux, rcu); + struct bpf_prog *prog = aux->prog; + + bpf_release_orig_filter(prog); + bpf_prog_free(prog); +} + void sk_reuseport_prog_free(struct bpf_prog *prog) { if (!prog) return; - if (prog->type == BPF_PROG_TYPE_SK_REUSEPORT) - bpf_prog_put(prog); + if (bpf_prog_was_classic(prog)) + call_rcu(&prog->aux->rcu, sk_reuseport_prog_free_rcu); else - bpf_prog_destroy(prog); + bpf_prog_put(prog); } static inline int __bpf_try_make_writable(struct sk_buff *skb, @@ -5481,7 +5490,7 @@ static int sol_tcp_sockopt(struct sock *sk, int optname, char *optval, int *optlen, bool getopt) { - if (sk->sk_protocol != IPPROTO_TCP) + if (!sk_is_tcp(sk)) return -EINVAL; switch (optname) { @@ -5688,6 +5697,30 @@ const struct bpf_func_proto bpf_sk_getsockopt_proto = { .arg5_type = ARG_CONST_SIZE, }; +BPF_CALL_5(bpf_sk_setsockopt_nodelay, struct sock *, sk, int, level, + int, optname, char *, optval, int, optlen) +{ + /* + * TCP_NODELAY triggers tcp_push_pending_frames() and re-enters + * CA_EVENT_TX_START in bpf_tcp_cc. + */ + if (level == SOL_TCP && optname == TCP_NODELAY) + return -EOPNOTSUPP; + + return _bpf_setsockopt(sk, level, optname, optval, optlen); +} + +const struct bpf_func_proto bpf_sk_setsockopt_nodelay_proto = { + .func = bpf_sk_setsockopt_nodelay, + .gpl_only = false, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_BTF_ID_SOCK_COMMON, + .arg2_type = ARG_ANYTHING, + .arg3_type = ARG_ANYTHING, + .arg4_type = ARG_PTR_TO_MEM | MEM_RDONLY, + .arg5_type = ARG_CONST_SIZE, +}; + BPF_CALL_5(bpf_unlocked_sk_setsockopt, struct sock *, sk, int, level, int, optname, char *, optval, int, optlen) { @@ -5833,6 +5866,12 @@ BPF_CALL_5(bpf_sock_ops_setsockopt, struct bpf_sock_ops_kern *, bpf_sock, if (!is_locked_tcp_sock_ops(bpf_sock)) return -EOPNOTSUPP; + /* TCP_NODELAY triggers tcp_push_pending_frames() and re-enters these callbacks. */ + if ((bpf_sock->op == BPF_SOCK_OPS_HDR_OPT_LEN_CB || + bpf_sock->op == BPF_SOCK_OPS_WRITE_HDR_OPT_CB) && + level == SOL_TCP && optname == TCP_NODELAY) + return -EOPNOTSUPP; + return _bpf_setsockopt(bpf_sock->sk, level, optname, optval, optlen); } @@ -6443,6 +6482,8 @@ BPF_CALL_4(bpf_skb_fib_lookup, struct sk_buff *, skb, * against MTU of FIB lookup resulting net_device */ dev = dev_get_by_index_rcu(net, params->ifindex); + if (unlikely(!dev)) + return -ENODEV; if (!is_skb_forwardable(dev, skb)) rc = BPF_FIB_LKUP_RET_FRAG_NEEDED; @@ -7443,7 +7484,7 @@ u32 bpf_tcp_sock_convert_ctx_access(enum bpf_access_type type, BPF_CALL_1(bpf_tcp_sock, struct sock *, sk) { - if (sk_fullsock(sk) && sk->sk_protocol == IPPROTO_TCP) + if (sk_fullsock(sk) && sk_is_tcp(sk)) return (unsigned long)sk; return (unsigned long)NULL; @@ -11915,7 +11956,7 @@ BPF_CALL_1(bpf_skc_to_tcp6_sock, struct sock *, sk) */ BTF_TYPE_EMIT(struct tcp6_sock); if (sk && sk_fullsock(sk) && sk->sk_protocol == IPPROTO_TCP && - sk->sk_family == AF_INET6) + sk->sk_type == SOCK_STREAM && sk->sk_family == AF_INET6) return (unsigned long)sk; return (unsigned long)NULL; @@ -11931,7 +11972,7 @@ const struct bpf_func_proto bpf_skc_to_tcp6_sock_proto = { BPF_CALL_1(bpf_skc_to_tcp_sock, struct sock *, sk) { - if (sk && sk_fullsock(sk) && sk->sk_protocol == IPPROTO_TCP) + if (sk && sk_fullsock(sk) && sk_is_tcp(sk)) return (unsigned long)sk; return (unsigned long)NULL; diff --git a/net/core/netpoll.c b/net/core/netpoll.c index 4381e0fc25bf47..84faace50ac281 100644 --- a/net/core/netpoll.c +++ b/net/core/netpoll.c @@ -608,14 +608,16 @@ EXPORT_SYMBOL_GPL(__netpoll_setup); /* * Returns a pointer to a string representation of the identifier used * to select the egress interface for the given netpoll instance. buf - * must be a buffer of length at least MAC_ADDR_STR_LEN + 1. + * is used to format np->dev_mac when np->dev_name is empty; bufsz must + * be at least MAC_ADDR_STR_LEN + 1 to fit the formatted MAC address + * and its NUL terminator. */ -static char *egress_dev(struct netpoll *np, char *buf) +static char *egress_dev(struct netpoll *np, char *buf, size_t bufsz) { if (np->dev_name[0]) return np->dev_name; - snprintf(buf, MAC_ADDR_STR_LEN, "%pM", np->dev_mac); + snprintf(buf, bufsz, "%pM", np->dev_mac); return buf; } @@ -645,7 +647,7 @@ static int netpoll_take_ipv6(struct netpoll *np, struct net_device *ndev) if (!IS_ENABLED(CONFIG_IPV6)) { np_err(np, "IPv6 is not supported %s, aborting\n", - egress_dev(np, buf)); + egress_dev(np, buf, sizeof(buf))); return -EINVAL; } @@ -667,7 +669,7 @@ static int netpoll_take_ipv6(struct netpoll *np, struct net_device *ndev) } if (err) { np_err(np, "no IPv6 address for %s, aborting\n", - egress_dev(np, buf)); + egress_dev(np, buf, sizeof(buf))); return err; } @@ -687,14 +689,14 @@ static int netpoll_take_ipv4(struct netpoll *np, struct net_device *ndev) in_dev = __in_dev_get_rtnl(ndev); if (!in_dev) { np_err(np, "no IP address for %s, aborting\n", - egress_dev(np, buf)); + egress_dev(np, buf, sizeof(buf))); return -EDESTADDRREQ; } ifa = rtnl_dereference(in_dev->ifa_list); if (!ifa) { np_err(np, "no IP address for %s, aborting\n", - egress_dev(np, buf)); + egress_dev(np, buf, sizeof(buf))); return -EDESTADDRREQ; } @@ -736,7 +738,8 @@ int netpoll_setup(struct netpoll *np) ndev = dev_getbyhwaddr(net, ARPHRD_ETHER, np->dev_mac); if (!ndev) { - np_err(np, "%s doesn't exist, aborting\n", egress_dev(np, buf)); + np_err(np, "%s doesn't exist, aborting\n", + egress_dev(np, buf, sizeof(buf))); err = -ENODEV; goto unlock; } @@ -744,14 +747,14 @@ int netpoll_setup(struct netpoll *np) if (netdev_master_upper_dev_get(ndev)) { np_err(np, "%s is a slave device, aborting\n", - egress_dev(np, buf)); + egress_dev(np, buf, sizeof(buf))); err = -EBUSY; goto put; } if (!netif_running(ndev)) { np_info(np, "device %s not up yet, forcing it\n", - egress_dev(np, buf)); + egress_dev(np, buf, sizeof(buf))); err = dev_open(ndev, NULL); if (err) { diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index b613bb6e07df65..df042da422ef31 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -1572,6 +1572,7 @@ static noinline_for_stack int rtnl_fill_vfinfo(struct sk_buff *skb, port_guid.vf = ivi.vf; memcpy(vf_mac.mac, ivi.mac, sizeof(ivi.mac)); + memset(&vf_broadcast, 0, sizeof(vf_broadcast)); memcpy(vf_broadcast.broadcast, dev->broadcast, dev->addr_len); vf_vlan.vlan = ivi.vlan; vf_vlan.qos = ivi.qos; diff --git a/net/core/sock_map.c b/net/core/sock_map.c index 02a68be3002a2e..99e3789492a09e 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -1630,18 +1630,23 @@ void sock_map_unhash(struct sock *sk) void (*saved_unhash)(struct sock *sk); struct sk_psock *psock; +retry: rcu_read_lock(); psock = sk_psock(sk); if (unlikely(!psock)) { rcu_read_unlock(); saved_unhash = READ_ONCE(sk->sk_prot)->unhash; + if (unlikely(saved_unhash == sock_map_unhash)) + goto retry; } else { saved_unhash = psock->saved_unhash; sock_map_remove_links(sk, psock); rcu_read_unlock(); + + if (WARN_ON_ONCE(saved_unhash == sock_map_unhash)) + return; } - if (WARN_ON_ONCE(saved_unhash == sock_map_unhash)) - return; + if (saved_unhash) saved_unhash(sk); } @@ -1652,20 +1657,25 @@ void sock_map_destroy(struct sock *sk) void (*saved_destroy)(struct sock *sk); struct sk_psock *psock; +retry: rcu_read_lock(); psock = sk_psock_get(sk); if (unlikely(!psock)) { rcu_read_unlock(); saved_destroy = READ_ONCE(sk->sk_prot)->destroy; + if (unlikely(saved_destroy == sock_map_destroy)) + goto retry; } else { saved_destroy = psock->saved_destroy; sock_map_remove_links(sk, psock); rcu_read_unlock(); sk_psock_stop(psock); sk_psock_put(sk, psock); + + if (WARN_ON_ONCE(saved_destroy == sock_map_destroy)) + return; } - if (WARN_ON_ONCE(saved_destroy == sock_map_destroy)) - return; + if (saved_destroy) saved_destroy(sk); } @@ -1676,32 +1686,33 @@ void sock_map_close(struct sock *sk, long timeout) void (*saved_close)(struct sock *sk, long timeout); struct sk_psock *psock; +retry: lock_sock(sk); rcu_read_lock(); - psock = sk_psock(sk); + psock = sk_psock_get(sk); if (likely(psock)) { saved_close = psock->saved_close; sock_map_remove_links(sk, psock); - psock = sk_psock_get(sk); - if (unlikely(!psock)) - goto no_psock; rcu_read_unlock(); sk_psock_stop(psock); release_sock(sk); cancel_delayed_work_sync(&psock->work); sk_psock_put(sk, psock); + + /* Make sure we do not recurse. This is a bug. + * Leak the socket instead of crashing on a stack overflow. + */ + if (WARN_ON_ONCE(saved_close == sock_map_close)) + return; } else { saved_close = READ_ONCE(sk->sk_prot)->close; -no_psock: rcu_read_unlock(); release_sock(sk); + + if (unlikely(saved_close == sock_map_close)) + goto retry; } - /* Make sure we do not recurse. This is a bug. - * Leak the socket instead of crashing on a stack overflow. - */ - if (WARN_ON_ONCE(saved_close == sock_map_close)) - return; saved_close(sk, timeout); } EXPORT_SYMBOL_GPL(sock_map_close); diff --git a/net/ethtool/bitset.c b/net/ethtool/bitset.c index 8bb98d3ea3db21..a3a2cc6480a0e8 100644 --- a/net/ethtool/bitset.c +++ b/net/ethtool/bitset.c @@ -92,7 +92,7 @@ static bool ethnl_bitmap32_not_zero(const u32 *map, unsigned int start, u32 mask; if (end <= start) - return true; + return false; if (start % 32) { mask = ethnl_upper_bits(start); @@ -105,11 +105,11 @@ static bool ethnl_bitmap32_not_zero(const u32 *map, unsigned int start, start_word++; } - if (!memchr_inv(map + start_word, '\0', - (end_word - start_word) * sizeof(u32))) + if (memchr_inv(map + start_word, '\0', + (end_word - start_word) * sizeof(u32))) return true; if (end % 32 == 0) - return true; + return false; return map[end_word] & ethnl_lower_bits(end); } diff --git a/net/ethtool/phy.c b/net/ethtool/phy.c index d4e6887055ab15..ddc6eab701ed65 100644 --- a/net/ethtool/phy.c +++ b/net/ethtool/phy.c @@ -76,6 +76,7 @@ static int phy_prepare_data(const struct ethnl_req_info *req_info, struct nlattr **tb = info->attrs; struct phy_device_node *pdn; struct phy_device *phydev; + int ret; /* RTNL is held by the caller */ phydev = ethnl_req_get_phydev(req_info, tb, ETHTOOL_A_PHY_HEADER, @@ -88,8 +89,19 @@ static int phy_prepare_data(const struct ethnl_req_info *req_info, return -EOPNOTSUPP; rep_data->phyindex = phydev->phyindex; + rep_data->name = kstrdup(dev_name(&phydev->mdio.dev), GFP_KERNEL); - rep_data->drvname = kstrdup(phydev->drv->name, GFP_KERNEL); + if (!rep_data->name) + return -ENOMEM; + + if (phydev->drv) { + rep_data->drvname = kstrdup(phydev->drv->name, GFP_KERNEL); + if (!rep_data->drvname) { + ret = -ENOMEM; + goto err_free_name; + } + } + rep_data->upstream_type = pdn->upstream_type; if (pdn->upstream_type == PHY_UPSTREAM_PHY) { @@ -97,15 +109,33 @@ static int phy_prepare_data(const struct ethnl_req_info *req_info, rep_data->upstream_index = upstream->phyindex; } - if (pdn->parent_sfp_bus) + if (pdn->parent_sfp_bus) { rep_data->upstream_sfp_name = kstrdup(sfp_get_name(pdn->parent_sfp_bus), GFP_KERNEL); + if (!rep_data->upstream_sfp_name) { + ret = -ENOMEM; + goto err_free_drvname; + } + } - if (phydev->sfp_bus) + if (phydev->sfp_bus) { rep_data->downstream_sfp_name = kstrdup(sfp_get_name(phydev->sfp_bus), GFP_KERNEL); + if (!rep_data->downstream_sfp_name) { + ret = -ENOMEM; + goto err_free_upstream_sfp; + } + } return 0; + +err_free_upstream_sfp: + kfree(rep_data->upstream_sfp_name); +err_free_drvname: + kfree(rep_data->drvname); +err_free_name: + kfree(rep_data->name); + return ret; } static int phy_fill_reply(struct sk_buff *skb, diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c index d09875b335885e..124619920d3863 100644 --- a/net/hsr/hsr_framereg.c +++ b/net/hsr/hsr_framereg.c @@ -889,7 +889,10 @@ int hsr_get_node_data(struct hsr_priv *hsr, if (node->addr_B_port != HSR_PT_NONE) { port = hsr_port_get_hsr(hsr, node->addr_B_port); - *addr_b_ifindex = port->dev->ifindex; + if (port) + *addr_b_ifindex = port->dev->ifindex; + else + *addr_b_ifindex = -1; } else { *addr_b_ifindex = -1; } diff --git a/net/ipv4/ah4.c b/net/ipv4/ah4.c index 5fb812443a08f2..4366cbac3f06c5 100644 --- a/net/ipv4/ah4.c +++ b/net/ipv4/ah4.c @@ -124,9 +124,14 @@ static void ah_output_done(void *data, int err) struct iphdr *top_iph = ip_hdr(skb); struct ip_auth_hdr *ah = ip_auth_hdr(skb); int ihl = ip_hdrlen(skb); + int seqhi_len = 0; + __be32 *seqhi; + if (x->props.flags & XFRM_STATE_ESN) + seqhi_len = sizeof(*seqhi); iph = AH_SKB_CB(skb)->tmp; - icv = ah_tmp_icv(iph, ihl); + seqhi = (__be32 *)((char *)iph + ihl); + icv = ah_tmp_icv(seqhi, seqhi_len); memcpy(ah->auth_data, icv, ahp->icv_trunc_len); top_iph->tos = iph->tos; @@ -270,12 +275,17 @@ static void ah_input_done(void *data, int err) struct ip_auth_hdr *ah = ip_auth_hdr(skb); int ihl = ip_hdrlen(skb); int ah_hlen = (ah->hdrlen + 2) << 2; + int seqhi_len = 0; + __be32 *seqhi; if (err) goto out; + if (x->props.flags & XFRM_STATE_ESN) + seqhi_len = sizeof(*seqhi); work_iph = AH_SKB_CB(skb)->tmp; - auth_data = ah_tmp_auth(work_iph, ihl); + seqhi = (__be32 *)((char *)work_iph + ihl); + auth_data = ah_tmp_auth(seqhi, seqhi_len); icv = ah_tmp_icv(auth_data, ahp->icv_trunc_len); err = crypto_memneq(icv, auth_data, ahp->icv_trunc_len) ? -EBADMSG : 0; diff --git a/net/ipv4/bpf_tcp_ca.c b/net/ipv4/bpf_tcp_ca.c index 008edc7f668852..791e15063237c9 100644 --- a/net/ipv4/bpf_tcp_ca.c +++ b/net/ipv4/bpf_tcp_ca.c @@ -168,7 +168,7 @@ bpf_tcp_ca_get_func_proto(enum bpf_func_id func_id, */ if (prog_ops_moff(prog) != offsetof(struct tcp_congestion_ops, release)) - return &bpf_sk_setsockopt_proto; + return &bpf_sk_setsockopt_nodelay_proto; return NULL; case BPF_FUNC_getsockopt: /* Since get/setsockopt is usually expected to diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c index 6dfc0bcdef6542..6a5febbdbee493 100644 --- a/net/ipv4/esp4.c +++ b/net/ipv4/esp4.c @@ -873,7 +873,8 @@ static int esp_input(struct xfrm_state *x, struct sk_buff *skb) nfrags = 1; goto skip_cow; - } else if (!skb_has_frag_list(skb)) { + } else if (!skb_has_frag_list(skb) && + !skb_has_shared_frag(skb)) { nfrags = skb_shinfo(skb)->nr_frags; nfrags++; diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c index a674fb44ec25ba..a9ad39064f3bb7 100644 --- a/net/ipv4/igmp.c +++ b/net/ipv4/igmp.c @@ -122,16 +122,29 @@ * contradict to specs provided this delay is small enough. */ -#define IGMP_V1_SEEN(in_dev) \ - (IPV4_DEVCONF_ALL_RO(dev_net(in_dev->dev), FORCE_IGMP_VERSION) == 1 || \ - IN_DEV_CONF_GET((in_dev), FORCE_IGMP_VERSION) == 1 || \ - ((in_dev)->mr_v1_seen && \ - time_before(jiffies, (in_dev)->mr_v1_seen))) -#define IGMP_V2_SEEN(in_dev) \ - (IPV4_DEVCONF_ALL_RO(dev_net(in_dev->dev), FORCE_IGMP_VERSION) == 2 || \ - IN_DEV_CONF_GET((in_dev), FORCE_IGMP_VERSION) == 2 || \ - ((in_dev)->mr_v2_seen && \ - time_before(jiffies, (in_dev)->mr_v2_seen))) +static bool IGMP_V1_SEEN(const struct in_device *in_dev) +{ + unsigned long seen; + + if (IPV4_DEVCONF_ALL_RO(dev_net(in_dev->dev), FORCE_IGMP_VERSION) == 1) + return true; + if (IN_DEV_CONF_GET((in_dev), FORCE_IGMP_VERSION) == 1) + return true; + seen = READ_ONCE(in_dev->mr_v1_seen); + return seen && time_before(jiffies, seen); +} + +static bool IGMP_V2_SEEN(const struct in_device *in_dev) +{ + unsigned long seen; + + if (IPV4_DEVCONF_ALL_RO(dev_net(in_dev->dev), FORCE_IGMP_VERSION) == 2) + return true; + if (IN_DEV_CONF_GET((in_dev), FORCE_IGMP_VERSION) == 2) + return true; + seen = READ_ONCE(in_dev->mr_v2_seen); + return seen && time_before(jiffies, seen); +} static int unsolicited_report_interval(struct in_device *in_dev) { @@ -954,23 +967,21 @@ static bool igmp_heard_query(struct in_device *in_dev, struct sk_buff *skb, int max_delay; int mark = 0; struct net *net = dev_net(in_dev->dev); - + unsigned long seen; if (len == 8) { + seen = jiffies + READ_ONCE(in_dev->mr_qrv) * READ_ONCE(in_dev->mr_qi) + + READ_ONCE(in_dev->mr_qri); if (ih->code == 0) { /* Alas, old v1 router presents here. */ max_delay = IGMP_QUERY_RESPONSE_INTERVAL; - in_dev->mr_v1_seen = jiffies + - (in_dev->mr_qrv * in_dev->mr_qi) + - in_dev->mr_qri; + WRITE_ONCE(in_dev->mr_v1_seen, seen); group = 0; } else { /* v2 router present */ max_delay = ih->code*(HZ/IGMP_TIMER_SCALE); - in_dev->mr_v2_seen = jiffies + - (in_dev->mr_qrv * in_dev->mr_qi) + - in_dev->mr_qri; + WRITE_ONCE(in_dev->mr_v2_seen, seen); } /* cancel the interface change timer */ WRITE_ONCE(in_dev->mr_ifc_count, 0); @@ -995,6 +1006,8 @@ static bool igmp_heard_query(struct in_device *in_dev, struct sk_buff *skb, if (!max_delay) max_delay = 1; /* can't mod w/ 0 */ } else { /* v3 */ + unsigned long mr_qi; + if (!pskb_may_pull(skb, sizeof(struct igmpv3_query))) return true; @@ -1015,15 +1028,16 @@ static bool igmp_heard_query(struct in_device *in_dev, struct sk_buff *skb, * received value was zero, use the default or statically * configured value. */ - in_dev->mr_qrv = ih3->qrv ?: READ_ONCE(net->ipv4.sysctl_igmp_qrv); - in_dev->mr_qi = IGMPV3_QQIC(ih3->qqic)*HZ ?: IGMP_QUERY_INTERVAL; - + WRITE_ONCE(in_dev->mr_qrv, + ih3->qrv ?: READ_ONCE(net->ipv4.sysctl_igmp_qrv)); + mr_qi = IGMPV3_QQIC(ih3->qqic)*HZ ?: IGMP_QUERY_INTERVAL; + WRITE_ONCE(in_dev->mr_qi, mr_qi); /* RFC3376, 8.3. Query Response Interval: * The number of seconds represented by the [Query Response * Interval] must be less than the [Query Interval]. */ - if (in_dev->mr_qri >= in_dev->mr_qi) - in_dev->mr_qri = (in_dev->mr_qi/HZ - 1)*HZ; + if (READ_ONCE(in_dev->mr_qri) >= mr_qi) + WRITE_ONCE(in_dev->mr_qri, (mr_qi/HZ - 1) * HZ); if (!group) { /* general query */ if (ih3->nsrcs) diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c index 928654c34156b6..dbcd37dfdc15b1 100644 --- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -1108,7 +1108,7 @@ static void reqsk_timer_handler(struct timer_list *t) if (!inet_ehash_insert(req_to_sk(nreq), req_to_sk(oreq), NULL)) { /* delete timer */ - __inet_csk_reqsk_queue_drop(sk_listener, nreq, true); + __inet_csk_reqsk_queue_drop(sk_listener, nreq, false); goto no_ownership; } @@ -1134,7 +1134,7 @@ static void reqsk_timer_handler(struct timer_list *t) } drop: - __inet_csk_reqsk_queue_drop(sk_listener, oreq, true); + __inet_csk_reqsk_queue_drop(oreq->rsk_listener, oreq, true); reqsk_put(oreq); } diff --git a/net/ipv4/inetpeer.c b/net/ipv4/inetpeer.c index d8083b9033c272..5b957a831e7c39 100644 --- a/net/ipv4/inetpeer.c +++ b/net/ipv4/inetpeer.c @@ -179,7 +179,8 @@ struct inet_peer *inet_getpeer(struct inet_peer_base *base, seq = read_seqbegin(&base->lock); p = lookup(daddr, base, seq, NULL, &gc_cnt, &parent, &pp); - if (p) + /* Make sure tree was not modified during our lookup. */ + if (p && !read_seqretry(&base->lock, seq)) return p; /* retry an exact lookup, taking the lock before. diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index e4790cc7b5c2ec..5bcd73cbdb41c0 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -1233,6 +1233,8 @@ static int __ip_append_data(struct sock *sk, if (err < 0) goto error; copy = err; + if (!(flags & MSG_NO_SHARED_FRAGS)) + skb_shinfo(skb)->flags |= SKBFL_SHARED_FRAG; wmem_alloc_delta += copy; } else if (!zc) { int i = skb_shinfo(skb)->nr_frags; diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c index 2058ca860294b0..2628cd3a93a684 100644 --- a/net/ipv4/ipmr.c +++ b/net/ipv4/ipmr.c @@ -537,15 +537,16 @@ static netdev_tx_t reg_vif_xmit(struct sk_buff *skb, struct net_device *dev) }; int err; + rcu_read_lock(); err = ipmr_fib_lookup(net, &fl4, &mrt); if (err < 0) { + rcu_read_unlock(); kfree_skb(skb); return err; } DEV_STATS_ADD(dev, tx_bytes, skb->len); DEV_STATS_INC(dev, tx_packets); - rcu_read_lock(); /* Pairs with WRITE_ONCE() in vif_add() and vif_delete() */ ipmr_cache_report(mrt, skb, READ_ONCE(mrt->mroute_reg_vif_num), @@ -1112,11 +1113,12 @@ static int ipmr_cache_report(const struct mr_table *mrt, msg->im_vif_hi = vifi >> 8; ipv4_pktinfo_prepare(mroute_sk, pkt, false); memcpy(skb->cb, pkt->cb, sizeof(skb->cb)); - /* Add our header */ - igmp = skb_put(skb, sizeof(struct igmphdr)); + /* Add our header. + * Note that code, csum and group fields are cleared. + */ + igmp = skb_put_zero(skb, sizeof(struct igmphdr)); igmp->type = assert; msg->im_msgtype = assert; - igmp->code = 0; ip_hdr(skb)->tot_len = htons(skb->len); /* Fix the length */ skb->transport_header = skb->network_header; } diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c index 97ead883e4a13b..ad2259678c7854 100644 --- a/net/ipv4/netfilter/arp_tables.c +++ b/net/ipv4/netfilter/arp_tables.c @@ -1501,13 +1501,11 @@ static int do_arpt_get_ctl(struct sock *sk, int cmd, void __user *user, int *len static void __arpt_unregister_table(struct net *net, struct xt_table *table) { - struct xt_table_info *private; - void *loc_cpu_entry; + struct xt_table_info *private = table->private; struct module *table_owner = table->me; + void *loc_cpu_entry; struct arpt_entry *iter; - private = xt_unregister_table(table); - /* Decrease module usage counts and free resources */ loc_cpu_entry = private->entries; xt_entry_foreach(iter, loc_cpu_entry, private->size) @@ -1515,6 +1513,7 @@ static void __arpt_unregister_table(struct net *net, struct xt_table *table) if (private->number > private->initial_entries) module_put(table_owner); xt_free_table_info(private); + kfree(table); } int arpt_register_table(struct net *net, @@ -1522,13 +1521,11 @@ int arpt_register_table(struct net *net, const struct arpt_replace *repl, const struct nf_hook_ops *template_ops) { - struct nf_hook_ops *ops; - unsigned int num_ops; - int ret, i; - struct xt_table_info *newinfo; struct xt_table_info bootstrap = {0}; - void *loc_cpu_entry; + struct xt_table_info *newinfo; struct xt_table *new_table; + void *loc_cpu_entry; + int ret; newinfo = xt_alloc_table_info(repl->size); if (!newinfo) @@ -1543,7 +1540,7 @@ int arpt_register_table(struct net *net, return ret; } - new_table = xt_register_table(net, table, &bootstrap, newinfo); + new_table = xt_register_table(net, table, template_ops, &bootstrap, newinfo); if (IS_ERR(new_table)) { struct arpt_entry *iter; @@ -1553,46 +1550,12 @@ int arpt_register_table(struct net *net, return PTR_ERR(new_table); } - num_ops = hweight32(table->valid_hooks); - if (num_ops == 0) { - ret = -EINVAL; - goto out_free; - } - - ops = kmemdup_array(template_ops, num_ops, sizeof(*ops), GFP_KERNEL); - if (!ops) { - ret = -ENOMEM; - goto out_free; - } - - for (i = 0; i < num_ops; i++) - ops[i].priv = new_table; - - new_table->ops = ops; - - ret = nf_register_net_hooks(net, ops, num_ops); - if (ret != 0) - goto out_free; - return ret; - -out_free: - __arpt_unregister_table(net, new_table); - return ret; -} - -void arpt_unregister_table_pre_exit(struct net *net, const char *name) -{ - struct xt_table *table = xt_find_table(net, NFPROTO_ARP, name); - - if (table) - nf_unregister_net_hooks(net, table->ops, hweight32(table->valid_hooks)); } -EXPORT_SYMBOL(arpt_unregister_table_pre_exit); void arpt_unregister_table(struct net *net, const char *name) { - struct xt_table *table = xt_find_table(net, NFPROTO_ARP, name); + struct xt_table *table = xt_unregister_table_exit(net, NFPROTO_ARP, name); if (table) __arpt_unregister_table(net, table); diff --git a/net/ipv4/netfilter/arptable_filter.c b/net/ipv4/netfilter/arptable_filter.c index 78cd5ee24448f6..370b635e3523b9 100644 --- a/net/ipv4/netfilter/arptable_filter.c +++ b/net/ipv4/netfilter/arptable_filter.c @@ -43,7 +43,7 @@ static int arptable_filter_table_init(struct net *net) static void __net_exit arptable_filter_net_pre_exit(struct net *net) { - arpt_unregister_table_pre_exit(net, "filter"); + xt_unregister_table_pre_exit(net, NFPROTO_ARP, "filter"); } static void __net_exit arptable_filter_net_exit(struct net *net) @@ -58,32 +58,33 @@ static struct pernet_operations arptable_filter_net_ops = { static int __init arptable_filter_init(void) { - int ret = xt_register_template(&packet_filter, - arptable_filter_table_init); - - if (ret < 0) - return ret; + int ret; arpfilter_ops = xt_hook_ops_alloc(&packet_filter, arpt_do_table); - if (IS_ERR(arpfilter_ops)) { - xt_unregister_template(&packet_filter); + if (IS_ERR(arpfilter_ops)) return PTR_ERR(arpfilter_ops); - } ret = register_pernet_subsys(&arptable_filter_net_ops); + if (ret < 0) + goto err_free; + + ret = xt_register_template(&packet_filter, + arptable_filter_table_init); if (ret < 0) { - xt_unregister_template(&packet_filter); - kfree(arpfilter_ops); - return ret; + unregister_pernet_subsys(&arptable_filter_net_ops); + goto err_free; } + return 0; +err_free: + kfree(arpfilter_ops); return ret; } static void __exit arptable_filter_fini(void) { - unregister_pernet_subsys(&arptable_filter_net_ops); xt_unregister_template(&packet_filter); + unregister_pernet_subsys(&arptable_filter_net_ops); kfree(arpfilter_ops); } diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c index 23c8deff8095ae..5cbdb0815857f4 100644 --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -1704,12 +1704,10 @@ do_ipt_get_ctl(struct sock *sk, int cmd, void __user *user, int *len) static void __ipt_unregister_table(struct net *net, struct xt_table *table) { - struct xt_table_info *private; - void *loc_cpu_entry; + struct xt_table_info *private = table->private; struct module *table_owner = table->me; struct ipt_entry *iter; - - private = xt_unregister_table(table); + void *loc_cpu_entry; /* Decrease module usage counts and free resources */ loc_cpu_entry = private->entries; @@ -1718,19 +1716,18 @@ static void __ipt_unregister_table(struct net *net, struct xt_table *table) if (private->number > private->initial_entries) module_put(table_owner); xt_free_table_info(private); + kfree(table); } int ipt_register_table(struct net *net, const struct xt_table *table, const struct ipt_replace *repl, const struct nf_hook_ops *template_ops) { - struct nf_hook_ops *ops; - unsigned int num_ops; - int ret, i; - struct xt_table_info *newinfo; struct xt_table_info bootstrap = {0}; - void *loc_cpu_entry; + struct xt_table_info *newinfo; struct xt_table *new_table; + void *loc_cpu_entry; + int ret; newinfo = xt_alloc_table_info(repl->size); if (!newinfo) @@ -1745,7 +1742,7 @@ int ipt_register_table(struct net *net, const struct xt_table *table, return ret; } - new_table = xt_register_table(net, table, &bootstrap, newinfo); + new_table = xt_register_table(net, table, template_ops, &bootstrap, newinfo); if (IS_ERR(new_table)) { struct ipt_entry *iter; @@ -1755,51 +1752,12 @@ int ipt_register_table(struct net *net, const struct xt_table *table, return PTR_ERR(new_table); } - /* No template? No need to do anything. This is used by 'nat' table, it registers - * with the nat core instead of the netfilter core. - */ - if (!template_ops) - return 0; - - num_ops = hweight32(table->valid_hooks); - if (num_ops == 0) { - ret = -EINVAL; - goto out_free; - } - - ops = kmemdup_array(template_ops, num_ops, sizeof(*ops), GFP_KERNEL); - if (!ops) { - ret = -ENOMEM; - goto out_free; - } - - for (i = 0; i < num_ops; i++) - ops[i].priv = new_table; - - new_table->ops = ops; - - ret = nf_register_net_hooks(net, ops, num_ops); - if (ret != 0) - goto out_free; - return ret; - -out_free: - __ipt_unregister_table(net, new_table); - return ret; -} - -void ipt_unregister_table_pre_exit(struct net *net, const char *name) -{ - struct xt_table *table = xt_find_table(net, NFPROTO_IPV4, name); - - if (table) - nf_unregister_net_hooks(net, table->ops, hweight32(table->valid_hooks)); } void ipt_unregister_table_exit(struct net *net, const char *name) { - struct xt_table *table = xt_find_table(net, NFPROTO_IPV4, name); + struct xt_table *table = xt_unregister_table_exit(net, NFPROTO_IPV4, name); if (table) __ipt_unregister_table(net, table); @@ -1887,7 +1845,6 @@ static void __exit ip_tables_fini(void) } EXPORT_SYMBOL(ipt_register_table); -EXPORT_SYMBOL(ipt_unregister_table_pre_exit); EXPORT_SYMBOL(ipt_unregister_table_exit); EXPORT_SYMBOL(ipt_do_table); module_init(ip_tables_init); diff --git a/net/ipv4/netfilter/iptable_filter.c b/net/ipv4/netfilter/iptable_filter.c index 3ab908b7479517..672d7da1071d31 100644 --- a/net/ipv4/netfilter/iptable_filter.c +++ b/net/ipv4/netfilter/iptable_filter.c @@ -61,7 +61,7 @@ static int __net_init iptable_filter_net_init(struct net *net) static void __net_exit iptable_filter_net_pre_exit(struct net *net) { - ipt_unregister_table_pre_exit(net, "filter"); + xt_unregister_table_pre_exit(net, NFPROTO_IPV4, "filter"); } static void __net_exit iptable_filter_net_exit(struct net *net) @@ -77,32 +77,33 @@ static struct pernet_operations iptable_filter_net_ops = { static int __init iptable_filter_init(void) { - int ret = xt_register_template(&packet_filter, - iptable_filter_table_init); - - if (ret < 0) - return ret; + int ret; filter_ops = xt_hook_ops_alloc(&packet_filter, ipt_do_table); - if (IS_ERR(filter_ops)) { - xt_unregister_template(&packet_filter); + if (IS_ERR(filter_ops)) return PTR_ERR(filter_ops); - } ret = register_pernet_subsys(&iptable_filter_net_ops); + if (ret < 0) + goto err_free; + + ret = xt_register_template(&packet_filter, + iptable_filter_table_init); if (ret < 0) { - xt_unregister_template(&packet_filter); - kfree(filter_ops); - return ret; + unregister_pernet_subsys(&iptable_filter_net_ops); + goto err_free; } return 0; +err_free: + kfree(filter_ops); + return ret; } static void __exit iptable_filter_fini(void) { - unregister_pernet_subsys(&iptable_filter_net_ops); xt_unregister_template(&packet_filter); + unregister_pernet_subsys(&iptable_filter_net_ops); kfree(filter_ops); } diff --git a/net/ipv4/netfilter/iptable_mangle.c b/net/ipv4/netfilter/iptable_mangle.c index 385d945d8ebea0..13d25d9a4610e3 100644 --- a/net/ipv4/netfilter/iptable_mangle.c +++ b/net/ipv4/netfilter/iptable_mangle.c @@ -96,7 +96,7 @@ static int iptable_mangle_table_init(struct net *net) static void __net_exit iptable_mangle_net_pre_exit(struct net *net) { - ipt_unregister_table_pre_exit(net, "mangle"); + xt_unregister_table_pre_exit(net, NFPROTO_IPV4, "mangle"); } static void __net_exit iptable_mangle_net_exit(struct net *net) @@ -111,32 +111,33 @@ static struct pernet_operations iptable_mangle_net_ops = { static int __init iptable_mangle_init(void) { - int ret = xt_register_template(&packet_mangler, - iptable_mangle_table_init); - if (ret < 0) - return ret; + int ret; mangle_ops = xt_hook_ops_alloc(&packet_mangler, iptable_mangle_hook); - if (IS_ERR(mangle_ops)) { - xt_unregister_template(&packet_mangler); - ret = PTR_ERR(mangle_ops); - return ret; - } + if (IS_ERR(mangle_ops)) + return PTR_ERR(mangle_ops); ret = register_pernet_subsys(&iptable_mangle_net_ops); + if (ret < 0) + goto err_free; + + ret = xt_register_template(&packet_mangler, + iptable_mangle_table_init); if (ret < 0) { - xt_unregister_template(&packet_mangler); - kfree(mangle_ops); - return ret; + unregister_pernet_subsys(&iptable_mangle_net_ops); + goto err_free; } + return 0; +err_free: + kfree(mangle_ops); return ret; } static void __exit iptable_mangle_fini(void) { - unregister_pernet_subsys(&iptable_mangle_net_ops); xt_unregister_template(&packet_mangler); + unregister_pernet_subsys(&iptable_mangle_net_ops); kfree(mangle_ops); } diff --git a/net/ipv4/netfilter/iptable_nat.c b/net/ipv4/netfilter/iptable_nat.c index 625a1ca13b1bad..a0df7255402514 100644 --- a/net/ipv4/netfilter/iptable_nat.c +++ b/net/ipv4/netfilter/iptable_nat.c @@ -119,8 +119,11 @@ static int iptable_nat_table_init(struct net *net) } ret = ipt_nat_register_lookups(net); - if (ret < 0) + if (ret < 0) { + xt_unregister_table_pre_exit(net, NFPROTO_IPV4, "nat"); + synchronize_rcu(); ipt_unregister_table_exit(net, "nat"); + } kfree(repl); return ret; @@ -129,6 +132,7 @@ static int iptable_nat_table_init(struct net *net) static void __net_exit iptable_nat_net_pre_exit(struct net *net) { ipt_nat_unregister_lookups(net); + xt_unregister_table_pre_exit(net, NFPROTO_IPV4, "nat"); } static void __net_exit iptable_nat_net_exit(struct net *net) diff --git a/net/ipv4/netfilter/iptable_raw.c b/net/ipv4/netfilter/iptable_raw.c index 0e7f53964d0af6..2745c22f4034d8 100644 --- a/net/ipv4/netfilter/iptable_raw.c +++ b/net/ipv4/netfilter/iptable_raw.c @@ -53,7 +53,7 @@ static int iptable_raw_table_init(struct net *net) static void __net_exit iptable_raw_net_pre_exit(struct net *net) { - ipt_unregister_table_pre_exit(net, "raw"); + xt_unregister_table_pre_exit(net, NFPROTO_IPV4, "raw"); } static void __net_exit iptable_raw_net_exit(struct net *net) @@ -77,32 +77,32 @@ static int __init iptable_raw_init(void) pr_info("Enabling raw table before defrag\n"); } - ret = xt_register_template(table, - iptable_raw_table_init); - if (ret < 0) - return ret; - rawtable_ops = xt_hook_ops_alloc(table, ipt_do_table); - if (IS_ERR(rawtable_ops)) { - xt_unregister_template(table); + if (IS_ERR(rawtable_ops)) return PTR_ERR(rawtable_ops); - } ret = register_pernet_subsys(&iptable_raw_net_ops); + if (ret < 0) + goto err_free; + + ret = xt_register_template(table, + iptable_raw_table_init); if (ret < 0) { - xt_unregister_template(table); - kfree(rawtable_ops); - return ret; + unregister_pernet_subsys(&iptable_raw_net_ops); + goto err_free; } + return 0; +err_free: + kfree(rawtable_ops); return ret; } static void __exit iptable_raw_fini(void) { + xt_unregister_template(&packet_raw); unregister_pernet_subsys(&iptable_raw_net_ops); kfree(rawtable_ops); - xt_unregister_template(&packet_raw); } module_init(iptable_raw_init); diff --git a/net/ipv4/netfilter/iptable_security.c b/net/ipv4/netfilter/iptable_security.c index d885443cb26798..491894511c5441 100644 --- a/net/ipv4/netfilter/iptable_security.c +++ b/net/ipv4/netfilter/iptable_security.c @@ -50,7 +50,7 @@ static int iptable_security_table_init(struct net *net) static void __net_exit iptable_security_net_pre_exit(struct net *net) { - ipt_unregister_table_pre_exit(net, "security"); + xt_unregister_table_pre_exit(net, NFPROTO_IPV4, "security"); } static void __net_exit iptable_security_net_exit(struct net *net) @@ -65,33 +65,34 @@ static struct pernet_operations iptable_security_net_ops = { static int __init iptable_security_init(void) { - int ret = xt_register_template(&security_table, - iptable_security_table_init); - - if (ret < 0) - return ret; + int ret; sectbl_ops = xt_hook_ops_alloc(&security_table, ipt_do_table); - if (IS_ERR(sectbl_ops)) { - xt_unregister_template(&security_table); + if (IS_ERR(sectbl_ops)) return PTR_ERR(sectbl_ops); - } ret = register_pernet_subsys(&iptable_security_net_ops); + if (ret < 0) + goto err_free; + + ret = xt_register_template(&security_table, + iptable_security_table_init); if (ret < 0) { - xt_unregister_template(&security_table); - kfree(sectbl_ops); - return ret; + unregister_pernet_subsys(&iptable_security_net_ops); + goto err_free; } + return 0; +err_free: + kfree(sectbl_ops); return ret; } static void __exit iptable_security_fini(void) { + xt_unregister_template(&security_table); unregister_pernet_subsys(&iptable_security_net_ops); kfree(sectbl_ops); - xt_unregister_template(&security_table); } module_init(iptable_security_init); diff --git a/net/ipv4/netfilter/nf_socket_ipv4.c b/net/ipv4/netfilter/nf_socket_ipv4.c index 5080fa5fbf6a02..f9c6755f5ec576 100644 --- a/net/ipv4/netfilter/nf_socket_ipv4.c +++ b/net/ipv4/netfilter/nf_socket_ipv4.c @@ -94,6 +94,9 @@ struct sock *nf_sk_lookup_slow_v4(struct net *net, const struct sk_buff *skb, #endif int doff = 0; + if (ntohs(iph->frag_off) & IP_OFFSET) + return NULL; + if (iph->protocol == IPPROTO_UDP || iph->protocol == IPPROTO_TCP) { struct tcphdr _hdr; struct udphdr *hp; diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c index a97cdf3e6af4cf..0a4b38b315fed4 100644 --- a/net/ipv4/tcp_ao.c +++ b/net/ipv4/tcp_ao.c @@ -116,7 +116,8 @@ struct tcp_ao_key *tcp_ao_established_key(const struct sock *sk, { struct tcp_ao_key *key; - hlist_for_each_entry_rcu(key, &ao->head, node, lockdep_sock_is_held(sk)) { + hlist_for_each_entry_rcu(key, &ao->head, node, + sk_fullsock(sk) && lockdep_sock_is_held(sk)) { if ((sndid >= 0 && key->sndid != sndid) || (rcvid >= 0 && key->rcvid != rcvid)) continue; diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 8fc24c3743c5f9..c0526cc0398049 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1827,7 +1827,6 @@ INDIRECT_CALLABLE_DECLARE(struct dst_entry *ipv4_dst_check(struct dst_entry *, int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb) { enum skb_drop_reason reason; - struct sock *rsk; reason = psp_sk_rx_policy_check(sk, skb); if (reason) @@ -1863,24 +1862,21 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb) return 0; if (nsk != sk) { reason = tcp_child_process(sk, nsk, skb); - if (reason) { - rsk = nsk; + sock_put(nsk); + if (reason) goto reset; - } return 0; } } else sock_rps_save_rxhash(sk, skb); reason = tcp_rcv_state_process(sk, skb); - if (reason) { - rsk = sk; + if (reason) goto reset; - } return 0; reset: - tcp_v4_send_reset(rsk, skb, sk_rst_convert_drop_reason(reason)); + tcp_v4_send_reset(sk, skb, sk_rst_convert_drop_reason(reason)); discard: sk_skb_reason_drop(sk, skb, reason); /* Be careful here. If this function gets more complicated and @@ -2193,8 +2189,10 @@ int tcp_v4_rcv(struct sk_buff *skb) rst_reason = sk_rst_convert_drop_reason(drop_reason); tcp_v4_send_reset(nsk, skb, rst_reason); + sock_put(nsk); goto discard_and_relse; } + sock_put(nsk); sock_put(sk); return 0; } diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 199f0b579e89cf..e6092c3ac840bd 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -1012,6 +1012,6 @@ enum skb_drop_reason tcp_child_process(struct sock *parent, struct sock *child, } bh_unlock_sock(child); - sock_put(child); + return reason; } diff --git a/net/ipv6/Kconfig b/net/ipv6/Kconfig index c024aa77f25ba6..c3806c6ac96f95 100644 --- a/net/ipv6/Kconfig +++ b/net/ipv6/Kconfig @@ -164,7 +164,7 @@ config IPV6_SIT select INET_TUNNEL select NET_IP_TUNNEL select IPV6_NDISC_NODETYPE - default y + default m help Tunneling means encapsulating data of one protocol type within another protocol and sending it over a channel that understands the @@ -172,7 +172,7 @@ config IPV6_SIT into IPv4 packets. This is useful if you want to connect two IPv6 networks over an IPv4-only path. - Saying M here will produce a module called sit. If unsure, say Y. + Saying M here will produce a module called sit. If unsure, say M. config IPV6_SIT_6RD bool "IPv6: IPv6 Rapid Deployment (6RD)" diff --git a/net/ipv6/ah6.c b/net/ipv6/ah6.c index cb26beea439824..de1e68199a0145 100644 --- a/net/ipv6/ah6.c +++ b/net/ipv6/ah6.c @@ -317,14 +317,19 @@ static void ah6_output_done(void *data, int err) struct ipv6hdr *top_iph = ipv6_hdr(skb); struct ip_auth_hdr *ah = ip_auth_hdr(skb); struct tmp_ext *iph_ext; + int seqhi_len = 0; + __be32 *seqhi; extlen = skb_network_header_len(skb) - sizeof(struct ipv6hdr); if (extlen) extlen += sizeof(*iph_ext); + if (x->props.flags & XFRM_STATE_ESN) + seqhi_len = sizeof(*seqhi); iph_base = AH_SKB_CB(skb)->tmp; iph_ext = ah_tmp_ext(iph_base); - icv = ah_tmp_icv(iph_ext, extlen); + seqhi = (__be32 *)((char *)iph_ext + extlen); + icv = ah_tmp_icv(seqhi, seqhi_len); memcpy(ah->auth_data, icv, ahp->icv_trunc_len); memcpy(top_iph, iph_base, IPV6HDR_BASELEN); @@ -471,13 +476,18 @@ static void ah6_input_done(void *data, int err) struct ip_auth_hdr *ah = ip_auth_hdr(skb); int hdr_len = skb_network_header_len(skb); int ah_hlen = ipv6_authlen(ah); + int seqhi_len = 0; + __be32 *seqhi; if (err) goto out; + if (x->props.flags & XFRM_STATE_ESN) + seqhi_len = sizeof(*seqhi); work_iph = AH_SKB_CB(skb)->tmp; auth_data = ah_tmp_auth(work_iph, hdr_len); - icv = ah_tmp_icv(auth_data, ahp->icv_trunc_len); + seqhi = (__be32 *)(auth_data + ahp->icv_trunc_len); + icv = ah_tmp_icv(seqhi, seqhi_len); err = crypto_memneq(icv, auth_data, ahp->icv_trunc_len) ? -EBADMSG : 0; if (err) diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c index 9f75313734f8cd..9c06c5a1419dc4 100644 --- a/net/ipv6/esp6.c +++ b/net/ipv6/esp6.c @@ -915,7 +915,8 @@ static int esp6_input(struct xfrm_state *x, struct sk_buff *skb) nfrags = 1; goto skip_cow; - } else if (!skb_has_frag_list(skb)) { + } else if (!skb_has_frag_list(skb) && + !skb_has_shared_frag(skb)) { nfrags = skb_shinfo(skb)->nr_frags; nfrags++; diff --git a/net/ipv6/exthdrs_core.c b/net/ipv6/exthdrs_core.c index 49e31e4ae7b7f6..9d06d487e8b103 100644 --- a/net/ipv6/exthdrs_core.c +++ b/net/ipv6/exthdrs_core.c @@ -73,6 +73,7 @@ int ipv6_skip_exthdr(const struct sk_buff *skb, int start, u8 *nexthdrp, __be16 *frag_offp) { u8 nexthdr = *nexthdrp; + int exthdr_cnt = 0; *frag_offp = 0; @@ -82,6 +83,8 @@ int ipv6_skip_exthdr(const struct sk_buff *skb, int start, u8 *nexthdrp, if (nexthdr == NEXTHDR_NONE) return -1; + if (unlikely(exthdr_cnt++ >= IP6_MAX_EXT_HDRS_CNT)) + return -1; hp = skb_header_pointer(skb, start, sizeof(_hdr), &_hdr); if (!hp) return -1; @@ -190,6 +193,7 @@ int ipv6_find_hdr(const struct sk_buff *skb, unsigned int *offset, { unsigned int start = skb_network_offset(skb) + sizeof(struct ipv6hdr); u8 nexthdr = ipv6_hdr(skb)->nexthdr; + int exthdr_cnt = 0; bool found; if (fragoff) @@ -216,6 +220,9 @@ int ipv6_find_hdr(const struct sk_buff *skb, unsigned int *offset, return -ENOENT; } + if (unlikely(exthdr_cnt++ >= IP6_MAX_EXT_HDRS_CNT)) + return -EBADMSG; + hp = skb_header_pointer(skb, start, sizeof(_hdr), &_hdr); if (!hp) return -EBADMSG; diff --git a/net/ipv6/ip6_flowlabel.c b/net/ipv6/ip6_flowlabel.c index c92f98c6f6ecca..b1ccdf0dc64699 100644 --- a/net/ipv6/ip6_flowlabel.c +++ b/net/ipv6/ip6_flowlabel.c @@ -36,11 +36,11 @@ /* FL hash table */ #define FL_MAX_PER_SOCK 32 -#define FL_MAX_SIZE 4096 +#define FL_MAX_SIZE 8192 #define FL_HASH_MASK 255 #define FL_HASH(l) (ntohl(l)&FL_HASH_MASK) -static atomic_t fl_size = ATOMIC_INIT(0); +static int fl_size; static struct ip6_flowlabel __rcu *fl_ht[FL_HASH_MASK+1]; static void ip6_fl_gc(struct timer_list *unused); @@ -162,8 +162,9 @@ static void ip6_fl_gc(struct timer_list *unused) ttd = fl->expires; if (time_after_eq(now, ttd)) { *flp = fl->next; + fl_size--; + fl->fl_net->ipv6.flowlabel_count--; fl_free(fl); - atomic_dec(&fl_size); continue; } if (!sched || time_before(ttd, sched)) @@ -172,7 +173,7 @@ static void ip6_fl_gc(struct timer_list *unused) flp = &fl->next; } } - if (!sched && atomic_read(&fl_size)) + if (!sched && fl_size) sched = now + FL_MAX_LINGER; if (sched) { mod_timer(&ip6_fl_gc_timer, sched); @@ -196,7 +197,8 @@ static void __net_exit ip6_fl_purge(struct net *net) atomic_read(&fl->users) == 0) { *flp = fl->next; fl_free(fl); - atomic_dec(&fl_size); + fl_size--; + net->ipv6.flowlabel_count--; continue; } flp = &fl->next; @@ -210,10 +212,10 @@ static struct ip6_flowlabel *fl_intern(struct net *net, { struct ip6_flowlabel *lfl; + lockdep_assert_held(&ip6_fl_lock); + fl->label = label & IPV6_FLOWLABEL_MASK; - rcu_read_lock(); - spin_lock_bh(&ip6_fl_lock); if (label == 0) { for (;;) { fl->label = htonl(get_random_u32())&IPV6_FLOWLABEL_MASK; @@ -235,8 +237,6 @@ static struct ip6_flowlabel *fl_intern(struct net *net, lfl = __fl_lookup(net, fl->label); if (lfl) { atomic_inc(&lfl->users); - spin_unlock_bh(&ip6_fl_lock); - rcu_read_unlock(); return lfl; } } @@ -244,9 +244,8 @@ static struct ip6_flowlabel *fl_intern(struct net *net, fl->lastuse = jiffies; fl->next = fl_ht[FL_HASH(fl->label)]; rcu_assign_pointer(fl_ht[FL_HASH(fl->label)], fl); - atomic_inc(&fl_size); - spin_unlock_bh(&ip6_fl_lock); - rcu_read_unlock(); + fl_size++; + net->ipv6.flowlabel_count++; return NULL; } @@ -464,10 +463,17 @@ fl_create(struct net *net, struct sock *sk, struct in6_flowlabel_req *freq, static int mem_check(struct sock *sk) { - int room = FL_MAX_SIZE - atomic_read(&fl_size); + const int unpriv_total_limit = FL_MAX_SIZE - (FL_MAX_SIZE / 4); + const int unpriv_user_limit = unpriv_total_limit / 2; + struct net *net = sock_net(sk); + int room; struct ipv6_fl_socklist *sfl; int count = 0; + lockdep_assert_held(&ip6_fl_lock); + + room = FL_MAX_SIZE - fl_size; + if (room > FL_MAX_SIZE - FL_MAX_PER_SOCK) return 0; @@ -478,7 +484,9 @@ static int mem_check(struct sock *sk) if (room <= 0 || ((count >= FL_MAX_PER_SOCK || - (count > 0 && room < FL_MAX_SIZE/2) || room < FL_MAX_SIZE/4) && + (count > 0 && room < FL_MAX_SIZE / 2) || + room < FL_MAX_SIZE / 4 || + net->ipv6.flowlabel_count >= unpriv_user_limit) && !capable(CAP_NET_ADMIN))) return -ENOBUFS; @@ -692,11 +700,19 @@ static int ipv6_flowlabel_get(struct sock *sk, struct in6_flowlabel_req *freq, if (!sfl1) goto done; + rcu_read_lock(); + spin_lock_bh(&ip6_fl_lock); err = mem_check(sk); + if (err == 0) + fl1 = fl_intern(net, fl, freq->flr_label); + else + fl1 = NULL; + spin_unlock_bh(&ip6_fl_lock); + rcu_read_unlock(); + if (err != 0) goto done; - fl1 = fl_intern(net, fl, freq->flr_label); if (fl1) goto recheck; diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c index 63fc8556b475e5..365b4059eb2035 100644 --- a/net/ipv6/ip6_gre.c +++ b/net/ipv6/ip6_gre.c @@ -2262,10 +2262,11 @@ static int ip6erspan_changelink(struct net_device *dev, struct nlattr *tb[], struct nlattr *data[], struct netlink_ext_ack *extack) { - struct ip6gre_net *ign = net_generic(dev_net(dev), ip6gre_net_id); + struct ip6_tnl *t = netdev_priv(dev); struct __ip6_tnl_parm p; - struct ip6_tnl *t; + struct ip6gre_net *ign; + ign = net_generic(t->net, ip6gre_net_id); t = ip6gre_changelink_common(dev, tb, data, &p, extack); if (IS_ERR(t)) return PTR_ERR(t); diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c index 967b07aeb6831e..8972863c93ee52 100644 --- a/net/ipv6/ip6_input.c +++ b/net/ipv6/ip6_input.c @@ -403,6 +403,7 @@ INDIRECT_CALLABLE_DECLARE(int tcp_v6_rcv(struct sk_buff *)); void ip6_protocol_deliver_rcu(struct net *net, struct sk_buff *skb, int nexthdr, bool have_final) { + int exthdr_cnt = IP6CB(skb)->flags & IP6SKB_HOPBYHOP ? 1 : 0; const struct inet6_protocol *ipprot; struct inet6_dev *idev; unsigned int nhoff; @@ -487,6 +488,10 @@ void ip6_protocol_deliver_rcu(struct net *net, struct sk_buff *skb, int nexthdr, nexthdr = ret; goto resubmit_final; } else { + if (unlikely(exthdr_cnt++ >= IP6_MAX_EXT_HDRS_CNT)) { + SKB_DR_SET(reason, IPV6_TOO_MANY_EXTHDRS); + goto discard; + } goto resubmit; } } else if (ret == 0) { diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index 7e92909ab5be3f..c14adcdd43960d 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -468,6 +468,7 @@ static int ip6_forward_proxy_check(struct sk_buff *skb) default: break; } + hdr = ipv6_hdr(skb); } /* @@ -582,6 +583,8 @@ int ip6_forward(struct sk_buff *skb) if (READ_ONCE(net->ipv6.devconf_all->proxy_ndp) && pneigh_lookup(&nd_tbl, net, &hdr->daddr, skb->dev)) { int proxied = ip6_forward_proxy_check(skb); + + hdr = ipv6_hdr(skb); if (proxied > 0) { /* It's tempting to decrease the hop limit * here by 1, as we do at the end of the @@ -1794,6 +1797,8 @@ static int __ip6_append_data(struct sock *sk, if (err < 0) goto error; copy = err; + if (!(flags & MSG_NO_SHARED_FRAGS)) + skb_shinfo(skb)->flags |= SKBFL_SHARED_FRAG; wmem_alloc_delta += copy; } else if (!zc) { int i = skb_shinfo(skb)->nr_frags; diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index c468c83af0f20e..9d1037ac082f6a 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -399,11 +399,15 @@ __u16 ip6_tnl_parse_tlv_enc_lim(struct sk_buff *skb, __u8 *raw) unsigned int nhoff = raw - skb->data; unsigned int off = nhoff + sizeof(*ipv6h); u8 nexthdr = ipv6h->nexthdr; + int exthdr_cnt = 0; while (ipv6_ext_hdr(nexthdr) && nexthdr != NEXTHDR_NONE) { struct ipv6_opt_hdr *hdr; u16 optlen; + if (unlikely(exthdr_cnt++ >= IP6_MAX_EXT_HDRS_CNT)) + break; + if (!pskb_may_pull(skb, off + sizeof(*hdr))) break; diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c index d585ac3c111335..9d9c3763f2f5e9 100644 --- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -1713,12 +1713,10 @@ do_ip6t_get_ctl(struct sock *sk, int cmd, void __user *user, int *len) static void __ip6t_unregister_table(struct net *net, struct xt_table *table) { - struct xt_table_info *private; - void *loc_cpu_entry; + struct xt_table_info *private = table->private; struct module *table_owner = table->me; struct ip6t_entry *iter; - - private = xt_unregister_table(table); + void *loc_cpu_entry; /* Decrease module usage counts and free resources */ loc_cpu_entry = private->entries; @@ -1727,19 +1725,18 @@ static void __ip6t_unregister_table(struct net *net, struct xt_table *table) if (private->number > private->initial_entries) module_put(table_owner); xt_free_table_info(private); + kfree(table); } int ip6t_register_table(struct net *net, const struct xt_table *table, const struct ip6t_replace *repl, const struct nf_hook_ops *template_ops) { - struct nf_hook_ops *ops; - unsigned int num_ops; - int ret, i; - struct xt_table_info *newinfo; struct xt_table_info bootstrap = {0}; - void *loc_cpu_entry; + struct xt_table_info *newinfo; struct xt_table *new_table; + void *loc_cpu_entry; + int ret; newinfo = xt_alloc_table_info(repl->size); if (!newinfo) @@ -1754,7 +1751,7 @@ int ip6t_register_table(struct net *net, const struct xt_table *table, return ret; } - new_table = xt_register_table(net, table, &bootstrap, newinfo); + new_table = xt_register_table(net, table, template_ops, &bootstrap, newinfo); if (IS_ERR(new_table)) { struct ip6t_entry *iter; @@ -1764,48 +1761,12 @@ int ip6t_register_table(struct net *net, const struct xt_table *table, return PTR_ERR(new_table); } - if (!template_ops) - return 0; - - num_ops = hweight32(table->valid_hooks); - if (num_ops == 0) { - ret = -EINVAL; - goto out_free; - } - - ops = kmemdup_array(template_ops, num_ops, sizeof(*ops), GFP_KERNEL); - if (!ops) { - ret = -ENOMEM; - goto out_free; - } - - for (i = 0; i < num_ops; i++) - ops[i].priv = new_table; - - new_table->ops = ops; - - ret = nf_register_net_hooks(net, ops, num_ops); - if (ret != 0) - goto out_free; - return ret; - -out_free: - __ip6t_unregister_table(net, new_table); - return ret; -} - -void ip6t_unregister_table_pre_exit(struct net *net, const char *name) -{ - struct xt_table *table = xt_find_table(net, NFPROTO_IPV6, name); - - if (table) - nf_unregister_net_hooks(net, table->ops, hweight32(table->valid_hooks)); } void ip6t_unregister_table_exit(struct net *net, const char *name) { - struct xt_table *table = xt_find_table(net, NFPROTO_IPV6, name); + struct xt_table *table = xt_unregister_table_exit(net, NFPROTO_IPV6, name); if (table) __ip6t_unregister_table(net, table); @@ -1894,7 +1855,6 @@ static void __exit ip6_tables_fini(void) } EXPORT_SYMBOL(ip6t_register_table); -EXPORT_SYMBOL(ip6t_unregister_table_pre_exit); EXPORT_SYMBOL(ip6t_unregister_table_exit); EXPORT_SYMBOL(ip6t_do_table); diff --git a/net/ipv6/netfilter/ip6table_filter.c b/net/ipv6/netfilter/ip6table_filter.c index e8992693e14a04..b074fc4776764c 100644 --- a/net/ipv6/netfilter/ip6table_filter.c +++ b/net/ipv6/netfilter/ip6table_filter.c @@ -60,7 +60,7 @@ static int __net_init ip6table_filter_net_init(struct net *net) static void __net_exit ip6table_filter_net_pre_exit(struct net *net) { - ip6t_unregister_table_pre_exit(net, "filter"); + xt_unregister_table_pre_exit(net, NFPROTO_IPV6, "filter"); } static void __net_exit ip6table_filter_net_exit(struct net *net) @@ -76,32 +76,32 @@ static struct pernet_operations ip6table_filter_net_ops = { static int __init ip6table_filter_init(void) { - int ret = xt_register_template(&packet_filter, - ip6table_filter_table_init); - - if (ret < 0) - return ret; + int ret; filter_ops = xt_hook_ops_alloc(&packet_filter, ip6t_do_table); - if (IS_ERR(filter_ops)) { - xt_unregister_template(&packet_filter); + if (IS_ERR(filter_ops)) return PTR_ERR(filter_ops); - } ret = register_pernet_subsys(&ip6table_filter_net_ops); + if (ret < 0) + goto err_free; + + ret = xt_register_template(&packet_filter, ip6table_filter_table_init); if (ret < 0) { - xt_unregister_template(&packet_filter); - kfree(filter_ops); - return ret; + unregister_pernet_subsys(&ip6table_filter_net_ops); + goto err_free; } + return 0; +err_free: + kfree(filter_ops); return ret; } static void __exit ip6table_filter_fini(void) { - unregister_pernet_subsys(&ip6table_filter_net_ops); xt_unregister_template(&packet_filter); + unregister_pernet_subsys(&ip6table_filter_net_ops); kfree(filter_ops); } diff --git a/net/ipv6/netfilter/ip6table_mangle.c b/net/ipv6/netfilter/ip6table_mangle.c index 8dd4cd0c47bd4d..e6ee036a9b2c58 100644 --- a/net/ipv6/netfilter/ip6table_mangle.c +++ b/net/ipv6/netfilter/ip6table_mangle.c @@ -89,7 +89,7 @@ static int ip6table_mangle_table_init(struct net *net) static void __net_exit ip6table_mangle_net_pre_exit(struct net *net) { - ip6t_unregister_table_pre_exit(net, "mangle"); + xt_unregister_table_pre_exit(net, NFPROTO_IPV6, "mangle"); } static void __net_exit ip6table_mangle_net_exit(struct net *net) @@ -104,32 +104,33 @@ static struct pernet_operations ip6table_mangle_net_ops = { static int __init ip6table_mangle_init(void) { - int ret = xt_register_template(&packet_mangler, - ip6table_mangle_table_init); - - if (ret < 0) - return ret; + int ret; mangle_ops = xt_hook_ops_alloc(&packet_mangler, ip6table_mangle_hook); - if (IS_ERR(mangle_ops)) { - xt_unregister_template(&packet_mangler); + if (IS_ERR(mangle_ops)) return PTR_ERR(mangle_ops); - } ret = register_pernet_subsys(&ip6table_mangle_net_ops); + if (ret < 0) + goto err_free; + + ret = xt_register_template(&packet_mangler, + ip6table_mangle_table_init); if (ret < 0) { - xt_unregister_template(&packet_mangler); - kfree(mangle_ops); - return ret; + unregister_pernet_subsys(&ip6table_mangle_net_ops); + goto err_free; } + return 0; +err_free: + kfree(mangle_ops); return ret; } static void __exit ip6table_mangle_fini(void) { - unregister_pernet_subsys(&ip6table_mangle_net_ops); xt_unregister_template(&packet_mangler); + unregister_pernet_subsys(&ip6table_mangle_net_ops); kfree(mangle_ops); } diff --git a/net/ipv6/netfilter/ip6table_nat.c b/net/ipv6/netfilter/ip6table_nat.c index 5be723232df8f1..c2394e2c94b562 100644 --- a/net/ipv6/netfilter/ip6table_nat.c +++ b/net/ipv6/netfilter/ip6table_nat.c @@ -121,8 +121,11 @@ static int ip6table_nat_table_init(struct net *net) } ret = ip6t_nat_register_lookups(net); - if (ret < 0) + if (ret < 0) { + xt_unregister_table_pre_exit(net, NFPROTO_IPV6, "nat"); + synchronize_rcu(); ip6t_unregister_table_exit(net, "nat"); + } kfree(repl); return ret; @@ -131,6 +134,7 @@ static int ip6table_nat_table_init(struct net *net) static void __net_exit ip6table_nat_net_pre_exit(struct net *net) { ip6t_nat_unregister_lookups(net); + xt_unregister_table_pre_exit(net, NFPROTO_IPV6, "nat"); } static void __net_exit ip6table_nat_net_exit(struct net *net) diff --git a/net/ipv6/netfilter/ip6table_raw.c b/net/ipv6/netfilter/ip6table_raw.c index fc9f6754028f2c..3b161ee875bcc1 100644 --- a/net/ipv6/netfilter/ip6table_raw.c +++ b/net/ipv6/netfilter/ip6table_raw.c @@ -52,7 +52,7 @@ static int ip6table_raw_table_init(struct net *net) static void __net_exit ip6table_raw_net_pre_exit(struct net *net) { - ip6t_unregister_table_pre_exit(net, "raw"); + xt_unregister_table_pre_exit(net, NFPROTO_IPV6, "raw"); } static void __net_exit ip6table_raw_net_exit(struct net *net) @@ -75,31 +75,31 @@ static int __init ip6table_raw_init(void) pr_info("Enabling raw table before defrag\n"); } - ret = xt_register_template(table, ip6table_raw_table_init); - if (ret < 0) - return ret; - /* Register hooks */ rawtable_ops = xt_hook_ops_alloc(table, ip6t_do_table); - if (IS_ERR(rawtable_ops)) { - xt_unregister_template(table); + if (IS_ERR(rawtable_ops)) return PTR_ERR(rawtable_ops); - } ret = register_pernet_subsys(&ip6table_raw_net_ops); + if (ret < 0) + goto err_free; + + ret = xt_register_template(table, ip6table_raw_table_init); if (ret < 0) { - kfree(rawtable_ops); - xt_unregister_template(table); - return ret; + unregister_pernet_subsys(&ip6table_raw_net_ops); + goto err_free; } + return 0; +err_free: + kfree(rawtable_ops); return ret; } static void __exit ip6table_raw_fini(void) { - unregister_pernet_subsys(&ip6table_raw_net_ops); xt_unregister_template(&packet_raw); + unregister_pernet_subsys(&ip6table_raw_net_ops); kfree(rawtable_ops); } diff --git a/net/ipv6/netfilter/ip6table_security.c b/net/ipv6/netfilter/ip6table_security.c index 4df14a9bae782d..4bd5d97b8ab65d 100644 --- a/net/ipv6/netfilter/ip6table_security.c +++ b/net/ipv6/netfilter/ip6table_security.c @@ -49,7 +49,7 @@ static int ip6table_security_table_init(struct net *net) static void __net_exit ip6table_security_net_pre_exit(struct net *net) { - ip6t_unregister_table_pre_exit(net, "security"); + xt_unregister_table_pre_exit(net, NFPROTO_IPV6, "security"); } static void __net_exit ip6table_security_net_exit(struct net *net) @@ -64,32 +64,33 @@ static struct pernet_operations ip6table_security_net_ops = { static int __init ip6table_security_init(void) { - int ret = xt_register_template(&security_table, - ip6table_security_table_init); - - if (ret < 0) - return ret; + int ret; sectbl_ops = xt_hook_ops_alloc(&security_table, ip6t_do_table); - if (IS_ERR(sectbl_ops)) { - xt_unregister_template(&security_table); + if (IS_ERR(sectbl_ops)) return PTR_ERR(sectbl_ops); - } ret = register_pernet_subsys(&ip6table_security_net_ops); + if (ret < 0) + goto err_free; + + ret = xt_register_template(&security_table, + ip6table_security_table_init); if (ret < 0) { - kfree(sectbl_ops); - xt_unregister_template(&security_table); - return ret; + unregister_pernet_subsys(&ip6table_security_net_ops); + goto err_free; } + return 0; +err_free: + kfree(sectbl_ops); return ret; } static void __exit ip6table_security_fini(void) { - unregister_pernet_subsys(&ip6table_security_net_ops); xt_unregister_template(&security_table); + unregister_pernet_subsys(&ip6table_security_net_ops); kfree(sectbl_ops); } diff --git a/net/ipv6/netfilter/nf_socket_ipv6.c b/net/ipv6/netfilter/nf_socket_ipv6.c index ced8bd44828ef7..893f2aeb471145 100644 --- a/net/ipv6/netfilter/nf_socket_ipv6.c +++ b/net/ipv6/netfilter/nf_socket_ipv6.c @@ -100,6 +100,7 @@ struct sock *nf_sk_lookup_slow_v6(struct net *net, const struct sk_buff *skb, const struct in6_addr *daddr = NULL, *saddr = NULL; struct ipv6hdr *iph = ipv6_hdr(skb), ipv6_var; struct sk_buff *data_skb = NULL; + unsigned short fragoff = 0; int doff = 0; int thoff = 0, tproto; #if IS_ENABLED(CONFIG_NF_CONNTRACK) @@ -107,8 +108,8 @@ struct sock *nf_sk_lookup_slow_v6(struct net *net, const struct sk_buff *skb, struct nf_conn const *ct; #endif - tproto = ipv6_find_hdr(skb, &thoff, -1, NULL, NULL); - if (tproto < 0) { + tproto = ipv6_find_hdr(skb, &thoff, -1, &fragoff, NULL); + if (tproto < 0 || fragoff) { pr_debug("unable to find transport header in IPv6 packet, dropping\n"); return NULL; } diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 19eb6b70222785..e3d355d1fbd627 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1645,6 +1645,10 @@ static unsigned int fib6_mtu(const struct fib6_result *res) rcu_read_lock(); idev = __in6_dev_get(dev); + if (!idev) { + rcu_read_unlock(); + return 0; + } mtu = READ_ONCE(idev->cnf.mtu6); rcu_read_unlock(); } @@ -4995,6 +4999,7 @@ static int fib6_ifdown(struct fib6_info *rt, void *p_arg) rt->fib6_flags & (RTF_LOCAL | RTF_ANYCAST)) break; rt->fib6_nh->fib_nh_flags |= RTNH_F_LINKDOWN; + fib6_update_sernum(net, rt); rt6_multipath_rebalance(rt); break; } diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 2c3f7a739709d7..d13d49bfef1945 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -288,8 +288,10 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr_unsized *uaddr, saddr = &fl6->saddr; err = inet_bhash2_update_saddr(sk, saddr, AF_INET6); - if (err) + if (err) { + dst_release(dst); goto failure; + } } /* set the source address */ @@ -1617,12 +1619,13 @@ int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb) if (sk->sk_state == TCP_LISTEN) { struct sock *nsk = tcp_v6_cookie_check(sk, skb); + if (!nsk) + return 0; if (nsk != sk) { - if (nsk) { - reason = tcp_child_process(sk, nsk, skb); - if (reason) - goto reset; - } + reason = tcp_child_process(sk, nsk, skb); + sock_put(nsk); + if (reason) + goto reset; return 0; } } else @@ -1827,8 +1830,10 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb) rst_reason = sk_rst_convert_drop_reason(drop_reason); tcp_v6_send_reset(nsk, skb, rst_reason); + sock_put(nsk); goto discard_and_relse; } + sock_put(nsk); sock_put(sk); return 0; } diff --git a/net/ipv6/xfrm6_protocol.c b/net/ipv6/xfrm6_protocol.c index ea2f805d3b014c..9b586fcec4850b 100644 --- a/net/ipv6/xfrm6_protocol.c +++ b/net/ipv6/xfrm6_protocol.c @@ -88,8 +88,10 @@ int xfrm6_rcv_encap(struct sk_buff *skb, int nexthdr, __be32 spi, dst = ip6_route_input_lookup(dev_net(skb->dev), skb->dev, &fl6, skb, flags); - if (dst->error) + if (dst->error) { + dst_release(dst); goto drop; + } skb_dst_set(skb, dst); } diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c index 160ae65a5c6458..0a0f27836d5706 100644 --- a/net/mac80211/mlme.c +++ b/net/mac80211/mlme.c @@ -437,6 +437,15 @@ ieee80211_verify_sta_ht_mcs_support(struct ieee80211_sub_if_data *sdata, memcpy(&sta_ht_cap, &sband->ht_cap, sizeof(sta_ht_cap)); ieee80211_apply_htcap_overrides(sdata, &sta_ht_cap); + /* + * Some Xfinity XB8 firmware advertises >1 spatial stream MCS indexes in + * their basic HT-MCS set. On cards with lower spatial streams, the check + * would fail, and we'd be stuck with no HT when it in fact work fine with + * its own supported rate. So check it only in strict mode. + */ + if (!ieee80211_hw_check(&sdata->local->hw, STRICT)) + return true; + /* * P802.11REVme/D7.0 - 6.5.4.2.4 * ... @@ -9140,7 +9149,7 @@ static int ieee80211_prep_connection(struct ieee80211_sub_if_data *sdata, struct ieee80211_bss *bss = (void *)cbss->priv; struct sta_info *new_sta = NULL; struct ieee80211_link_data *link; - bool have_sta = false; + struct sta_info *have_sta = NULL; bool mlo; int err; u16 new_links; @@ -9159,11 +9168,8 @@ static int ieee80211_prep_connection(struct ieee80211_sub_if_data *sdata, mlo = false; } - if (assoc) { - rcu_read_lock(); + if (assoc) have_sta = sta_info_get(sdata, ap_mld_addr); - rcu_read_unlock(); - } if (mlo && !have_sta && WARN_ON(sdata->vif.valid_links || sdata->vif.active_links)) @@ -9327,6 +9333,8 @@ static int ieee80211_prep_connection(struct ieee80211_sub_if_data *sdata, out_release_chan: ieee80211_link_release_channel(link); out_err: + if (mlo && have_sta) + WARN_ON(__sta_info_destroy(have_sta)); ieee80211_vif_set_links(sdata, 0, 0); return err; } diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c index 3e5d1c47a5b067..d18e962126ce4d 100644 --- a/net/mac80211/rx.c +++ b/net/mac80211/rx.c @@ -4971,7 +4971,7 @@ static bool ieee80211_invoke_fast_rx(struct ieee80211_rx_data *rx, struct sk_buff *skb = rx->skb; struct ieee80211_hdr *hdr = (void *)skb->data; struct ieee80211_rx_status *status = IEEE80211_SKB_RXCB(skb); - static ieee80211_rx_result res; + ieee80211_rx_result res; int orig_len = skb->len; int hdrlen = ieee80211_hdrlen(hdr->frame_control); int snap_offs = hdrlen; @@ -5380,7 +5380,9 @@ static void __ieee80211_rx_handle_packet(struct ieee80211_hw *hw, if (!link_sta) goto out; - ieee80211_rx_data_set_link(&rx, link_sta->link_id); + if (!ieee80211_rx_data_set_link(&rx, + link_sta->link_id)) + goto out; } if (ieee80211_prepare_and_rx_handle(&rx, skb, true)) diff --git a/net/mac80211/tests/chan-mode.c b/net/mac80211/tests/chan-mode.c index adc069065e73dd..fa370831d61711 100644 --- a/net/mac80211/tests/chan-mode.c +++ b/net/mac80211/tests/chan-mode.c @@ -65,6 +65,7 @@ static const struct determine_chan_mode_case { .ht_capa_mask = { .mcs.rx_mask[0] = 0xf7, }, + .strict = true, }, { .desc = "Masking out a RX rate in VHT capabilities", .conn_mode = IEEE80211_CONN_MODE_EHT, diff --git a/net/mac80211/util.c b/net/mac80211/util.c index b093bc203c8159..2529b01e2cd55c 100644 --- a/net/mac80211/util.c +++ b/net/mac80211/util.c @@ -3700,11 +3700,11 @@ void ieee80211_dfs_radar_detected_work(struct wiphy *wiphy, struct ieee80211_local *local = container_of(work, struct ieee80211_local, radar_detected_work); struct cfg80211_chan_def chandef; - struct ieee80211_chanctx *ctx; + struct ieee80211_chanctx *ctx, *tmp; lockdep_assert_wiphy(local->hw.wiphy); - list_for_each_entry(ctx, &local->chanctx_list, list) { + list_for_each_entry_safe(ctx, tmp, &local->chanctx_list, list) { if (ctx->replace_state == IEEE80211_CHANCTX_REPLACES_OTHER) continue; diff --git a/net/mctp/test/route-test.c b/net/mctp/test/route-test.c index e1033643fab0e2..e4b230ef609969 100644 --- a/net/mctp/test/route-test.c +++ b/net/mctp/test/route-test.c @@ -920,9 +920,9 @@ static void mctp_test_route_input_cloned_frag(struct kunit *test) static void mctp_test_route_input_null_eid(struct kunit *test) { struct mctp_hdr hdr = RX_HDR(1, 10, 0, FL_S | FL_E | FL_TO); + struct sockaddr_mctp addr = { 0 }; struct sk_buff *skb_pkt, *skb_sk; struct mctp_test_dev *dev; - struct sockaddr_mctp addr; struct socket *sock; u8 type = 0; int rc; diff --git a/net/mctp/test/utils.c b/net/mctp/test/utils.c index c3987d5ade7adc..6eef8d485c25e2 100644 --- a/net/mctp/test/utils.c +++ b/net/mctp/test/utils.c @@ -116,7 +116,7 @@ void mctp_test_destroy_dev(struct mctp_test_dev *dev) static int mctp_test_dst_output(struct mctp_dst *dst, struct sk_buff *skb) { skb->dev = dst->dev->dev; - dev_queue_xmit(skb); + dev_direct_xmit(skb, 0); return 0; } diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c index 8a16672b94e238..4cc16cbeb32817 100644 --- a/net/mptcp/bpf.c +++ b/net/mptcp/bpf.c @@ -14,7 +14,7 @@ struct mptcp_sock *bpf_mptcp_sock_from_subflow(struct sock *sk) { - if (sk && sk_fullsock(sk) && sk->sk_protocol == IPPROTO_TCP && sk_is_mptcp(sk)) + if (sk && sk_fullsock(sk) && sk_is_tcp(sk) && sk_is_mptcp(sk)) return mptcp_sk(mptcp_subflow_ctx(sk)->conn); return NULL; diff --git a/net/mptcp/fastopen.c b/net/mptcp/fastopen.c index 82ec15bcfd7f56..082c46c0f50ee7 100644 --- a/net/mptcp/fastopen.c +++ b/net/mptcp/fastopen.c @@ -12,6 +12,7 @@ void mptcp_fastopen_subflow_synack_set_params(struct mptcp_subflow_context *subf struct sock *sk, *ssk; struct sk_buff *skb; struct tcp_sock *tp; + bool has_rxtstamp; /* on early fallback the subflow context is deleted by * subflow_syn_recv_sock() @@ -40,12 +41,13 @@ void mptcp_fastopen_subflow_synack_set_params(struct mptcp_subflow_context *subf */ tp->copied_seq += skb->len; subflow->ssn_offset += skb->len; + has_rxtstamp = TCP_SKB_CB(skb)->has_rxtstamp; /* Only the sequence delta is relevant */ MPTCP_SKB_CB(skb)->map_seq = -skb->len; MPTCP_SKB_CB(skb)->end_seq = 0; MPTCP_SKB_CB(skb)->offset = 0; - MPTCP_SKB_CB(skb)->has_rxtstamp = TCP_SKB_CB(skb)->has_rxtstamp; + MPTCP_SKB_CB(skb)->has_rxtstamp = has_rxtstamp; MPTCP_SKB_CB(skb)->cant_coalesce = 1; mptcp_data_lock(sk); diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c index 57a45669040679..3c152bf66cd5ac 100644 --- a/net/mptcp/pm.c +++ b/net/mptcp/pm.c @@ -16,6 +16,7 @@ struct mptcp_pm_add_entry { struct list_head list; struct mptcp_addr_info addr; u8 retrans_times; + bool timer_done; struct timer_list add_timer; struct mptcp_sock *sock; struct rcu_head rcu; @@ -283,6 +284,9 @@ int mptcp_pm_mp_prio_send_ack(struct mptcp_sock *msk, struct sock *ssk = mptcp_subflow_tcp_sock(subflow); struct mptcp_addr_info local, remote; + if (!__mptcp_subflow_active(subflow)) + continue; + mptcp_local_address((struct sock_common *)ssk, &local); if (!mptcp_addresses_equal(&local, addr, addr->port)) continue; @@ -305,18 +309,31 @@ static unsigned int mptcp_adjust_add_addr_timeout(struct mptcp_sock *msk) const struct net *net = sock_net((struct sock *)msk); unsigned int rto = mptcp_get_add_addr_timeout(net); struct mptcp_subflow_context *subflow; - unsigned int max = 0; + unsigned int max = 0, max_stale = 0; + + if (!rto) + return 0; mptcp_for_each_subflow(msk, subflow) { struct sock *ssk = mptcp_subflow_tcp_sock(subflow); struct inet_connection_sock *icsk = inet_csk(ssk); - if (icsk->icsk_rto > max) + if (!__mptcp_subflow_active(subflow)) + continue; + + if (unlikely(subflow->stale)) { + if (icsk->icsk_rto > max_stale) + max_stale = icsk->icsk_rto; + } else if (icsk->icsk_rto > max) { max = icsk->icsk_rto; + } } - if (max && max < rto) - rto = max; + if (max) + return min(max, rto); + + if (max_stale) + return min(max_stale, rto); return rto; } @@ -327,26 +344,22 @@ static void mptcp_pm_add_timer(struct timer_list *timer) add_timer); struct mptcp_sock *msk = entry->sock; struct sock *sk = (struct sock *)msk; - unsigned int timeout; + unsigned int timeout = 0; pr_debug("msk=%p\n", msk); - if (!msk) - return; - - if (inet_sk_state_load(sk) == TCP_CLOSE) - return; - - if (!entry->addr.id) - return; + bh_lock_sock(sk); + if (unlikely(inet_sk_state_load(sk) == TCP_CLOSE)) + goto out; - if (mptcp_pm_should_add_signal_addr(msk)) { - sk_reset_timer(sk, timer, jiffies + TCP_RTO_MAX / 8); + if (sock_owned_by_user(sk)) { + /* Try again later. */ + timeout = HZ / 20; goto out; } timeout = mptcp_adjust_add_addr_timeout(msk); - if (!timeout) + if (!timeout || mptcp_pm_should_add_signal_addr(msk)) goto out; spin_lock_bh(&msk->pm.lock); @@ -359,8 +372,9 @@ static void mptcp_pm_add_timer(struct timer_list *timer) } if (entry->retrans_times < ADD_ADDR_RETRANS_MAX) - sk_reset_timer(sk, timer, - jiffies + (timeout << entry->retrans_times)); + timeout <<= entry->retrans_times; + else + timeout = 0; spin_unlock_bh(&msk->pm.lock); @@ -368,7 +382,13 @@ static void mptcp_pm_add_timer(struct timer_list *timer) mptcp_pm_subflow_established(msk); out: - __sock_put(sk); + if (timeout) + sk_reset_timer(sk, timer, jiffies + timeout); + else + /* if sock_put calls sk_free: avoid waiting for this timer */ + entry->timer_done = true; + bh_unlock_sock(sk); + sock_put(sk); } struct mptcp_pm_add_entry * @@ -431,6 +451,7 @@ bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk, timer_setup(&add_entry->add_timer, mptcp_pm_add_timer, 0); reset_timer: + add_entry->timer_done = false; timeout = mptcp_adjust_add_addr_timeout(msk); if (timeout) sk_reset_timer(sk, &add_entry->add_timer, jiffies + timeout); @@ -451,7 +472,8 @@ static void mptcp_pm_free_anno_list(struct mptcp_sock *msk) spin_unlock_bh(&msk->pm.lock); list_for_each_entry_safe(entry, tmp, &free_list, list) { - sk_stop_timer_sync(sk, &entry->add_timer); + if (!entry->timer_done) + sk_stop_timer_sync(sk, &entry->add_timer); kfree_rcu(entry, rcu); } } diff --git a/net/mptcp/pm_kernel.c b/net/mptcp/pm_kernel.c index c9f1e5af3cd3ec..fc818b63752e37 100644 --- a/net/mptcp/pm_kernel.c +++ b/net/mptcp/pm_kernel.c @@ -347,6 +347,8 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) /* check first for announce */ if (msk->pm.add_addr_signaled < endp_signal_max) { + u8 endp_id; + /* due to racing events on both ends we can reach here while * previous add address is still running: if we invoke now * mptcp_pm_announce_addr(), that will fail and the @@ -360,19 +362,20 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) if (!select_signal_address(pernet, msk, &local)) goto subflow; + /* Special case for ID0: set the correct ID */ + endp_id = local.addr.id; + if (endp_id == msk->mpc_endpoint_id) + local.addr.id = 0; + /* If the alloc fails, we are on memory pressure, not worth * continuing, and trying to create subflows. */ if (!mptcp_pm_alloc_anno_list(msk, &local.addr)) return; - __clear_bit(local.addr.id, msk->pm.id_avail_bitmap); + __clear_bit(endp_id, msk->pm.id_avail_bitmap); msk->pm.add_addr_signaled++; - /* Special case for ID0: set the correct ID */ - if (local.addr.id == msk->mpc_endpoint_id) - local.addr.id = 0; - mptcp_pm_announce_addr(msk, &local.addr, false); mptcp_pm_addr_send_ack(msk); diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c index 0efe40be2fde07..1cf608e7357bda 100644 --- a/net/mptcp/sockopt.c +++ b/net/mptcp/sockopt.c @@ -812,6 +812,10 @@ static int mptcp_setsockopt_all_sf(struct mptcp_sock *msk, int level, if (ret) break; } + + if (!ret) + sockopt_seq_inc(msk); + return ret; } diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index e2cb9d23e4a0b2..d562e149606f60 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -581,7 +581,7 @@ static void subflow_finish_connect(struct sock *sk, const struct sk_buff *skb) subflow->backup); if (!subflow_thmac_valid(subflow)) { - MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_JOINACKMAC); + MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_JOINSYNACKMAC); subflow->reset_reason = MPTCP_RST_EMPTCP; goto do_reset; } @@ -908,7 +908,7 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk, if (!subflow_hmac_valid(subflow_req, &mp_opt)) { SUBFLOW_REQ_INC_STATS(req, MPTCP_MIB_JOINACKMAC); - subflow_add_reset_reason(skb, MPTCP_RST_EPROHIBIT); + subflow_add_reset_reason(skb, MPTCP_RST_EMPTCP); goto dispose_child; } diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c index 2082bfb2d93cdf..9ea6b4fa78bf00 100644 --- a/net/netfilter/ipvs/ip_vs_conn.c +++ b/net/netfilter/ipvs/ip_vs_conn.c @@ -267,27 +267,20 @@ static inline int ip_vs_conn_hash(struct ip_vs_conn *cp) hash_key2 = hash_key; use2 = false; } + conn_tab_lock(t, cp, hash_key, hash_key2, use2, true /* new_hash */, &head, &head2); - spin_lock(&cp->lock); - - if (!(cp->flags & IP_VS_CONN_F_HASHED)) { - cp->flags |= IP_VS_CONN_F_HASHED; - WRITE_ONCE(cp->hn0.hash_key, hash_key); - WRITE_ONCE(cp->hn1.hash_key, hash_key2); - refcount_inc(&cp->refcnt); - hlist_bl_add_head_rcu(&cp->hn0.node, head); - if (use2) - hlist_bl_add_head_rcu(&cp->hn1.node, head2); - ret = 1; - } else { - pr_err("%s(): request for already hashed, called from %pS\n", - __func__, __builtin_return_address(0)); - ret = 0; - } - spin_unlock(&cp->lock); + cp->flags |= IP_VS_CONN_F_HASHED; + WRITE_ONCE(cp->hn0.hash_key, hash_key); + WRITE_ONCE(cp->hn1.hash_key, hash_key2); + refcount_inc(&cp->refcnt); + hlist_bl_add_head_rcu(&cp->hn0.node, head); + if (use2) + hlist_bl_add_head_rcu(&cp->hn1.node, head2); + conn_tab_unlock(head, head2); + ret = 1; /* Schedule resizing if load increases */ if (atomic_read(&ipvs->conn_count) > t->u_thresh && @@ -321,7 +314,6 @@ static inline bool ip_vs_conn_unlink(struct ip_vs_conn *cp) conn_tab_lock(t, cp, hash_key, hash_key2, use2, false /* new_hash */, &head, &head2); - spin_lock(&cp->lock); if (cp->flags & IP_VS_CONN_F_HASHED) { /* Decrease refcnt and unlink conn only if we are last user */ @@ -334,7 +326,6 @@ static inline bool ip_vs_conn_unlink(struct ip_vs_conn *cp) } } - spin_unlock(&cp->lock); conn_tab_unlock(head, head2); rcu_read_unlock(); @@ -637,6 +628,7 @@ void ip_vs_conn_fill_cport(struct ip_vs_conn *cp, __be16 cport) struct ip_vs_conn_hnode *hn; u32 hash_key, hash_key_new; struct ip_vs_conn_param p; + bool by_me = false; int ntbl; int dir; @@ -664,8 +656,16 @@ void ip_vs_conn_fill_cport(struct ip_vs_conn *cp, __be16 cport) t = rcu_dereference(t->new_tbl); ntbl++; /* We are lost? */ - if (ntbl >= 2) + if (ntbl >= 2) { + spin_lock_bh(&cp->lock); + if (cp->flags & IP_VS_CONN_F_NO_CPORT && by_me) + cp->cport = 0; + /* hn1 will be rehashed on next packet */ + spin_unlock_bh(&cp->lock); + IP_VS_ERR_RL("%s(): Too many ht changes for dir %d\n", + __func__, dir); return; + } } /* Rehashing during resize? Use the recent table for adds */ @@ -683,10 +683,13 @@ void ip_vs_conn_fill_cport(struct ip_vs_conn *cp, __be16 cport) if (head > head2 && t == t2) swap(head, head2); + /* Protect the cp->flags modification */ + spin_lock_bh(&cp->lock); + /* Lock seqcount only for the old bucket, even if we are on new table * because it affects the del operation, not the adding. */ - spin_lock_bh(&t->lock[hash_key & t->lock_mask].l); + spin_lock(&t->lock[hash_key & t->lock_mask].l); preempt_disable_nested(); write_seqcount_begin(&t->seqc[hash_key & t->seqc_mask]); @@ -704,14 +707,23 @@ void ip_vs_conn_fill_cport(struct ip_vs_conn *cp, __be16 cport) hlist_bl_unlock(head); write_seqcount_end(&t->seqc[hash_key & t->seqc_mask]); preempt_enable_nested(); - spin_unlock_bh(&t->lock[hash_key & t->lock_mask].l); + spin_unlock(&t->lock[hash_key & t->lock_mask].l); + spin_unlock_bh(&cp->lock); hash_key = hash_key_new; goto retry; } - spin_lock(&cp->lock); - if ((cp->flags & IP_VS_CONN_F_NO_CPORT) && - (cp->flags & IP_VS_CONN_F_HASHED)) { + /* Fill cport once, even if multiple packets try to do it */ + if (cp->flags & IP_VS_CONN_F_NO_CPORT && (!cp->cport || by_me)) { + /* If we race with resizing make sure cport is set for dir 1 */ + if (!cp->cport) { + cp->cport = cport; + by_me = true; + } + if (!dir) { + atomic_dec(&ipvs->no_cport_conns[af_id]); + cp->flags &= ~IP_VS_CONN_F_NO_CPORT; + } /* We do not recalc hash_key_r under lock, we assume the * parameters in cp do not change, i.e. cport is * the only possible change. @@ -726,21 +738,17 @@ void ip_vs_conn_fill_cport(struct ip_vs_conn *cp, __be16 cport) hlist_bl_del_rcu(&hn->node); hlist_bl_add_head_rcu(&hn->node, head_new); } - if (!dir) { - atomic_dec(&ipvs->no_cport_conns[af_id]); - cp->flags &= ~IP_VS_CONN_F_NO_CPORT; - cp->cport = cport; - } } - spin_unlock(&cp->lock); if (head != head2) hlist_bl_unlock(head2); hlist_bl_unlock(head); write_seqcount_end(&t->seqc[hash_key & t->seqc_mask]); preempt_enable_nested(); - spin_unlock_bh(&t->lock[hash_key & t->lock_mask].l); - if (dir--) + spin_unlock(&t->lock[hash_key & t->lock_mask].l); + + spin_unlock_bh(&cp->lock); + if (dir-- && by_me) goto next_dir; } @@ -1835,7 +1843,7 @@ static void ip_vs_conn_flush(struct netns_ipvs *ipvs) if (!rcu_dereference_protected(ipvs->conn_tab, 1)) return; - cancel_delayed_work_sync(&ipvs->conn_resize_work); + disable_delayed_work_sync(&ipvs->conn_resize_work); if (!atomic_read(&ipvs->conn_count)) goto unreg; diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c index f5b7a204729134..d40b404c1bf646 100644 --- a/net/netfilter/ipvs/ip_vs_core.c +++ b/net/netfilter/ipvs/ip_vs_core.c @@ -237,7 +237,7 @@ int ip_vs_rht_desired_size(struct netns_ipvs *ipvs, struct ip_vs_rht *t, int n, { if (!t) return 1 << min_bits; - n = roundup_pow_of_two(n); + n = n > 0 ? roundup_pow_of_two(n) : 1; if (lfactor < 0) { int factor = min(-lfactor, max_bits); diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c index 6632daa87ded1c..c7c7f6a7a9f6df 100644 --- a/net/netfilter/ipvs/ip_vs_ctl.c +++ b/net/netfilter/ipvs/ip_vs_ctl.c @@ -261,12 +261,28 @@ static void est_reload_work_handler(struct work_struct *work) if (!kd) continue; /* New config ? Stop kthread tasks */ - if (genid != genid_done) - ip_vs_est_kthread_stop(kd); + if (genid != genid_done) { + if (!id) { + /* Only we can stop kt 0 but not under mutex */ + mutex_unlock(&ipvs->est_mutex); + ip_vs_est_kthread_stop(kd); + mutex_lock(&ipvs->est_mutex); + if (!READ_ONCE(ipvs->enable)) + goto unlock; + /* kd for kt 0 is never destroyed */ + } else { + ip_vs_est_kthread_stop(kd); + } + } if (!kd->task && !ip_vs_est_stopped(ipvs)) { + bool start; + /* Do not start kthreads above 0 in calc phase */ - if ((!id || !ipvs->est_calc_phase) && - ip_vs_est_kthread_start(ipvs, kd) < 0) + if (id) + start = !ipvs->est_calc_phase; + else + start = kd->needed; + if (start && ip_vs_est_kthread_start(ipvs, kd) < 0) repeat = true; } } @@ -1102,6 +1118,24 @@ ip_vs_trash_get_dest(struct ip_vs_service *svc, int dest_af, return dest; } +/* Put destination in trash */ +static void ip_vs_trash_put_dest(struct netns_ipvs *ipvs, + struct ip_vs_dest *dest, unsigned long istart, + bool cleanup) +{ + spin_lock_bh(&ipvs->dest_trash_lock); + IP_VS_DBG_BUF(3, "Moving dest %s:%u into trash, dest->refcnt=%d\n", + IP_VS_DBG_ADDR(dest->af, &dest->addr), ntohs(dest->port), + refcount_read(&dest->refcnt)); + if (list_empty(&ipvs->dest_trash) && !cleanup) + mod_timer(&ipvs->dest_trash_timer, + jiffies + (IP_VS_DEST_TRASH_PERIOD >> 1)); + /* dest lives in trash with reference */ + list_add(&dest->t_list, &ipvs->dest_trash); + dest->idle_start = istart; + spin_unlock_bh(&ipvs->dest_trash_lock); +} + static void ip_vs_dest_rcu_free(struct rcu_head *head) { struct ip_vs_dest *dest; @@ -1461,9 +1495,12 @@ ip_vs_add_dest(struct ip_vs_service *svc, struct ip_vs_dest_user_kern *udest) ntohs(dest->vport)); ret = ip_vs_start_estimator(svc->ipvs, &dest->stats); + /* On error put back dest into the trash */ if (ret < 0) - return ret; - __ip_vs_update_dest(svc, dest, udest, 1); + ip_vs_trash_put_dest(svc->ipvs, dest, dest->idle_start, + false); + else + __ip_vs_update_dest(svc, dest, udest, 1); } else { /* * Allocate and initialize the dest structure @@ -1533,17 +1570,7 @@ static void __ip_vs_del_dest(struct netns_ipvs *ipvs, struct ip_vs_dest *dest, */ ip_vs_rs_unhash(dest); - spin_lock_bh(&ipvs->dest_trash_lock); - IP_VS_DBG_BUF(3, "Moving dest %s:%u into trash, dest->refcnt=%d\n", - IP_VS_DBG_ADDR(dest->af, &dest->addr), ntohs(dest->port), - refcount_read(&dest->refcnt)); - if (list_empty(&ipvs->dest_trash) && !cleanup) - mod_timer(&ipvs->dest_trash_timer, - jiffies + (IP_VS_DEST_TRASH_PERIOD >> 1)); - /* dest lives in trash with reference */ - list_add(&dest->t_list, &ipvs->dest_trash); - dest->idle_start = 0; - spin_unlock_bh(&ipvs->dest_trash_lock); + ip_vs_trash_put_dest(ipvs, dest, 0, cleanup); /* Queue up delayed work to expire all no destination connections. * No-op when CONFIG_SYSCTL is disabled. @@ -1812,11 +1839,16 @@ ip_vs_add_service(struct netns_ipvs *ipvs, struct ip_vs_service_user_kern *u, *svc_p = svc; if (!READ_ONCE(ipvs->enable)) { + mutex_lock(&ipvs->est_mutex); + /* Now there is a service - full throttle */ WRITE_ONCE(ipvs->enable, 1); + ipvs->est_max_threads = ip_vs_est_max_threads(ipvs); + /* Start estimation for first time */ - ip_vs_est_reload_start(ipvs); + ip_vs_est_reload_start(ipvs, true); + mutex_unlock(&ipvs->est_mutex); } return 0; @@ -2032,6 +2064,9 @@ static int ip_vs_del_service(struct ip_vs_service *svc) cancel_delayed_work_sync(&ipvs->svc_resize_work); if (t) { rcu_assign_pointer(ipvs->svc_table, NULL); + /* Inform readers that table is removed */ + smp_mb__before_atomic(); + atomic_inc(&ipvs->svc_table_changes); while (1) { p = rcu_dereference_protected(t->new_tbl, 1); call_rcu(&t->rcu_head, ip_vs_rht_rcu_free); @@ -2078,6 +2113,9 @@ static int ip_vs_flush(struct netns_ipvs *ipvs, bool cleanup) t = rcu_dereference_protected(ipvs->svc_table, 1); if (t) { rcu_assign_pointer(ipvs->svc_table, NULL); + /* Inform readers that table is removed */ + smp_mb__before_atomic(); + atomic_inc(&ipvs->svc_table_changes); while (1) { p = rcu_dereference_protected(t->new_tbl, 1); call_rcu(&t->rcu_head, ip_vs_rht_rcu_free); @@ -2086,6 +2124,11 @@ static int ip_vs_flush(struct netns_ipvs *ipvs, bool cleanup) t = p; } } + /* Stop the tot_stats estimator early under service_mutex + * to avoid locking it again later. + */ + if (cleanup) + ip_vs_stop_estimator_tot_stats(ipvs); return 0; } @@ -2331,7 +2374,7 @@ static int ipvs_proc_est_cpumask_set(const struct ctl_table *table, /* est_max_threads may depend on cpulist size */ ipvs->est_max_threads = ip_vs_est_max_threads(ipvs); ipvs->est_calc_phase = 1; - ip_vs_est_reload_start(ipvs); + ip_vs_est_reload_start(ipvs, true); unlock: mutex_unlock(&ipvs->est_mutex); @@ -2351,11 +2394,14 @@ static int ipvs_proc_est_cpumask_get(const struct ctl_table *table, mutex_lock(&ipvs->est_mutex); - if (ipvs->est_cpulist_valid) - mask = *valp; - else - mask = (struct cpumask *)housekeeping_cpumask(HK_TYPE_KTHREAD); - ret = scnprintf(buffer, size, "%*pbl\n", cpumask_pr_args(mask)); + /* HK_TYPE_KTHREAD cpumask needs RCU protection */ + scoped_guard(rcu) { + if (ipvs->est_cpulist_valid) + mask = *valp; + else + mask = (struct cpumask *)housekeeping_cpumask(HK_TYPE_KTHREAD); + ret = scnprintf(buffer, size, "%*pbl\n", cpumask_pr_args(mask)); + } mutex_unlock(&ipvs->est_mutex); @@ -2411,7 +2457,7 @@ static int ipvs_proc_est_nice(const struct ctl_table *table, int write, mutex_lock(&ipvs->est_mutex); if (*valp != val) { *valp = val; - ip_vs_est_reload_start(ipvs); + ip_vs_est_reload_start(ipvs, true); } mutex_unlock(&ipvs->est_mutex); } @@ -2438,7 +2484,7 @@ static int ipvs_proc_run_estimation(const struct ctl_table *table, int write, mutex_lock(&ipvs->est_mutex); if (*valp != val) { *valp = val; - ip_vs_est_reload_start(ipvs); + ip_vs_est_reload_start(ipvs, true); } mutex_unlock(&ipvs->est_mutex); } @@ -2463,7 +2509,7 @@ static int ipvs_proc_conn_lfactor(const struct ctl_table *table, int write, if (val < -8 || val > 8) { ret = -EINVAL; } else { - *valp = val; + WRITE_ONCE(*valp, val); if (rcu_access_pointer(ipvs->conn_tab)) mod_delayed_work(system_unbound_wq, &ipvs->conn_resize_work, 0); @@ -2490,10 +2536,16 @@ static int ipvs_proc_svc_lfactor(const struct ctl_table *table, int write, if (val < -8 || val > 8) { ret = -EINVAL; } else { - *valp = val; - if (rcu_access_pointer(ipvs->svc_table)) + mutex_lock(&ipvs->service_mutex); + WRITE_ONCE(*valp, val); + /* Make sure the services are present */ + if (rcu_access_pointer(ipvs->svc_table) && + READ_ONCE(ipvs->enable) && + !test_bit(IP_VS_WORK_SVC_NORESIZE, + &ipvs->work_flags)) mod_delayed_work(system_unbound_wq, &ipvs->svc_resize_work, 0); + mutex_unlock(&ipvs->service_mutex); } } return ret; @@ -3004,7 +3056,8 @@ static int ip_vs_status_show(struct seq_file *seq, void *v) int old_gen, new_gen; u32 counts[8]; u32 bucket; - int count; + u32 count; + int loops; u32 sum1; u32 sum; int i; @@ -3020,6 +3073,7 @@ static int ip_vs_status_show(struct seq_file *seq, void *v) if (!atomic_read(&ipvs->conn_count)) goto after_conns; old_gen = atomic_read(&ipvs->conn_tab_changes); + loops = 0; repeat_conn: smp_rmb(); /* ipvs->conn_tab and conn_tab_changes */ @@ -3032,8 +3086,11 @@ static int ip_vs_status_show(struct seq_file *seq, void *v) resched_score++; ip_vs_rht_walk_bucket_rcu(t, bucket, head) { count = 0; - hlist_bl_for_each_entry_rcu(hn, e, head, node) + hlist_bl_for_each_entry_rcu(hn, e, head, node) { count++; + if (count >= ARRAY_SIZE(counts) - 1) + break; + } } resched_score += count; if (resched_score >= 100) { @@ -3042,37 +3099,41 @@ static int ip_vs_status_show(struct seq_file *seq, void *v) new_gen = atomic_read(&ipvs->conn_tab_changes); /* New table installed ? */ if (old_gen != new_gen) { + /* Too many changes? */ + if (++loops >= 5) + goto after_conns; old_gen = new_gen; goto repeat_conn; } } - counts[min(count, (int)ARRAY_SIZE(counts) - 1)]++; + counts[count]++; } } for (sum = 0, i = 0; i < ARRAY_SIZE(counts); i++) sum += counts[i]; sum1 = sum - counts[0]; - seq_printf(seq, "Conn buckets empty:\t%u (%lu%%)\n", - counts[0], (unsigned long)counts[0] * 100 / max(sum, 1U)); + seq_printf(seq, "Conn buckets empty:\t%u (%llu%%)\n", + counts[0], div_u64((u64)counts[0] * 100U, max(sum, 1U))); for (i = 1; i < ARRAY_SIZE(counts); i++) { if (!counts[i]) continue; - seq_printf(seq, "Conn buckets len-%d:\t%u (%lu%%)\n", + seq_printf(seq, "Conn buckets len-%d:\t%u (%llu%%)\n", i, counts[i], - (unsigned long)counts[i] * 100 / max(sum1, 1U)); + div_u64((u64)counts[i] * 100U, max(sum1, 1U))); } after_conns: t = rcu_dereference(ipvs->svc_table); count = ip_vs_get_num_services(ipvs); - seq_printf(seq, "Services:\t%d\n", count); + seq_printf(seq, "Services:\t%u\n", count); seq_printf(seq, "Service buckets:\t%d (%d bits, lfactor %d)\n", t ? t->size : 0, t ? t->bits : 0, t ? t->lfactor : 0); if (!count) goto after_svc; old_gen = atomic_read(&ipvs->svc_table_changes); + loops = 0; repeat_svc: smp_rmb(); /* ipvs->svc_table and svc_table_changes */ @@ -3086,8 +3147,11 @@ static int ip_vs_status_show(struct seq_file *seq, void *v) ip_vs_rht_walk_bucket_rcu(t, bucket, head) { count = 0; hlist_bl_for_each_entry_rcu(svc, e, head, - s_list) + s_list) { count++; + if (count >= ARRAY_SIZE(counts) - 1) + break; + } } resched_score += count; if (resched_score >= 100) { @@ -3096,24 +3160,27 @@ static int ip_vs_status_show(struct seq_file *seq, void *v) new_gen = atomic_read(&ipvs->svc_table_changes); /* New table installed ? */ if (old_gen != new_gen) { + /* Too many changes? */ + if (++loops >= 5) + goto after_svc; old_gen = new_gen; goto repeat_svc; } } - counts[min(count, (int)ARRAY_SIZE(counts) - 1)]++; + counts[count]++; } } for (sum = 0, i = 0; i < ARRAY_SIZE(counts); i++) sum += counts[i]; sum1 = sum - counts[0]; - seq_printf(seq, "Service buckets empty:\t%u (%lu%%)\n", - counts[0], (unsigned long)counts[0] * 100 / max(sum, 1U)); + seq_printf(seq, "Service buckets empty:\t%u (%llu%%)\n", + counts[0], div_u64((u64)counts[0] * 100U, max(sum, 1U))); for (i = 1; i < ARRAY_SIZE(counts); i++) { if (!counts[i]) continue; - seq_printf(seq, "Service buckets len-%d:\t%u (%lu%%)\n", + seq_printf(seq, "Service buckets len-%d:\t%u (%llu%%)\n", i, counts[i], - (unsigned long)counts[i] * 100 / max(sum1, 1U)); + div_u64((u64)counts[i] * 100U, max(sum1, 1U))); } after_svc: @@ -4967,7 +5034,14 @@ static void __net_exit ip_vs_control_net_cleanup_sysctl(struct netns_ipvs *ipvs) cancel_delayed_work_sync(&ipvs->defense_work); cancel_work_sync(&ipvs->defense_work.work); unregister_net_sysctl_table(ipvs->sysctl_hdr); - ip_vs_stop_estimator(ipvs, &ipvs->tot_stats->s); + if (ipvs->tot_stats->s.est.ktid != -2) { + /* Not stopped yet? This happens only on netns init error and + * we even do not need to lock the service_mutex for this case. + */ + mutex_lock(&ipvs->service_mutex); + ip_vs_stop_estimator(ipvs, &ipvs->tot_stats->s); + mutex_unlock(&ipvs->service_mutex); + } if (ipvs->est_cpulist_valid) free_cpumask_var(ipvs->sysctl_est_cpulist); @@ -5039,7 +5113,7 @@ int __net_init ip_vs_control_net_init(struct netns_ipvs *ipvs) ipvs->net->proc_net, ip_vs_stats_percpu_show, NULL)) goto err_percpu; - if (!proc_create_net_single("ip_vs_status", 0, ipvs->net->proc_net, + if (!proc_create_net_single("ip_vs_status", 0440, ipvs->net->proc_net, ip_vs_status_show, NULL)) goto err_status; #endif diff --git a/net/netfilter/ipvs/ip_vs_est.c b/net/netfilter/ipvs/ip_vs_est.c index 433ba3cab58c2c..ab09f518295122 100644 --- a/net/netfilter/ipvs/ip_vs_est.c +++ b/net/netfilter/ipvs/ip_vs_est.c @@ -68,6 +68,11 @@ and the limit of estimators per kthread - est_add_ktid: ktid where to add new ests, can point to empty slot where we should add kt data + - data protected by service_mutex: est_temp_list, est_add_ktid, + est_kt_count(R/W), est_kt_arr(R/W), est_genid_done, kd->needed(R/W) + - data protected by est_mutex: est_genid, est_max_threads, sysctl_est_cpulist, + est_cpulist_valid, sysctl_est_nice, est_stopped, sysctl_run_estimation, + est_kt_count(R), est_kt_arr(R), kd->needed(R), kd->task (id > 0) */ static struct lock_class_key __ipvs_est_key; @@ -227,14 +232,17 @@ static int ip_vs_estimation_kthread(void *data) } /* Schedule stop/start for kthread tasks */ -void ip_vs_est_reload_start(struct netns_ipvs *ipvs) +void ip_vs_est_reload_start(struct netns_ipvs *ipvs, bool restart) { + lockdep_assert_held(&ipvs->est_mutex); + /* Ignore reloads before first service is added */ if (!READ_ONCE(ipvs->enable)) return; ip_vs_est_stopped_recalc(ipvs); - /* Bump the kthread configuration genid */ - atomic_inc(&ipvs->est_genid); + /* Bump the kthread configuration genid if stopping is requested */ + if (restart) + atomic_inc(&ipvs->est_genid); queue_delayed_work(system_long_wq, &ipvs->est_reload_work, 0); } @@ -304,12 +312,17 @@ static int ip_vs_est_add_kthread(struct netns_ipvs *ipvs) void *arr = NULL; int i; - if ((unsigned long)ipvs->est_kt_count >= ipvs->est_max_threads && - READ_ONCE(ipvs->enable) && ipvs->est_max_threads) - return -EINVAL; - mutex_lock(&ipvs->est_mutex); + /* Allow kt 0 data to be created before the services are added + * and limit the kthreads when services are present. + */ + if ((unsigned long)ipvs->est_kt_count >= ipvs->est_max_threads && + READ_ONCE(ipvs->enable) && ipvs->est_max_threads) { + ret = -EINVAL; + goto out; + } + for (i = 0; i < id; i++) { if (!ipvs->est_kt_arr[i]) break; @@ -333,6 +346,7 @@ static int ip_vs_est_add_kthread(struct netns_ipvs *ipvs) kd->est_timer = jiffies; kd->id = id; ip_vs_est_set_params(ipvs, kd); + kd->needed = 1; /* Pre-allocate stats used in calc phase */ if (!id && !kd->calc_stats) { @@ -341,12 +355,8 @@ static int ip_vs_est_add_kthread(struct netns_ipvs *ipvs) goto out; } - /* Start kthread tasks only when services are present */ - if (READ_ONCE(ipvs->enable) && !ip_vs_est_stopped(ipvs)) { - ret = ip_vs_est_kthread_start(ipvs, kd); - if (ret < 0) - goto out; - } + /* Request kthread to be started */ + ip_vs_est_reload_start(ipvs, false); if (arr) ipvs->est_kt_count++; @@ -482,12 +492,11 @@ static int ip_vs_enqueue_estimator(struct netns_ipvs *ipvs, /* Start estimation for stats */ int ip_vs_start_estimator(struct netns_ipvs *ipvs, struct ip_vs_stats *stats) { + struct ip_vs_est_kt_data *kd = ipvs->est_kt_count > 0 ? + ipvs->est_kt_arr[0] : NULL; struct ip_vs_estimator *est = &stats->est; int ret; - if (!ipvs->est_max_threads && READ_ONCE(ipvs->enable)) - ipvs->est_max_threads = ip_vs_est_max_threads(ipvs); - est->ktid = -1; est->ktrow = IPVS_EST_NTICKS - 1; /* Initial delay */ @@ -496,8 +505,15 @@ int ip_vs_start_estimator(struct netns_ipvs *ipvs, struct ip_vs_stats *stats) * will not allocate much memory, just for kt 0. */ ret = 0; - if (!ipvs->est_kt_count || !ipvs->est_kt_arr[0]) + if (!kd) { ret = ip_vs_est_add_kthread(ipvs); + } else if (!kd->needed) { + mutex_lock(&ipvs->est_mutex); + /* We have job for the kt 0 task */ + kd->needed = 1; + ip_vs_est_reload_start(ipvs, true); + mutex_unlock(&ipvs->est_mutex); + } if (ret >= 0) hlist_add_head(&est->list, &ipvs->est_temp_list); else @@ -578,16 +594,14 @@ void ip_vs_stop_estimator(struct netns_ipvs *ipvs, struct ip_vs_stats *stats) } end_kt0: - /* kt 0 is freed after all other kthreads and chains are empty */ + /* kt 0 task is stopped after all other kt slots and chains are empty */ if (ipvs->est_kt_count == 1 && hlist_empty(&ipvs->est_temp_list)) { kd = ipvs->est_kt_arr[0]; - if (!kd || !kd->est_count) { + if (kd && !kd->est_count) { mutex_lock(&ipvs->est_mutex); - if (kd) { - ip_vs_est_kthread_destroy(kd); - ipvs->est_kt_arr[0] = NULL; - } - ipvs->est_kt_count--; + /* Keep the kt0 data but request kthread_stop */ + kd->needed = 0; + ip_vs_est_reload_start(ipvs, true); mutex_unlock(&ipvs->est_mutex); ipvs->est_add_ktid = 0; } @@ -647,9 +661,9 @@ static int ip_vs_est_calc_limits(struct netns_ipvs *ipvs, int *chain_max) u64 val; INIT_HLIST_HEAD(&chain); - mutex_lock(&ipvs->service_mutex); + mutex_lock(&ipvs->est_mutex); kd = ipvs->est_kt_arr[0]; - mutex_unlock(&ipvs->service_mutex); + mutex_unlock(&ipvs->est_mutex); s = kd ? kd->calc_stats : NULL; if (!s) goto out; @@ -748,16 +762,16 @@ static void ip_vs_est_calc_phase(struct netns_ipvs *ipvs) if (!ip_vs_est_calc_limits(ipvs, &chain_max)) return; - mutex_lock(&ipvs->service_mutex); - /* Stop all other tasks, so that we can immediately move the * estimators to est_temp_list without RCU grace period */ mutex_lock(&ipvs->est_mutex); for (id = 1; id < ipvs->est_kt_count; id++) { /* netns clean up started, abort */ - if (!READ_ONCE(ipvs->enable)) - goto unlock2; + if (kthread_should_stop() || !READ_ONCE(ipvs->enable)) { + mutex_unlock(&ipvs->est_mutex); + return; + } kd = ipvs->est_kt_arr[id]; if (!kd) continue; @@ -765,9 +779,11 @@ static void ip_vs_est_calc_phase(struct netns_ipvs *ipvs) } mutex_unlock(&ipvs->est_mutex); + mutex_lock(&ipvs->service_mutex); + /* Move all estimators to est_temp_list but carefully, * all estimators and kthread data can be released while - * we reschedule. Even for kthread 0. + * we reschedule. */ step = 0; @@ -849,9 +865,7 @@ static void ip_vs_est_calc_phase(struct netns_ipvs *ipvs) ip_vs_stop_estimator(ipvs, stats); /* Tasks are stopped, move without RCU grace period */ est->ktid = -1; - est->ktrow = row - kd->est_row; - if (est->ktrow < 0) - est->ktrow += IPVS_EST_NTICKS; + est->ktrow = delay; hlist_add_head(&est->list, &ipvs->est_temp_list); /* kd freed ? */ if (last) @@ -889,7 +903,6 @@ static void ip_vs_est_calc_phase(struct netns_ipvs *ipvs) if (genid == atomic_read(&ipvs->est_genid)) ipvs->est_calc_phase = 0; -unlock2: mutex_unlock(&ipvs->est_mutex); unlock: diff --git a/net/netfilter/nf_conntrack_broadcast.c b/net/netfilter/nf_conntrack_broadcast.c index 4f39bf7c843f2d..75e53fde6b2974 100644 --- a/net/netfilter/nf_conntrack_broadcast.c +++ b/net/netfilter/nf_conntrack_broadcast.c @@ -72,6 +72,7 @@ int nf_conntrack_broadcast_help(struct sk_buff *skb, exp->flags = NF_CT_EXPECT_PERMANENT; exp->class = NF_CT_EXPECT_CLASS_DEFAULT; rcu_assign_pointer(exp->helper, helper); + rcu_assign_pointer(exp->assign_helper, NULL); write_pnet(&exp->net, net); #ifdef CONFIG_NF_CONNTRACK_ZONES exp->zone = ct->zone; diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index b081892263201c..8ba5b22a1eef2f 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -1811,14 +1811,17 @@ init_conntrack(struct net *net, struct nf_conn *tmpl, spin_lock_bh(&nf_conntrack_expect_lock); exp = nf_ct_find_expectation(net, zone, tuple, !tmpl || nf_ct_is_confirmed(tmpl)); if (exp) { + struct nf_conntrack_helper *assign_helper; + /* Welcome, Mr. Bond. We've been expecting you... */ __set_bit(IPS_EXPECTED_BIT, &ct->status); /* exp->master safe, refcnt bumped in nf_ct_find_expectation */ ct->master = exp->master; - if (exp->helper) { + assign_helper = rcu_dereference(exp->assign_helper); + if (assign_helper) { help = nf_ct_helper_ext_add(ct, GFP_ATOMIC); if (help) - rcu_assign_pointer(help->helper, exp->helper); + rcu_assign_pointer(help->helper, assign_helper); } #ifdef CONFIG_NF_CONNTRACK_MARK diff --git a/net/netfilter/nf_conntrack_expect.c b/net/netfilter/nf_conntrack_expect.c index 24d0576d84b7f6..8e943efbdf0a52 100644 --- a/net/netfilter/nf_conntrack_expect.c +++ b/net/netfilter/nf_conntrack_expect.c @@ -344,6 +344,7 @@ void nf_ct_expect_init(struct nf_conntrack_expect *exp, unsigned int class, helper = rcu_dereference(help->helper); rcu_assign_pointer(exp->helper, helper); + rcu_assign_pointer(exp->assign_helper, NULL); write_pnet(&exp->net, net); #ifdef CONFIG_NF_CONNTRACK_ZONES exp->zone = ct->zone; diff --git a/net/netfilter/nf_conntrack_h323_main.c b/net/netfilter/nf_conntrack_h323_main.c index 3f5c50455b716a..b2fe6554b9cf43 100644 --- a/net/netfilter/nf_conntrack_h323_main.c +++ b/net/netfilter/nf_conntrack_h323_main.c @@ -643,7 +643,7 @@ static int expect_h245(struct sk_buff *skb, struct nf_conn *ct, &ct->tuplehash[!dir].tuple.src.u3, &ct->tuplehash[!dir].tuple.dst.u3, IPPROTO_TCP, NULL, &port); - rcu_assign_pointer(exp->helper, &nf_conntrack_helper_h245); + rcu_assign_pointer(exp->assign_helper, &nf_conntrack_helper_h245); nathook = rcu_dereference(nfct_h323_nat_hook); if (memcmp(&ct->tuplehash[dir].tuple.src.u3, @@ -767,7 +767,7 @@ static int expect_callforwarding(struct sk_buff *skb, nf_ct_expect_init(exp, NF_CT_EXPECT_CLASS_DEFAULT, nf_ct_l3num(ct), &ct->tuplehash[!dir].tuple.src.u3, &addr, IPPROTO_TCP, NULL, &port); - rcu_assign_pointer(exp->helper, nf_conntrack_helper_q931); + rcu_assign_pointer(exp->assign_helper, nf_conntrack_helper_q931); nathook = rcu_dereference(nfct_h323_nat_hook); if (memcmp(&ct->tuplehash[dir].tuple.src.u3, @@ -1234,7 +1234,7 @@ static int expect_q931(struct sk_buff *skb, struct nf_conn *ct, &ct->tuplehash[!dir].tuple.src.u3 : NULL, &ct->tuplehash[!dir].tuple.dst.u3, IPPROTO_TCP, NULL, &port); - rcu_assign_pointer(exp->helper, nf_conntrack_helper_q931); + rcu_assign_pointer(exp->assign_helper, nf_conntrack_helper_q931); exp->flags = NF_CT_EXPECT_PERMANENT; /* Accept multiple calls */ nathook = rcu_dereference(nfct_h323_nat_hook); @@ -1306,7 +1306,7 @@ static int process_gcf(struct sk_buff *skb, struct nf_conn *ct, nf_ct_expect_init(exp, NF_CT_EXPECT_CLASS_DEFAULT, nf_ct_l3num(ct), &ct->tuplehash[!dir].tuple.src.u3, &addr, IPPROTO_UDP, NULL, &port); - rcu_assign_pointer(exp->helper, nf_conntrack_helper_ras); + rcu_assign_pointer(exp->assign_helper, nf_conntrack_helper_ras); if (nf_ct_expect_related(exp, 0) == 0) { pr_debug("nf_ct_ras: expect RAS "); @@ -1523,7 +1523,7 @@ static int process_acf(struct sk_buff *skb, struct nf_conn *ct, &ct->tuplehash[!dir].tuple.src.u3, &addr, IPPROTO_TCP, NULL, &port); exp->flags = NF_CT_EXPECT_PERMANENT; - rcu_assign_pointer(exp->helper, nf_conntrack_helper_q931); + rcu_assign_pointer(exp->assign_helper, nf_conntrack_helper_q931); if (nf_ct_expect_related(exp, 0) == 0) { pr_debug("nf_ct_ras: expect Q.931 "); @@ -1577,7 +1577,7 @@ static int process_lcf(struct sk_buff *skb, struct nf_conn *ct, &ct->tuplehash[!dir].tuple.src.u3, &addr, IPPROTO_TCP, NULL, &port); exp->flags = NF_CT_EXPECT_PERMANENT; - rcu_assign_pointer(exp->helper, nf_conntrack_helper_q931); + rcu_assign_pointer(exp->assign_helper, nf_conntrack_helper_q931); if (nf_ct_expect_related(exp, 0) == 0) { pr_debug("nf_ct_ras: expect Q.931 "); diff --git a/net/netfilter/nf_conntrack_helper.c b/net/netfilter/nf_conntrack_helper.c index a715304a53d8c2..b594cd244fe1d4 100644 --- a/net/netfilter/nf_conntrack_helper.c +++ b/net/netfilter/nf_conntrack_helper.c @@ -400,6 +400,11 @@ static bool expect_iter_me(struct nf_conntrack_expect *exp, void *data) this = rcu_dereference_protected(exp->helper, lockdep_is_held(&nf_conntrack_expect_lock)); + if (this == me) + return true; + + this = rcu_dereference_protected(exp->assign_helper, + lockdep_is_held(&nf_conntrack_expect_lock)); return this == me; } diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c index eda5fe4a75c827..befa7e83ee49f5 100644 --- a/net/netfilter/nf_conntrack_netlink.c +++ b/net/netfilter/nf_conntrack_netlink.c @@ -2634,6 +2634,7 @@ static const struct nla_policy exp_nla_policy[CTA_EXPECT_MAX+1] = { static struct nf_conntrack_expect * ctnetlink_alloc_expect(const struct nlattr *const cda[], struct nf_conn *ct, + const struct nf_conntrack_helper *assign_helper, struct nf_conntrack_tuple *tuple, struct nf_conntrack_tuple *mask); @@ -2860,6 +2861,7 @@ static int ctnetlink_glue_attach_expect(const struct nlattr *attr, struct nf_conn *ct, u32 portid, u32 report) { + struct nf_conntrack_helper *assign_helper = NULL; struct nlattr *cda[CTA_EXPECT_MAX+1]; struct nf_conntrack_tuple tuple, mask; struct nf_conntrack_expect *exp; @@ -2870,13 +2872,26 @@ ctnetlink_glue_attach_expect(const struct nlattr *attr, struct nf_conn *ct, if (err < 0) return err; + if (!cda[CTA_EXPECT_TUPLE] || !cda[CTA_EXPECT_MASK]) + return -EINVAL; + err = ctnetlink_glue_exp_parse((const struct nlattr * const *)cda, ct, &tuple, &mask); if (err < 0) return err; + if (cda[CTA_EXPECT_HELP_NAME]) { + const char *helpname = nla_data(cda[CTA_EXPECT_HELP_NAME]); + + assign_helper = __nf_conntrack_helper_find(helpname, + nf_ct_l3num(ct), + tuple.dst.protonum); + if (!assign_helper) + return -EOPNOTSUPP; + } + exp = ctnetlink_alloc_expect((const struct nlattr * const *)cda, ct, - &tuple, &mask); + assign_helper, &tuple, &mask); if (IS_ERR(exp)) return PTR_ERR(exp); @@ -3515,6 +3530,7 @@ ctnetlink_parse_expect_nat(const struct nlattr *attr, static struct nf_conntrack_expect * ctnetlink_alloc_expect(const struct nlattr * const cda[], struct nf_conn *ct, + const struct nf_conntrack_helper *assign_helper, struct nf_conntrack_tuple *tuple, struct nf_conntrack_tuple *mask) { @@ -3568,6 +3584,7 @@ ctnetlink_alloc_expect(const struct nlattr * const cda[], struct nf_conn *ct, exp->zone = ct->zone; #endif rcu_assign_pointer(exp->helper, helper); + rcu_assign_pointer(exp->assign_helper, assign_helper); exp->tuple = *tuple; exp->mask.src.u3 = mask->src.u3; exp->mask.src.u.all = mask->src.u.all; @@ -3623,7 +3640,7 @@ ctnetlink_create_expect(struct net *net, ct = nf_ct_tuplehash_to_ctrack(h); rcu_read_lock(); - exp = ctnetlink_alloc_expect(cda, ct, &tuple, &mask); + exp = ctnetlink_alloc_expect(cda, ct, NULL, &tuple, &mask); if (IS_ERR(exp)) { err = PTR_ERR(exp); goto err_rcu; diff --git a/net/netfilter/nf_conntrack_sip.c b/net/netfilter/nf_conntrack_sip.c index 1eb55907d470d6..e69941f1a10161 100644 --- a/net/netfilter/nf_conntrack_sip.c +++ b/net/netfilter/nf_conntrack_sip.c @@ -1366,6 +1366,10 @@ static int process_register_request(struct sk_buff *skb, unsigned int protoff, goto store_cseq; } + helper = rcu_dereference(nfct_help(ct)->helper); + if (!helper) + return NF_DROP; + exp = nf_ct_expect_alloc(ct); if (!exp) { nf_ct_helper_log(skb, ct, "cannot alloc expectation"); @@ -1376,14 +1380,10 @@ static int process_register_request(struct sk_buff *skb, unsigned int protoff, if (sip_direct_signalling) saddr = &ct->tuplehash[!dir].tuple.src.u3; - helper = rcu_dereference(nfct_help(ct)->helper); - if (!helper) - return NF_DROP; - nf_ct_expect_init(exp, SIP_EXPECT_SIGNALLING, nf_ct_l3num(ct), saddr, &daddr, proto, NULL, &port); exp->timeout.expires = sip_timeout * HZ; - rcu_assign_pointer(exp->helper, helper); + rcu_assign_pointer(exp->assign_helper, helper); exp->flags = NF_CT_EXPECT_PERMANENT | NF_CT_EXPECT_INACTIVE; hooks = rcu_dereference(nf_nat_sip_hooks); diff --git a/net/netfilter/nf_dup_netdev.c b/net/netfilter/nf_dup_netdev.c index e348fb90b8dc3b..3b0a70e154cd8f 100644 --- a/net/netfilter/nf_dup_netdev.c +++ b/net/netfilter/nf_dup_netdev.c @@ -13,22 +13,6 @@ #include #include -#define NF_RECURSION_LIMIT 2 - -#ifndef CONFIG_PREEMPT_RT -static u8 *nf_get_nf_dup_skb_recursion(void) -{ - return this_cpu_ptr(&softnet_data.xmit.nf_dup_skb_recursion); -} -#else - -static u8 *nf_get_nf_dup_skb_recursion(void) -{ - return ¤t->net_xmit.nf_dup_skb_recursion; -} - -#endif - static void nf_do_netdev_egress(struct sk_buff *skb, struct net_device *dev, enum nf_dev_hooks hook) { diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c index 2c4140e6f53c51..785d8c244a7715 100644 --- a/net/netfilter/nf_flow_table_core.c +++ b/net/netfilter/nf_flow_table_core.c @@ -122,6 +122,7 @@ static int flow_offload_fill_route(struct flow_offload *flow, flow_tuple->tun = route->tuple[dir].in.tun; flow_tuple->encap_num = route->tuple[dir].in.num_encaps; + flow_tuple->needs_gso_segment = route->tuple[dir].out.needs_gso_segment; flow_tuple->tun_num = route->tuple[dir].in.num_tuns; switch (route->tuple[dir].xmit_type) { diff --git a/net/netfilter/nf_flow_table_ip.c b/net/netfilter/nf_flow_table_ip.c index fd56d663cb5b02..9c05a50d601384 100644 --- a/net/netfilter/nf_flow_table_ip.c +++ b/net/netfilter/nf_flow_table_ip.c @@ -445,13 +445,13 @@ static void nf_flow_encap_pop(struct nf_flowtable_ctx *ctx, switch (skb->protocol) { case htons(ETH_P_8021Q): vlan_hdr = (struct vlan_hdr *)skb->data; - __skb_pull(skb, VLAN_HLEN); + skb_pull_rcsum(skb, VLAN_HLEN); vlan_set_encap_proto(skb, vlan_hdr); skb_reset_network_header(skb); break; case htons(ETH_P_PPP_SES): skb->protocol = __nf_flow_pppoe_proto(skb); - skb_pull(skb, PPPOE_SES_HLEN); + skb_pull_rcsum(skb, PPPOE_SES_HLEN); skb_reset_network_header(skb); break; } @@ -462,23 +462,6 @@ static void nf_flow_encap_pop(struct nf_flowtable_ctx *ctx, nf_flow_ip_tunnel_pop(ctx, skb); } -struct nf_flow_xmit { - const void *dest; - const void *source; - struct net_device *outdev; -}; - -static unsigned int nf_flow_queue_xmit(struct net *net, struct sk_buff *skb, - struct nf_flow_xmit *xmit) -{ - skb->dev = xmit->outdev; - dev_hard_header(skb, skb->dev, ntohs(skb->protocol), - xmit->dest, xmit->source, skb->len); - dev_queue_xmit(skb); - - return NF_STOLEN; -} - static struct flow_offload_tuple_rhash * nf_flow_offload_lookup(struct nf_flowtable_ctx *ctx, struct nf_flowtable *flow_table, struct sk_buff *skb) @@ -524,7 +507,7 @@ static int nf_flow_offload_forward(struct nf_flowtable_ctx *ctx, return 0; } - if (skb_try_make_writable(skb, thoff + ctx->hdrsize)) + if (skb_ensure_writable(skb, thoff + ctx->hdrsize)) return -1; flow_offload_refresh(flow_table, flow, false); @@ -544,7 +527,34 @@ static int nf_flow_offload_forward(struct nf_flowtable_ctx *ctx, return 1; } -static int nf_flow_pppoe_push(struct sk_buff *skb, u16 id) +/* Similar to skb_vlan_push. */ +static int nf_flow_vlan_push(struct sk_buff *skb, __be16 proto, u16 id, + u32 needed_headroom) +{ + if (skb_vlan_tag_present(skb)) { + struct vlan_hdr *vhdr; + + if (skb_cow_head(skb, needed_headroom + VLAN_HLEN)) + return -1; + + __skb_push(skb, VLAN_HLEN); + if (skb_mac_header_was_set(skb)) + skb->mac_header -= VLAN_HLEN; + + vhdr = (struct vlan_hdr *)skb->data; + skb->network_header -= VLAN_HLEN; + vhdr->h_vlan_TCI = htons(skb_vlan_tag_get(skb)); + vhdr->h_vlan_encapsulated_proto = skb->protocol; + skb->protocol = skb->vlan_proto; + skb_postpush_rcsum(skb, skb->data, VLAN_HLEN); + } + __vlan_hwaccel_put_tag(skb, proto, id); + + return 0; +} + +static int nf_flow_pppoe_push(struct sk_buff *skb, u16 id, + u32 needed_headroom) { int data_len = skb->len + sizeof(__be16); struct ppp_hdr { @@ -553,7 +563,7 @@ static int nf_flow_pppoe_push(struct sk_buff *skb, u16 id) } *ph; __be16 proto; - if (skb_cow_head(skb, PPPOE_SES_HLEN)) + if (skb_cow_head(skb, needed_headroom + PPPOE_SES_HLEN)) return -1; switch (skb->protocol) { @@ -730,21 +740,24 @@ static int nf_flow_tunnel_v6_push(struct net *net, struct sk_buff *skb, } static int nf_flow_encap_push(struct sk_buff *skb, - struct flow_offload_tuple *tuple) + struct flow_offload_tuple *tuple, + struct net_device *outdev) { + u32 needed_headroom = LL_RESERVED_SPACE(outdev); int i; - for (i = 0; i < tuple->encap_num; i++) { + for (i = tuple->encap_num - 1; i >= 0; i--) { switch (tuple->encap[i].proto) { case htons(ETH_P_8021Q): case htons(ETH_P_8021AD): - skb_reset_mac_header(skb); - if (skb_vlan_push(skb, tuple->encap[i].proto, - tuple->encap[i].id) < 0) + if (nf_flow_vlan_push(skb, tuple->encap[i].proto, + tuple->encap[i].id, + needed_headroom) < 0) return -1; break; case htons(ETH_P_PPP_SES): - if (nf_flow_pppoe_push(skb, tuple->encap[i].id) < 0) + if (nf_flow_pppoe_push(skb, tuple->encap[i].id, + needed_headroom) < 0) return -1; break; } @@ -753,6 +766,76 @@ static int nf_flow_encap_push(struct sk_buff *skb, return 0; } +struct nf_flow_xmit { + const void *dest; + const void *source; + struct net_device *outdev; + struct flow_offload_tuple *tuple; + bool needs_gso_segment; +}; + +static void __nf_flow_queue_xmit(struct net *net, struct sk_buff *skb, + struct nf_flow_xmit *xmit) +{ + struct net_device *dev = xmit->outdev; + unsigned int hh_len = LL_RESERVED_SPACE(dev); + + if (unlikely(skb_headroom(skb) < hh_len && dev->header_ops)) { + skb = skb_expand_head(skb, hh_len); + if (!skb) + return; + } + + skb->dev = dev; + dev_hard_header(skb, dev, ntohs(skb->protocol), + xmit->dest, xmit->source, skb->len); + dev_queue_xmit(skb); +} + +static unsigned int nf_flow_encap_gso_xmit(struct net *net, struct sk_buff *skb, + struct nf_flow_xmit *xmit) +{ + struct sk_buff *segs, *nskb; + + segs = skb_gso_segment(skb, 0); + if (IS_ERR(segs)) + return NF_DROP; + + if (segs) + consume_skb(skb); + else + segs = skb; + + skb_list_walk_safe(segs, segs, nskb) { + skb_mark_not_on_list(segs); + + if (nf_flow_encap_push(segs, xmit->tuple, xmit->outdev) < 0) { + kfree_skb(segs); + kfree_skb_list(nskb); + return NF_STOLEN; + } + __nf_flow_queue_xmit(net, segs, xmit); + } + + return NF_STOLEN; +} + +static unsigned int nf_flow_queue_xmit(struct net *net, struct sk_buff *skb, + struct nf_flow_xmit *xmit) +{ + if (xmit->tuple->encap_num) { + if (skb_is_gso(skb) && xmit->needs_gso_segment) + return nf_flow_encap_gso_xmit(net, skb, xmit); + + if (nf_flow_encap_push(skb, xmit->tuple, xmit->outdev) < 0) + return NF_DROP; + } + + __nf_flow_queue_xmit(net, skb, xmit); + + return NF_STOLEN; +} + unsigned int nf_flow_offload_ip_hook(void *priv, struct sk_buff *skb, const struct nf_hook_state *state) @@ -797,9 +880,6 @@ nf_flow_offload_ip_hook(void *priv, struct sk_buff *skb, if (nf_flow_tunnel_v4_push(state->net, skb, other_tuple, &ip_daddr) < 0) return NF_DROP; - if (nf_flow_encap_push(skb, other_tuple) < 0) - return NF_DROP; - switch (tuplehash->tuple.xmit_type) { case FLOW_OFFLOAD_XMIT_NEIGH: rt = dst_rtable(tuplehash->tuple.dst_cache); @@ -829,6 +909,8 @@ nf_flow_offload_ip_hook(void *priv, struct sk_buff *skb, WARN_ON_ONCE(1); return NF_DROP; } + xmit.tuple = other_tuple; + xmit.needs_gso_segment = tuplehash->tuple.needs_gso_segment; return nf_flow_queue_xmit(state->net, skb, &xmit); } @@ -1037,7 +1119,7 @@ static int nf_flow_offload_ipv6_forward(struct nf_flowtable_ctx *ctx, return 0; } - if (skb_try_make_writable(skb, thoff + ctx->hdrsize)) + if (skb_ensure_writable(skb, thoff + ctx->hdrsize)) return -1; flow_offload_refresh(flow_table, flow, false); @@ -1119,9 +1201,6 @@ nf_flow_offload_ipv6_hook(void *priv, struct sk_buff *skb, &ip6_daddr, encap_limit) < 0) return NF_DROP; - if (nf_flow_encap_push(skb, other_tuple) < 0) - return NF_DROP; - switch (tuplehash->tuple.xmit_type) { case FLOW_OFFLOAD_XMIT_NEIGH: rt = dst_rt6_info(tuplehash->tuple.dst_cache); @@ -1151,6 +1230,8 @@ nf_flow_offload_ipv6_hook(void *priv, struct sk_buff *skb, WARN_ON_ONCE(1); return NF_DROP; } + xmit.tuple = other_tuple; + xmit.needs_gso_segment = tuplehash->tuple.needs_gso_segment; return nf_flow_queue_xmit(state->net, skb, &xmit); } diff --git a/net/netfilter/nf_flow_table_path.c b/net/netfilter/nf_flow_table_path.c index 6bb9579dcc2abb..9e88ea6a2eef78 100644 --- a/net/netfilter/nf_flow_table_path.c +++ b/net/netfilter/nf_flow_table_path.c @@ -86,6 +86,7 @@ struct nft_forward_info { u8 ingress_vlans; u8 h_source[ETH_ALEN]; u8 h_dest[ETH_ALEN]; + bool needs_gso_segment; enum flow_offload_xmit_type xmit_type; }; @@ -138,8 +139,11 @@ static void nft_dev_path_info(const struct net_device_path_stack *stack, path->encap.proto; info->num_encaps++; } - if (path->type == DEV_PATH_PPPOE) + if (path->type == DEV_PATH_PPPOE) { memcpy(info->h_dest, path->encap.h_dest, ETH_ALEN); + info->xmit_type = FLOW_OFFLOAD_XMIT_DIRECT; + info->needs_gso_segment = 1; + } break; case DEV_PATH_BRIDGE: if (is_zero_ether_addr(info->h_source)) @@ -279,6 +283,7 @@ static void nft_dev_forward_path(const struct nft_pktinfo *pkt, memcpy(route->tuple[dir].out.h_dest, info.h_dest, ETH_ALEN); route->tuple[dir].xmit_type = info.xmit_type; } + route->tuple[dir].out.needs_gso_segment = info.needs_gso_segment; } int nft_flow_route(const struct nft_pktinfo *pkt, const struct nf_conn *ct, diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index d20ce5c36d3187..87387adbca655f 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -407,6 +407,7 @@ static void nft_netdev_unregister_trans_hook(struct net *net, } static void nft_netdev_unregister_hooks(struct net *net, + const struct nft_table *table, struct list_head *hook_list, bool release_netdev) { @@ -414,8 +415,10 @@ static void nft_netdev_unregister_hooks(struct net *net, struct nf_hook_ops *ops; list_for_each_entry_safe(hook, next, hook_list, list) { - list_for_each_entry(ops, &hook->ops_list, list) - nf_unregister_net_hook(net, ops); + if (!(table->flags & NFT_TABLE_F_DORMANT)) { + list_for_each_entry(ops, &hook->ops_list, list) + nf_unregister_net_hook(net, ops); + } if (release_netdev) nft_netdev_hook_unlink_free_rcu(hook); } @@ -452,20 +455,25 @@ static void __nf_tables_unregister_hook(struct net *net, struct nft_base_chain *basechain; const struct nf_hook_ops *ops; - if (table->flags & NFT_TABLE_F_DORMANT || - !nft_is_base_chain(chain)) + if (!nft_is_base_chain(chain)) return; basechain = nft_base_chain(chain); ops = &basechain->ops; + /* must also be called for dormant tables */ + if (nft_base_chain_netdev(table->family, basechain->ops.hooknum)) { + nft_netdev_unregister_hooks(net, table, &basechain->hook_list, + release_netdev); + return; + } + + if (table->flags & NFT_TABLE_F_DORMANT) + return; + if (basechain->type->ops_unregister) return basechain->type->ops_unregister(net, ops); - if (nft_base_chain_netdev(table->family, basechain->ops.hooknum)) - nft_netdev_unregister_hooks(net, &basechain->hook_list, - release_netdev); - else - nf_unregister_net_hook(net, &basechain->ops); + nf_unregister_net_hook(net, &basechain->ops); } static void nf_tables_unregister_hook(struct net *net, @@ -4205,6 +4213,7 @@ static int nft_table_validate(struct net *net, const struct nft_table *table) struct nft_chain *chain; struct nft_ctx ctx = { .net = net, + .table = (struct nft_table *)table, .family = table->family, }; int err = 0; @@ -11281,11 +11290,9 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action) break; case NFT_MSG_NEWCHAIN: if (nft_trans_chain_update(trans)) { - if (!(table->flags & NFT_TABLE_F_DORMANT)) { - nft_netdev_unregister_hooks(net, - &nft_trans_chain_hooks(trans), - true); - } + nft_netdev_unregister_hooks(net, table, + &nft_trans_chain_hooks(trans), + true); free_percpu(nft_trans_chain_stats(trans)); kfree(nft_trans_chain_name(trans)); nft_trans_destroy(trans); diff --git a/net/netfilter/nf_tables_core.c b/net/netfilter/nf_tables_core.c index 5ddd5b6e135f1c..8ab186f86dd43a 100644 --- a/net/netfilter/nf_tables_core.c +++ b/net/netfilter/nf_tables_core.c @@ -153,7 +153,7 @@ static bool nft_payload_fast_eval(const struct nft_expr *expr, if (priv->base == NFT_PAYLOAD_NETWORK_HEADER) ptr = skb_network_header(skb) + pkt->nhoff; else { - if (!(pkt->flags & NFT_PKTINFO_L4PROTO)) + if (!(pkt->flags & NFT_PKTINFO_L4PROTO) || pkt->fragoff) return false; ptr = skb->data + nft_thoff(pkt); } diff --git a/net/netfilter/nft_compat.c b/net/netfilter/nft_compat.c index decc725a33c251..0caa9304d2d030 100644 --- a/net/netfilter/nft_compat.c +++ b/net/netfilter/nft_compat.c @@ -261,10 +261,10 @@ nft_target_init(const struct nft_ctx *ctx, const struct nft_expr *expr, return ret; } - nft_target_set_tgchk_param(&par, ctx, target, info, &e, proto, inv); - nft_compat_wait_for_destructors(ctx->net); + nft_target_set_tgchk_param(&par, ctx, target, info, &e, proto, inv); + ret = xt_check_target(&par, size, proto, inv); if (ret < 0) { if (ret == -ENOENT) { @@ -353,8 +353,6 @@ static int nft_target_dump(struct sk_buff *skb, static int nft_target_validate(const struct nft_ctx *ctx, const struct nft_expr *expr) { - struct xt_target *target = expr->ops->data; - unsigned int hook_mask = 0; int ret; if (ctx->family != NFPROTO_IPV4 && @@ -377,11 +375,21 @@ static int nft_target_validate(const struct nft_ctx *ctx, const struct nft_base_chain *basechain = nft_base_chain(ctx->chain); const struct nf_hook_ops *ops = &basechain->ops; + unsigned int hook_mask = 1 << ops->hooknum; + struct xt_target *target = expr->ops->data; + void *info = nft_expr_priv(expr); + struct xt_tgchk_param par; + union nft_entry e = {}; - hook_mask = 1 << ops->hooknum; if (target->hooks && !(hook_mask & target->hooks)) return -EINVAL; + nft_target_set_tgchk_param(&par, ctx, target, info, &e, 0, false); + + ret = xt_check_hooks_target(&par); + if (ret < 0) + return ret; + ret = nft_compat_chain_validate_dependency(ctx, target->table); if (ret < 0) return ret; @@ -515,10 +523,10 @@ __nft_match_init(const struct nft_ctx *ctx, const struct nft_expr *expr, return ret; } - nft_match_set_mtchk_param(&par, ctx, match, info, &e, proto, inv); - nft_compat_wait_for_destructors(ctx->net); + nft_match_set_mtchk_param(&par, ctx, match, info, &e, proto, inv); + return xt_check_match(&par, size, proto, inv); } @@ -614,8 +622,6 @@ static int nft_match_large_dump(struct sk_buff *skb, static int nft_match_validate(const struct nft_ctx *ctx, const struct nft_expr *expr) { - struct xt_match *match = expr->ops->data; - unsigned int hook_mask = 0; int ret; if (ctx->family != NFPROTO_IPV4 && @@ -638,11 +644,30 @@ static int nft_match_validate(const struct nft_ctx *ctx, const struct nft_base_chain *basechain = nft_base_chain(ctx->chain); const struct nf_hook_ops *ops = &basechain->ops; + unsigned int hook_mask = 1 << ops->hooknum; + struct xt_match *match = expr->ops->data; + size_t size = XT_ALIGN(match->matchsize); + struct xt_mtchk_param par; + union nft_entry e = {}; + void *info; - hook_mask = 1 << ops->hooknum; if (match->hooks && !(hook_mask & match->hooks)) return -EINVAL; + if (NFT_EXPR_SIZE(size) > NFT_MATCH_LARGE_THRESH) { + struct nft_xt_match_priv *priv = nft_expr_priv(expr); + + info = priv->info; + } else { + info = nft_expr_priv(expr); + } + + nft_match_set_mtchk_param(&par, ctx, match, info, &e, 0, false); + + ret = xt_check_hooks_match(&par); + if (ret < 0) + return ret; + ret = nft_compat_chain_validate_dependency(ctx, match->table); if (ret < 0) return ret; diff --git a/net/netfilter/nft_ct.c b/net/netfilter/nft_ct.c index 60ee8d932fcb36..fa2cc556331cf5 100644 --- a/net/netfilter/nft_ct.c +++ b/net/netfilter/nft_ct.c @@ -1334,6 +1334,8 @@ static void nft_ct_expect_obj_eval(struct nft_object *obj, if (nf_ct_expect_related(exp, 0) != 0) regs->verdict.code = NF_DROP; + + nf_ct_expect_put(exp); } static const struct nla_policy nft_ct_expect_policy[NFTA_CT_EXPECT_MAX + 1] = { diff --git a/net/netfilter/nft_exthdr.c b/net/netfilter/nft_exthdr.c index 0407d6f708ae9e..e6a07c0df2079f 100644 --- a/net/netfilter/nft_exthdr.c +++ b/net/netfilter/nft_exthdr.c @@ -376,7 +376,7 @@ static void nft_exthdr_sctp_eval(const struct nft_expr *expr, const struct sctp_chunkhdr *sch; struct sctp_chunkhdr _sch; - if (pkt->tprot != IPPROTO_SCTP) + if (pkt->tprot != IPPROTO_SCTP || pkt->fragoff) goto err; do { diff --git a/net/netfilter/nft_fwd_netdev.c b/net/netfilter/nft_fwd_netdev.c index 4bce36c3a6a070..b9e88d7cf3081a 100644 --- a/net/netfilter/nft_fwd_netdev.c +++ b/net/netfilter/nft_fwd_netdev.c @@ -95,12 +95,15 @@ static void nft_fwd_neigh_eval(const struct nft_expr *expr, struct nft_regs *regs, const struct nft_pktinfo *pkt) { + u8 *nf_dup_skb_recursion = nf_get_nf_dup_skb_recursion(); struct nft_fwd_neigh *priv = nft_expr_priv(expr); void *addr = ®s->data[priv->sreg_addr]; int oif = regs->data[priv->sreg_dev]; unsigned int verdict = NF_STOLEN; struct sk_buff *skb = pkt->skb; + int nhoff = skb_network_offset(skb); struct net_device *dev; + unsigned int hh_len; int neigh_table; switch (priv->nfproto) { @@ -111,7 +114,7 @@ static void nft_fwd_neigh_eval(const struct nft_expr *expr, verdict = NFT_BREAK; goto out; } - if (skb_try_make_writable(skb, sizeof(*iph))) { + if (skb_ensure_writable(skb, nhoff + sizeof(*iph))) { verdict = NF_DROP; goto out; } @@ -132,7 +135,7 @@ static void nft_fwd_neigh_eval(const struct nft_expr *expr, verdict = NFT_BREAK; goto out; } - if (skb_try_make_writable(skb, sizeof(*ip6h))) { + if (skb_ensure_writable(skb, nhoff + sizeof(*ip6h))) { verdict = NF_DROP; goto out; } @@ -151,13 +154,31 @@ static void nft_fwd_neigh_eval(const struct nft_expr *expr, goto out; } + if (*nf_dup_skb_recursion > NF_RECURSION_LIMIT) { + verdict = NF_DROP; + goto out; + } + dev = dev_get_by_index_rcu(nft_net(pkt), oif); - if (dev == NULL) - return; + if (dev == NULL) { + verdict = NF_DROP; + goto out; + } + + hh_len = LL_RESERVED_SPACE(dev); + if (unlikely(skb_headroom(skb) < hh_len && dev->header_ops)) { + skb = skb_expand_head(skb, hh_len); + if (!skb) { + verdict = NF_STOLEN; + goto out; + } + } skb->dev = dev; skb_clear_tstamp(skb); + (*nf_dup_skb_recursion)++; neigh_xmit(neigh_table, dev, addr, skb); + (*nf_dup_skb_recursion)--; out: regs->verdict.code = verdict; } diff --git a/net/netfilter/nft_osf.c b/net/netfilter/nft_osf.c index c02d5cb52143bc..45fe56da50442b 100644 --- a/net/netfilter/nft_osf.c +++ b/net/netfilter/nft_osf.c @@ -33,7 +33,7 @@ static void nft_osf_eval(const struct nft_expr *expr, struct nft_regs *regs, return; } - if (pkt->tprot != IPPROTO_TCP) { + if (pkt->tprot != IPPROTO_TCP || pkt->fragoff) { regs->verdict.code = NFT_BREAK; return; } diff --git a/net/netfilter/nft_tproxy.c b/net/netfilter/nft_tproxy.c index f2101af8c867f4..89be443734f66f 100644 --- a/net/netfilter/nft_tproxy.c +++ b/net/netfilter/nft_tproxy.c @@ -30,8 +30,8 @@ static void nft_tproxy_eval_v4(const struct nft_expr *expr, __be16 tport = 0; struct sock *sk; - if (pkt->tprot != IPPROTO_TCP && - pkt->tprot != IPPROTO_UDP) { + if ((pkt->tprot != IPPROTO_TCP && + pkt->tprot != IPPROTO_UDP) || pkt->fragoff) { regs->verdict.code = NFT_BREAK; return; } @@ -97,8 +97,8 @@ static void nft_tproxy_eval_v6(const struct nft_expr *expr, memset(&taddr, 0, sizeof(taddr)); - if (pkt->tprot != IPPROTO_TCP && - pkt->tprot != IPPROTO_UDP) { + if ((pkt->tprot != IPPROTO_TCP && + pkt->tprot != IPPROTO_UDP) || pkt->fragoff) { regs->verdict.code = NFT_BREAK; return; } diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c index 9f837fb5ceb47d..4e6708c23922b9 100644 --- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -55,6 +55,9 @@ static struct list_head xt_templates[NFPROTO_NUMPROTO]; struct xt_pernet { struct list_head tables[NFPROTO_NUMPROTO]; + + /* stash area used during netns exit */ + struct list_head dead_tables[NFPROTO_NUMPROTO]; }; struct compat_delta { @@ -477,11 +480,9 @@ int xt_check_proc_name(const char *name, unsigned int size) } EXPORT_SYMBOL(xt_check_proc_name); -int xt_check_match(struct xt_mtchk_param *par, - unsigned int size, u16 proto, bool inv_proto) +static int xt_check_match_common(struct xt_mtchk_param *par, + unsigned int size, u16 proto, bool inv_proto) { - int ret; - if (XT_ALIGN(par->match->matchsize) != size && par->match->matchsize != -1) { /* @@ -530,6 +531,14 @@ int xt_check_match(struct xt_mtchk_param *par, par->match->proto); return -EINVAL; } + + return 0; +} + +static int xt_checkentry_match(struct xt_mtchk_param *par) +{ + int ret; + if (par->match->checkentry != NULL) { ret = par->match->checkentry(par); if (ret < 0) @@ -538,8 +547,34 @@ int xt_check_match(struct xt_mtchk_param *par, /* Flag up potential errors. */ return -EIO; } + + return 0; +} + +int xt_check_hooks_match(struct xt_mtchk_param *par) +{ + if (par->match->check_hooks != NULL) + return par->match->check_hooks(par); + return 0; } +EXPORT_SYMBOL_GPL(xt_check_hooks_match); + +int xt_check_match(struct xt_mtchk_param *par, + unsigned int size, u16 proto, bool inv_proto) +{ + int ret; + + ret = xt_check_match_common(par, size, proto, inv_proto); + if (ret < 0) + return ret; + + ret = xt_check_hooks_match(par); + if (ret < 0) + return ret; + + return xt_checkentry_match(par); +} EXPORT_SYMBOL_GPL(xt_check_match); /** xt_check_entry_match - check that matches end before start of target @@ -1012,11 +1047,9 @@ bool xt_find_jump_offset(const unsigned int *offsets, } EXPORT_SYMBOL(xt_find_jump_offset); -int xt_check_target(struct xt_tgchk_param *par, - unsigned int size, u16 proto, bool inv_proto) +static int xt_check_target_common(struct xt_tgchk_param *par, + unsigned int size, u16 proto, bool inv_proto) { - int ret; - if (XT_ALIGN(par->target->targetsize) != size) { pr_err_ratelimited("%s_tables: %s.%u target: invalid size %u (kernel) != (user) %u\n", xt_prefix[par->family], par->target->name, @@ -1061,6 +1094,23 @@ int xt_check_target(struct xt_tgchk_param *par, par->target->proto); return -EINVAL; } + + return 0; +} + +int xt_check_hooks_target(struct xt_tgchk_param *par) +{ + if (par->target->check_hooks != NULL) + return par->target->check_hooks(par); + + return 0; +} +EXPORT_SYMBOL_GPL(xt_check_hooks_target); + +static int xt_checkentry_target(struct xt_tgchk_param *par) +{ + int ret; + if (par->target->checkentry != NULL) { ret = par->target->checkentry(par); if (ret < 0) @@ -1071,6 +1121,22 @@ int xt_check_target(struct xt_tgchk_param *par, } return 0; } + +int xt_check_target(struct xt_tgchk_param *par, + unsigned int size, u16 proto, bool inv_proto) +{ + int ret; + + ret = xt_check_target_common(par, size, proto, inv_proto); + if (ret < 0) + return ret; + + ret = xt_check_hooks_target(par); + if (ret < 0) + return ret; + + return xt_checkentry_target(par); +} EXPORT_SYMBOL_GPL(xt_check_target); /** @@ -1409,11 +1475,9 @@ struct xt_counters *xt_counters_alloc(unsigned int counters) } EXPORT_SYMBOL(xt_counters_alloc); -struct xt_table_info * -xt_replace_table(struct xt_table *table, - unsigned int num_counters, - struct xt_table_info *newinfo, - int *error) +static struct xt_table_info * +do_replace_table(struct xt_table *table, unsigned int num_counters, + struct xt_table_info *newinfo, int *error) { struct xt_table_info *private; unsigned int cpu; @@ -1468,30 +1532,54 @@ xt_replace_table(struct xt_table *table, } } - audit_log_nfcfg(table->name, table->af, private->number, - !private->number ? AUDIT_XT_OP_REGISTER : - AUDIT_XT_OP_REPLACE, - GFP_KERNEL); + return private; +} + +struct xt_table_info * +xt_replace_table(struct xt_table *table, unsigned int num_counters, + struct xt_table_info *newinfo, + int *error) +{ + struct xt_table_info *private; + + private = do_replace_table(table, num_counters, newinfo, error); + if (private) + audit_log_nfcfg(table->name, table->af, private->number, + AUDIT_XT_OP_REPLACE, + GFP_KERNEL); + return private; } EXPORT_SYMBOL_GPL(xt_replace_table); struct xt_table *xt_register_table(struct net *net, const struct xt_table *input_table, + const struct nf_hook_ops *template_ops, struct xt_table_info *bootstrap, struct xt_table_info *newinfo) { struct xt_pernet *xt_net = net_generic(net, xt_pernet_id); + struct xt_table *t, *table = NULL; + struct nf_hook_ops *ops = NULL; struct xt_table_info *private; - struct xt_table *t, *table; - int ret; + unsigned int num_ops; + int ret = -EINVAL; + + num_ops = hweight32(input_table->valid_hooks); + if (num_ops == 0) + goto out; + + ret = -ENOMEM; + if (template_ops) { + ops = kmemdup_array(template_ops, num_ops, sizeof(*ops), GFP_KERNEL); + if (!ops) + goto out; + } /* Don't add one object to multiple lists. */ table = kmemdup(input_table, sizeof(struct xt_table), GFP_KERNEL); - if (!table) { - ret = -ENOMEM; + if (!table) goto out; - } mutex_lock(&xt[table->af].mutex); /* Don't autoload: we'd eat our tail... */ @@ -1505,7 +1593,7 @@ struct xt_table *xt_register_table(struct net *net, /* Simplifies replace_table code. */ table->private = bootstrap; - if (!xt_replace_table(table, 0, newinfo, &ret)) + if (!do_replace_table(table, 0, newinfo, &ret)) goto unlock; private = table->private; @@ -1514,34 +1602,122 @@ struct xt_table *xt_register_table(struct net *net, /* save number of initial entries */ private->initial_entries = private->number; + if (ops) { + int i; + + for (i = 0; i < num_ops; i++) + ops[i].priv = table; + + ret = nf_register_net_hooks(net, ops, num_ops); + if (ret != 0) { + mutex_unlock(&xt[table->af].mutex); + /* nf_register_net_hooks() might have published a + * base chain before internal error unwind. + */ + synchronize_rcu(); + goto out; + } + + table->ops = ops; + } + + audit_log_nfcfg(table->name, table->af, private->number, + AUDIT_XT_OP_REGISTER, GFP_KERNEL); + list_add(&table->list, &xt_net->tables[table->af]); mutex_unlock(&xt[table->af].mutex); return table; unlock: mutex_unlock(&xt[table->af].mutex); - kfree(table); out: + kfree(table); + kfree(ops); return ERR_PTR(ret); } EXPORT_SYMBOL_GPL(xt_register_table); -void *xt_unregister_table(struct xt_table *table) +/** + * xt_unregister_table_pre_exit - pre-shutdown unregister of a table + * @net: network namespace + * @af: address family (e.g., NFPROTO_IPV4, NFPROTO_IPV6) + * @name: name of the table to unregister + * + * Unregisters the specified netfilter table from the given network namespace + * and also unregisters the hooks from netfilter core: no new packets will be + * processed. + * + * This must be called prior to xt_unregister_table_exit() from the pernet + * .pre_exit callback. After this call, the table is no longer visible to + * the get/setsockopt path. In case of rmmod, module exit path must have + * called xt_unregister_template() prior to unregistering pernet ops to + * prevent re-instantiation of the table. + * + * See also: xt_unregister_table_exit() + */ +void xt_unregister_table_pre_exit(struct net *net, u8 af, const char *name) { - struct xt_table_info *private; + struct xt_pernet *xt_net = net_generic(net, xt_pernet_id); + struct xt_table *t; - mutex_lock(&xt[table->af].mutex); - private = table->private; - list_del(&table->list); - mutex_unlock(&xt[table->af].mutex); - audit_log_nfcfg(table->name, table->af, private->number, - AUDIT_XT_OP_UNREGISTER, GFP_KERNEL); - kfree(table->ops); - kfree(table); + mutex_lock(&xt[af].mutex); + list_for_each_entry(t, &xt_net->tables[af], list) { + if (strcmp(t->name, name) == 0) { + list_move(&t->list, &xt_net->dead_tables[af]); + mutex_unlock(&xt[af].mutex); - return private; + if (t->ops) /* nat table registers with nat core, t->ops is NULL. */ + nf_unregister_net_hooks(net, t->ops, hweight32(t->valid_hooks)); + return; + } + } + mutex_unlock(&xt[af].mutex); +} +EXPORT_SYMBOL(xt_unregister_table_pre_exit); + +/** + * xt_unregister_table_exit - remove a table during namespace teardown + * @net: the network namespace from which to unregister the table + * @af: address family (e.g., NFPROTO_IPV4, NFPROTO_IPV6) + * @name: name of the table to unregister + * + * Completes the unregister process for a table. This must be called from + * the pernet ops .exit callback. This is the second stage after + * xt_unregister_table_pre_exit(). + * + * pair with xt_unregister_table_pre_exit() during namespace shutdown. + * + * Return: the unregistered table or NULL if the table was never + * instantiated. The caller needs to kfree() the table after it + * has removed the family specific matches/targets. + */ +struct xt_table *xt_unregister_table_exit(struct net *net, u8 af, const char *name) +{ + struct xt_pernet *xt_net = net_generic(net, xt_pernet_id); + struct xt_table *table; + + mutex_lock(&xt[af].mutex); + list_for_each_entry(table, &xt_net->dead_tables[af], list) { + struct nf_hook_ops *ops = NULL; + + if (strcmp(table->name, name) != 0) + continue; + + list_del(&table->list); + + audit_log_nfcfg(table->name, table->af, table->private->number, + AUDIT_XT_OP_UNREGISTER, GFP_KERNEL); + swap(table->ops, ops); + mutex_unlock(&xt[af].mutex); + + kfree(ops); + return table; + } + mutex_unlock(&xt[af].mutex); + + return NULL; } -EXPORT_SYMBOL_GPL(xt_unregister_table); +EXPORT_SYMBOL_GPL(xt_unregister_table_exit); #endif #ifdef CONFIG_PROC_FS @@ -1988,8 +2164,10 @@ static int __net_init xt_net_init(struct net *net) struct xt_pernet *xt_net = net_generic(net, xt_pernet_id); int i; - for (i = 0; i < NFPROTO_NUMPROTO; i++) + for (i = 0; i < NFPROTO_NUMPROTO; i++) { INIT_LIST_HEAD(&xt_net->tables[i]); + INIT_LIST_HEAD(&xt_net->dead_tables[i]); + } return 0; } @@ -1998,8 +2176,10 @@ static void __net_exit xt_net_exit(struct net *net) struct xt_pernet *xt_net = net_generic(net, xt_pernet_id); int i; - for (i = 0; i < NFPROTO_NUMPROTO; i++) + for (i = 0; i < NFPROTO_NUMPROTO; i++) { WARN_ON_ONCE(!list_empty(&xt_net->tables[i])); + WARN_ON_ONCE(!list_empty(&xt_net->dead_tables[i])); + } } static struct pernet_operations xt_net_ops = { diff --git a/net/netfilter/xt_CT.c b/net/netfilter/xt_CT.c index 498f5871c84a0e..d2aeacf94230f8 100644 --- a/net/netfilter/xt_CT.c +++ b/net/netfilter/xt_CT.c @@ -354,7 +354,7 @@ static struct xt_target xt_ct_tg_reg[] __read_mostly = { .family = NFPROTO_IPV4, .revision = 1, .targetsize = sizeof(struct xt_ct_target_info_v1), - .usersize = offsetof(struct xt_ct_target_info, ct), + .usersize = offsetof(struct xt_ct_target_info_v1, ct), .checkentry = xt_ct_tg_check_v1, .destroy = xt_ct_tg_destroy_v1, .target = xt_ct_target_v1, @@ -366,7 +366,7 @@ static struct xt_target xt_ct_tg_reg[] __read_mostly = { .family = NFPROTO_IPV4, .revision = 2, .targetsize = sizeof(struct xt_ct_target_info_v1), - .usersize = offsetof(struct xt_ct_target_info, ct), + .usersize = offsetof(struct xt_ct_target_info_v1, ct), .checkentry = xt_ct_tg_check_v2, .destroy = xt_ct_tg_destroy_v1, .target = xt_ct_target_v1, @@ -398,7 +398,7 @@ static struct xt_target xt_ct_tg_reg[] __read_mostly = { .family = NFPROTO_IPV6, .revision = 1, .targetsize = sizeof(struct xt_ct_target_info_v1), - .usersize = offsetof(struct xt_ct_target_info, ct), + .usersize = offsetof(struct xt_ct_target_info_v1, ct), .checkentry = xt_ct_tg_check_v1, .destroy = xt_ct_tg_destroy_v1, .target = xt_ct_target_v1, @@ -410,7 +410,7 @@ static struct xt_target xt_ct_tg_reg[] __read_mostly = { .family = NFPROTO_IPV6, .revision = 2, .targetsize = sizeof(struct xt_ct_target_info_v1), - .usersize = offsetof(struct xt_ct_target_info, ct), + .usersize = offsetof(struct xt_ct_target_info_v1, ct), .checkentry = xt_ct_tg_check_v2, .destroy = xt_ct_tg_destroy_v1, .target = xt_ct_target_v1, diff --git a/net/netfilter/xt_TCPMSS.c b/net/netfilter/xt_TCPMSS.c index 116a885adb3cd6..80e1634bc51f85 100644 --- a/net/netfilter/xt_TCPMSS.c +++ b/net/netfilter/xt_TCPMSS.c @@ -247,6 +247,21 @@ tcpmss_tg6(struct sk_buff *skb, const struct xt_action_param *par) } #endif +static int tcpmss_tg4_check_hooks(const struct xt_tgchk_param *par) +{ + const struct xt_tcpmss_info *info = par->targinfo; + + if (info->mss == XT_TCPMSS_CLAMP_PMTU && + (par->hook_mask & ~((1 << NF_INET_FORWARD) | + (1 << NF_INET_LOCAL_OUT) | + (1 << NF_INET_POST_ROUTING))) != 0) { + pr_info_ratelimited("path-MTU clamping only supported in FORWARD, OUTPUT and POSTROUTING hooks\n"); + return -EINVAL; + } + + return 0; +} + /* Must specify -p tcp --syn */ static inline bool find_syn_match(const struct xt_entry_match *m) { @@ -262,17 +277,9 @@ static inline bool find_syn_match(const struct xt_entry_match *m) static int tcpmss_tg4_check(const struct xt_tgchk_param *par) { - const struct xt_tcpmss_info *info = par->targinfo; const struct ipt_entry *e = par->entryinfo; const struct xt_entry_match *ematch; - if (info->mss == XT_TCPMSS_CLAMP_PMTU && - (par->hook_mask & ~((1 << NF_INET_FORWARD) | - (1 << NF_INET_LOCAL_OUT) | - (1 << NF_INET_POST_ROUTING))) != 0) { - pr_info_ratelimited("path-MTU clamping only supported in FORWARD, OUTPUT and POSTROUTING hooks\n"); - return -EINVAL; - } if (par->nft_compat) return 0; @@ -286,17 +293,9 @@ static int tcpmss_tg4_check(const struct xt_tgchk_param *par) #if IS_ENABLED(CONFIG_IP6_NF_IPTABLES) static int tcpmss_tg6_check(const struct xt_tgchk_param *par) { - const struct xt_tcpmss_info *info = par->targinfo; const struct ip6t_entry *e = par->entryinfo; const struct xt_entry_match *ematch; - if (info->mss == XT_TCPMSS_CLAMP_PMTU && - (par->hook_mask & ~((1 << NF_INET_FORWARD) | - (1 << NF_INET_LOCAL_OUT) | - (1 << NF_INET_POST_ROUTING))) != 0) { - pr_info_ratelimited("path-MTU clamping only supported in FORWARD, OUTPUT and POSTROUTING hooks\n"); - return -EINVAL; - } if (par->nft_compat) return 0; @@ -312,6 +311,7 @@ static struct xt_target tcpmss_tg_reg[] __read_mostly = { { .family = NFPROTO_IPV4, .name = "TCPMSS", + .check_hooks = tcpmss_tg4_check_hooks, .checkentry = tcpmss_tg4_check, .target = tcpmss_tg4, .targetsize = sizeof(struct xt_tcpmss_info), @@ -322,6 +322,7 @@ static struct xt_target tcpmss_tg_reg[] __read_mostly = { { .family = NFPROTO_IPV6, .name = "TCPMSS", + .check_hooks = tcpmss_tg4_check_hooks, .checkentry = tcpmss_tg6_check, .target = tcpmss_tg6, .targetsize = sizeof(struct xt_tcpmss_info), diff --git a/net/netfilter/xt_TPROXY.c b/net/netfilter/xt_TPROXY.c index e4bea1d346cf92..5f60e7298a1ea9 100644 --- a/net/netfilter/xt_TPROXY.c +++ b/net/netfilter/xt_TPROXY.c @@ -86,6 +86,9 @@ tproxy_tg4_v0(struct sk_buff *skb, const struct xt_action_param *par) { const struct xt_tproxy_target_info *tgi = par->targinfo; + if (par->fragoff) + return NF_DROP; + return tproxy_tg4(xt_net(par), skb, tgi->laddr, tgi->lport, tgi->mark_mask, tgi->mark_value); } @@ -95,6 +98,9 @@ tproxy_tg4_v1(struct sk_buff *skb, const struct xt_action_param *par) { const struct xt_tproxy_target_info_v1 *tgi = par->targinfo; + if (par->fragoff) + return NF_DROP; + return tproxy_tg4(xt_net(par), skb, tgi->laddr.ip, tgi->lport, tgi->mark_mask, tgi->mark_value); } @@ -106,6 +112,7 @@ tproxy_tg6_v1(struct sk_buff *skb, const struct xt_action_param *par) { const struct ipv6hdr *iph = ipv6_hdr(skb); const struct xt_tproxy_target_info_v1 *tgi = par->targinfo; + unsigned short fragoff = 0; struct udphdr _hdr, *hp; struct sock *sk; const struct in6_addr *laddr; @@ -113,8 +120,8 @@ tproxy_tg6_v1(struct sk_buff *skb, const struct xt_action_param *par) int thoff = 0; int tproto; - tproto = ipv6_find_hdr(skb, &thoff, -1, NULL, NULL); - if (tproto < 0) + tproto = ipv6_find_hdr(skb, &thoff, -1, &fragoff, NULL); + if (tproto < 0 || fragoff) return NF_DROP; hp = skb_header_pointer(skb, thoff, sizeof(_hdr), &_hdr); diff --git a/net/netfilter/xt_addrtype.c b/net/netfilter/xt_addrtype.c index a7708894310716..913dbe3aa5e29e 100644 --- a/net/netfilter/xt_addrtype.c +++ b/net/netfilter/xt_addrtype.c @@ -153,14 +153,10 @@ addrtype_mt_v1(const struct sk_buff *skb, struct xt_action_param *par) return ret; } -static int addrtype_mt_checkentry_v1(const struct xt_mtchk_param *par) +static int addrtype_mt_check_hooks(const struct xt_mtchk_param *par) { - const char *errmsg = "both incoming and outgoing interface limitation cannot be selected"; struct xt_addrtype_info_v1 *info = par->matchinfo; - - if (info->flags & XT_ADDRTYPE_LIMIT_IFACE_IN && - info->flags & XT_ADDRTYPE_LIMIT_IFACE_OUT) - goto err; + const char *errmsg; if (par->hook_mask & ((1 << NF_INET_PRE_ROUTING) | (1 << NF_INET_LOCAL_IN)) && @@ -176,6 +172,21 @@ static int addrtype_mt_checkentry_v1(const struct xt_mtchk_param *par) goto err; } + return 0; +err: + pr_info_ratelimited("%s\n", errmsg); + return -EINVAL; +} + +static int addrtype_mt_checkentry_v1(const struct xt_mtchk_param *par) +{ + const char *errmsg = "both incoming and outgoing interface limitation cannot be selected"; + struct xt_addrtype_info_v1 *info = par->matchinfo; + + if (info->flags & XT_ADDRTYPE_LIMIT_IFACE_IN && + info->flags & XT_ADDRTYPE_LIMIT_IFACE_OUT) + goto err; + #if IS_ENABLED(CONFIG_IP6_NF_IPTABLES) if (par->family == NFPROTO_IPV6) { if ((info->source | info->dest) & XT_ADDRTYPE_BLACKHOLE) { @@ -211,6 +222,7 @@ static struct xt_match addrtype_mt_reg[] __read_mostly = { .family = NFPROTO_IPV4, .revision = 1, .match = addrtype_mt_v1, + .check_hooks = addrtype_mt_check_hooks, .checkentry = addrtype_mt_checkentry_v1, .matchsize = sizeof(struct xt_addrtype_info_v1), .me = THIS_MODULE @@ -221,6 +233,7 @@ static struct xt_match addrtype_mt_reg[] __read_mostly = { .family = NFPROTO_IPV6, .revision = 1, .match = addrtype_mt_v1, + .check_hooks = addrtype_mt_check_hooks, .checkentry = addrtype_mt_checkentry_v1, .matchsize = sizeof(struct xt_addrtype_info_v1), .me = THIS_MODULE diff --git a/net/netfilter/xt_devgroup.c b/net/netfilter/xt_devgroup.c index 9520dd00070b22..6d1a44ab5eeeb6 100644 --- a/net/netfilter/xt_devgroup.c +++ b/net/netfilter/xt_devgroup.c @@ -33,14 +33,10 @@ static bool devgroup_mt(const struct sk_buff *skb, struct xt_action_param *par) return true; } -static int devgroup_mt_checkentry(const struct xt_mtchk_param *par) +static int devgroup_mt_check_hooks(const struct xt_mtchk_param *par) { const struct xt_devgroup_info *info = par->matchinfo; - if (info->flags & ~(XT_DEVGROUP_MATCH_SRC | XT_DEVGROUP_INVERT_SRC | - XT_DEVGROUP_MATCH_DST | XT_DEVGROUP_INVERT_DST)) - return -EINVAL; - if (info->flags & XT_DEVGROUP_MATCH_SRC && par->hook_mask & ~((1 << NF_INET_PRE_ROUTING) | (1 << NF_INET_LOCAL_IN) | @@ -56,9 +52,21 @@ static int devgroup_mt_checkentry(const struct xt_mtchk_param *par) return 0; } +static int devgroup_mt_checkentry(const struct xt_mtchk_param *par) +{ + const struct xt_devgroup_info *info = par->matchinfo; + + if (info->flags & ~(XT_DEVGROUP_MATCH_SRC | XT_DEVGROUP_INVERT_SRC | + XT_DEVGROUP_MATCH_DST | XT_DEVGROUP_INVERT_DST)) + return -EINVAL; + + return 0; +} + static struct xt_match devgroup_mt_reg __read_mostly = { .name = "devgroup", .match = devgroup_mt, + .check_hooks = devgroup_mt_check_hooks, .checkentry = devgroup_mt_checkentry, .matchsize = sizeof(struct xt_devgroup_info), .family = NFPROTO_UNSPEC, diff --git a/net/netfilter/xt_ecn.c b/net/netfilter/xt_ecn.c index b96e8203ac5498..a8503f5d26bf41 100644 --- a/net/netfilter/xt_ecn.c +++ b/net/netfilter/xt_ecn.c @@ -30,6 +30,10 @@ static bool match_tcp(const struct sk_buff *skb, struct xt_action_param *par) struct tcphdr _tcph; const struct tcphdr *th; + /* this is fine for IPv6 as ecn_mt_check6() enforces -p tcp */ + if (par->fragoff) + return false; + /* In practice, TCP match does this, so can't fail. But let's * be good citizens. */ diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c index 3bd127bfc11494..2704b4b60d1e04 100644 --- a/net/netfilter/xt_hashlimit.c +++ b/net/netfilter/xt_hashlimit.c @@ -658,6 +658,8 @@ hashlimit_init_dst(const struct xt_hashlimit_htable *hinfo, if (!(hinfo->cfg.mode & (XT_HASHLIMIT_HASH_DPT | XT_HASHLIMIT_HASH_SPT))) return 0; + if (ntohs(ip_hdr(skb)->frag_off) & IP_OFFSET) + return -1; nexthdr = ip_hdr(skb)->protocol; break; #if IS_ENABLED(CONFIG_IP6_NF_IPTABLES) @@ -681,7 +683,7 @@ hashlimit_init_dst(const struct xt_hashlimit_htable *hinfo, return 0; nexthdr = ipv6_hdr(skb)->nexthdr; protoff = ipv6_skip_exthdr(skb, sizeof(struct ipv6hdr), &nexthdr, &frag_off); - if ((int)protoff < 0) + if ((int)protoff < 0 || ntohs(frag_off) & IP6_OFFSET) return -1; break; } diff --git a/net/netfilter/xt_osf.c b/net/netfilter/xt_osf.c index dc9485854002aa..e8807caede68e6 100644 --- a/net/netfilter/xt_osf.c +++ b/net/netfilter/xt_osf.c @@ -27,6 +27,9 @@ static bool xt_osf_match_packet(const struct sk_buff *skb, struct xt_action_param *p) { + if (p->fragoff) + return false; + return nf_osf_match(skb, xt_family(p), xt_hooknum(p), xt_in(p), xt_out(p), p->matchinfo, xt_net(p), nf_osf_fingers); } diff --git a/net/netfilter/xt_physdev.c b/net/netfilter/xt_physdev.c index d2b0b52434fa90..dd98f758176c2e 100644 --- a/net/netfilter/xt_physdev.c +++ b/net/netfilter/xt_physdev.c @@ -91,14 +91,10 @@ physdev_mt(const struct sk_buff *skb, struct xt_action_param *par) return (!!ret ^ !(info->invert & XT_PHYSDEV_OP_OUT)); } -static int physdev_mt_check(const struct xt_mtchk_param *par) +static int physdev_mt_check_hooks(const struct xt_mtchk_param *par) { const struct xt_physdev_info *info = par->matchinfo; - static bool brnf_probed __read_mostly; - if (!(info->bitmask & XT_PHYSDEV_OP_MASK) || - info->bitmask & ~XT_PHYSDEV_OP_MASK) - return -EINVAL; if (info->bitmask & (XT_PHYSDEV_OP_OUT | XT_PHYSDEV_OP_ISOUT) && (!(info->bitmask & XT_PHYSDEV_OP_BRIDGED) || info->invert & XT_PHYSDEV_OP_BRIDGED) && @@ -107,6 +103,18 @@ static int physdev_mt_check(const struct xt_mtchk_param *par) return -EINVAL; } + return 0; +} + +static int physdev_mt_check(const struct xt_mtchk_param *par) +{ + const struct xt_physdev_info *info = par->matchinfo; + static bool brnf_probed __read_mostly; + + if (!(info->bitmask & XT_PHYSDEV_OP_MASK) || + info->bitmask & ~XT_PHYSDEV_OP_MASK) + return -EINVAL; + #define X(memb) strnlen(info->memb, sizeof(info->memb)) >= sizeof(info->memb) if (info->bitmask & XT_PHYSDEV_OP_IN) { if (info->physindev[0] == '\0') @@ -141,6 +149,7 @@ static struct xt_match physdev_mt_reg[] __read_mostly = { { .name = "physdev", .family = NFPROTO_IPV4, + .check_hooks = physdev_mt_check_hooks, .checkentry = physdev_mt_check, .match = physdev_mt, .matchsize = sizeof(struct xt_physdev_info), @@ -149,6 +158,7 @@ static struct xt_match physdev_mt_reg[] __read_mostly = { { .name = "physdev", .family = NFPROTO_IPV6, + .check_hooks = physdev_mt_check_hooks, .checkentry = physdev_mt_check, .match = physdev_mt, .matchsize = sizeof(struct xt_physdev_info), diff --git a/net/netfilter/xt_policy.c b/net/netfilter/xt_policy.c index b5fa65558318f5..ff54e3a8581e1f 100644 --- a/net/netfilter/xt_policy.c +++ b/net/netfilter/xt_policy.c @@ -126,13 +126,10 @@ policy_mt(const struct sk_buff *skb, struct xt_action_param *par) return ret; } -static int policy_mt_check(const struct xt_mtchk_param *par) +static int policy_mt_check_hooks(const struct xt_mtchk_param *par) { const struct xt_policy_info *info = par->matchinfo; - const char *errmsg = "neither incoming nor outgoing policy selected"; - - if (!(info->flags & (XT_POLICY_MATCH_IN|XT_POLICY_MATCH_OUT))) - goto err; + const char *errmsg; if (par->hook_mask & ((1 << NF_INET_PRE_ROUTING) | (1 << NF_INET_LOCAL_IN)) && info->flags & XT_POLICY_MATCH_OUT) { @@ -144,6 +141,21 @@ static int policy_mt_check(const struct xt_mtchk_param *par) errmsg = "input policy not valid in POSTROUTING and OUTPUT"; goto err; } + + return 0; +err: + pr_info_ratelimited("%s\n", errmsg); + return -EINVAL; +} + +static int policy_mt_check(const struct xt_mtchk_param *par) +{ + const struct xt_policy_info *info = par->matchinfo; + const char *errmsg = "neither incoming nor outgoing policy selected"; + + if (!(info->flags & (XT_POLICY_MATCH_IN|XT_POLICY_MATCH_OUT))) + goto err; + if (info->len > XT_POLICY_MAX_ELEM) { errmsg = "too many policy elements"; goto err; @@ -158,6 +170,7 @@ static struct xt_match policy_mt_reg[] __read_mostly = { { .name = "policy", .family = NFPROTO_IPV4, + .check_hooks = policy_mt_check_hooks, .checkentry = policy_mt_check, .match = policy_mt, .matchsize = sizeof(struct xt_policy_info), @@ -166,6 +179,7 @@ static struct xt_match policy_mt_reg[] __read_mostly = { { .name = "policy", .family = NFPROTO_IPV6, + .check_hooks = policy_mt_check_hooks, .checkentry = policy_mt_check, .match = policy_mt, .matchsize = sizeof(struct xt_policy_info), diff --git a/net/netfilter/xt_set.c b/net/netfilter/xt_set.c index 731bc2cafae4bc..4ae04bba93581c 100644 --- a/net/netfilter/xt_set.c +++ b/net/netfilter/xt_set.c @@ -430,6 +430,29 @@ set_target_v3(struct sk_buff *skb, const struct xt_action_param *par) return XT_CONTINUE; } +static int +set_target_v3_check_hooks(const struct xt_tgchk_param *par) +{ + const struct xt_set_info_target_v3 *info = par->targinfo; + + if (info->map_set.index != IPSET_INVALID_ID) { + if (strncmp(par->table, "mangle", 7)) { + pr_info_ratelimited("--map-set only usable from mangle table\n"); + return -EINVAL; + } + if (((info->flags & IPSET_FLAG_MAP_SKBPRIO) | + (info->flags & IPSET_FLAG_MAP_SKBQUEUE)) && + (par->hook_mask & ~(1 << NF_INET_FORWARD | + 1 << NF_INET_LOCAL_OUT | + 1 << NF_INET_POST_ROUTING))) { + pr_info_ratelimited("mapping of prio or/and queue is allowed only from OUTPUT/FORWARD/POSTROUTING chains\n"); + return -EINVAL; + } + } + + return 0; +} + static int set_target_v3_checkentry(const struct xt_tgchk_param *par) { @@ -459,20 +482,6 @@ set_target_v3_checkentry(const struct xt_tgchk_param *par) } if (info->map_set.index != IPSET_INVALID_ID) { - if (strncmp(par->table, "mangle", 7)) { - pr_info_ratelimited("--map-set only usable from mangle table\n"); - ret = -EINVAL; - goto cleanup_del; - } - if (((info->flags & IPSET_FLAG_MAP_SKBPRIO) | - (info->flags & IPSET_FLAG_MAP_SKBQUEUE)) && - (par->hook_mask & ~(1 << NF_INET_FORWARD | - 1 << NF_INET_LOCAL_OUT | - 1 << NF_INET_POST_ROUTING))) { - pr_info_ratelimited("mapping of prio or/and queue is allowed only from OUTPUT/FORWARD/POSTROUTING chains\n"); - ret = -EINVAL; - goto cleanup_del; - } index = ip_set_nfnl_get_byindex(par->net, info->map_set.index); if (index == IPSET_INVALID_ID) { @@ -672,6 +681,7 @@ static struct xt_target set_targets[] __read_mostly = { .family = NFPROTO_IPV4, .target = set_target_v3, .targetsize = sizeof(struct xt_set_info_target_v3), + .check_hooks = set_target_v3_check_hooks, .checkentry = set_target_v3_checkentry, .destroy = set_target_v3_destroy, .me = THIS_MODULE @@ -682,6 +692,7 @@ static struct xt_target set_targets[] __read_mostly = { .family = NFPROTO_IPV6, .target = set_target_v3, .targetsize = sizeof(struct xt_set_info_target_v3), + .check_hooks = set_target_v3_check_hooks, .checkentry = set_target_v3_checkentry, .destroy = set_target_v3_destroy, .me = THIS_MODULE diff --git a/net/netfilter/xt_tcpmss.c b/net/netfilter/xt_tcpmss.c index 0d32d4841cb325..b9da8269161d8f 100644 --- a/net/netfilter/xt_tcpmss.c +++ b/net/netfilter/xt_tcpmss.c @@ -32,6 +32,10 @@ tcpmss_mt(const struct sk_buff *skb, struct xt_action_param *par) u8 _opt[15 * 4 - sizeof(_tcph)]; unsigned int i, optlen; + /* this is fine for IPv6 as xt_tcpmss enforces -p tcp */ + if (par->fragoff) + return false; + /* If we don't have the whole header, drop packet. */ th = skb_header_pointer(skb, par->thoff, sizeof(_tcph), &_tcph); if (th == NULL) diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c index d251d894afd468..0da39eaed255f4 100644 --- a/net/netlink/genetlink.c +++ b/net/netlink/genetlink.c @@ -1972,8 +1972,10 @@ int genlmsg_multicast_allns(const struct genl_family *family, struct sk_buff *skb, u32 portid, unsigned int group) { - if (WARN_ON_ONCE(group >= family->n_mcgrps)) + if (WARN_ON_ONCE(group >= family->n_mcgrps)) { + kfree_skb(skb); return -EINVAL; + } group = family->mcgrp_offset + group; return genlmsg_mcast(skb, portid, group); @@ -1986,8 +1988,10 @@ void genl_notify(const struct genl_family *family, struct sk_buff *skb, struct net *net = genl_info_net(info); struct sock *sk = net->genl_sock; - if (WARN_ON_ONCE(group >= family->n_mcgrps)) + if (WARN_ON_ONCE(group >= family->n_mcgrps)) { + kfree_skb(skb); return; + } group = family->mcgrp_offset + group; nlmsg_notify(sk, skb, info->snd_portid, group, diff --git a/net/openvswitch/vport-geneve.c b/net/openvswitch/vport-geneve.c index b10e1602c6b147..cb5ea4424ffc80 100644 --- a/net/openvswitch/vport-geneve.c +++ b/net/openvswitch/vport-geneve.c @@ -97,6 +97,9 @@ static struct vport *geneve_tnl_create(const struct vport_parms *parms) goto error; } + vport->dev = dev; + netdev_hold(vport->dev, &vport->dev_tracker, GFP_KERNEL); + rtnl_unlock(); return vport; error: @@ -111,7 +114,7 @@ static struct vport *geneve_create(const struct vport_parms *parms) if (IS_ERR(vport)) return vport; - return ovs_netdev_link(vport, parms->name); + return ovs_netdev_link(vport, true); } static struct vport_ops ovs_geneve_vport_ops = { diff --git a/net/openvswitch/vport-gre.c b/net/openvswitch/vport-gre.c index 4014c9b5eb7987..6cb5a697b396a3 100644 --- a/net/openvswitch/vport-gre.c +++ b/net/openvswitch/vport-gre.c @@ -63,6 +63,9 @@ static struct vport *gre_tnl_create(const struct vport_parms *parms) return ERR_PTR(err); } + vport->dev = dev; + netdev_hold(vport->dev, &vport->dev_tracker, GFP_KERNEL); + rtnl_unlock(); return vport; } @@ -75,7 +78,7 @@ static struct vport *gre_create(const struct vport_parms *parms) if (IS_ERR(vport)) return vport; - return ovs_netdev_link(vport, parms->name); + return ovs_netdev_link(vport, true); } static struct vport_ops ovs_gre_vport_ops = { diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c index 12055af832dc08..c42642075685d7 100644 --- a/net/openvswitch/vport-netdev.c +++ b/net/openvswitch/vport-netdev.c @@ -73,37 +73,21 @@ static struct net_device *get_dpdev(const struct datapath *dp) return local->dev; } -struct vport *ovs_netdev_link(struct vport *vport, const char *name) +struct vport *ovs_netdev_link(struct vport *vport, bool tunnel) { int err; - vport->dev = dev_get_by_name(ovs_dp_get_net(vport->dp), name); - if (!vport->dev) { + if (WARN_ON_ONCE(!vport->dev)) { err = -ENODEV; goto error_free_vport; } - /* Ensure that the device exists and that the provided - * name is not one of its aliases. - */ - if (strcmp(name, ovs_vport_name(vport))) { - err = -ENODEV; - goto error_put; - } - netdev_tracker_alloc(vport->dev, &vport->dev_tracker, GFP_KERNEL); - if (vport->dev->flags & IFF_LOOPBACK || - (vport->dev->type != ARPHRD_ETHER && - vport->dev->type != ARPHRD_NONE) || - ovs_is_internal_dev(vport->dev)) { - err = -EINVAL; - goto error_put; - } rtnl_lock(); err = netdev_master_upper_dev_link(vport->dev, get_dpdev(vport->dp), NULL, NULL, NULL); if (err) - goto error_unlock; + goto error_put_unlock; err = netdev_rx_handler_register(vport->dev, netdev_frame_hook, vport); @@ -119,10 +103,11 @@ struct vport *ovs_netdev_link(struct vport *vport, const char *name) error_master_upper_dev_unlink: netdev_upper_dev_unlink(vport->dev, get_dpdev(vport->dp)); -error_unlock: - rtnl_unlock(); -error_put: +error_put_unlock: + if (tunnel && vport->dev->reg_state == NETREG_REGISTERED) + rtnl_delete_link(vport->dev, 0, NULL); netdev_put(vport->dev, &vport->dev_tracker); + rtnl_unlock(); error_free_vport: ovs_vport_free(vport); return ERR_PTR(err); @@ -132,12 +117,39 @@ EXPORT_SYMBOL_GPL(ovs_netdev_link); static struct vport *netdev_create(const struct vport_parms *parms) { struct vport *vport; + int err; vport = ovs_vport_alloc(0, &ovs_netdev_vport_ops, parms); if (IS_ERR(vport)) return vport; - return ovs_netdev_link(vport, parms->name); + vport->dev = dev_get_by_name(ovs_dp_get_net(vport->dp), parms->name); + if (!vport->dev) { + err = -ENODEV; + goto error_free_vport; + } + netdev_tracker_alloc(vport->dev, &vport->dev_tracker, GFP_KERNEL); + + /* Ensure that the provided name is not an alias. */ + if (strcmp(parms->name, ovs_vport_name(vport))) { + err = -ENODEV; + goto error_put; + } + + if (vport->dev->flags & IFF_LOOPBACK || + (vport->dev->type != ARPHRD_ETHER && + vport->dev->type != ARPHRD_NONE) || + ovs_is_internal_dev(vport->dev)) { + err = -EINVAL; + goto error_put; + } + + return ovs_netdev_link(vport, false); +error_put: + netdev_put(vport->dev, &vport->dev_tracker); +error_free_vport: + ovs_vport_free(vport); + return ERR_PTR(err); } static void vport_netdev_free(struct rcu_head *rcu) @@ -196,9 +208,13 @@ void ovs_netdev_tunnel_destroy(struct vport *vport) */ if (vport->dev->reg_state == NETREG_REGISTERED) rtnl_delete_link(vport->dev, 0, NULL); - rtnl_unlock(); + /* We can't put the device reference yet, since it can still be in + * use, but rtnl_unlock()->netdev_run_todo() will block until all + * the references are released, so the RCU call must be before it. + */ call_rcu(&vport->rcu, vport_netdev_free); + rtnl_unlock(); } EXPORT_SYMBOL_GPL(ovs_netdev_tunnel_destroy); diff --git a/net/openvswitch/vport-netdev.h b/net/openvswitch/vport-netdev.h index c5d83a43bfc49b..6c0d7366f9862e 100644 --- a/net/openvswitch/vport-netdev.h +++ b/net/openvswitch/vport-netdev.h @@ -13,7 +13,7 @@ struct vport *ovs_netdev_get_vport(struct net_device *dev); -struct vport *ovs_netdev_link(struct vport *vport, const char *name); +struct vport *ovs_netdev_link(struct vport *vport, bool tunnel); void ovs_netdev_detach_dev(struct vport *); int __init ovs_netdev_init(void); diff --git a/net/openvswitch/vport-vxlan.c b/net/openvswitch/vport-vxlan.c index 0b881b043bcf4a..c1b37b50d29e15 100644 --- a/net/openvswitch/vport-vxlan.c +++ b/net/openvswitch/vport-vxlan.c @@ -126,6 +126,9 @@ static struct vport *vxlan_tnl_create(const struct vport_parms *parms) goto error; } + vport->dev = dev; + netdev_hold(vport->dev, &vport->dev_tracker, GFP_KERNEL); + rtnl_unlock(); return vport; error: @@ -140,7 +143,7 @@ static struct vport *vxlan_create(const struct vport_parms *parms) if (IS_ERR(vport)) return vport; - return ovs_netdev_link(vport, parms->name); + return ovs_netdev_link(vport, true); } static struct vport_ops ovs_vxlan_netdev_vport_ops = { diff --git a/net/psp/psp_main.c b/net/psp/psp_main.c index 9508b6c380038b..e45549f08eef8a 100644 --- a/net/psp/psp_main.c +++ b/net/psp/psp_main.c @@ -263,15 +263,16 @@ EXPORT_SYMBOL(psp_dev_encapsulate); /* Receive handler for PSP packets. * - * Presently it accepts only already-authenticated packets and does not - * support optional fields, such as virtualization cookies. The caller should - * ensure that skb->data is pointing to the mac header, and that skb->mac_len - * is set. This function does not currently adjust skb->csum (CHECKSUM_COMPLETE - * is not supported). + * Accepts only already-authenticated packets. The full PSP header is + * stripped according to psph->hdrlen; any optional fields it advertises + * (virtualization cookies, etc.) are ignored and discarded along with the + * rest of the header. The caller should ensure that skb->data is pointing + * to the mac header, and that skb->mac_len is set. This function does not + * currently adjust skb->csum (CHECKSUM_COMPLETE is not supported). */ int psp_dev_rcv(struct sk_buff *skb, u16 dev_id, u8 generation, bool strip_icv) { - int l2_hlen = 0, l3_hlen, encap; + int l2_hlen = 0, l3_hlen, encap, psp_hlen; struct psp_skb_ext *pse; struct psphdr *psph; struct ethhdr *eth; @@ -312,18 +313,36 @@ int psp_dev_rcv(struct sk_buff *skb, u16 dev_id, u8 generation, bool strip_icv) if (unlikely(uh->dest != htons(PSP_DEFAULT_UDP_PORT))) return -EINVAL; - pse = skb_ext_add(skb, SKB_EXT_PSP); - if (!pse) + psph = (struct psphdr *)(skb->data + l2_hlen + l3_hlen + + sizeof(struct udphdr)); + + /* Strip the full PSP header per psph->hdrlen; VC/options are pulled + * into the linear region only so they can be discarded with the + * rest of the header. + */ + psp_hlen = (psph->hdrlen + 1) * 8; + + if (unlikely(psp_hlen < sizeof(struct psphdr))) + return -EINVAL; + + if (psp_hlen > sizeof(struct psphdr) && + !pskb_may_pull(skb, l2_hlen + l3_hlen + + sizeof(struct udphdr) + psp_hlen)) return -EINVAL; psph = (struct psphdr *)(skb->data + l2_hlen + l3_hlen + sizeof(struct udphdr)); + + pse = skb_ext_add(skb, SKB_EXT_PSP); + if (!pse) + return -EINVAL; + pse->spi = psph->spi; pse->dev_id = dev_id; pse->generation = generation; pse->version = FIELD_GET(PSPHDR_VERFL_VERSION, psph->verfl); - encap = PSP_ENCAP_HLEN; + encap = sizeof(struct udphdr) + psp_hlen; encap += strip_icv ? PSP_TRL_SIZE : 0; if (proto == htons(ETH_P_IP)) { @@ -340,8 +359,9 @@ int psp_dev_rcv(struct sk_buff *skb, u16 dev_id, u8 generation, bool strip_icv) ipv6h->payload_len = htons(ntohs(ipv6h->payload_len) - encap); } - memmove(skb->data + PSP_ENCAP_HLEN, skb->data, l2_hlen + l3_hlen); - skb_pull(skb, PSP_ENCAP_HLEN); + memmove(skb->data + sizeof(struct udphdr) + psp_hlen, + skb->data, l2_hlen + l3_hlen); + skb_pull(skb, sizeof(struct udphdr) + psp_hlen); if (strip_icv) pskb_trim(skb, skb->len - PSP_TRL_SIZE); diff --git a/net/rds/message.c b/net/rds/message.c index eaa6f22601a447..7feb0eb6537db8 100644 --- a/net/rds/message.c +++ b/net/rds/message.c @@ -131,24 +131,34 @@ static void rds_rm_zerocopy_callback(struct rds_sock *rs, */ static void rds_message_purge(struct rds_message *rm) { + struct rds_znotifier *znotifier; unsigned long i, flags; - bool zcopy = false; + bool zcopy; if (unlikely(test_bit(RDS_MSG_PAGEVEC, &rm->m_flags))) return; spin_lock_irqsave(&rm->m_rs_lock, flags); + znotifier = rm->data.op_mmp_znotifier; + rm->data.op_mmp_znotifier = NULL; + zcopy = !!znotifier; + if (rm->m_rs) { struct rds_sock *rs = rm->m_rs; - if (rm->data.op_mmp_znotifier) { - zcopy = true; - rds_rm_zerocopy_callback(rs, rm->data.op_mmp_znotifier); + if (znotifier) { + rds_rm_zerocopy_callback(rs, znotifier); rds_wake_sk_sleep(rs); - rm->data.op_mmp_znotifier = NULL; } sock_put(rds_rs_to_sk(rs)); rm->m_rs = NULL; + } else if (znotifier) { + /* + * Zerocopy can fail before the message is queued on the + * socket, so there is no rs to carry the notification. + */ + mm_unaccount_pinned_pages(&znotifier->z_mmp); + kfree(rds_info_from_znotifier(znotifier)); } spin_unlock_irqrestore(&rm->m_rs_lock, flags); @@ -438,6 +448,7 @@ static int rds_message_zcopy_from_user(struct rds_message *rm, struct iov_iter * for (i = 0; i < rm->data.op_nents; i++) put_page(sg_page(&rm->data.op_sg[i])); + rm->data.op_nents = 0; mmp = &rm->data.op_mmp_znotifier->z_mmp; mm_unaccount_pinned_pages(mmp); ret = -EFAULT; diff --git a/net/rxrpc/call_event.c b/net/rxrpc/call_event.c index fdd683261226cf..2b19b252225e55 100644 --- a/net/rxrpc/call_event.c +++ b/net/rxrpc/call_event.c @@ -334,7 +334,9 @@ bool rxrpc_input_call_event(struct rxrpc_call *call) if (sp->hdr.type == RXRPC_PACKET_TYPE_DATA && sp->hdr.securityIndex != 0 && - skb_cloned(skb)) { + (skb_cloned(skb) || + skb_has_frag_list(skb) || + skb_has_shared_frag(skb))) { /* Unshare the packet so that it can be * modified by in-place decryption. */ diff --git a/net/rxrpc/conn_event.c b/net/rxrpc/conn_event.c index a2130d25aaa9b7..442414d90ba1cd 100644 --- a/net/rxrpc/conn_event.c +++ b/net/rxrpc/conn_event.c @@ -245,7 +245,8 @@ static int rxrpc_verify_response(struct rxrpc_connection *conn, { int ret; - if (skb_cloned(skb)) { + if (skb_cloned(skb) || skb_has_frag_list(skb) || + skb_has_shared_frag(skb)) { /* Copy the packet if shared so that we can do in-place * decryption. */ diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c index 13c6d1869a1447..5862933be8d746 100644 --- a/net/sched/sch_cake.c +++ b/net/sched/sch_cake.c @@ -399,14 +399,14 @@ static void cake_configure_rates(struct Qdisc *sch, u64 rate, bool rate_adjust); * Here, invsqrt is a fixed point number (< 1.0), 32bit mantissa, aka Q0.32 */ -static void cobalt_newton_step(struct cobalt_vars *vars) +static void cobalt_newton_step(struct cobalt_vars *vars, u32 count) { u32 invsqrt, invsqrt2; u64 val; invsqrt = vars->rec_inv_sqrt; invsqrt2 = ((u64)invsqrt * invsqrt) >> 32; - val = (3LL << 32) - ((u64)vars->count * invsqrt2); + val = (3LL << 32) - ((u64)count * invsqrt2); val >>= 2; /* avoid overflow in following multiply */ val = (val * invsqrt) >> (32 - 2 + 1); @@ -414,12 +414,12 @@ static void cobalt_newton_step(struct cobalt_vars *vars) vars->rec_inv_sqrt = val; } -static void cobalt_invsqrt(struct cobalt_vars *vars) +static void cobalt_invsqrt(struct cobalt_vars *vars, u32 count) { - if (vars->count < REC_INV_SQRT_CACHE) - vars->rec_inv_sqrt = inv_sqrt_cache[vars->count]; + if (count < REC_INV_SQRT_CACHE) + vars->rec_inv_sqrt = inv_sqrt_cache[count]; else - cobalt_newton_step(vars); + cobalt_newton_step(vars, count); } static void cobalt_vars_init(struct cobalt_vars *vars) @@ -449,16 +449,19 @@ static bool cobalt_queue_full(struct cobalt_vars *vars, bool up = false; if (ktime_to_ns(ktime_sub(now, vars->blue_timer)) > p->target) { - up = !vars->p_drop; - vars->p_drop += p->p_inc; - if (vars->p_drop < p->p_inc) - vars->p_drop = ~0; - vars->blue_timer = now; - } - vars->dropping = true; - vars->drop_next = now; + u32 p_drop = vars->p_drop; + + up = !p_drop; + p_drop += p->p_inc; + if (p_drop < p->p_inc) + p_drop = ~0; + WRITE_ONCE(vars->p_drop, p_drop); + WRITE_ONCE(vars->blue_timer, now); + } + WRITE_ONCE(vars->dropping, true); + WRITE_ONCE(vars->drop_next, now); if (!vars->count) - vars->count = 1; + WRITE_ONCE(vars->count, 1); return up; } @@ -475,20 +478,20 @@ static bool cobalt_queue_empty(struct cobalt_vars *vars, if (vars->p_drop && ktime_to_ns(ktime_sub(now, vars->blue_timer)) > p->target) { if (vars->p_drop < p->p_dec) - vars->p_drop = 0; + WRITE_ONCE(vars->p_drop, 0); else - vars->p_drop -= p->p_dec; - vars->blue_timer = now; + WRITE_ONCE(vars->p_drop, vars->p_drop - p->p_dec); + WRITE_ONCE(vars->blue_timer, now); down = !vars->p_drop; } - vars->dropping = false; + WRITE_ONCE(vars->dropping, false); if (vars->count && ktime_to_ns(ktime_sub(now, vars->drop_next)) >= 0) { - vars->count--; - cobalt_invsqrt(vars); - vars->drop_next = cobalt_control(vars->drop_next, - p->interval, - vars->rec_inv_sqrt); + WRITE_ONCE(vars->count, vars->count - 1); + cobalt_invsqrt(vars, vars->count); + WRITE_ONCE(vars->drop_next, + cobalt_control(vars->drop_next, p->interval, + vars->rec_inv_sqrt)); } return down; @@ -507,6 +510,7 @@ static enum qdisc_drop_reason cobalt_should_drop(struct cobalt_vars *vars, bool next_due, over_target; ktime_t schedule; u64 sojourn; + u32 count; /* The 'schedule' variable records, in its sign, whether 'now' is before or * after 'drop_next'. This allows 'drop_next' to be updated before the next @@ -528,21 +532,22 @@ static enum qdisc_drop_reason cobalt_should_drop(struct cobalt_vars *vars, over_target = sojourn > p->target && sojourn > p->mtu_time * bulk_flows * 2 && sojourn > p->mtu_time * 4; - next_due = vars->count && ktime_to_ns(schedule) >= 0; + count = vars->count; + next_due = count && ktime_to_ns(schedule) >= 0; vars->ecn_marked = false; if (over_target) { if (!vars->dropping) { - vars->dropping = true; - vars->drop_next = cobalt_control(now, - p->interval, - vars->rec_inv_sqrt); + WRITE_ONCE(vars->dropping, true); + WRITE_ONCE(vars->drop_next, + cobalt_control(now, p->interval, + vars->rec_inv_sqrt)); } - if (!vars->count) - vars->count = 1; + if (!count) + count = 1; } else if (vars->dropping) { - vars->dropping = false; + WRITE_ONCE(vars->dropping, false); } if (next_due && vars->dropping) { @@ -550,23 +555,23 @@ static enum qdisc_drop_reason cobalt_should_drop(struct cobalt_vars *vars, if (!(vars->ecn_marked = INET_ECN_set_ce(skb))) reason = QDISC_DROP_CONGESTED; - vars->count++; - if (!vars->count) - vars->count--; - cobalt_invsqrt(vars); - vars->drop_next = cobalt_control(vars->drop_next, - p->interval, - vars->rec_inv_sqrt); + count++; + if (!count) + count--; + cobalt_invsqrt(vars, count); + WRITE_ONCE(vars->drop_next, + cobalt_control(vars->drop_next, p->interval, + vars->rec_inv_sqrt)); schedule = ktime_sub(now, vars->drop_next); } else { while (next_due) { - vars->count--; - cobalt_invsqrt(vars); - vars->drop_next = cobalt_control(vars->drop_next, - p->interval, - vars->rec_inv_sqrt); + count--; + cobalt_invsqrt(vars, count); + WRITE_ONCE(vars->drop_next, + cobalt_control(vars->drop_next, p->interval, + vars->rec_inv_sqrt)); schedule = ktime_sub(now, vars->drop_next); - next_due = vars->count && ktime_to_ns(schedule) >= 0; + next_due = count && ktime_to_ns(schedule) >= 0; } } @@ -575,11 +580,12 @@ static enum qdisc_drop_reason cobalt_should_drop(struct cobalt_vars *vars, get_random_u32() < vars->p_drop) reason = QDISC_DROP_FLOOD_PROTECTION; + WRITE_ONCE(vars->count, count); /* Overload the drop_next field as an activity timeout */ - if (!vars->count) - vars->drop_next = ktime_add_ns(now, p->interval); + if (!count) + WRITE_ONCE(vars->drop_next, ktime_add_ns(now, p->interval)); else if (ktime_to_ns(schedule) > 0 && reason == QDISC_DROP_UNSPEC) - vars->drop_next = now; + WRITE_ONCE(vars->drop_next, now); return reason; } @@ -914,7 +920,7 @@ static struct sk_buff *dequeue_head(struct cake_flow *flow) struct sk_buff *skb = flow->head; if (skb) { - flow->head = skb->next; + WRITE_ONCE(flow->head, skb->next); skb_mark_not_on_list(skb); } @@ -926,7 +932,7 @@ static struct sk_buff *dequeue_head(struct cake_flow *flow) static void flow_queue_add(struct cake_flow *flow, struct sk_buff *skb) { if (!flow->head) - flow->head = skb; + WRITE_ONCE(flow->head, skb); else flow->tail->next = skb; flow->tail = skb; @@ -1357,7 +1363,7 @@ static struct sk_buff *cake_ack_filter(struct cake_sched_data *q, if (elig_ack_prev) elig_ack_prev->next = elig_ack->next; else - flow->head = elig_ack->next; + WRITE_ONCE(flow->head, elig_ack->next); skb_mark_not_on_list(elig_ack); @@ -1595,11 +1601,11 @@ static unsigned int cake_drop(struct Qdisc *sch, struct sk_buff **to_free) len = qdisc_pkt_len(skb); q->buffer_used -= skb->truesize; - b->backlogs[idx] -= len; WRITE_ONCE(b->tin_backlog, b->tin_backlog - len); + WRITE_ONCE(b->backlogs[idx], b->backlogs[idx] - len); sch->qstats.backlog -= len; - flow->dropped++; + WRITE_ONCE(flow->dropped, flow->dropped + 1); WRITE_ONCE(b->tin_dropped, b->tin_dropped + 1); if (q->config->rate_flags & CAKE_FLAG_INGRESS) @@ -1824,11 +1830,11 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch, } /* stats */ - b->backlogs[idx] += slen; sch->qstats.backlog += slen; q->avg_window_bytes += slen; WRITE_ONCE(b->bytes, b->bytes + slen); WRITE_ONCE(b->tin_backlog, b->tin_backlog + slen); + WRITE_ONCE(b->backlogs[idx], b->backlogs[idx] + slen); qdisc_tree_reduce_backlog(sch, 1-numsegs, len-slen); consume_skb(skb); @@ -1861,11 +1867,11 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch, /* stats */ WRITE_ONCE(b->packets, b->packets + 1); - b->backlogs[idx] += len - ack_pkt_len; sch->qstats.backlog += len - ack_pkt_len; q->avg_window_bytes += len - ack_pkt_len; WRITE_ONCE(b->bytes, b->bytes + len - ack_pkt_len); WRITE_ONCE(b->tin_backlog, b->tin_backlog + len - ack_pkt_len); + WRITE_ONCE(b->backlogs[idx], b->backlogs[idx] + len - ack_pkt_len); } if (q->overflow_timeout) @@ -1924,7 +1930,7 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch, flow->set = CAKE_SET_SPARSE; WRITE_ONCE(b->sparse_flow_count, b->sparse_flow_count + 1); - flow->deficit = cake_get_flow_quantum(b, flow, q->config->flow_mode); + WRITE_ONCE(flow->deficit, cake_get_flow_quantum(b, flow, q->config->flow_mode)); } else if (flow->set == CAKE_SET_SPARSE_WAIT) { /* this flow was empty, accounted as a sparse flow, but actually * in the bulk rotation. @@ -1977,7 +1983,7 @@ static struct sk_buff *cake_dequeue_one(struct Qdisc *sch) if (flow->head) { skb = dequeue_head(flow); len = qdisc_pkt_len(skb); - b->backlogs[q->cur_flow] -= len; + WRITE_ONCE(b->backlogs[q->cur_flow], b->backlogs[q->cur_flow] - len); WRITE_ONCE(b->tin_backlog, b->tin_backlog - len); sch->qstats.backlog -= len; q->buffer_used -= skb->truesize; @@ -2166,7 +2172,8 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch) } } - flow->deficit += cake_get_flow_quantum(b, flow, q->config->flow_mode); + WRITE_ONCE(flow->deficit, + flow->deficit + cake_get_flow_quantum(b, flow, q->config->flow_mode)); list_move_tail(&flow->flowchain, &b->old_flows); goto retry; @@ -2232,10 +2239,10 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch) if (q->config->rate_flags & CAKE_FLAG_INGRESS) { len = cake_advance_shaper(q, b, skb, now, true); - flow->deficit -= len; + WRITE_ONCE(flow->deficit, flow->deficit - len); b->tin_deficit -= len; } - flow->dropped++; + WRITE_ONCE(flow->dropped, flow->dropped + 1); WRITE_ONCE(b->tin_dropped, b->tin_dropped + 1); qdisc_tree_reduce_backlog(sch, 1, qdisc_pkt_len(skb)); qdisc_qstats_drop(sch); @@ -2259,7 +2266,7 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch) delay < b->base_delay ? 2 : 8)); len = cake_advance_shaper(q, b, skb, now, false); - flow->deficit -= len; + WRITE_ONCE(flow->deficit, flow->deficit - len); b->tin_deficit -= len; if (ktime_after(q->time_next_packet, now) && sch->q.qlen) { @@ -3137,7 +3144,7 @@ static int cake_dump_class_stats(struct Qdisc *sch, unsigned long cl, flow = &b->flows[idx % CAKE_QUEUES]; - if (flow->head) { + if (READ_ONCE(flow->head)) { sch_tree_lock(sch); skb = flow->head; while (skb) { @@ -3146,13 +3153,15 @@ static int cake_dump_class_stats(struct Qdisc *sch, unsigned long cl, } sch_tree_unlock(sch); } - qs.backlog = b->backlogs[idx % CAKE_QUEUES]; - qs.drops = flow->dropped; + qs.backlog = READ_ONCE(b->backlogs[idx % CAKE_QUEUES]); + qs.drops = READ_ONCE(flow->dropped); } if (gnet_stats_copy_queue(d, NULL, &qs, qs.qlen) < 0) return -1; if (flow) { ktime_t now = ktime_get(); + bool dropping; + u32 p_drop; stats = nla_nest_start_noflag(d->skb, TCA_STATS_APP); if (!stats) @@ -3167,21 +3176,23 @@ static int cake_dump_class_stats(struct Qdisc *sch, unsigned long cl, goto nla_put_failure; \ } while (0) - PUT_STAT_S32(DEFICIT, flow->deficit); - PUT_STAT_U32(DROPPING, flow->cvars.dropping); - PUT_STAT_U32(COBALT_COUNT, flow->cvars.count); - PUT_STAT_U32(P_DROP, flow->cvars.p_drop); - if (flow->cvars.p_drop) { + PUT_STAT_S32(DEFICIT, READ_ONCE(flow->deficit)); + dropping = READ_ONCE(flow->cvars.dropping); + PUT_STAT_U32(DROPPING, dropping); + PUT_STAT_U32(COBALT_COUNT, READ_ONCE(flow->cvars.count)); + p_drop = READ_ONCE(flow->cvars.p_drop); + PUT_STAT_U32(P_DROP, p_drop); + if (p_drop) { PUT_STAT_S32(BLUE_TIMER_US, ktime_to_us( ktime_sub(now, - flow->cvars.blue_timer))); + READ_ONCE(flow->cvars.blue_timer)))); } - if (flow->cvars.dropping) { + if (dropping) { PUT_STAT_S32(DROP_NEXT_US, ktime_to_us( ktime_sub(now, - flow->cvars.drop_next))); + READ_ONCE(flow->cvars.drop_next)))); } if (nla_nest_end(d->skb, stats) < 0) diff --git a/net/sched/sch_cbs.c b/net/sched/sch_cbs.c index 8c9a0400c8622c..0f953bd46b5814 100644 --- a/net/sched/sch_cbs.c +++ b/net/sched/sch_cbs.c @@ -243,6 +243,20 @@ static struct sk_buff *cbs_dequeue(struct Qdisc *sch) return q->dequeue(sch); } +static void cbs_reset(struct Qdisc *sch) +{ + struct cbs_sched_data *q = qdisc_priv(sch); + + /* Nothing to do if we couldn't create the underlying qdisc */ + if (!q->qdisc) + return; + + qdisc_reset(q->qdisc); + qdisc_watchdog_cancel(&q->watchdog); + q->credits = 0; + q->last = 0; +} + static const struct nla_policy cbs_policy[TCA_CBS_MAX + 1] = { [TCA_CBS_PARMS] = { .len = sizeof(struct tc_cbs_qopt) }, }; @@ -540,7 +554,7 @@ static struct Qdisc_ops cbs_qdisc_ops __read_mostly = { .dequeue = cbs_dequeue, .peek = qdisc_peek_dequeued, .init = cbs_init, - .reset = qdisc_reset_queue, + .reset = cbs_reset, .destroy = cbs_destroy, .change = cbs_change, .dump = cbs_dump, diff --git a/net/sched/sch_dualpi2.c b/net/sched/sch_dualpi2.c index 241e6a46bd00e3..a22489c14458e6 100644 --- a/net/sched/sch_dualpi2.c +++ b/net/sched/sch_dualpi2.c @@ -938,6 +938,8 @@ static int dualpi2_init(struct Qdisc *sch, struct nlattr *opt, int err; sch->flags |= TCQ_F_DEQUEUE_DROPS; + hrtimer_setup(&q->pi2_timer, dualpi2_timer, CLOCK_MONOTONIC, + HRTIMER_MODE_ABS_PINNED_SOFT); q->l_queue = qdisc_create_dflt(sch->dev_queue, &pfifo_qdisc_ops, TC_H_MAKE(sch->handle, 1), extack); @@ -950,8 +952,6 @@ static int dualpi2_init(struct Qdisc *sch, struct nlattr *opt, q->sch = sch; dualpi2_reset_default(sch); - hrtimer_setup(&q->pi2_timer, dualpi2_timer, CLOCK_MONOTONIC, - HRTIMER_MODE_ABS_PINNED_SOFT); if (opt && nla_len(opt)) { err = dualpi2_change(sch, opt, extack); diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c index 0664b2f2d6f280..24db54684e8a59 100644 --- a/net/sched/sch_fq_codel.c +++ b/net/sched/sch_fq_codel.c @@ -117,7 +117,7 @@ static inline struct sk_buff *dequeue_head(struct fq_codel_flow *flow) { struct sk_buff *skb = flow->head; - flow->head = skb->next; + WRITE_ONCE(flow->head, skb->next); skb_mark_not_on_list(skb); return skb; } @@ -127,7 +127,7 @@ static inline void flow_queue_add(struct fq_codel_flow *flow, struct sk_buff *skb) { if (flow->head == NULL) - flow->head = skb; + WRITE_ONCE(flow->head, skb); else flow->tail->next = skb; flow->tail = skb; @@ -173,8 +173,8 @@ static unsigned int fq_codel_drop(struct Qdisc *sch, unsigned int max_packets, } while (++i < max_packets && len < threshold); /* Tell codel to increase its signal strength also */ - flow->cvars.count += i; - q->backlogs[idx] -= len; + WRITE_ONCE(flow->cvars.count, flow->cvars.count + i); + WRITE_ONCE(q->backlogs[idx], q->backlogs[idx] - len); q->memory_usage -= mem; sch->qstats.drops += i; sch->qstats.backlog -= len; @@ -204,13 +204,13 @@ static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch, codel_set_enqueue_time(skb); flow = &q->flows[idx]; flow_queue_add(flow, skb); - q->backlogs[idx] += qdisc_pkt_len(skb); + WRITE_ONCE(q->backlogs[idx], q->backlogs[idx] + qdisc_pkt_len(skb)); qdisc_qstats_backlog_inc(sch, skb); if (list_empty(&flow->flowchain)) { list_add_tail(&flow->flowchain, &q->new_flows); q->new_flow_count++; - flow->deficit = q->quantum; + WRITE_ONCE(flow->deficit, q->quantum); } get_codel_cb(skb)->mem_usage = skb->truesize; q->memory_usage += get_codel_cb(skb)->mem_usage; @@ -263,7 +263,8 @@ static struct sk_buff *dequeue_func(struct codel_vars *vars, void *ctx) flow = container_of(vars, struct fq_codel_flow, cvars); if (flow->head) { skb = dequeue_head(flow); - q->backlogs[flow - q->flows] -= qdisc_pkt_len(skb); + WRITE_ONCE(q->backlogs[flow - q->flows], + q->backlogs[flow - q->flows] - qdisc_pkt_len(skb)); q->memory_usage -= get_codel_cb(skb)->mem_usage; sch->q.qlen--; sch->qstats.backlog -= qdisc_pkt_len(skb); @@ -296,7 +297,7 @@ static struct sk_buff *fq_codel_dequeue(struct Qdisc *sch) flow = list_first_entry(head, struct fq_codel_flow, flowchain); if (flow->deficit <= 0) { - flow->deficit += q->quantum; + WRITE_ONCE(flow->deficit, flow->deficit + q->quantum); list_move_tail(&flow->flowchain, &q->old_flows); goto begin; } @@ -314,7 +315,7 @@ static struct sk_buff *fq_codel_dequeue(struct Qdisc *sch) goto begin; } qdisc_bstats_update(sch, skb); - flow->deficit -= qdisc_pkt_len(skb); + WRITE_ONCE(flow->deficit, flow->deficit - qdisc_pkt_len(skb)); if (q->cstats.drop_count) { qdisc_tree_reduce_backlog(sch, q->cstats.drop_count, @@ -328,7 +329,7 @@ static struct sk_buff *fq_codel_dequeue(struct Qdisc *sch) static void fq_codel_flow_purge(struct fq_codel_flow *flow) { rtnl_kfree_skbs(flow->head, flow->tail); - flow->head = NULL; + WRITE_ONCE(flow->head, NULL); } static void fq_codel_reset(struct Qdisc *sch) @@ -656,21 +657,21 @@ static int fq_codel_dump_class_stats(struct Qdisc *sch, unsigned long cl, memset(&xstats, 0, sizeof(xstats)); xstats.type = TCA_FQ_CODEL_XSTATS_CLASS; - xstats.class_stats.deficit = flow->deficit; + xstats.class_stats.deficit = READ_ONCE(flow->deficit); xstats.class_stats.ldelay = - codel_time_to_us(flow->cvars.ldelay); - xstats.class_stats.count = flow->cvars.count; - xstats.class_stats.lastcount = flow->cvars.lastcount; - xstats.class_stats.dropping = flow->cvars.dropping; - if (flow->cvars.dropping) { - codel_tdiff_t delta = flow->cvars.drop_next - + codel_time_to_us(READ_ONCE(flow->cvars.ldelay)); + xstats.class_stats.count = READ_ONCE(flow->cvars.count); + xstats.class_stats.lastcount = READ_ONCE(flow->cvars.lastcount); + xstats.class_stats.dropping = READ_ONCE(flow->cvars.dropping); + if (xstats.class_stats.dropping) { + codel_tdiff_t delta = READ_ONCE(flow->cvars.drop_next) - codel_get_time(); xstats.class_stats.drop_next = (delta >= 0) ? codel_time_to_us(delta) : -codel_time_to_us(-delta); } - if (flow->head) { + if (READ_ONCE(flow->head)) { sch_tree_lock(sch); skb = flow->head; while (skb) { @@ -679,7 +680,7 @@ static int fq_codel_dump_class_stats(struct Qdisc *sch, unsigned long cl, } sch_tree_unlock(sch); } - qs.backlog = q->backlogs[idx]; + qs.backlog = READ_ONCE(q->backlogs[idx]); qs.drops = 0; } if (gnet_stats_copy_queue(d, NULL, &qs, qs.qlen) < 0) diff --git a/net/sched/sch_pie.c b/net/sched/sch_pie.c index fb53fbf0e32857..b41f2def2e2cc5 100644 --- a/net/sched/sch_pie.c +++ b/net/sched/sch_pie.c @@ -219,16 +219,14 @@ void pie_process_dequeue(struct sk_buff *skb, struct pie_params *params, * packet timestamp. */ if (!params->dq_rate_estimator) { - vars->qdelay = now - pie_get_enqueue_time(skb); + WRITE_ONCE(vars->qdelay, + backlog ? now - pie_get_enqueue_time(skb) : 0); if (vars->dq_tstamp != DTIME_INVALID) dtime = now - vars->dq_tstamp; vars->dq_tstamp = now; - if (backlog == 0) - vars->qdelay = 0; - if (dtime == 0) return; @@ -376,7 +374,7 @@ void pie_calculate_probability(struct pie_params *params, struct pie_vars *vars, if (qdelay > (PSCHED_NS2TICKS(250 * NSEC_PER_MSEC))) delta += MAX_PROB / (100 / 2); - vars->prob += delta; + WRITE_ONCE(vars->prob, vars->prob + delta); if (delta > 0) { /* prevent overflow */ @@ -401,7 +399,7 @@ void pie_calculate_probability(struct pie_params *params, struct pie_vars *vars, if (qdelay == 0 && qdelay_old == 0 && update_prob) /* Reduce drop probability to 98.4% */ - vars->prob -= vars->prob / 64; + WRITE_ONCE(vars->prob, vars->prob - vars->prob / 64); WRITE_ONCE(vars->qdelay, qdelay); vars->backlog_old = backlog; @@ -501,7 +499,7 @@ static int pie_dump_stats(struct Qdisc *sch, struct gnet_dump *d) { struct pie_sched_data *q = qdisc_priv(sch); struct tc_pie_xstats st = { - .prob = q->vars.prob << BITS_PER_BYTE, + .prob = READ_ONCE(q->vars.prob) << BITS_PER_BYTE, .delay = ((u32)PSCHED_TICKS2NS(READ_ONCE(q->vars.qdelay))) / NSEC_PER_USEC, .packets_in = READ_ONCE(q->stats.packets_in), @@ -512,7 +510,7 @@ static int pie_dump_stats(struct Qdisc *sch, struct gnet_dump *d) }; /* avg_dq_rate is only valid if dq_rate_estimator is enabled */ - st.dq_rate_estimating = q->params.dq_rate_estimator; + st.dq_rate_estimating = READ_ONCE(q->params.dq_rate_estimator); /* unscale and return dq_rate in bytes per sec */ if (st.dq_rate_estimating) diff --git a/net/sched/sch_red.c b/net/sched/sch_red.c index 432b8a3000a57b..4d0e44a2e7c664 100644 --- a/net/sched/sch_red.c +++ b/net/sched/sch_red.c @@ -162,7 +162,7 @@ static struct sk_buff *red_dequeue(struct Qdisc *sch) struct red_sched_data *q = qdisc_priv(sch); struct Qdisc *child = q->qdisc; - skb = child->dequeue(child); + skb = qdisc_dequeue_peeked(child); if (skb) { qdisc_bstats_update(sch, skb); qdisc_qstats_backlog_dec(sch, skb); diff --git a/net/sched/sch_sfb.c b/net/sched/sch_sfb.c index bd5ef561030fea..d3ee8e5479b35e 100644 --- a/net/sched/sch_sfb.c +++ b/net/sched/sch_sfb.c @@ -441,7 +441,7 @@ static struct sk_buff *sfb_dequeue(struct Qdisc *sch) struct Qdisc *child = q->qdisc; struct sk_buff *skb; - skb = child->dequeue(q->qdisc); + skb = qdisc_dequeue_peeked(child); if (skb) { qdisc_bstats_update(sch, skb); diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c index c3f3181dba5424..f39822babf88be 100644 --- a/net/sched/sch_sfq.c +++ b/net/sched/sch_sfq.c @@ -225,7 +225,8 @@ static inline void sfq_dec(struct sfq_sched_data *q, sfq_index x) sfq_unlink(q, x, n, p); - d = q->slots[x].qlen--; + d = q->slots[x].qlen; + WRITE_ONCE(q->slots[x].qlen, d - 1); if (n == p && q->cur_depth == d) q->cur_depth--; sfq_link(q, x); @@ -238,7 +239,8 @@ static inline void sfq_inc(struct sfq_sched_data *q, sfq_index x) sfq_unlink(q, x, n, p); - d = ++q->slots[x].qlen; + d = q->slots[x].qlen + 1; + WRITE_ONCE(q->slots[x].qlen, d); if (q->cur_depth < d) q->cur_depth = d; sfq_link(q, x); @@ -298,7 +300,7 @@ static unsigned int sfq_drop(struct Qdisc *sch, struct sk_buff **to_free) drop: skb = q->headdrop ? slot_dequeue_head(slot) : slot_dequeue_tail(slot); len = qdisc_pkt_len(skb); - slot->backlog -= len; + WRITE_ONCE(slot->backlog, slot->backlog - len); sfq_dec(q, x); sch->q.qlen--; qdisc_qstats_backlog_dec(sch, skb); @@ -314,7 +316,7 @@ static unsigned int sfq_drop(struct Qdisc *sch, struct sk_buff **to_free) q->tail = NULL; /* no more active slots */ else q->tail->next = slot->next; - q->ht[slot->hash] = SFQ_EMPTY_SLOT; + WRITE_ONCE(q->ht[slot->hash], SFQ_EMPTY_SLOT); goto drop; } @@ -364,10 +366,10 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct sk_buff **to_free) x = q->dep[0].next; /* get a free slot */ if (x >= SFQ_MAX_FLOWS) return qdisc_drop_reason(skb, sch, to_free, QDISC_DROP_MAXFLOWS); - q->ht[hash] = x; + WRITE_ONCE(q->ht[hash], x); slot = &q->slots[x]; slot->hash = hash; - slot->backlog = 0; /* should already be 0 anyway... */ + WRITE_ONCE(slot->backlog, 0); /* should already be 0 anyway... */ red_set_vars(&slot->vars); goto enqueue; } @@ -426,7 +428,7 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct sk_buff **to_free) head = slot_dequeue_head(slot); delta = qdisc_pkt_len(head) - qdisc_pkt_len(skb); sch->qstats.backlog -= delta; - slot->backlog -= delta; + WRITE_ONCE(slot->backlog, slot->backlog - delta); qdisc_drop_reason(head, sch, to_free, QDISC_DROP_FLOW_LIMIT); slot_queue_add(slot, skb); @@ -436,7 +438,7 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct sk_buff **to_free) enqueue: qdisc_qstats_backlog_inc(sch, skb); - slot->backlog += qdisc_pkt_len(skb); + WRITE_ONCE(slot->backlog, slot->backlog + qdisc_pkt_len(skb)); slot_queue_add(slot, skb); sfq_inc(q, x); if (slot->qlen == 1) { /* The flow is new */ @@ -452,7 +454,7 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct sk_buff **to_free) */ q->tail = slot; /* We could use a bigger initial quantum for new flows */ - slot->allot = q->quantum; + WRITE_ONCE(slot->allot, q->quantum); } if (++sch->q.qlen <= q->limit) return NET_XMIT_SUCCESS; @@ -489,7 +491,7 @@ sfq_dequeue(struct Qdisc *sch) slot = &q->slots[a]; if (slot->allot <= 0) { q->tail = slot; - slot->allot += q->quantum; + WRITE_ONCE(slot->allot, slot->allot + q->quantum); goto next_slot; } skb = slot_dequeue_head(slot); @@ -497,10 +499,10 @@ sfq_dequeue(struct Qdisc *sch) qdisc_bstats_update(sch, skb); sch->q.qlen--; qdisc_qstats_backlog_dec(sch, skb); - slot->backlog -= qdisc_pkt_len(skb); + WRITE_ONCE(slot->backlog, slot->backlog - qdisc_pkt_len(skb)); /* Is the slot empty? */ if (slot->qlen == 0) { - q->ht[slot->hash] = SFQ_EMPTY_SLOT; + WRITE_ONCE(q->ht[slot->hash], SFQ_EMPTY_SLOT); next_a = slot->next; if (a == next_a) { q->tail = NULL; /* no more active slots */ @@ -508,7 +510,7 @@ sfq_dequeue(struct Qdisc *sch) } q->tail->next = next_a; } else { - slot->allot -= qdisc_pkt_len(skb); + WRITE_ONCE(slot->allot, slot->allot - qdisc_pkt_len(skb)); } return skb; } @@ -549,9 +551,9 @@ static void sfq_rehash(struct Qdisc *sch) sfq_dec(q, i); __skb_queue_tail(&list, skb); } - slot->backlog = 0; + WRITE_ONCE(slot->backlog, 0); red_set_vars(&slot->vars); - q->ht[slot->hash] = SFQ_EMPTY_SLOT; + WRITE_ONCE(q->ht[slot->hash], SFQ_EMPTY_SLOT); } q->tail = NULL; @@ -570,7 +572,7 @@ static void sfq_rehash(struct Qdisc *sch) dropped++; continue; } - q->ht[hash] = x; + WRITE_ONCE(q->ht[hash], x); slot = &q->slots[x]; slot->hash = hash; } @@ -581,7 +583,7 @@ static void sfq_rehash(struct Qdisc *sch) slot->vars.qavg = red_calc_qavg(q->red_parms, &slot->vars, slot->backlog); - slot->backlog += qdisc_pkt_len(skb); + WRITE_ONCE(slot->backlog, slot->backlog + qdisc_pkt_len(skb)); sfq_inc(q, x); if (slot->qlen == 1) { /* The flow is new */ if (q->tail == NULL) { /* It is the first flow */ @@ -591,7 +593,7 @@ static void sfq_rehash(struct Qdisc *sch) q->tail->next = x; } q->tail = slot; - slot->allot = q->quantum; + WRITE_ONCE(slot->allot, q->quantum); } } sch->q.qlen -= dropped; @@ -905,16 +907,16 @@ static int sfq_dump_class_stats(struct Qdisc *sch, unsigned long cl, struct gnet_dump *d) { struct sfq_sched_data *q = qdisc_priv(sch); - sfq_index idx = q->ht[cl - 1]; + sfq_index idx = READ_ONCE(q->ht[cl - 1]); struct gnet_stats_queue qs = { 0 }; struct tc_sfq_xstats xstats = { 0 }; if (idx != SFQ_EMPTY_SLOT) { const struct sfq_slot *slot = &q->slots[idx]; - xstats.allot = slot->allot; - qs.qlen = slot->qlen; - qs.backlog = slot->backlog; + xstats.allot = READ_ONCE(slot->allot); + qs.qlen = READ_ONCE(slot->qlen); + qs.backlog = READ_ONCE(slot->backlog); } if (gnet_stats_copy_queue(d, NULL, &qs, qs.qlen) < 0) return -1; @@ -930,7 +932,7 @@ static void sfq_walk(struct Qdisc *sch, struct qdisc_walker *arg) return; for (i = 0; i < q->divisor; i++) { - if (q->ht[i] == SFQ_EMPTY_SLOT) { + if (READ_ONCE(q->ht[i]) == SFQ_EMPTY_SLOT) { arg->count++; continue; } diff --git a/net/sctp/socket.c b/net/sctp/socket.c index 58d0d9747f0b39..1d2568bb6bc277 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -1986,6 +1986,15 @@ static int sctp_sendmsg(struct sock *sk, struct msghdr *msg, size_t msg_len) goto out_unlock; iov_iter_revert(&msg->msg_iter, err); + + /* sctp_sendmsg_to_asoc() may have released the socket + * lock (sctp_wait_for_sndbuf), during which other + * associations on ep->asocs could have been peeled + * off or freed. @asoc itself is revalidated by the + * base.dead and base.sk checks in sctp_wait_for_sndbuf, + * so re-derive the cached cursor from it. + */ + tmp = list_next_entry(asoc, asocs); } goto out_unlock; diff --git a/net/shaper/shaper.c b/net/shaper/shaper.c index 94bc9c7382ea62..b1c65110f04d37 100644 --- a/net/shaper/shaper.c +++ b/net/shaper/shaper.c @@ -21,6 +21,8 @@ #define NET_SHAPER_ID_UNSPEC NET_SHAPER_ID_MASK +static_assert(NET_SHAPER_ID_UNSPEC == NET_SHAPER_MAX_HANDLE_ID + 1); + struct net_shaper_hierarchy { struct xarray shapers; }; @@ -90,6 +92,12 @@ static int net_shaper_handle_size(void) nla_total_size(sizeof(u32))); } +static int net_shaper_group_reply_size(void) +{ + return nla_total_size(sizeof(u32)) + /* NET_SHAPER_A_IFINDEX */ + net_shaper_handle_size(); /* NET_SHAPER_A_HANDLE */ +} + static int net_shaper_fill_binding(struct sk_buff *msg, const struct net_shaper_binding *binding, u32 type) @@ -275,11 +283,13 @@ static void net_shaper_default_parent(const struct net_shaper_handle *handle, parent->id = 0; } -/* - * MARK_0 is already in use due to XA_FLAGS_ALLOC, can't reuse such flag as - * it's cleared by xa_store(). +/* MARK_0 is already in use due to XA_FLAGS_ALLOC. The VALID mark is set on + * an entry only after the device-side configuration has completed + * successfully (see net_shaper_commit()). Lookups and dumps must filter on + * this mark to avoid exposing tentative entries inserted by + * net_shaper_pre_insert() while the driver call is still in flight. */ -#define NET_SHAPER_NOT_VALID XA_MARK_1 +#define NET_SHAPER_VALID XA_MARK_1 static struct net_shaper * net_shaper_lookup(struct net_shaper_binding *binding, @@ -289,10 +299,14 @@ net_shaper_lookup(struct net_shaper_binding *binding, struct net_shaper_hierarchy *hierarchy; hierarchy = net_shaper_hierarchy_rcu(binding); - if (!hierarchy || xa_get_mark(&hierarchy->shapers, index, - NET_SHAPER_NOT_VALID)) + if (!hierarchy || !xa_get_mark(&hierarchy->shapers, index, + NET_SHAPER_VALID)) return NULL; + /* Pairs with smp_wmb() in net_shaper_commit(): if the entry is + * valid, its contents must be visible too. + */ + smp_rmb(); return xa_load(&hierarchy->shapers, index); } @@ -348,7 +362,7 @@ static int net_shaper_pre_insert(struct net_shaper_binding *binding, handle->id == NET_SHAPER_ID_UNSPEC) { u32 min, max; - handle->id = NET_SHAPER_ID_MASK - 1; + handle->id = NET_SHAPER_MAX_HANDLE_ID; max = net_shaper_handle_to_index(handle); handle->id = 0; min = net_shaper_handle_to_index(handle); @@ -370,13 +384,10 @@ static int net_shaper_pre_insert(struct net_shaper_binding *binding, goto free_id; } - /* Mark 'tentative' shaper inside the hierarchy container. - * xa_set_mark is a no-op if the previous store fails. + /* Insert as 'tentative' (no VALID mark). The mark will be set by + * net_shaper_commit() once the driver-side configuration succeeds. */ - xa_lock(&hierarchy->shapers); - prev = __xa_store(&hierarchy->shapers, index, cur, GFP_KERNEL); - __xa_set_mark(&hierarchy->shapers, index, NET_SHAPER_NOT_VALID); - xa_unlock(&hierarchy->shapers); + prev = xa_store(&hierarchy->shapers, index, cur, GFP_KERNEL); if (xa_err(prev)) { NL_SET_ERR_MSG(extack, "Can't insert shaper into device store"); kfree_rcu(cur, rcu); @@ -413,9 +424,9 @@ static void net_shaper_commit(struct net_shaper_binding *binding, /* Successful update: drop the tentative mark * and update the hierarchy container. */ - __xa_clear_mark(&hierarchy->shapers, index, - NET_SHAPER_NOT_VALID); *cur = shapers[i]; + smp_wmb(); + __xa_set_mark(&hierarchy->shapers, index, NET_SHAPER_VALID); } xa_unlock(&hierarchy->shapers); } @@ -431,8 +442,9 @@ static void net_shaper_rollback(struct net_shaper_binding *binding) return; xa_lock(&hierarchy->shapers); - xa_for_each_marked(&hierarchy->shapers, index, cur, - NET_SHAPER_NOT_VALID) { + xa_for_each(&hierarchy->shapers, index, cur) { + if (xa_get_mark(&hierarchy->shapers, index, NET_SHAPER_VALID)) + continue; __xa_erase(&hierarchy->shapers, index); kfree(cur); } @@ -465,10 +477,21 @@ static int net_shaper_parse_handle(const struct nlattr *attr, * shaper (any other value). */ id_attr = tb[NET_SHAPER_A_HANDLE_ID]; - if (id_attr) + if (id_attr) { id = nla_get_u32(id_attr); - else if (handle->scope == NET_SHAPER_SCOPE_NODE) + } else if (handle->scope == NET_SHAPER_SCOPE_NODE) { id = NET_SHAPER_ID_UNSPEC; + } else if (handle->scope == NET_SHAPER_SCOPE_QUEUE) { + NL_SET_ERR_ATTR_MISS(info->extack, attr, + NET_SHAPER_A_HANDLE_ID); + return -EINVAL; + } + + if (id && handle->scope == NET_SHAPER_SCOPE_NETDEV) { + NL_SET_ERR_MSG_ATTR(info->extack, id_attr, + "Netdev scope is a singleton, must use ID 0"); + return -EINVAL; + } handle->id = id; return 0; @@ -836,7 +859,12 @@ int net_shaper_nl_get_dumpit(struct sk_buff *skb, goto out_unlock; for (; (shaper = xa_find(&hierarchy->shapers, &ctx->start_index, - U32_MAX, XA_PRESENT)); ctx->start_index++) { + U32_MAX, NET_SHAPER_VALID)); + ctx->start_index++) { + /* Pairs with smp_wmb() in net_shaper_commit(): the entry + * is marked VALID, so its contents must be visible too. + */ + smp_rmb(); ret = net_shaper_fill_one(skb, binding, shaper, info); if (ret) break; @@ -932,6 +960,46 @@ static int net_shaper_handle_cmp(const struct net_shaper_handle *a, return memcmp(a, b, sizeof(*a)); } +static int net_shaper_parse_leaves(struct net_shaper_binding *binding, + struct genl_info *info, + const struct net_shaper *node, + struct net_shaper *leaves, + int leaves_count) +{ + struct nlattr *attr; + int i, j, ret, rem; + + i = 0; + nla_for_each_attr_type(attr, NET_SHAPER_A_LEAVES, + genlmsg_data(info->genlhdr), + genlmsg_len(info->genlhdr), rem) { + if (WARN_ON_ONCE(i >= leaves_count)) + return -EINVAL; + + ret = net_shaper_parse_leaf(binding, attr, info, + node, &leaves[i]); + if (ret) + return ret; + + /* Reject duplicates */ + for (j = 0; j < i; j++) { + if (net_shaper_handle_cmp(&leaves[i].handle, + &leaves[j].handle)) + continue; + + NL_SET_ERR_MSG_ATTR_FMT(info->extack, attr, + "Duplicate leaf shaper %d:%d", + leaves[i].handle.scope, + leaves[i].handle.id); + return -EINVAL; + } + + i++; + } + + return 0; +} + static int net_shaper_parent_from_leaves(int leaves_count, const struct net_shaper *leaves, struct net_shaper *node, @@ -964,15 +1032,22 @@ static int __net_shaper_group(struct net_shaper_binding *binding, int i, ret; if (node->handle.scope == NET_SHAPER_SCOPE_NODE) { + struct net_shaper *cur = NULL; + new_node = node->handle.id == NET_SHAPER_ID_UNSPEC; - if (!new_node && !net_shaper_lookup(binding, &node->handle)) { - /* The related attribute is not available when - * reaching here from the delete() op. - */ - NL_SET_ERR_MSG_FMT(extack, "Node shaper %d:%d does not exists", - node->handle.scope, node->handle.id); - return -ENOENT; + if (!new_node) { + cur = net_shaper_lookup(binding, &node->handle); + if (!cur) { + /* The related attribute is not available + * when reaching here from the delete() op. + */ + NL_SET_ERR_MSG_FMT(extack, + "Node shaper %d:%d does not exist", + node->handle.scope, + node->handle.id); + return -ENOENT; + } } /* When unspecified, the node parent scope is inherited from @@ -986,6 +1061,15 @@ static int __net_shaper_group(struct net_shaper_binding *binding, return ret; } + if (cur && net_shaper_handle_cmp(&cur->parent, + &node->parent)) { + NL_SET_ERR_MSG_FMT(extack, + "Cannot reparent node shaper %d:%d", + node->handle.scope, + node->handle.id); + return -EOPNOTSUPP; + } + } else { net_shaper_default_parent(&node->handle, &node->parent); } @@ -1162,7 +1246,7 @@ static int net_shaper_group_send_reply(struct net_shaper_binding *binding, free_msg: /* Should never happen as msg is pre-allocated with enough space. */ WARN_ONCE(true, "calculated message payload length (%d)", - net_shaper_handle_size()); + net_shaper_group_reply_size()); nlmsg_free(msg); return -EMSGSIZE; } @@ -1172,10 +1256,9 @@ int net_shaper_nl_group_doit(struct sk_buff *skb, struct genl_info *info) struct net_shaper **old_nodes, *leaves, node = {}; struct net_shaper_hierarchy *hierarchy; struct net_shaper_binding *binding; - int i, ret, rem, leaves_count; + int i, ret, leaves_count; int old_nodes_count = 0; struct sk_buff *msg; - struct nlattr *attr; if (GENL_REQ_ATTR_CHECK(info, NET_SHAPER_A_LEAVES)) return -EINVAL; @@ -1203,26 +1286,19 @@ int net_shaper_nl_group_doit(struct sk_buff *skb, struct genl_info *info) if (ret) goto free_leaves; - i = 0; - nla_for_each_attr_type(attr, NET_SHAPER_A_LEAVES, - genlmsg_data(info->genlhdr), - genlmsg_len(info->genlhdr), rem) { - if (WARN_ON_ONCE(i >= leaves_count)) - goto free_leaves; - - ret = net_shaper_parse_leaf(binding, attr, info, - &node, &leaves[i]); - if (ret) - goto free_leaves; - i++; - } + ret = net_shaper_parse_leaves(binding, info, &node, + leaves, leaves_count); + if (ret) + goto free_leaves; /* Prepare the msg reply in advance, to avoid device operation * rollback on allocation failure. */ - msg = genlmsg_new(net_shaper_handle_size(), GFP_KERNEL); - if (!msg) + msg = genlmsg_new(net_shaper_group_reply_size(), GFP_KERNEL); + if (!msg) { + ret = -ENOMEM; goto free_leaves; + } hierarchy = net_shaper_hierarchy_setup(binding); if (!hierarchy) { diff --git a/net/shaper/shaper_nl_gen.c b/net/shaper/shaper_nl_gen.c index 9b29be3ef19a85..76eff85ec66dfc 100644 --- a/net/shaper/shaper_nl_gen.c +++ b/net/shaper/shaper_nl_gen.c @@ -11,10 +11,15 @@ #include +/* Integer value ranges */ +static const struct netlink_range_validation net_shaper_a_handle_id_range = { + .max = NET_SHAPER_MAX_HANDLE_ID, +}; + /* Common nested types */ const struct nla_policy net_shaper_handle_nl_policy[NET_SHAPER_A_HANDLE_ID + 1] = { [NET_SHAPER_A_HANDLE_SCOPE] = NLA_POLICY_MAX(NLA_U32, 3), - [NET_SHAPER_A_HANDLE_ID] = { .type = NLA_U32, }, + [NET_SHAPER_A_HANDLE_ID] = NLA_POLICY_FULL_RANGE(NLA_U32, &net_shaper_a_handle_id_range), }; const struct nla_policy net_shaper_leaf_info_nl_policy[NET_SHAPER_A_WEIGHT + 1] = { diff --git a/net/shaper/shaper_nl_gen.h b/net/shaper/shaper_nl_gen.h index 42c46c52c77513..2406652a9014a6 100644 --- a/net/shaper/shaper_nl_gen.h +++ b/net/shaper/shaper_nl_gen.h @@ -12,6 +12,8 @@ #include +#define NET_SHAPER_MAX_HANDLE_ID 67108862 + /* Common nested types */ extern const struct nla_policy net_shaper_handle_nl_policy[NET_SHAPER_A_HANDLE_ID + 1]; extern const struct nla_policy net_shaper_leaf_info_nl_policy[NET_SHAPER_A_WEIGHT + 1]; diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index 1a565095376aab..dffbd529762d6b 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -1400,7 +1400,8 @@ smc_v2_determine_accepted_chid(struct smc_clc_msg_accept_confirm *aclc, int i; for (i = 0; i < ini->ism_offered_cnt + 1; i++) { - if (ini->ism_chid[i] == ntohs(aclc->d1.chid)) { + if (ini->ism_dev[i] && + ini->ism_chid[i] == ntohs(aclc->d1.chid)) { ini->ism_selected = i; return 0; } @@ -1628,12 +1629,8 @@ static void smc_connect_work(struct work_struct *work) lock_sock(&smc->sk); if (rc != 0 || smc->sk.sk_err) { smc->sk.sk_state = SMC_CLOSED; - if (rc == -EPIPE || rc == -EAGAIN) - smc->sk.sk_err = EPIPE; - else if (rc == -ECONNREFUSED) - smc->sk.sk_err = ECONNREFUSED; - else if (signal_pending(current)) - smc->sk.sk_err = -sock_intr_errno(timeo); + if (!smc->sk.sk_err) + smc->sk.sk_err = (rc == -EAGAIN) ? EPIPE : -rc; sock_put(&smc->sk); /* passive closing */ goto out; } @@ -3058,18 +3055,17 @@ static int __smc_setsockopt(struct socket *sock, int level, int optname, smc = smc_sk(sk); + /* pre-fetch user data outside the lock */ + if (optname == SMC_LIMIT_HS) { + if (optlen < sizeof(int)) + return -EINVAL; + if (copy_from_sockptr(&val, optval, sizeof(int))) + return -EFAULT; + } + lock_sock(sk); switch (optname) { case SMC_LIMIT_HS: - if (optlen < sizeof(int)) { - rc = -EINVAL; - break; - } - if (copy_from_sockptr(&val, optval, sizeof(int))) { - rc = -EFAULT; - break; - } - smc->limit_smc_hs = !!val; rc = 0; break; diff --git a/net/smc/smc_tracepoint.h b/net/smc/smc_tracepoint.h index a9a6e3c1113aaa..53da84f57fd6f3 100644 --- a/net/smc/smc_tracepoint.h +++ b/net/smc/smc_tracepoint.h @@ -51,7 +51,7 @@ DECLARE_EVENT_CLASS(smc_msg_event, __field(const void *, smc) __field(u64, net_cookie) __field(size_t, len) - __string(name, smc->conn.lnk->ibname) + __string(name, smc->conn.lnk ? smc->conn.lnk->ibname : "") ), TP_fast_assign( diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c index 7081c1214e6c32..b5474ce534fb9d 100644 --- a/net/sunrpc/cache.c +++ b/net/sunrpc/cache.c @@ -403,7 +403,7 @@ void sunrpc_init_cache_detail(struct cache_detail *cd) INIT_LIST_HEAD(&cd->readers); spin_lock_init(&cd->queue_lock); init_waitqueue_head(&cd->queue_wait); - cd->next_seqno = 0; + cd->next_seqno = 1; spin_lock(&cache_list_lock); cd->nextcheck = 0; cd->entries = 0; diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index 798243eabb1f87..3bfdaf5e64f501 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -789,23 +789,33 @@ static int tls_push_record(struct sock *sk, int flags, i = msg_pl->sg.end; sk_msg_iter_var_prev(i); + /* msg_pl->sg.data is a ring; data[MAX+1] is reserved for the wrap + * link (frags won't use it). 'i' is now the last filled entry: + * + * i end start + * v v v [ rsv ] + * [ d ][ d ][ ][ ]...[ ][ d ][ d ][ d ][chain] + * ^ END v + * `-----------------------------------------' + * + * Note that SGL does not allow chain-after-chain, so for TLS 1.3, + * we must make sure we don't create the wrap entry and then chain + * link to content_type immediately at index 0. + */ + if (i < msg_pl->sg.start) + sg_chain(msg_pl->sg.data, ARRAY_SIZE(msg_pl->sg.data), + msg_pl->sg.data); + rec->content_type = record_type; if (prot->version == TLS_1_3_VERSION) { /* Add content type to end of message. No padding added */ sg_set_buf(&rec->sg_content_type, &rec->content_type, 1); sg_mark_end(&rec->sg_content_type); - sg_chain(msg_pl->sg.data, msg_pl->sg.end + 1, - &rec->sg_content_type); + sg_chain(msg_pl->sg.data, i + 2, &rec->sg_content_type); } else { sg_mark_end(sk_msg_elem(msg_pl, i)); } - if (msg_pl->sg.end < msg_pl->sg.start) { - sg_chain(&msg_pl->sg.data[msg_pl->sg.start], - MAX_SKB_FRAGS - msg_pl->sg.start + 1, - msg_pl->sg.data); - } - i = msg_pl->sg.start; sg_chain(rec->sg_aead_in, 2, &msg_pl->sg.data[i]); @@ -2317,9 +2327,9 @@ ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos, if (copied < 0) goto splice_requeue; - if (chunk < rxm->full_len) { - rxm->offset += len; - rxm->full_len -= len; + if (copied < rxm->full_len) { + rxm->offset += copied; + rxm->full_len -= copied; goto splice_requeue; } diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index e2d787ca3e743d..1cbf36ea043bd5 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -3323,6 +3323,9 @@ static int unix_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg) struct sk_buff *skb; int answ = 0; + if (sk->sk_type != SOCK_STREAM) + return -EOPNOTSUPP; + mutex_lock(&u->iolock); skb = skb_peek(&sk->sk_receive_queue); diff --git a/net/unix/garbage.c b/net/unix/garbage.c index a7967a34582737..0783555e252660 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -607,6 +607,8 @@ static void unix_gc(struct work_struct *work) struct sk_buff_head hitlist; struct sk_buff *skb; + WRITE_ONCE(gc_in_progress, true); + spin_lock(&unix_gc_lock); if (unix_graph_state == UNIX_GRAPH_NOT_CYCLIC) { @@ -649,10 +651,8 @@ void unix_schedule_gc(struct user_struct *user) READ_ONCE(user->unix_inflight) < UNIX_INFLIGHT_SANE_USER) return; - if (!READ_ONCE(gc_in_progress)) { - WRITE_ONCE(gc_in_progress, true); + if (!READ_ONCE(gc_in_progress)) queue_work(system_dfl_wq, &unix_gc_work); - } if (user && READ_ONCE(unix_graph_cyclic_sccs)) flush_work(&unix_gc_work); diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index 416d533f493d7b..989cc252d3d34c 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -136,27 +136,6 @@ static void virtio_transport_init_hdr(struct sk_buff *skb, hdr->fwd_cnt = cpu_to_le32(0); } -static void virtio_transport_copy_nonlinear_skb(const struct sk_buff *skb, - void *dst, - size_t len) -{ - struct iov_iter iov_iter = { 0 }; - struct kvec kvec; - size_t to_copy; - - kvec.iov_base = dst; - kvec.iov_len = len; - - iov_iter.iter_type = ITER_KVEC; - iov_iter.kvec = &kvec; - iov_iter.nr_segs = 1; - - to_copy = min_t(size_t, len, skb->len); - - skb_copy_datagram_iter(skb, VIRTIO_VSOCK_SKB_CB(skb)->offset, - &iov_iter, to_copy); -} - /* Packet capture */ static struct sk_buff *virtio_transport_build_skb(void *opaque) { @@ -166,12 +145,12 @@ static struct sk_buff *virtio_transport_build_skb(void *opaque) struct sk_buff *skb; size_t payload_len; - /* A packet could be split to fit the RX buffer, so we can retrieve - * the payload length from the header and the buffer pointer taking - * care of the offset in the original packet. + /* A packet could be split to fit the RX buffer, so we use + * the payload length from the header, which has been updated + * by the sender to reflect the fragment size. */ pkt_hdr = virtio_vsock_hdr(pkt); - payload_len = pkt->len; + payload_len = le32_to_cpu(pkt_hdr->len); skb = alloc_skb(sizeof(*hdr) + sizeof(*pkt_hdr) + payload_len, GFP_ATOMIC); @@ -214,12 +193,18 @@ static struct sk_buff *virtio_transport_build_skb(void *opaque) skb_put_data(skb, pkt_hdr, sizeof(*pkt_hdr)); if (payload_len) { - if (skb_is_nonlinear(pkt)) { - void *data = skb_put(skb, payload_len); - - virtio_transport_copy_nonlinear_skb(pkt, data, payload_len); - } else { - skb_put_data(skb, pkt->data, payload_len); + struct iov_iter iov_iter; + struct kvec kvec; + void *data = skb_put(skb, payload_len); + + kvec.iov_base = data; + kvec.iov_len = payload_len; + iov_iter_kvec(&iov_iter, ITER_DEST, &kvec, 1, payload_len); + + if (skb_copy_datagram_iter(pkt, VIRTIO_VSOCK_SKB_CB(pkt)->offset, + &iov_iter, payload_len)) { + kfree_skb(skb); + return NULL; } } @@ -447,7 +432,9 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk, static bool virtio_transport_inc_rx_pkt(struct virtio_vsock_sock *vvs, u32 len) { - if (vvs->buf_used + len > vvs->buf_alloc) + u64 skb_overhead = (skb_queue_len(&vvs->rx_queue) + 1) * SKB_TRUESIZE(0); + + if (skb_overhead + vvs->buf_used + len > vvs->buf_alloc) return false; vvs->rx_bytes += len; diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c index f334cdef895870..7db9cd4338011a 100644 --- a/net/wireless/nl80211.c +++ b/net/wireless/nl80211.c @@ -1276,6 +1276,18 @@ static int nl80211_prepare_wdev_dump(struct netlink_callback *cb, rtnl_unlock(); return -ENODEV; } + + /* + * The first invocation validated the wdev's netns against + * the caller via __cfg80211_wdev_from_attrs(). The wiphy + * may have moved netns between dumpit invocations (via + * NL80211_CMD_SET_WIPHY_NETNS), so re-check here. + */ + if (!net_eq(wiphy_net(wiphy), sock_net(cb->skb->sk))) { + rtnl_unlock(); + return -ENODEV; + } + *rdev = wiphy_to_rdev(wiphy); *wdev = NULL; @@ -13867,6 +13879,19 @@ static int nl80211_wiphy_netns(struct sk_buff *skb, struct genl_info *info) if (IS_ERR(net)) return PTR_ERR(net); + /* + * The caller already has CAP_NET_ADMIN over the source netns + * (enforced by GENL_UNS_ADMIN_PERM on the genl op). Mirror the + * convention used by net/core/rtnetlink.c::rtnl_get_net_ns_capable() + * and require CAP_NET_ADMIN over the target netns as well, so that + * a caller that is privileged in their own user namespace cannot + * push a wiphy into a netns where they have no privilege. + */ + if (!ns_capable(net->user_ns, CAP_NET_ADMIN)) { + put_net(net); + return -EPERM; + } + err = 0; /* check if anything to do */ @@ -19828,6 +19853,7 @@ static const struct genl_small_ops nl80211_small_ops[] = { .cmd = NL80211_CMD_SET_PMK, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nl80211_set_pmk, + .flags = GENL_UNS_ADMIN_PERM, .internal_flags = IFLAGS(NL80211_FLAG_NEED_NETDEV_UP | NL80211_FLAG_CLEAR_SKB), }, @@ -19835,6 +19861,7 @@ static const struct genl_small_ops nl80211_small_ops[] = { .cmd = NL80211_CMD_DEL_PMK, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nl80211_del_pmk, + .flags = GENL_UNS_ADMIN_PERM, .internal_flags = IFLAGS(NL80211_FLAG_NEED_NETDEV_UP), }, { diff --git a/net/wireless/pmsr.c b/net/wireless/pmsr.c index 4c8ea0583f9403..d6cd0de64d1f84 100644 --- a/net/wireless/pmsr.c +++ b/net/wireless/pmsr.c @@ -88,7 +88,7 @@ static int pmsr_parse_ftm(struct cfg80211_registered_device *rdev, out->ftm.ftms_per_burst = 0; if (tb[NL80211_PMSR_FTM_REQ_ATTR_FTMS_PER_BURST]) out->ftm.ftms_per_burst = - nla_get_u32(tb[NL80211_PMSR_FTM_REQ_ATTR_FTMS_PER_BURST]); + nla_get_u8(tb[NL80211_PMSR_FTM_REQ_ATTR_FTMS_PER_BURST]); if (capa->ftm.max_ftms_per_burst && (out->ftm.ftms_per_burst > capa->ftm.max_ftms_per_burst || diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index 887abed2546680..5e5786cd9af55a 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -646,9 +646,42 @@ static u64 xsk_skb_destructor_get_addr(struct sk_buff *skb) return (u64)((uintptr_t)skb_shinfo(skb)->destructor_arg & ~0x1UL); } -static void xsk_skb_destructor_set_addr(struct sk_buff *skb, u64 addr) +static struct xsk_addrs *__xsk_addrs_alloc(struct sk_buff *skb, u64 addr) { - skb_shinfo(skb)->destructor_arg = (void *)((uintptr_t)addr | 0x1UL); + struct xsk_addrs *xsk_addr; + + xsk_addr = kmem_cache_zalloc(xsk_tx_generic_cache, GFP_KERNEL); + if (unlikely(!xsk_addr)) + return NULL; + + xsk_addr->addrs[0] = addr; + skb_shinfo(skb)->destructor_arg = (void *)xsk_addr; + return xsk_addr; +} + +static struct xsk_addrs *xsk_addrs_alloc(struct sk_buff *skb) +{ + struct xsk_addrs *xsk_addr; + + if (!xsk_skb_destructor_is_addr(skb)) + return (struct xsk_addrs *)skb_shinfo(skb)->destructor_arg; + + xsk_addr = __xsk_addrs_alloc(skb, xsk_skb_destructor_get_addr(skb)); + if (likely(xsk_addr)) + xsk_addr->num_descs = 1; + return xsk_addr; +} + +static int xsk_skb_destructor_set_addr(struct sk_buff *skb, u64 addr) +{ + if (IS_ENABLED(CONFIG_64BIT)) { + skb_shinfo(skb)->destructor_arg = (void *)((uintptr_t)addr | 0x1UL); + return 0; + } + + if (unlikely(!__xsk_addrs_alloc(skb, addr))) + return -ENOMEM; + return 0; } static void xsk_inc_num_desc(struct sk_buff *skb) @@ -685,7 +718,7 @@ static void xsk_cq_submit_addr_locked(struct xsk_buff_pool *pool, spin_lock_irqsave(&pool->cq_prod_lock, flags); idx = xskq_get_prod(pool->cq); - if (unlikely(num_descs > 1)) { + if (unlikely(!xsk_skb_destructor_is_addr(skb))) { xsk_addr = (struct xsk_addrs *)skb_shinfo(skb)->destructor_arg; for (i = 0; i < num_descs; i++) { @@ -724,14 +757,20 @@ void xsk_destruct_skb(struct sk_buff *skb) sock_wfree(skb); } -static void xsk_skb_init_misc(struct sk_buff *skb, struct xdp_sock *xs, - u64 addr) +static int xsk_skb_init_misc(struct sk_buff *skb, struct xdp_sock *xs, + u64 addr) { + int err; + + err = xsk_skb_destructor_set_addr(skb, addr); + if (unlikely(err)) + return err; + skb->dev = xs->dev; skb->priority = READ_ONCE(xs->sk.sk_priority); skb->mark = READ_ONCE(xs->sk.sk_mark); skb->destructor = xsk_destruct_skb; - xsk_skb_destructor_set_addr(skb, addr); + return 0; } static void xsk_consume_skb(struct sk_buff *skb) @@ -740,7 +779,7 @@ static void xsk_consume_skb(struct sk_buff *skb) u32 num_descs = xsk_get_num_desc(skb); struct xsk_addrs *xsk_addr; - if (unlikely(num_descs > 1)) { + if (unlikely(!xsk_skb_destructor_is_addr(skb))) { xsk_addr = (struct xsk_addrs *)skb_shinfo(skb)->destructor_arg; kmem_cache_free(xsk_tx_generic_cache, xsk_addr); } @@ -819,28 +858,19 @@ static struct sk_buff *xsk_build_skb_zerocopy(struct xdp_sock *xs, return ERR_PTR(err); skb_reserve(skb, hr); - - xsk_skb_init_misc(skb, xs, desc->addr); if (desc->options & XDP_TX_METADATA) { err = xsk_skb_metadata(skb, buffer, desc, pool, hr); - if (unlikely(err)) + if (unlikely(err)) { + kfree_skb(skb); return ERR_PTR(err); + } } } else { struct xsk_addrs *xsk_addr; - if (xsk_skb_destructor_is_addr(skb)) { - xsk_addr = kmem_cache_zalloc(xsk_tx_generic_cache, - GFP_KERNEL); - if (!xsk_addr) - return ERR_PTR(-ENOMEM); - - xsk_addr->num_descs = 1; - xsk_addr->addrs[0] = xsk_skb_destructor_get_addr(skb); - skb_shinfo(skb)->destructor_arg = (void *)xsk_addr; - } else { - xsk_addr = (struct xsk_addrs *)skb_shinfo(skb)->destructor_arg; - } + xsk_addr = xsk_addrs_alloc(skb); + if (!xsk_addr) + return ERR_PTR(-ENOMEM); /* in case of -EOVERFLOW that could happen below, * xsk_consume_skb() will release this node as whole skb @@ -856,8 +886,11 @@ static struct sk_buff *xsk_build_skb_zerocopy(struct xdp_sock *xs, addr = buffer - pool->addrs; for (copied = 0, i = skb_shinfo(skb)->nr_frags; copied < len; i++) { - if (unlikely(i >= MAX_SKB_FRAGS)) + if (unlikely(i >= MAX_SKB_FRAGS)) { + if (!xs->skb) + kfree_skb(skb); return ERR_PTR(-EOVERFLOW); + } page = pool->umem->pgs[addr >> PAGE_SHIFT]; get_page(page); @@ -914,7 +947,6 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs, if (unlikely(err)) goto free_err; - xsk_skb_init_misc(skb, xs, desc->addr); if (desc->options & XDP_TX_METADATA) { err = xsk_skb_metadata(skb, buffer, desc, xs->pool, hr); @@ -927,19 +959,10 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs, struct page *page; u8 *vaddr; - if (xsk_skb_destructor_is_addr(skb)) { - xsk_addr = kmem_cache_zalloc(xsk_tx_generic_cache, - GFP_KERNEL); - if (!xsk_addr) { - err = -ENOMEM; - goto free_err; - } - - xsk_addr->num_descs = 1; - xsk_addr->addrs[0] = xsk_skb_destructor_get_addr(skb); - skb_shinfo(skb)->destructor_arg = (void *)xsk_addr; - } else { - xsk_addr = (struct xsk_addrs *)skb_shinfo(skb)->destructor_arg; + xsk_addr = xsk_addrs_alloc(skb); + if (!xsk_addr) { + err = -ENOMEM; + goto free_err; } if (unlikely(nr_frags == (MAX_SKB_FRAGS - 1) && xp_mb_desc(desc))) { @@ -964,18 +987,28 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs, } } + if (!xs->skb) { + err = xsk_skb_init_misc(skb, xs, desc->addr); + if (unlikely(err)) + goto free_err; + } xsk_inc_num_desc(skb); return skb; free_err: - if (skb && !skb_shinfo(skb)->nr_frags) + if (skb && !xs->skb) kfree_skb(skb); if (err == -EOVERFLOW) { - /* Drop the packet */ - xsk_inc_num_desc(xs->skb); - xsk_drop_skb(xs->skb); + if (xs->skb) { + /* Drop the packet */ + xsk_inc_num_desc(xs->skb); + xsk_drop_skb(xs->skb); + } else { + xsk_cq_cancel_locked(xs->pool, 1); + xs->tx->invalid_descs++; + } xskq_cons_release(xs->tx); } else { /* Let application retry */ diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c index cd7bc50872f6b5..d981cfdd853578 100644 --- a/net/xdp/xsk_buff_pool.c +++ b/net/xdp/xsk_buff_pool.c @@ -175,6 +175,9 @@ int xp_assign_dev(struct xsk_buff_pool *pool, if (force_zc && force_copy) return -EINVAL; + if (pool->tx_sw_csum && (netdev->priv_flags & IFF_TX_SKB_NO_LINEAR)) + return -EOPNOTSUPP; + if (xsk_get_pool_from_qid(netdev, queue_id)) return -EBUSY; diff --git a/net/xdp/xskmap.c b/net/xdp/xskmap.c index afa457506274c1..3bff346308d0f1 100644 --- a/net/xdp/xskmap.c +++ b/net/xdp/xskmap.c @@ -184,6 +184,10 @@ static long xsk_map_update_elem(struct bpf_map *map, void *key, void *value, } xs = (struct xdp_sock *)sock->sk; + if (!READ_ONCE(xs->rx)) { + sockfd_put(sock); + return -ENOBUFS; + } map_entry = &m->xsk_map[i]; node = xsk_map_node_alloc(m, map_entry); diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c index a9652b422f5121..cc35c2fcbbe095 100644 --- a/net/xfrm/xfrm_output.c +++ b/net/xfrm/xfrm_output.c @@ -66,7 +66,9 @@ static int xfrm4_transport_output(struct xfrm_state *x, struct sk_buff *skb) struct iphdr *iph = ip_hdr(skb); int ihl = iph->ihl * 4; - skb_set_inner_transport_header(skb, skb_transport_offset(skb)); + if (!skb->inner_protocol) + skb_set_inner_transport_header(skb, + skb_transport_offset(skb)); skb_set_network_header(skb, -x->props.header_len); skb->mac_header = skb->network_header + @@ -167,7 +169,9 @@ static int xfrm6_transport_output(struct xfrm_state *x, struct sk_buff *skb) int hdr_len; iph = ipv6_hdr(skb); - skb_set_inner_transport_header(skb, skb_transport_offset(skb)); + if (!skb->inner_protocol) + skb_set_inner_transport_header(skb, + skb_transport_offset(skb)); hdr_len = xfrm6_hdr_offset(x, skb, &prevhdr); if (hdr_len < 0) @@ -276,8 +280,10 @@ static int xfrm4_tunnel_encap_add(struct xfrm_state *x, struct sk_buff *skb) struct iphdr *top_iph; int flags; - skb_set_inner_network_header(skb, skb_network_offset(skb)); - skb_set_inner_transport_header(skb, skb_transport_offset(skb)); + if (!skb->inner_protocol) { + skb_set_inner_network_header(skb, skb_network_offset(skb)); + skb_set_inner_transport_header(skb, skb_transport_offset(skb)); + } skb_set_network_header(skb, -x->props.header_len); skb->mac_header = skb->network_header + @@ -321,8 +327,10 @@ static int xfrm6_tunnel_encap_add(struct xfrm_state *x, struct sk_buff *skb) struct ipv6hdr *top_iph; int dsfield; - skb_set_inner_network_header(skb, skb_network_offset(skb)); - skb_set_inner_transport_header(skb, skb_transport_offset(skb)); + if (!skb->inner_protocol) { + skb_set_inner_network_header(skb, skb_network_offset(skb)); + skb_set_inner_transport_header(skb, skb_transport_offset(skb)); + } skb_set_network_header(skb, -x->props.header_len); skb->mac_header = skb->network_header + diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 1748d374abcab3..686014d394298c 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -818,17 +818,17 @@ int __xfrm_state_delete(struct xfrm_state *x) spin_lock(&net->xfrm.xfrm_state_lock); list_del(&x->km.all); - hlist_del_rcu(&x->bydst); - hlist_del_rcu(&x->bysrc); - if (x->km.seq) - hlist_del_rcu(&x->byseq); + hlist_del_init_rcu(&x->bydst); + hlist_del_init_rcu(&x->bysrc); + if (!hlist_unhashed(&x->byseq)) + hlist_del_init_rcu(&x->byseq); if (!hlist_unhashed(&x->state_cache)) hlist_del_rcu(&x->state_cache); if (!hlist_unhashed(&x->state_cache_input)) hlist_del_rcu(&x->state_cache_input); - if (x->id.spi) - hlist_del_rcu(&x->byspi); + if (!hlist_unhashed(&x->byspi)) + hlist_del_init_rcu(&x->byspi); net->xfrm.state_num--; xfrm_nat_keepalive_state_updated(x); spin_unlock(&net->xfrm.xfrm_state_lock); diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index d56450f6166912..38a90e5ee3d935 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -3323,6 +3323,7 @@ const int xfrm_msg_min[XFRM_NR_MSGTYPES] = { [XFRM_MSG_GETSADINFO - XFRM_MSG_BASE] = sizeof(u32), [XFRM_MSG_NEWSPDINFO - XFRM_MSG_BASE] = sizeof(u32), [XFRM_MSG_GETSPDINFO - XFRM_MSG_BASE] = sizeof(u32), + [XFRM_MSG_MAPPING - XFRM_MSG_BASE] = XMSGSIZE(xfrm_user_mapping), [XFRM_MSG_SETDEFAULT - XFRM_MSG_BASE] = XMSGSIZE(xfrm_userpolicy_default), [XFRM_MSG_GETDEFAULT - XFRM_MSG_BASE] = XMSGSIZE(xfrm_userpolicy_default), }; diff --git a/rust/Makefile b/rust/Makefile index b361bfedfdf07a..b9e9f512cec314 100644 --- a/rust/Makefile +++ b/rust/Makefile @@ -403,6 +403,8 @@ BINDGEN_TARGET_x86 := x86_64-linux-gnu BINDGEN_TARGET_arm64 := aarch64-linux-gnu BINDGEN_TARGET_arm := arm-linux-gnueabi BINDGEN_TARGET_loongarch := loongarch64-linux-gnusf +# This is only for i386 UM builds, which need the 32-bit target not -m32 +BINDGEN_TARGET_i386 := i386-linux-gnu BINDGEN_TARGET_um := $(BINDGEN_TARGET_$(SUBARCH)) BINDGEN_TARGET := $(BINDGEN_TARGET_$(SRCARCH)) diff --git a/rust/kernel/drm/device.rs b/rust/kernel/drm/device.rs index adbafe8db54d11..403fc35353c740 100644 --- a/rust/kernel/drm/device.rs +++ b/rust/kernel/drm/device.rs @@ -119,13 +119,20 @@ impl Device { // compatible `Layout`. let layout = Kmalloc::aligned_layout(Layout::new::()); + // Use a temporary vtable without a `release` callback until `data` is initialized, so + // init failure can release the DRM device without dropping uninitialized fields. + let alloc_vtable = bindings::drm_driver { + release: None, + ..Self::VTABLE + }; + // SAFETY: - // - `VTABLE`, as a `const` is pinned to the read-only section of the compilation, + // - `alloc_vtable` reference remains valid until no longer used, // - `dev` is valid by its type invarants, let raw_drm: *mut Self = unsafe { bindings::__drm_dev_alloc( dev.as_raw(), - &Self::VTABLE, + &alloc_vtable, layout.size(), mem::offset_of!(Self, dev), ) @@ -133,6 +140,10 @@ impl Device { .cast(); let raw_drm = NonNull::new(from_err_ptr(raw_drm)?).ok_or(ENOMEM)?; + // SAFETY: `raw_drm` is a valid pointer to `Self`, given that `__drm_dev_alloc` was + // successful. + let drm_dev = unsafe { Self::into_drm_device(raw_drm) }; + // SAFETY: `raw_drm` is a valid pointer to `Self`. let raw_data = unsafe { ptr::addr_of_mut!((*raw_drm.as_ptr()).data) }; @@ -140,15 +151,14 @@ impl Device { // - `raw_data` is a valid pointer to uninitialized memory. // - `raw_data` will not move until it is dropped. unsafe { data.__pinned_init(raw_data) }.inspect_err(|_| { - // SAFETY: `raw_drm` is a valid pointer to `Self`, given that `__drm_dev_alloc` was - // successful. - let drm_dev = unsafe { Self::into_drm_device(raw_drm) }; - // SAFETY: `__drm_dev_alloc()` was successful, hence `drm_dev` must be valid and the // refcount must be non-zero. unsafe { bindings::drm_dev_put(drm_dev) }; })?; + // SAFETY: `drm_dev` is still private to this function. + unsafe { (*drm_dev).driver = const { &Self::VTABLE } }; + // SAFETY: The reference count is one, and now we take ownership of that reference as a // `drm::Device`. Ok(unsafe { ARef::from_raw(raw_drm) }) diff --git a/rust/kernel/drm/gem/mod.rs b/rust/kernel/drm/gem/mod.rs index 75acda7ba50016..01b5bd47a3332d 100644 --- a/rust/kernel/drm/gem/mod.rs +++ b/rust/kernel/drm/gem/mod.rs @@ -277,8 +277,17 @@ impl Object { // SAFETY: `obj.as_raw()` is guaranteed to be valid by the initialization above. unsafe { (*obj.as_raw()).funcs = &Self::OBJECT_FUNCS }; - // SAFETY: The arguments are all valid per the type invariants. - to_result(unsafe { bindings::drm_gem_object_init(dev.as_raw(), obj.obj.get(), size) })?; + if let Err(err) = + // SAFETY: The arguments are all valid per the type invariants. + to_result(unsafe { + bindings::drm_gem_object_init(dev.as_raw(), obj.obj.get(), size) + }) + { + // SAFETY: `drm_gem_object_init()` initializes the private GEM object state before + // failing, so `drm_gem_private_object_fini()` is the matching cleanup. + unsafe { bindings::drm_gem_private_object_fini(obj.obj.get()) }; + return Err(err); + } // SAFETY: We will never move out of `Self` as `ARef` is always treated as pinned. let ptr = KBox::into_raw(unsafe { Pin::into_inner_unchecked(obj) }); diff --git a/rust/kernel/drm/gem/shmem.rs b/rust/kernel/drm/gem/shmem.rs index d025fb03519545..e1b648920d2f67 100644 --- a/rust/kernel/drm/gem/shmem.rs +++ b/rust/kernel/drm/gem/shmem.rs @@ -19,10 +19,8 @@ use crate::{ }, error::to_result, prelude::*, - types::{ - ARef, - Opaque, // - }, // + sync::aref::ARef, + types::Opaque, // }; use core::{ ops::{ diff --git a/rust/pin-init/internal/src/init.rs b/rust/pin-init/internal/src/init.rs index daa3f1c6466eff..487ee0013fafee 100644 --- a/rust/pin-init/internal/src/init.rs +++ b/rust/pin-init/internal/src/init.rs @@ -249,22 +249,6 @@ fn init_fields( }); // Again span for better diagnostics let write = quote_spanned!(ident.span()=> ::core::ptr::write); - // NOTE: the field accessor ensures that the initialized field is properly aligned. - // Unaligned fields will cause the compiler to emit E0793. We do not support - // unaligned fields since `Init::__init` requires an aligned pointer; the call to - // `ptr::write` below has the same requirement. - let accessor = if pinned { - let project_ident = format_ident!("__project_{ident}"); - quote! { - // SAFETY: TODO - unsafe { #data.#project_ident(&mut (*#slot).#ident) } - } - } else { - quote! { - // SAFETY: TODO - unsafe { &mut (*#slot).#ident } - } - }; quote! { #(#attrs)* { @@ -272,51 +256,31 @@ fn init_fields( // SAFETY: TODO unsafe { #write(&raw mut (*#slot).#ident, #value_ident) }; } - #(#cfgs)* - #[allow(unused_variables)] - let #ident = #accessor; } } InitializerKind::Init { ident, value, .. } => { // Again span for better diagnostics let init = format_ident!("init", span = value.span()); - // NOTE: the field accessor ensures that the initialized field is properly aligned. - // Unaligned fields will cause the compiler to emit E0793. We do not support - // unaligned fields since `Init::__init` requires an aligned pointer; the call to - // `ptr::write` below has the same requirement. - let (value_init, accessor) = if pinned { - let project_ident = format_ident!("__project_{ident}"); - ( - quote! { - // SAFETY: - // - `slot` is valid, because we are inside of an initializer closure, we - // return when an error/panic occurs. - // - We also use `#data` to require the correct trait (`Init` or `PinInit`) - // for `#ident`. - unsafe { #data.#ident(&raw mut (*#slot).#ident, #init)? }; - }, - quote! { - // SAFETY: TODO - unsafe { #data.#project_ident(&mut (*#slot).#ident) } - }, - ) + let value_init = if pinned { + quote! { + // SAFETY: + // - `slot` is valid, because we are inside of an initializer closure, we + // return when an error/panic occurs. + // - We also use `#data` to require the correct trait (`Init` or `PinInit`) + // for `#ident`. + unsafe { #data.#ident(&raw mut (*#slot).#ident, #init)? }; + } } else { - ( - quote! { - // SAFETY: `slot` is valid, because we are inside of an initializer - // closure, we return when an error/panic occurs. - unsafe { - ::pin_init::Init::__init( - #init, - &raw mut (*#slot).#ident, - )? - }; - }, - quote! { - // SAFETY: TODO - unsafe { &mut (*#slot).#ident } - }, - ) + quote! { + // SAFETY: `slot` is valid, because we are inside of an initializer + // closure, we return when an error/panic occurs. + unsafe { + ::pin_init::Init::__init( + #init, + &raw mut (*#slot).#ident, + )? + }; + } }; quote! { #(#attrs)* @@ -324,9 +288,6 @@ fn init_fields( let #init = #value; #value_init } - #(#cfgs)* - #[allow(unused_variables)] - let #ident = #accessor; } } InitializerKind::Code { block: value, .. } => quote! { @@ -339,18 +300,41 @@ fn init_fields( if let Some(ident) = kind.ident() { // `mixed_site` ensures that the guard is not accessible to the user-controlled code. let guard = format_ident!("__{ident}_guard", span = Span::mixed_site()); + + // NOTE: The reference is derived from the guard so that it only lives as long as the + // guard does and cannot escape the scope. If it's created via `&mut (*#slot).#ident` + // like the unaligned field guard, it will become effectively `'static`. + let accessor = if pinned { + let project_ident = format_ident!("__project_{ident}"); + quote! { + // SAFETY: the initialization is pinned. + unsafe { #data.#project_ident(#guard.let_binding()) } + } + } else { + quote! { + #guard.let_binding() + } + }; + res.extend(quote! { #(#cfgs)* - // Create the drop guard: + // Create the drop guard. // - // We rely on macro hygiene to make it impossible for users to access this local - // variable. - // SAFETY: We forget the guard later when initialization has succeeded. - let #guard = unsafe { + // SAFETY: + // - `&raw mut (*slot).#ident` is valid. + // - `make_field_check` checks that `&raw mut (*slot).#ident` is properly aligned. + // - `(*slot).#ident` has been initialized above. + // - We only need the ownership to the pointee back when initialization has + // succeeded, where we `forget` the guard. + let mut #guard = unsafe { ::pin_init::__internal::DropGuard::new( &raw mut (*slot).#ident ) }; + + #(#cfgs)* + #[allow(unused_variables)] + let #ident = #accessor; }); guards.push(guard); guard_attrs.push(cfgs); @@ -367,49 +351,49 @@ fn init_fields( } } -/// Generate the check for ensuring that every field has been initialized. +/// Generate the check for ensuring that every field has been initialized and aligned. fn make_field_check( fields: &Punctuated, init_kind: InitKind, path: &Path, ) -> TokenStream { - let field_attrs = fields + let field_attrs: Vec<_> = fields .iter() - .filter_map(|f| f.kind.ident().map(|_| &f.attrs)); - let field_name = fields.iter().filter_map(|f| f.kind.ident()); - match init_kind { - InitKind::Normal => quote! { - // We use unreachable code to ensure that all fields have been mentioned exactly once, - // this struct initializer will still be type-checked and complain with a very natural - // error message if a field is forgotten/mentioned more than once. - #[allow(unreachable_code, clippy::diverging_sub_expression)] - // SAFETY: this code is never executed. - let _ = || unsafe { - ::core::ptr::write(slot, #path { - #( - #(#field_attrs)* - #field_name: ::core::panic!(), - )* - }) - }; - }, - InitKind::Zeroing => quote! { - // We use unreachable code to ensure that all fields have been mentioned at most once. - // Since the user specified `..Zeroable::zeroed()` at the end, all missing fields will - // be zeroed. This struct initializer will still be type-checked and complain with a - // very natural error message if a field is mentioned more than once, or doesn't exist. - #[allow(unreachable_code, clippy::diverging_sub_expression, unused_assignments)] - // SAFETY: this code is never executed. - let _ = || unsafe { - ::core::ptr::write(slot, #path { - #( - #(#field_attrs)* - #field_name: ::core::panic!(), - )* - ..::core::mem::zeroed() - }) - }; - }, + .filter_map(|f| f.kind.ident().map(|_| &f.attrs)) + .collect(); + let field_name: Vec<_> = fields.iter().filter_map(|f| f.kind.ident()).collect(); + let zeroing_trailer = match init_kind { + InitKind::Normal => None, + InitKind::Zeroing => Some(quote! { + ..::core::mem::zeroed() + }), + }; + quote! { + #[allow(unreachable_code, clippy::diverging_sub_expression)] + // We use unreachable code to perform field checks. They're still checked by the compiler. + // SAFETY: this code is never executed. + let _ = || unsafe { + // Create references to ensure that the initialized field is properly aligned. + // Unaligned fields will cause the compiler to emit E0793. We do not support + // unaligned fields since `Init::__init` requires an aligned pointer; the call to + // `ptr::write` for value-initialization case has the same requirement. + #( + #(#field_attrs)* + let _ = &(*slot).#field_name; + )* + + // If the zeroing trailer is not present, this checks that all fields have been + // mentioned exactly once. If the zeroing trailer is present, all missing fields will be + // zeroed, so this checks that all fields have been mentioned at most once. The use of + // struct initializer will still generate very natural error messages for any misuse. + ::core::ptr::write(slot, #path { + #( + #(#field_attrs)* + #field_name: ::core::panic!(), + )* + #zeroing_trailer + }) + }; } } diff --git a/rust/pin-init/src/__internal.rs b/rust/pin-init/src/__internal.rs index 90adbdc1893bbf..5720a621aed74b 100644 --- a/rust/pin-init/src/__internal.rs +++ b/rust/pin-init/src/__internal.rs @@ -238,32 +238,42 @@ fn stack_init_reuse() { /// When a value of this type is dropped, it drops a `T`. /// /// Can be forgotten to prevent the drop. +/// +/// # Invariants +/// +/// - `ptr` is valid and properly aligned. +/// - `*ptr` is initialized and owned by this guard. pub struct DropGuard { ptr: *mut T, } impl DropGuard { - /// Creates a new [`DropGuard`]. It will [`ptr::drop_in_place`] `ptr` when it gets dropped. + /// Creates a drop guard and transfer the ownership of the pointer content. /// - /// # Safety + /// The ownership is only relinguished if the guard is forgotten via [`core::mem::forget`]. /// - /// `ptr` must be a valid pointer. + /// # Safety /// - /// It is the callers responsibility that `self` will only get dropped if the pointee of `ptr`: - /// - has not been dropped, - /// - is not accessible by any other means, - /// - will not be dropped by any other means. + /// - `ptr` is valid and properly aligned. + /// - `*ptr` is initialized, and the ownership is transferred to this guard. #[inline] pub unsafe fn new(ptr: *mut T) -> Self { + // INVARIANT: By safety requirement. Self { ptr } } + + /// Create a let binding for accessor use. + #[inline] + pub fn let_binding(&mut self) -> &mut T { + // SAFETY: Per type invariant. + unsafe { &mut *self.ptr } + } } impl Drop for DropGuard { #[inline] fn drop(&mut self) { - // SAFETY: A `DropGuard` can only be constructed using the unsafe `new` function - // ensuring that this operation is safe. + // SAFETY: `self.ptr` is valid, properly aligned and `*self.ptr` is owned by this guard. unsafe { ptr::drop_in_place(self.ptr) } } } diff --git a/scripts/gcc-plugins/gcc-common.h b/scripts/gcc-plugins/gcc-common.h index 8f1b3500f8e2dc..abb1964c44d4ee 100644 --- a/scripts/gcc-plugins/gcc-common.h +++ b/scripts/gcc-plugins/gcc-common.h @@ -309,7 +309,9 @@ typedef const gimple *const_gimple_ptr; #define gimple gimple_ptr #define const_gimple const_gimple_ptr #undef CONST_CAST_GIMPLE -#define CONST_CAST_GIMPLE(X) CONST_CAST(gimple, (X)) +#define CONST_CAST_GIMPLE(X) const_cast((X)) +#undef CONST_CAST_TREE +#define CONST_CAST_TREE(X) const_cast((X)) /* gimple related */ static inline gimple gimple_build_assign_with_ops(enum tree_code subcode, tree lhs, tree op1, tree op2 MEM_STAT_DECL) diff --git a/security/selinux/include/security.h b/security/selinux/include/security.h index d1f16d7f684de3..0babb89921816a 100644 --- a/security/selinux/include/security.h +++ b/security/selinux/include/security.h @@ -312,8 +312,6 @@ int security_context_to_sid_default(const char *scontext, u32 scontext_len, int security_context_to_sid_force(const char *scontext, u32 scontext_len, u32 *sid); -int security_get_user_sids(u32 fromsid, const char *username, u32 **sids, u32 *nel); - int security_port_sid(u8 protocol, u16 port, u32 *out_sid); int security_ib_pkey_sid(u64 subnet_prefix, u16 pkey_num, u32 *out_sid); diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c index 83aa765a09f98d..25ca7d71401440 100644 --- a/security/selinux/selinuxfs.c +++ b/security/selinux/selinuxfs.c @@ -76,7 +76,6 @@ struct selinux_fs_info { int *bool_pending_values; struct dentry *class_dir; unsigned long last_class_ino; - bool policy_opened; unsigned long last_ino; struct super_block *sb; }; @@ -272,35 +271,13 @@ static ssize_t sel_write_disable(struct file *file, const char __user *buf, size_t count, loff_t *ppos) { - char *page; - ssize_t length; - int new_value; - - if (count >= PAGE_SIZE) - return -ENOMEM; - - /* No partial writes. */ - if (*ppos != 0) - return -EINVAL; - - page = memdup_user_nul(buf, count); - if (IS_ERR(page)) - return PTR_ERR(page); - - if (sscanf(page, "%d", &new_value) != 1) { - length = -EINVAL; - goto out; - } - length = count; - - if (new_value) { - pr_err("SELinux: https://github.com/SELinuxProject/selinux-kernel/wiki/DEPRECATE-runtime-disable\n"); - pr_err("SELinux: Runtime disable is not supported, use selinux=0 on the kernel cmdline.\n"); - } - -out: - kfree(page); - return length; + /* + * Setting disable is no longer supported, see + * https://github.com/SELinuxProject/selinux-kernel/wiki/DEPRECATE-runtime-disable + */ + pr_err_once("SELinux: %s (%d) wrote to disable. This is no longer supported.\n", + current->comm, current->pid); + return count; } static const struct file_operations sel_disable_ops = { @@ -362,44 +339,31 @@ struct policy_load_memory { static int sel_open_policy(struct inode *inode, struct file *filp) { - struct selinux_fs_info *fsi = inode->i_sb->s_fs_info; struct policy_load_memory *plm = NULL; int rc; - BUG_ON(filp->private_data); - - mutex_lock(&selinux_state.policy_mutex); - rc = avc_has_perm(current_sid(), SECINITSID_SECURITY, SECCLASS_SECURITY, SECURITY__READ_POLICY, NULL); if (rc) - goto err; - - rc = -EBUSY; - if (fsi->policy_opened) - goto err; + return rc; - rc = -ENOMEM; plm = kzalloc_obj(*plm); if (!plm) - goto err; + return -ENOMEM; + mutex_lock(&selinux_state.policy_mutex); rc = security_read_policy(&plm->data, &plm->len); if (rc) goto err; - if ((size_t)i_size_read(inode) != plm->len) { inode_lock(inode); i_size_write(inode, plm->len); inode_unlock(inode); } - - fsi->policy_opened = 1; + mutex_unlock(&selinux_state.policy_mutex); filp->private_data = plm; - mutex_unlock(&selinux_state.policy_mutex); - return 0; err: mutex_unlock(&selinux_state.policy_mutex); @@ -412,13 +376,8 @@ static int sel_open_policy(struct inode *inode, struct file *filp) static int sel_release_policy(struct inode *inode, struct file *filp) { - struct selinux_fs_info *fsi = inode->i_sb->s_fs_info; struct policy_load_memory *plm = filp->private_data; - BUG_ON(!plm); - - fsi->policy_opened = 0; - vfree(plm->data); kfree(plm); @@ -594,34 +553,31 @@ static ssize_t sel_write_load(struct file *file, const char __user *buf, if (!count) return -EINVAL; - mutex_lock(&selinux_state.policy_mutex); - length = avc_has_perm(current_sid(), SECINITSID_SECURITY, SECCLASS_SECURITY, SECURITY__LOAD_POLICY, NULL); if (length) - goto out; + return length; data = vmalloc(count); - if (!data) { - length = -ENOMEM; - goto out; - } + if (!data) + return -ENOMEM; if (copy_from_user(data, buf, count) != 0) { length = -EFAULT; goto out; } + mutex_lock(&selinux_state.policy_mutex); length = security_load_policy(data, count, &load_state); if (length) { pr_warn_ratelimited("SELinux: failed to load policy\n"); - goto out; + goto out_unlock; } fsi = file_inode(file)->i_sb->s_fs_info; length = sel_make_policy_nodes(fsi, load_state.policy); if (length) { pr_warn_ratelimited("SELinux: failed to initialize selinuxfs\n"); selinux_policy_cancel(&load_state); - goto out; + goto out_unlock; } selinux_policy_commit(&load_state); @@ -631,8 +587,9 @@ static ssize_t sel_write_load(struct file *file, const char __user *buf, from_kuid(&init_user_ns, audit_get_loginuid(current)), audit_get_sessionid(current)); -out: +out_unlock: mutex_unlock(&selinux_state.policy_mutex); +out: vfree(data); return length; } @@ -689,46 +646,13 @@ static ssize_t sel_read_checkreqprot(struct file *filp, char __user *buf, static ssize_t sel_write_checkreqprot(struct file *file, const char __user *buf, size_t count, loff_t *ppos) { - char *page; - ssize_t length; - unsigned int new_value; - - length = avc_has_perm(current_sid(), SECINITSID_SECURITY, - SECCLASS_SECURITY, SECURITY__SETCHECKREQPROT, - NULL); - if (length) - return length; - - if (count >= PAGE_SIZE) - return -ENOMEM; - - /* No partial writes. */ - if (*ppos != 0) - return -EINVAL; - - page = memdup_user_nul(buf, count); - if (IS_ERR(page)) - return PTR_ERR(page); - - if (sscanf(page, "%u", &new_value) != 1) { - length = -EINVAL; - goto out; - } - length = count; - - if (new_value) { - char comm[sizeof(current->comm)]; - - strscpy(comm, current->comm); - pr_err("SELinux: %s (%d) set checkreqprot to 1. This is no longer supported.\n", - comm, current->pid); - } - - selinux_ima_measure_state(); - -out: - kfree(page); - return length; + /* + * Setting checkreqprot is no longer supported, see + * https://github.com/SELinuxProject/selinux-kernel/wiki/DEPRECATE-checkreqprot + */ + pr_err_once("SELinux: %s (%d) wrote to checkreqprot. This is no longer supported.\n", + current->comm, current->pid); + return count; } static const struct file_operations sel_checkreqprot_ops = { .read = sel_read_checkreqprot, @@ -1073,69 +997,11 @@ static ssize_t sel_write_relabel(struct file *file, char *buf, size_t size) static ssize_t sel_write_user(struct file *file, char *buf, size_t size) { - char *con = NULL, *user = NULL, *ptr; - u32 sid, *sids = NULL; - ssize_t length; - char *newcon; - int rc; - u32 i, len, nsids; - - pr_warn_ratelimited("SELinux: %s (%d) wrote to /sys/fs/selinux/user!" - " This will not be supported in the future; please update your" - " userspace.\n", current->comm, current->pid); - ssleep(5); - - length = avc_has_perm(current_sid(), SECINITSID_SECURITY, - SECCLASS_SECURITY, SECURITY__COMPUTE_USER, - NULL); - if (length) - goto out; - - length = -ENOMEM; - con = kzalloc(size + 1, GFP_KERNEL); - if (!con) - goto out; - - length = -ENOMEM; - user = kzalloc(size + 1, GFP_KERNEL); - if (!user) - goto out; - - length = -EINVAL; - if (sscanf(buf, "%s %s", con, user) != 2) - goto out; - - length = security_context_str_to_sid(con, &sid, GFP_KERNEL); - if (length) - goto out; - - length = security_get_user_sids(sid, user, &sids, &nsids); - if (length) - goto out; - - length = sprintf(buf, "%u", nsids) + 1; - ptr = buf + length; - for (i = 0; i < nsids; i++) { - rc = security_sid_to_context(sids[i], &newcon, &len); - if (rc) { - length = rc; - goto out; - } - if ((length + len) >= SIMPLE_TRANSACTION_LIMIT) { - kfree(newcon); - length = -ERANGE; - goto out; - } - memcpy(ptr, newcon, len); - kfree(newcon); - ptr += len; - length += len; - } -out: - kfree(sids); - kfree(user); - kfree(con); - return length; + pr_err_once("SELinux: %s (%d) wrote to user. This is no longer supported.\n", + current->comm, current->pid); + buf[0] = '0'; + buf[1] = 0; + return 2; } static ssize_t sel_write_member(struct file *file, char *buf, size_t size) diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c index e8e7ccbd1e4485..143021c5e326d8 100644 --- a/security/selinux/ss/services.c +++ b/security/selinux/ss/services.c @@ -2746,131 +2746,6 @@ int security_node_sid(u16 domain, return rc; } -#define SIDS_NEL 25 - -/** - * security_get_user_sids - Obtain reachable SIDs for a user. - * @fromsid: starting SID - * @username: username - * @sids: array of reachable SIDs for user - * @nel: number of elements in @sids - * - * Generate the set of SIDs for legal security contexts - * for a given user that can be reached by @fromsid. - * Set *@sids to point to a dynamically allocated - * array containing the set of SIDs. Set *@nel to the - * number of elements in the array. - */ - -int security_get_user_sids(u32 fromsid, - const char *username, - u32 **sids, - u32 *nel) -{ - struct selinux_policy *policy; - struct policydb *policydb; - struct sidtab *sidtab; - struct context *fromcon, usercon; - u32 *mysids = NULL, *mysids2, sid; - u32 i, j, mynel, maxnel = SIDS_NEL; - struct user_datum *user; - struct role_datum *role; - struct ebitmap_node *rnode, *tnode; - int rc; - - *sids = NULL; - *nel = 0; - - if (!selinux_initialized()) - return 0; - - mysids = kcalloc(maxnel, sizeof(*mysids), GFP_KERNEL); - if (!mysids) - return -ENOMEM; - -retry: - mynel = 0; - rcu_read_lock(); - policy = rcu_dereference(selinux_state.policy); - policydb = &policy->policydb; - sidtab = policy->sidtab; - - context_init(&usercon); - - rc = -EINVAL; - fromcon = sidtab_search(sidtab, fromsid); - if (!fromcon) - goto out_unlock; - - rc = -EINVAL; - user = symtab_search(&policydb->p_users, username); - if (!user) - goto out_unlock; - - usercon.user = user->value; - - ebitmap_for_each_positive_bit(&user->roles, rnode, i) { - role = policydb->role_val_to_struct[i]; - usercon.role = i + 1; - ebitmap_for_each_positive_bit(&role->types, tnode, j) { - usercon.type = j + 1; - - if (mls_setup_user_range(policydb, fromcon, user, - &usercon)) - continue; - - rc = sidtab_context_to_sid(sidtab, &usercon, &sid); - if (rc == -ESTALE) { - rcu_read_unlock(); - goto retry; - } - if (rc) - goto out_unlock; - if (mynel < maxnel) { - mysids[mynel++] = sid; - } else { - rc = -ENOMEM; - maxnel += SIDS_NEL; - mysids2 = kcalloc(maxnel, sizeof(*mysids2), GFP_ATOMIC); - if (!mysids2) - goto out_unlock; - memcpy(mysids2, mysids, mynel * sizeof(*mysids2)); - kfree(mysids); - mysids = mysids2; - mysids[mynel++] = sid; - } - } - } - rc = 0; -out_unlock: - rcu_read_unlock(); - if (rc || !mynel) { - kfree(mysids); - return rc; - } - - rc = -ENOMEM; - mysids2 = kcalloc(mynel, sizeof(*mysids2), GFP_KERNEL); - if (!mysids2) { - kfree(mysids); - return rc; - } - for (i = 0, j = 0; i < mynel; i++) { - struct av_decision dummy_avd; - rc = avc_has_perm_noaudit(fromsid, mysids[i], - SECCLASS_PROCESS, /* kernel value */ - PROCESS__TRANSITION, AVC_STRICT, - &dummy_avd); - if (!rc) - mysids2[j++] = mysids[i]; - cond_resched(); - } - kfree(mysids); - *sids = mysids2; - *nel = j; - return 0; -} - /** * __security_genfs_sid - Helper to obtain a SID for a file in a filesystem * @policy: policy diff --git a/sound/core/misc.c b/sound/core/misc.c index 5aca09edf9718a..833124c8e4fa83 100644 --- a/sound/core/misc.c +++ b/sound/core/misc.c @@ -148,9 +148,11 @@ EXPORT_SYMBOL_GPL(snd_fasync_helper); void snd_kill_fasync(struct snd_fasync *fasync, int signal, int poll) { - if (!fasync || !fasync->on) + if (!fasync) return; guard(spinlock_irqsave)(&snd_fasync_lock); + if (!fasync->on) + return; fasync->signal = signal; fasync->poll = poll; list_move(&fasync->list, &snd_fasync_list); @@ -163,8 +165,10 @@ void snd_fasync_free(struct snd_fasync *fasync) if (!fasync) return; - scoped_guard(spinlock_irq, &snd_fasync_lock) + scoped_guard(spinlock_irq, &snd_fasync_lock) { + fasync->on = 0; list_del_init(&fasync->list); + } flush_work(&snd_fasync_work); kfree(fasync); diff --git a/sound/core/oss/pcm_plugin.c b/sound/core/oss/pcm_plugin.c index 14b4a390a2192f..5f4d6945a7df64 100644 --- a/sound/core/oss/pcm_plugin.c +++ b/sound/core/oss/pcm_plugin.c @@ -146,7 +146,7 @@ int snd_pcm_plugin_build(struct snd_pcm_substream *plug, return -ENXIO; if (snd_BUG_ON(!src_format || !dst_format)) return -ENXIO; - plugin = kzalloc(sizeof(*plugin) + extra, GFP_KERNEL); + plugin = kzalloc_flex(*plugin, extra_data, extra); if (plugin == NULL) return -ENOMEM; plugin->name = name; diff --git a/sound/core/seq/seq_clientmgr.c b/sound/core/seq/seq_clientmgr.c index 75a7a2af9d8c96..5719637575a911 100644 --- a/sound/core/seq/seq_clientmgr.c +++ b/sound/core/seq/seq_clientmgr.c @@ -1253,7 +1253,7 @@ static int snd_seq_ioctl_set_client_info(struct snd_seq_client *client, if (client->user_pversion >= SNDRV_PROTOCOL_VERSION(1, 0, 3)) client->midi_version = client_info->midi_version; memcpy(client->event_filter, client_info->event_filter, 32); - client->group_filter = client_info->group_filter; + client->group_filter = client_info->group_filter & SND_SEQ_GROUP_FILTER_MASK; /* notify the change */ snd_seq_system_client_ev_client_change(client->number); diff --git a/sound/core/seq/seq_clientmgr.h b/sound/core/seq/seq_clientmgr.h index ece02c58db702d..feea8bb7d9870a 100644 --- a/sound/core/seq/seq_clientmgr.h +++ b/sound/core/seq/seq_clientmgr.h @@ -14,6 +14,9 @@ /* client manager */ +#define SND_SEQ_GROUP_FILTER_MASK GENMASK(SNDRV_UMP_MAX_GROUPS, 0) +#define SND_SEQ_GROUP_FILTER_GROUPS GENMASK(SNDRV_UMP_MAX_GROUPS, 1) + struct snd_seq_user_client { struct file *file; /* file struct of client */ /* ... */ @@ -40,7 +43,7 @@ struct snd_seq_client { int number; /* client number */ unsigned int filter; /* filter flags */ DECLARE_BITMAP(event_filter, 256); - unsigned short group_filter; + unsigned int group_filter; snd_use_lock_t use_lock; int event_lost; /* ports */ diff --git a/sound/core/seq/seq_midi_emul.c b/sound/core/seq/seq_midi_emul.c index fd067c85c524fc..a90ebc7b3811b2 100644 --- a/sound/core/seq/seq_midi_emul.c +++ b/sound/core/seq/seq_midi_emul.c @@ -81,9 +81,6 @@ snd_midi_process_event(const struct snd_midi_op *ops, pr_debug("ALSA: seq_midi_emul: ev or chanbase NULL (snd_midi_process_event)\n"); return; } - if (chanset->channels == NULL) - return; - if (snd_seq_ev_is_channel_type(ev)) { dest_channel = ev->data.note.channel; if (dest_channel >= chanset->max_channels) { @@ -642,23 +639,6 @@ static void snd_midi_channel_init(struct snd_midi_channel *p, int n) p->drum_channel = 1; /* Default ch 10 as drums */ } -/* - * Allocate and initialise a set of midi channel control blocks. - */ -static struct snd_midi_channel *snd_midi_channel_init_set(int n) -{ - struct snd_midi_channel *chan; - int i; - - chan = kmalloc_objs(struct snd_midi_channel, n); - if (chan) { - for (i = 0; i < n; i++) - snd_midi_channel_init(chan+i, i); - } - - return chan; -} - /* * reset all midi channels */ @@ -687,13 +667,18 @@ reset_all_channels(struct snd_midi_channel_set *chset) struct snd_midi_channel_set *snd_midi_channel_alloc_set(int n) { struct snd_midi_channel_set *chset; + int i; + + chset = kmalloc_flex(*chset, channels, n); + if (!chset) + return NULL; + + chset->max_channels = n; + chset->private_data = NULL; + + for (i = 0; i < n; i++) + snd_midi_channel_init(&chset->channels[i], i); - chset = kmalloc_obj(*chset); - if (chset) { - chset->channels = snd_midi_channel_init_set(n); - chset->private_data = NULL; - chset->max_channels = n; - } return chset; } EXPORT_SYMBOL(snd_midi_channel_alloc_set); @@ -717,7 +702,6 @@ void snd_midi_channel_free_set(struct snd_midi_channel_set *chset) { if (chset == NULL) return; - kfree(chset->channels); kfree(chset); } EXPORT_SYMBOL(snd_midi_channel_free_set); diff --git a/sound/core/seq/seq_ump_client.c b/sound/core/seq/seq_ump_client.c index fdc76f23e03f48..9079ccfdc8666d 100644 --- a/sound/core/seq/seq_ump_client.c +++ b/sound/core/seq/seq_ump_client.c @@ -369,7 +369,7 @@ static void setup_client_group_filter(struct seq_ump_client *client) cptr = snd_seq_kernel_client_get(client->seq_client); if (!cptr) return; - filter = ~(1U << 0); /* always allow groupless messages */ + filter = SND_SEQ_GROUP_FILTER_GROUPS; /* always allow groupless messages */ for (p = 0; p < SNDRV_UMP_MAX_GROUPS; p++) { if (client->ump->groups[p].active) filter &= ~(1U << (p + 1)); diff --git a/sound/drivers/pcmtest.c b/sound/drivers/pcmtest.c index 5bfec4c7bf7147..7f93557b51eca2 100644 --- a/sound/drivers/pcmtest.c +++ b/sound/drivers/pcmtest.c @@ -679,9 +679,9 @@ static ssize_t pattern_read(struct file *file, char __user *u_buff, size_t len, return 0; if (copy_to_user(u_buff, patt_buf->buf + *off, to_read)) - to_read = 0; - else - *off += to_read; + return -EFAULT; + + *off += to_read; return to_read; } diff --git a/sound/firewire/tascam/tascam-hwdep.c b/sound/firewire/tascam/tascam-hwdep.c index 867b4ea1096e13..6270263e7bf48b 100644 --- a/sound/firewire/tascam/tascam-hwdep.c +++ b/sound/firewire/tascam/tascam-hwdep.c @@ -73,6 +73,7 @@ static long tscm_hwdep_read_queue(struct snd_tscm *tscm, char __user *buf, length = rounddown(remained, sizeof(*entries)); if (length == 0) break; + tail_pos = head_pos + length / sizeof(*entries); spin_unlock_irq(&tscm->lock); if (copy_to_user(pos, &entries[head_pos], length)) diff --git a/sound/hda/codecs/Kconfig b/sound/hda/codecs/Kconfig index addbc94243365b..dcf340e5a0c1ad 100644 --- a/sound/hda/codecs/Kconfig +++ b/sound/hda/codecs/Kconfig @@ -69,6 +69,7 @@ comment "Set to Y if you want auto-loading the codec driver" config SND_HDA_CODEC_CA0132 tristate "Build Creative CA0132 codec support" + select SND_HDA_GENERIC help Say Y or M here to include Creative CA0132 codec support in snd-hda-intel driver. diff --git a/sound/hda/codecs/ca0132.c b/sound/hda/codecs/ca0132.c index ad533b04ab29cf..3fe11983d6ca17 100644 --- a/sound/hda/codecs/ca0132.c +++ b/sound/hda/codecs/ca0132.c @@ -24,6 +24,7 @@ #include "hda_local.h" #include "hda_auto_parser.h" #include "hda_jack.h" +#include "generic.h" #include "ca0132_regs.h" @@ -1060,6 +1061,8 @@ enum dsp_download_state { */ struct ca0132_spec { + struct hda_gen_spec gen; + const struct snd_kcontrol_new *mixers[5]; unsigned int num_mixers; const struct hda_verb *base_init_verbs; @@ -1174,6 +1177,7 @@ enum { QUIRK_R3D, QUIRK_AE5, QUIRK_AE7, + QUIRK_GENERIC, QUIRK_NONE = HDA_FIXUP_ID_NOT_SET, }; @@ -1292,6 +1296,20 @@ static const struct hda_pintbl ae7_pincfgs[] = { {} }; +static const struct hda_pintbl ca0132_generic_pincfgs[] = { + { 0x0b, 0x41014111 }, + { 0x0c, 0x414520f0 }, /* SPDIF out */ + { 0x0d, 0x01014010 }, /* lineout */ + { 0x0e, 0x41c501f0 }, + { 0x0f, 0x411111f0 }, /* disabled */ + { 0x10, 0x411111f0 }, /* disabled */ + { 0x11, 0x41012014 }, + { 0x12, 0x37a790f0 }, /* mic */ + { 0x13, 0x77a701f0 }, + { 0x18, 0x500000f0 }, + {} +}; + static const struct hda_quirk ca0132_quirks[] = { SND_PCI_QUIRK(0x1028, 0x057b, "Alienware M17x R4", QUIRK_ALIENWARE_M17XR4), SND_PCI_QUIRK(0x1028, 0x0685, "Alienware 15 2015", QUIRK_ALIENWARE), @@ -1304,6 +1322,7 @@ static const struct hda_quirk ca0132_quirks[] = { SND_PCI_QUIRK(0x1458, 0xA016, "Recon3Di", QUIRK_R3DI), SND_PCI_QUIRK(0x1458, 0xA026, "Gigabyte G1.Sniper Z97", QUIRK_R3DI), SND_PCI_QUIRK(0x1458, 0xA036, "Gigabyte GA-Z170X-Gaming 7", QUIRK_R3DI), + SND_PCI_QUIRK(0x1458, 0xA046, "Gigabyte GA-Z170X-Gaming G1", QUIRK_GENERIC), SND_PCI_QUIRK(0x3842, 0x1038, "EVGA X99 Classified", QUIRK_R3DI), SND_PCI_QUIRK(0x3842, 0x104b, "EVGA X299 Dark", QUIRK_R3DI), SND_PCI_QUIRK(0x3842, 0x1055, "EVGA Z390 DARK", QUIRK_R3DI), @@ -1325,6 +1344,7 @@ static const struct hda_model_fixup ca0132_quirk_models[] = { { .id = QUIRK_R3D, .name = "r3d" }, { .id = QUIRK_AE5, .name = "ae5" }, { .id = QUIRK_AE7, .name = "ae7" }, + { .id = QUIRK_GENERIC, .name = "generic" }, {} }; @@ -5498,6 +5518,30 @@ static int zxr_headphone_gain_set(struct hda_codec *codec, long val) return 0; } +/* + * Manual output selection (HP/Speaker Playback Switch or alt Output Select) + * is meaningful only when HP/Speaker auto-detect is disabled, since the + * select_out path always prefers jack presence when auto-detect is on. When + * the user explicitly chooses an output, turn auto-detect off so the manual + * choice actually takes effect, and notify userspace so the auto-detect + * control reflects the new state. + */ +static void ca0132_disable_hp_auto_detect(struct hda_codec *codec) +{ + struct ca0132_spec *spec = codec->spec; + struct snd_kcontrol *kctl; + + if (!spec->vnode_lswitch[VNID_HP_ASEL - VNODE_START_NID]) + return; + + spec->vnode_lswitch[VNID_HP_ASEL - VNODE_START_NID] = 0; + kctl = snd_hda_find_mixer_ctl(codec, + "HP/Speaker Auto Detect Playback Switch"); + if (kctl) + snd_ctl_notify(codec->card, SNDRV_CTL_EVENT_MASK_VALUE, + &kctl->id); +} + static int ca0132_vnode_switch_set(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol) { @@ -5510,14 +5554,11 @@ static int ca0132_vnode_switch_set(struct snd_kcontrol *kcontrol, int auto_jack; if (nid == VNID_HP_SEL) { - auto_jack = - spec->vnode_lswitch[VNID_HP_ASEL - VNODE_START_NID]; - if (!auto_jack) { - if (ca0132_use_alt_functions(spec)) - ca0132_alt_select_out(codec); - else - ca0132_select_out(codec); - } + ca0132_disable_hp_auto_detect(codec); + if (ca0132_use_alt_functions(spec)) + ca0132_alt_select_out(codec); + else + ca0132_select_out(codec); return 1; } @@ -5978,7 +6019,6 @@ static int ca0132_alt_output_select_put(struct snd_kcontrol *kcontrol, struct ca0132_spec *spec = codec->spec; int sel = ucontrol->value.enumerated.item[0]; unsigned int items = NUM_OF_OUTPUTS; - unsigned int auto_jack; if (sel >= items) return 0; @@ -5988,10 +6028,8 @@ static int ca0132_alt_output_select_put(struct snd_kcontrol *kcontrol, spec->out_enum_val = sel; - auto_jack = spec->vnode_lswitch[VNID_HP_ASEL - VNODE_START_NID]; - - if (!auto_jack) - ca0132_alt_select_out(codec); + ca0132_disable_hp_auto_detect(codec); + ca0132_alt_select_out(codec); return 1; } @@ -9879,14 +9917,57 @@ static void sbz_detect_quirk(struct hda_codec *codec) } } +static void ca0132_generic_init_hook(struct hda_codec *codec) +{ + struct ca0132_spec *spec = codec->spec; + + snd_hda_sequence_write(codec, spec->spec_init_verbs); +} + +static int ca0132_generic_probe(struct hda_codec *codec) +{ + struct ca0132_spec *spec = codec->spec; + struct auto_pin_cfg *cfg = &spec->gen.autocfg; + int err; + + snd_hda_gen_spec_init(&spec->gen); + + snd_hda_apply_pincfgs(codec, ca0132_generic_pincfgs); + + ca0132_init_chip(codec); + + err = ca0132_prepare_verbs(codec); + if (err < 0) + return err; + + err = snd_hda_parse_pin_def_config(codec, cfg, NULL); + if (err < 0) + return err; + err = snd_hda_gen_parse_auto_config(codec, cfg); + if (err < 0) + return err; + + spec->gen.init_hook = ca0132_generic_init_hook; + spec->gen.automute_speaker = 0; + spec->gen.automute_lo = 0; + + snd_hda_sequence_write(codec, spec->spec_init_verbs); + return 0; +} + static void ca0132_codec_remove(struct hda_codec *codec) { struct ca0132_spec *spec = codec->spec; - if (ca0132_quirk(spec) == QUIRK_ZXR_DBPRO) + switch (ca0132_quirk(spec)) { + case QUIRK_GENERIC: + snd_hda_gen_remove(codec); + return; + case QUIRK_ZXR_DBPRO: return dbpro_free(codec); - else + default: return ca0132_free(codec); + } } static int ca0132_codec_probe(struct hda_codec *codec, @@ -9903,14 +9984,21 @@ static int ca0132_codec_probe(struct hda_codec *codec, codec->spec = spec; spec->codec = codec; - /* Detect codec quirk */ - snd_hda_pick_fixup(codec, ca0132_quirk_models, ca0132_quirks, NULL); - if (ca0132_quirk(spec) == QUIRK_SBZ) - sbz_detect_quirk(codec); - + /* These must be set before any path is taken */ codec->pcm_format_first = 1; codec->no_sticky_stream = 1; + /* Detect codec quirk */ + snd_hda_pick_fixup(codec, ca0132_quirk_models, ca0132_quirks, NULL); + switch (ca0132_quirk(spec)) { + case QUIRK_SBZ: + sbz_detect_quirk(codec); + break; + case QUIRK_GENERIC: + return ca0132_generic_probe(codec); + default: + break; + } spec->dsp_state = DSP_DOWNLOAD_INIT; spec->num_mixers = 1; @@ -10011,36 +10099,51 @@ static int ca0132_codec_build_controls(struct hda_codec *codec) { struct ca0132_spec *spec = codec->spec; - if (ca0132_quirk(spec) == QUIRK_ZXR_DBPRO) + switch (ca0132_quirk(spec)) { + case QUIRK_GENERIC: + return snd_hda_gen_build_controls(codec); + case QUIRK_ZXR_DBPRO: return dbpro_build_controls(codec); - else + default: return ca0132_build_controls(codec); + } } static int ca0132_codec_build_pcms(struct hda_codec *codec) { struct ca0132_spec *spec = codec->spec; - if (ca0132_quirk(spec) == QUIRK_ZXR_DBPRO) + switch (ca0132_quirk(spec)) { + case QUIRK_GENERIC: + return snd_hda_gen_build_pcms(codec); + case QUIRK_ZXR_DBPRO: return dbpro_build_pcms(codec); - else + default: return ca0132_build_pcms(codec); + } } static int ca0132_codec_init(struct hda_codec *codec) { struct ca0132_spec *spec = codec->spec; - if (ca0132_quirk(spec) == QUIRK_ZXR_DBPRO) + switch (ca0132_quirk(spec)) { + case QUIRK_GENERIC: + return snd_hda_gen_init(codec); + case QUIRK_ZXR_DBPRO: return dbpro_init(codec); - else + default: return ca0132_init(codec); + } } static int ca0132_codec_suspend(struct hda_codec *codec) { struct ca0132_spec *spec = codec->spec; + if (ca0132_quirk(spec) == QUIRK_GENERIC) + return 0; + cancel_delayed_work_sync(&spec->unsol_hp_work); return 0; } diff --git a/sound/hda/codecs/realtek/alc269.c b/sound/hda/codecs/realtek/alc269.c index c3316a6f7e9d59..97ae4cee87bf52 100644 --- a/sound/hda/codecs/realtek/alc269.c +++ b/sound/hda/codecs/realtek/alc269.c @@ -1669,6 +1669,21 @@ static void alc295_fixup_hp_mute_led_coefbit11(struct hda_codec *codec, } } +/* Override wrong pin to NID 0x1b (F.32 BIOS reports 0x18 via DMI OEM string) + * on HP pavilion 15-cs1xxx laptops + */ +static void alc295_fixup_hp_pavilion_mute_led_1b(struct hda_codec *codec, + const struct hda_fixup *fix, + int action) +{ + struct alc_spec *spec = codec->spec; + + alc269_fixup_hp_mute_led(codec, fix, action); + + if (action == HDA_FIXUP_ACT_PRE_PROBE) + spec->mute_led_nid = 0x1b; +} + static void alc233_fixup_lenovo_coef_micmute_led(struct hda_codec *codec, const struct hda_fixup *fix, int action) { @@ -3323,6 +3338,18 @@ static void alc256_fixup_acer_sfg16_micmute_led(struct hda_codec *codec, alc_fixup_hp_gpio_led(codec, action, 0, 0x04); } +static void alc287_fixup_acer_micmute_led(struct hda_codec *codec, + const struct hda_fixup *fix, int action) +{ + struct alc_spec *spec = codec->spec; + + alc_fixup_hp_gpio_led(codec, action, 0, 0x10); + if (action == HDA_FIXUP_ACT_PRE_PROBE) { + spec->micmute_led_polarity = 1; + spec->gpio_mask |= 0x10; + spec->gpio_dir |= 0x10; + } +} /* for alc295_fixup_hp_top_speakers */ #include "../helpers/hp_x360.c" @@ -3390,6 +3417,19 @@ static void alc256_fixup_mic_no_presence_and_resume(struct hda_codec *codec, } } +static void alc256_fixup_xiaomi_pro15_resume(struct hda_codec *codec, + const struct hda_fixup *fix, + int action) +{ + /* + * On the Xiaomi Mi Laptop Pro 15 (TM1905, SSID 1d72:1905) the ALC256 + * codec sets coefficient 0x10 bit 9 to 1 after S3 resume, silencing + * the internal speaker. Bluetooth and HDMI audio are unaffected. + * Clear the bit so the speaker keeps working across suspend cycles. + */ + alc_update_coef_idx(codec, 0x10, 1<<9, 0); +} + static void alc256_decrease_headphone_amp_val(struct hda_codec *codec, const struct hda_fixup *fix, int action) { @@ -3857,6 +3897,7 @@ enum { ALC290_FIXUP_SUBWOOFER, ALC290_FIXUP_SUBWOOFER_HSJACK, ALC295_FIXUP_HP_MUTE_LED_COEFBIT11, + ALC295_FIXUP_HP_PAVILION_MUTE_LED_1B, ALC269_FIXUP_THINKPAD_ACPI, ALC269_FIXUP_LENOVO_XPAD_ACPI, ALC269_FIXUP_DMIC_THINKPAD_ACPI, @@ -4052,6 +4093,7 @@ enum { ALC256_FIXUP_SYSTEM76_MIC_NO_PRESENCE, ALC233_FIXUP_NO_AUDIO_JACK, ALC256_FIXUP_MIC_NO_PRESENCE_AND_RESUME, + ALC256_FIXUP_XIAOMI_PRO15_RESUME, ALC285_FIXUP_LEGION_Y9000X_SPEAKERS, ALC285_FIXUP_LEGION_Y9000X_AUTOMUTE, ALC287_FIXUP_LEGION_16ACHG6, @@ -4065,6 +4107,7 @@ enum { ALC245_FIXUP_CS35L41_SPI_4_HP_GPIO_LED, ALC285_FIXUP_HP_SPEAKERS_MICMUTE_LED, ALC295_FIXUP_FRAMEWORK_LAPTOP_MIC_NO_PRESENCE, + ALC295_FIXUP_FRAMEWORK_LAPTOP_LIMIT_INT_MIC_BOOST, ALC287_FIXUP_LEGION_16ITHG6, ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK, ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN, @@ -4123,6 +4166,8 @@ enum { ALC245_FIXUP_CS35L41_I2C_2_MUTE_LED, ALC236_FIXUP_HP_DMIC, ALC256_FIXUP_HONOR_MRB_XXX_M1020_AUDIO, + ALC245_FIXUP_HP_ENVY_X360_15_FH0XXX, + ALC287_FIXUP_ACER_MICMUTE_LED, }; /* A special fixup for Lenovo C940 and Yoga Duet 7; @@ -5700,6 +5745,10 @@ static const struct hda_fixup alc269_fixups[] = { .type = HDA_FIXUP_FUNC, .v.func = alc295_fixup_hp_mute_led_coefbit11, }, + [ALC295_FIXUP_HP_PAVILION_MUTE_LED_1B] = { + .type = HDA_FIXUP_FUNC, + .v.func = alc295_fixup_hp_pavilion_mute_led_1b, + }, [ALC298_FIXUP_SAMSUNG_AMP] = { .type = HDA_FIXUP_FUNC, .v.func = alc298_fixup_samsung_amp, @@ -6240,6 +6289,10 @@ static const struct hda_fixup alc269_fixups[] = { .chained = true, .chain_id = ALC269_FIXUP_HEADSET_MODE_NO_HP_MIC }, + [ALC256_FIXUP_XIAOMI_PRO15_RESUME] = { + .type = HDA_FIXUP_FUNC, + .v.func = alc256_fixup_xiaomi_pro15_resume, + }, [ALC287_FIXUP_LEGION_16ACHG6] = { .type = HDA_FIXUP_FUNC, .v.func = alc287_fixup_legion_16achg6_speakers, @@ -6307,6 +6360,12 @@ static const struct hda_fixup alc269_fixups[] = { .chained = true, .chain_id = ALC269_FIXUP_HEADSET_MODE_NO_HP_MIC }, + [ALC295_FIXUP_FRAMEWORK_LAPTOP_LIMIT_INT_MIC_BOOST] = { + .type = HDA_FIXUP_FUNC, + .v.func = alc269_fixup_limit_int_mic_boost, + .chained = true, + .chain_id = ALC295_FIXUP_FRAMEWORK_LAPTOP_MIC_NO_PRESENCE, + }, [ALC287_FIXUP_LEGION_16ITHG6] = { .type = HDA_FIXUP_FUNC, .v.func = alc287_fixup_legion_16ithg6_speakers, @@ -6675,7 +6734,19 @@ static const struct hda_fixup alc269_fixups[] = { { 0x1b, 0x90170110 }, { } } - } + }, + [ALC245_FIXUP_HP_ENVY_X360_15_FH0XXX] = { + .type = HDA_FIXUP_FUNC, + .v.func = cs35l41_fixup_i2c_two, + .chained = true, + .chain_id = ALC245_FIXUP_HP_X360_MUTE_LEDS + }, + [ALC287_FIXUP_ACER_MICMUTE_LED] = { + .type = HDA_FIXUP_FUNC, + .v.func = alc287_fixup_acer_micmute_led, + .chained = true, + .chain_id = ALC2XX_FIXUP_HEADSET_MIC, + }, }; static const struct hda_quirk alc269_fixup_tbl[] = { @@ -6725,7 +6796,7 @@ static const struct hda_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x1025, 0x1466, "Acer Aspire A515-56", ALC255_FIXUP_ACER_HEADPHONE_AND_MIC), SND_PCI_QUIRK(0x1025, 0x1534, "Acer Predator PH315-54", ALC255_FIXUP_ACER_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1025, 0x1539, "Acer Nitro 5 AN515-57", ALC2XX_FIXUP_HEADSET_MIC), - SND_PCI_QUIRK(0x1025, 0x159c, "Acer Nitro 5 AN515-58", ALC2XX_FIXUP_HEADSET_MIC), + SND_PCI_QUIRK(0x1025, 0x159c, "Acer Nitro 5 AN515-58", ALC287_FIXUP_ACER_MICMUTE_LED), SND_PCI_QUIRK(0x1025, 0x1597, "Acer Nitro 5 AN517-55", ALC2XX_FIXUP_HEADSET_MIC), SND_PCI_QUIRK(0x1025, 0x160e, "Acer PT316-51S", ALC2XX_FIXUP_HEADSET_MIC), SND_PCI_QUIRK(0x1025, 0x1640, "Acer Aspire A315-44P", ALC256_FIXUP_ACER_SFG16_MICMUTE_LED), @@ -6906,6 +6977,7 @@ static const struct hda_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x103c, 0x8537, "HP ProBook 440 G6", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF), SND_PCI_QUIRK(0x103c, 0x8548, "HP EliteBook x360 830 G6", ALC285_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x854a, "HP EliteBook 830 G6", ALC285_FIXUP_HP_GPIO_LED), + SND_PCI_QUIRK(0x103c, 0x856a, "HP Pavilion 15-cs1xxx", ALC295_FIXUP_HP_PAVILION_MUTE_LED_1B), SND_PCI_QUIRK(0x103c, 0x85c6, "HP Pavilion x360 Convertible 14-dy1xxx", ALC295_FIXUP_HP_MUTE_LED_COEFBIT11), SND_PCI_QUIRK(0x103c, 0x85de, "HP Envy x360 13-ar0xxx", ALC285_FIXUP_HP_ENVY_X360), SND_PCI_QUIRK(0x103c, 0x8603, "HP Omen 17-cb0xxx", ALC285_FIXUP_HP_MUTE_LED), @@ -7097,7 +7169,7 @@ static const struct hda_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x103c, 0x8be6, "HP Envy 16", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x103c, 0x8be7, "HP Envy 17", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x103c, 0x8be8, "HP Envy 17", ALC287_FIXUP_CS35L41_I2C_2), - SND_PCI_QUIRK(0x103c, 0x8be9, "HP Envy 15", ALC287_FIXUP_CS35L41_I2C_2), + SND_PCI_QUIRK(0x103c, 0x8be9, "HP Envy x360 2-in-1 Laptop 15-fh0xxx", ALC245_FIXUP_HP_ENVY_X360_15_FH0XXX), SND_PCI_QUIRK(0x103c, 0x8bf0, "HP", ALC236_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8c15, "HP Spectre x360 2-in-1 Laptop 14-eu0xxx", ALC245_FIXUP_HP_SPECTRE_X360_EU0XXX), SND_PCI_QUIRK(0x103c, 0x8c16, "HP Spectre x360 2-in-1 Laptop 16-aa0xxx", ALC245_FIXUP_HP_SPECTRE_X360_16_AA0XXX), @@ -7147,6 +7219,7 @@ static const struct hda_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x103c, 0x8ca4, "HP ZBook Fury", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8ca7, "HP ZBook Fury", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8caf, "HP Elite mt645 G8 Mobile Thin Client", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF), + SND_PCI_QUIRK(0x103c, 0x8cbc, "HP Pavilion Laptop 16-ag0xxx", ALC245_FIXUP_HP_X360_MUTE_LEDS), SND_PCI_QUIRK(0x103c, 0x8cbd, "HP Pavilion Aero Laptop 13-bg0xxx", ALC245_FIXUP_HP_X360_MUTE_LEDS), SND_PCI_QUIRK(0x103c, 0x8cdd, "HP Spectre", ALC245_FIXUP_HP_SPECTRE_X360_EU0XXX), SND_PCI_QUIRK(0x103c, 0x8cde, "HP OmniBook Ultra Flip Laptop 14t", ALC245_FIXUP_HP_SPECTRE_X360_EU0XXX), @@ -7459,6 +7532,7 @@ static const struct hda_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x144d, 0xc870, "Samsung Galaxy Book2 Pro (NP950XED)", ALC298_FIXUP_SAMSUNG_AMP_V2_2_AMPS), SND_PCI_QUIRK(0x144d, 0xc872, "Samsung Galaxy Book2 Pro (NP950XEE)", ALC298_FIXUP_SAMSUNG_AMP_V2_2_AMPS), SND_PCI_QUIRK(0x144d, 0xc886, "Samsung Galaxy Book3 Pro (NP964XFG)", ALC298_FIXUP_SAMSUNG_AMP_V2_4_AMPS), + SND_PCI_QUIRK(0x144d, 0xc902, "Samsung Galaxy Book5 360 (NP750QHA)", ALC256_FIXUP_SAMSUNG_HEADPHONE_VERY_QUIET), SND_PCI_QUIRK(0x144d, 0xc1ca, "Samsung Galaxy Book3 Pro 360 (NP960QFG)", ALC298_FIXUP_SAMSUNG_AMP_V2_4_AMPS), SND_PCI_QUIRK(0x144d, 0xc1cb, "Samsung Galaxy Book3 Pro 360 (NP965QFG)", ALC298_FIXUP_SAMSUNG_AMP_V2_4_AMPS), SND_PCI_QUIRK(0x144d, 0xc1cc, "Samsung Galaxy Book3 Ultra (NT960XFH)", ALC298_FIXUP_SAMSUNG_AMP_V2_4_AMPS), @@ -7630,6 +7704,12 @@ static const struct hda_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x17aa, 0x3801, "Lenovo Yoga9 14IAP7", ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN), HDA_CODEC_QUIRK(0x17aa, 0x3802, "DuetITL 2021", ALC287_FIXUP_YOGA7_14ITL_SPEAKERS), SND_PCI_QUIRK(0x17aa, 0x3802, "Lenovo Yoga Pro 9 14IRP8", ALC287_FIXUP_TAS2781_I2C), + /* Yoga Pro 9 16IMH9 and Legion 7 16ITHG6 share PCI SSID 17aa:3811 + * with Legion S7 15IMH05; use codec SSID to distinguish them + */ + HDA_CODEC_QUIRK(0x17aa, 0x38d5, "Lenovo Yoga Pro 9 16IMH9", ALC287_FIXUP_TAS2781_I2C), + HDA_CODEC_QUIRK(0x17aa, 0x38d6, "Lenovo Yoga Pro 9 16IMH9", ALC287_FIXUP_TAS2781_I2C), + HDA_CODEC_QUIRK(0x17aa, 0x3855, "Legion 7 16ITHG6", ALC287_FIXUP_LEGION_16ITHG6), SND_PCI_QUIRK(0x17aa, 0x3811, "Legion S7 15IMH05", ALC287_FIXUP_LEGION_15IMHG05_SPEAKERS), SND_PCI_QUIRK(0x17aa, 0x3813, "Legion 7i 15IMHG05", ALC287_FIXUP_LEGION_15IMHG05_SPEAKERS), SND_PCI_QUIRK(0x17aa, 0x3818, "Lenovo C940 / Yoga Duet 7", ALC298_FIXUP_LENOVO_C940_DUET7), @@ -7703,6 +7783,7 @@ static const struct hda_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x17aa, 0x38df, "Y990 YG DUAL", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x38f9, "Thinkbook 16P Gen5", ALC287_FIXUP_MG_RTKC_CSAMP_CS35L41_I2C_THINKPAD), SND_PCI_QUIRK(0x17aa, 0x38fa, "Thinkbook 16P Gen5", ALC287_FIXUP_MG_RTKC_CSAMP_CS35L41_I2C_THINKPAD), + SND_PCI_QUIRK(0x17aa, 0x38fc, "Lenovo Yoga Pro 7 15ASH11", ALC245_FIXUP_BASS_HP_DAC), SND_PCI_QUIRK(0x17aa, 0x38fd, "ThinkBook plus Gen5 Hybrid", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x3902, "Lenovo E50-80", ALC269_FIXUP_DMIC_THINKPAD_ACPI), SND_PCI_QUIRK(0x17aa, 0x390d, "Lenovo Yoga Pro 7 14ASP10", ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN), @@ -7775,9 +7856,11 @@ static const struct hda_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x1d72, 0x1602, "RedmiBook", ALC255_FIXUP_XIAOMI_HEADSET_MIC), SND_PCI_QUIRK(0x1d72, 0x1701, "XiaomiNotebook Pro", ALC298_FIXUP_DELL1_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1d72, 0x1901, "RedmiBook 14", ALC256_FIXUP_ASUS_HEADSET_MIC), + SND_PCI_QUIRK(0x1d72, 0x1905, "Xiaomi Mi Laptop Pro 15", ALC256_FIXUP_XIAOMI_PRO15_RESUME), SND_PCI_QUIRK(0x1d72, 0x1945, "Redmi G", ALC256_FIXUP_ASUS_HEADSET_MIC), SND_PCI_QUIRK(0x1d72, 0x1947, "RedmiBook Air", ALC255_FIXUP_XIAOMI_HEADSET_MIC), SND_PCI_QUIRK(0x1e39, 0xca14, "MEDION NM14LNL", ALC233_FIXUP_MEDION_MTL_SPK), + SND_PCI_QUIRK(0x1e50, 0x7007, "Positivo DN50E", ALC269_FIXUP_LIMIT_INT_MIC_BOOST), SND_PCI_QUIRK(0x1ee7, 0x2078, "HONOR BRB-X M1010", ALC2XX_FIXUP_HEADSET_MIC), SND_PCI_QUIRK(0x1ee7, 0x2081, "HONOR MRB-XXX M1020", ALC256_FIXUP_HONOR_MRB_XXX_M1020_AUDIO), SND_PCI_QUIRK(0x1f4c, 0xe001, "Minisforum V3 (SE)", ALC245_FIXUP_BASS_HP_DAC), @@ -7803,7 +7886,8 @@ static const struct hda_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0xf111, 0x0009, "Framework Laptop", ALC295_FIXUP_FRAMEWORK_LAPTOP_MIC_NO_PRESENCE), SND_PCI_QUIRK(0xf111, 0x000b, "Framework Laptop", ALC295_FIXUP_FRAMEWORK_LAPTOP_MIC_NO_PRESENCE), SND_PCI_QUIRK(0xf111, 0x000c, "Framework Laptop", ALC295_FIXUP_FRAMEWORK_LAPTOP_MIC_NO_PRESENCE), - SND_PCI_QUIRK(0xf111, 0x000f, "Framework Laptop", ALC295_FIXUP_FRAMEWORK_LAPTOP_MIC_NO_PRESENCE), + SND_PCI_QUIRK(0xf111, 0x000f, "Framework Laptop 13 Pro PTL", ALC295_FIXUP_FRAMEWORK_LAPTOP_LIMIT_INT_MIC_BOOST), + SND_PCI_QUIRK(0xf111, 0x010f, "Framework Laptop 13 PTL", ALC295_FIXUP_FRAMEWORK_LAPTOP_LIMIT_INT_MIC_BOOST), #if 0 /* Below is a quirk table taken from the old code. diff --git a/sound/hda/codecs/side-codecs/Kconfig b/sound/hda/codecs/side-codecs/Kconfig index fc5651e555e3c2..e51964c0a09162 100644 --- a/sound/hda/codecs/side-codecs/Kconfig +++ b/sound/hda/codecs/side-codecs/Kconfig @@ -94,7 +94,6 @@ menu "CS35L56 driver options" config SND_HDA_SCODEC_CS35L56_CAL_DEBUGFS bool "CS35L56 create debugfs for factory calibration" - default N depends on DEBUG_FS select SND_SOC_CS35L56_CAL_DEBUGFS_COMMON help diff --git a/sound/hda/codecs/side-codecs/cs35l41_hda.c b/sound/hda/codecs/side-codecs/cs35l41_hda.c index b64890006bb701..64a5bd895fd1e6 100644 --- a/sound/hda/codecs/side-codecs/cs35l41_hda.c +++ b/sound/hda/codecs/side-codecs/cs35l41_hda.c @@ -1325,6 +1325,43 @@ static int cs35l41_fw_type_ctl_info(struct snd_kcontrol *kcontrol, struct snd_ct return snd_ctl_enum_info(uinfo, 1, ARRAY_SIZE(cs35l41_hda_fw_ids), cs35l41_hda_fw_ids); } +static void cs35l41_remove_controls(struct cs35l41_hda *cs35l41) +{ + if (!cs35l41->codec) + return; + + snd_ctl_remove(cs35l41->codec->card, cs35l41->mute_override_ctl); + cs35l41->mute_override_ctl = NULL; + + snd_ctl_remove(cs35l41->codec->card, cs35l41->fw_load_ctl); + cs35l41->fw_load_ctl = NULL; + + snd_ctl_remove(cs35l41->codec->card, cs35l41->fw_type_ctl); + cs35l41->fw_type_ctl = NULL; +} + +static int cs35l41_add_control(struct cs35l41_hda *cs35l41, + struct snd_kcontrol_new *ctl, + struct snd_kcontrol **kctl) +{ + int ret; + + *kctl = snd_ctl_new1(ctl, cs35l41); + if (!*kctl) + return -ENOMEM; + + ret = snd_ctl_add(cs35l41->codec->card, *kctl); + if (ret) { + dev_err(cs35l41->dev, "Failed to add KControl %s = %d\n", ctl->name, ret); + *kctl = NULL; + return ret; + } + + dev_dbg(cs35l41->dev, "Added Control %s\n", ctl->name); + + return 0; +} + static int cs35l41_create_controls(struct cs35l41_hda *cs35l41) { char fw_type_ctl_name[SNDRV_CTL_ELEM_ID_NAME_MAXLEN]; @@ -1360,32 +1397,23 @@ static int cs35l41_create_controls(struct cs35l41_hda *cs35l41) scnprintf(mute_override_ctl_name, SNDRV_CTL_ELEM_ID_NAME_MAXLEN, "%s Forced Mute Status", cs35l41->amp_name); - ret = snd_ctl_add(cs35l41->codec->card, snd_ctl_new1(&fw_type_ctl, cs35l41)); - if (ret) { - dev_err(cs35l41->dev, "Failed to add KControl %s = %d\n", fw_type_ctl.name, ret); - return ret; - } - - dev_dbg(cs35l41->dev, "Added Control %s\n", fw_type_ctl.name); - - ret = snd_ctl_add(cs35l41->codec->card, snd_ctl_new1(&fw_load_ctl, cs35l41)); - if (ret) { - dev_err(cs35l41->dev, "Failed to add KControl %s = %d\n", fw_load_ctl.name, ret); - return ret; - } - - dev_dbg(cs35l41->dev, "Added Control %s\n", fw_load_ctl.name); + ret = cs35l41_add_control(cs35l41, &fw_type_ctl, &cs35l41->fw_type_ctl); + if (ret) + goto err; - ret = snd_ctl_add(cs35l41->codec->card, snd_ctl_new1(&mute_override_ctl, cs35l41)); - if (ret) { - dev_err(cs35l41->dev, "Failed to add KControl %s = %d\n", mute_override_ctl.name, - ret); - return ret; - } + ret = cs35l41_add_control(cs35l41, &fw_load_ctl, &cs35l41->fw_load_ctl); + if (ret) + goto err; - dev_dbg(cs35l41->dev, "Added Control %s\n", mute_override_ctl.name); + ret = cs35l41_add_control(cs35l41, &mute_override_ctl, &cs35l41->mute_override_ctl); + if (ret) + goto err; return 0; + +err: + cs35l41_remove_controls(cs35l41); + return ret; } static bool cs35l41_dsm_supported(acpi_handle handle, unsigned int commands) @@ -1522,6 +1550,10 @@ static void cs35l41_hda_unbind(struct device *dev, struct device *master, void * device_link_remove(&cs35l41->codec->core.dev, cs35l41->dev); unlock_system_sleep(sleep_flags); memset(comp, 0, sizeof(*comp)); + + cs35l41_remove_controls(cs35l41); + cancel_work_sync(&cs35l41->fw_load_work); + cs35l41->codec = NULL; } } @@ -1896,8 +1928,10 @@ static int cs35l41_hda_read_acpi(struct cs35l41_hda *cs35l41, const char *hid, i cs35l41->dacpi = adev; physdev = get_device(acpi_get_first_physical_node(adev)); - if (!physdev) + if (!physdev) { + acpi_dev_put(adev); return -ENODEV; + } sub = acpi_get_subsystem_id(ACPI_HANDLE(physdev)); if (IS_ERR(sub)) @@ -2058,6 +2092,7 @@ void cs35l41_hda_remove(struct device *dev) struct cs35l41_hda *cs35l41 = dev_get_drvdata(dev); component_del(cs35l41->dev, &cs35l41_hda_comp_ops); + cancel_work_sync(&cs35l41->fw_load_work); pm_runtime_get_sync(cs35l41->dev); pm_runtime_dont_use_autosuspend(cs35l41->dev); diff --git a/sound/hda/codecs/side-codecs/cs35l41_hda.h b/sound/hda/codecs/side-codecs/cs35l41_hda.h index 7d003c598e93e5..56ec07c0bb7456 100644 --- a/sound/hda/codecs/side-codecs/cs35l41_hda.h +++ b/sound/hda/codecs/side-codecs/cs35l41_hda.h @@ -57,6 +57,8 @@ enum control_bus { SPI }; +struct snd_kcontrol; + struct cs35l41_hda { struct device *dev; struct regmap *regmap; @@ -75,6 +77,9 @@ struct cs35l41_hda { int speaker_id; struct mutex fw_mutex; struct work_struct fw_load_work; + struct snd_kcontrol *fw_type_ctl; + struct snd_kcontrol *fw_load_ctl; + struct snd_kcontrol *mute_override_ctl; struct regmap_irq_chip_data *irq_data; bool firmware_running; diff --git a/sound/hda/codecs/side-codecs/cs35l56_hda.c b/sound/hda/codecs/side-codecs/cs35l56_hda.c index 4c8d01799931c8..cdbc576569efee 100644 --- a/sound/hda/codecs/side-codecs/cs35l56_hda.c +++ b/sound/hda/codecs/side-codecs/cs35l56_hda.c @@ -1041,6 +1041,7 @@ static int cs35l56_hda_read_acpi(struct cs35l56_hda *cs35l56, int hid, int id) return -ENODEV; } ACPI_COMPANION_SET(cs35l56->base.dev, adev); + acpi_dev_put(adev); } /* Initialize things that could be overwritten by a fixup */ diff --git a/sound/hda/common/codec.c b/sound/hda/common/codec.c index c2af2511a83160..81f266b9b850f7 100644 --- a/sound/hda/common/codec.c +++ b/sound/hda/common/codec.c @@ -1699,6 +1699,9 @@ int snd_hda_ctl_add(struct hda_codec *codec, hda_nid_t nid, unsigned short flags = 0; struct hda_nid_item *item; + if (!kctl) + return -EINVAL; + if (kctl->id.subdevice & HDA_SUBDEV_AMP_FLAG) { flags |= HDA_NID_ITEM_AMP; if (nid == 0) diff --git a/sound/pci/asihpi/hpicmn.c b/sound/pci/asihpi/hpicmn.c index d846777e7462b3..0674fd854cc8bd 100644 --- a/sound/pci/asihpi/hpicmn.c +++ b/sound/pci/asihpi/hpicmn.c @@ -641,17 +641,12 @@ void hpi_cmn_control_cache_sync_to_msg(struct hpi_control_cache *p_cache, struct hpi_control_cache *hpi_alloc_control_cache(const u32 control_count, const u32 size_in_bytes, u8 *p_dsp_control_buffer) { - struct hpi_control_cache *p_cache = kmalloc_obj(*p_cache); + struct hpi_control_cache *p_cache; + + p_cache = kzalloc_flex(*p_cache, p_info, control_count); if (!p_cache) return NULL; - p_cache->p_info = - kzalloc_objs(*p_cache->p_info, control_count); - if (!p_cache->p_info) { - kfree(p_cache); - return NULL; - } - p_cache->cache_size_in_bytes = size_in_bytes; p_cache->control_count = control_count; p_cache->p_cache = p_dsp_control_buffer; @@ -661,10 +656,7 @@ struct hpi_control_cache *hpi_alloc_control_cache(const u32 control_count, void hpi_free_control_cache(struct hpi_control_cache *p_cache) { - if (p_cache) { - kfree(p_cache->p_info); - kfree(p_cache); - } + kfree(p_cache); } static void subsys_message(struct hpi_message *phm, struct hpi_response *phr) diff --git a/sound/pci/asihpi/hpicmn.h b/sound/pci/asihpi/hpicmn.h index 8ec656cf8848c7..40af7329587a12 100644 --- a/sound/pci/asihpi/hpicmn.h +++ b/sound/pci/asihpi/hpicmn.h @@ -37,10 +37,10 @@ struct hpi_control_cache { u16 adap_idx; u32 control_count; u32 cache_size_in_bytes; - /** pointer to allocated memory of lookup pointers. */ - struct hpi_control_cache_info **p_info; /** pointer to DSP's control cache. */ u8 *p_cache; + /** pointer to allocated memory of lookup pointers. */ + struct hpi_control_cache_info *p_info[] __counted_by(control_count); }; struct hpi_adapter_obj *hpi_find_adapter(u16 adapter_index); diff --git a/sound/pci/ctxfi/ctmixer.c b/sound/pci/ctxfi/ctmixer.c index 50ab69aca2fa31..9abe1d53c3c5e5 100644 --- a/sound/pci/ctxfi/ctmixer.c +++ b/sound/pci/ctxfi/ctmixer.c @@ -221,10 +221,6 @@ ct_mixer_recording_select(struct ct_mixer *mixer, enum CT_AMIXER_CTL type); static void ct_mixer_recording_unselect(struct ct_mixer *mixer, enum CT_AMIXER_CTL type); -/* FIXME: this static looks like it would fail if more than one card was */ -/* installed. */ -static struct snd_kcontrol *kctls[2] = {NULL}; - static enum CT_AMIXER_CTL get_amixer_index(enum CTALSA_MIXER_CTL alsa_index) { switch (alsa_index) { @@ -481,17 +477,18 @@ static struct snd_kcontrol_new mic_source_ctl = { static void do_line_mic_switch(struct ct_atc *atc, enum CTALSA_MIXER_CTL type) { + struct ct_mixer *mixer = atc->mixer; if (MIXER_LINEIN_C_S == type) { atc->select_line_in(atc); - set_switch_state(atc->mixer, MIXER_MIC_C_S, 0); + set_switch_state(mixer, MIXER_MIC_C_S, 0); snd_ctl_notify(atc->card, SNDRV_CTL_EVENT_MASK_VALUE, - &kctls[1]->id); + &mixer->line_mic_kctls[1]->id); } else if (MIXER_MIC_C_S == type) { atc->select_mic_in(atc); - set_switch_state(atc->mixer, MIXER_LINEIN_C_S, 0); + set_switch_state(mixer, MIXER_LINEIN_C_S, 0); snd_ctl_notify(atc->card, SNDRV_CTL_EVENT_MASK_VALUE, - &kctls[0]->id); + &mixer->line_mic_kctls[0]->id); } } @@ -778,9 +775,9 @@ ct_mixer_kcontrol_new(struct ct_mixer *mixer, struct snd_kcontrol_new *new) switch (new->private_value) { case MIXER_LINEIN_C_S: - kctls[0] = kctl; break; + mixer->line_mic_kctls[0] = kctl; break; case MIXER_MIC_C_S: - kctls[1] = kctl; break; + mixer->line_mic_kctls[1] = kctl; break; default: break; } diff --git a/sound/pci/ctxfi/ctmixer.h b/sound/pci/ctxfi/ctmixer.h index dd23d227aeb5aa..0c0963bba99c8a 100644 --- a/sound/pci/ctxfi/ctmixer.h +++ b/sound/pci/ctxfi/ctmixer.h @@ -17,6 +17,8 @@ #include "ctatc.h" #include "ctresource.h" +struct snd_kcontrol; + #define INIT_VOL 0x1c00 enum MIXER_PORT_T { @@ -42,6 +44,7 @@ struct ct_mixer { struct ct_atc *atc; struct sum **sums; /* sum resources for signal collection */ + struct snd_kcontrol *line_mic_kctls[2]; /* line/mic capture switch controls */ unsigned int switch_state; /* A bit-map to indicate state of switches */ int (*get_output_ports)(struct ct_mixer *mixer, enum MIXER_PORT_T type, diff --git a/sound/pci/ctxfi/ctsrc.c b/sound/pci/ctxfi/ctsrc.c index 326997bb885df4..46dc1f509234f0 100644 --- a/sound/pci/ctxfi/ctsrc.c +++ b/sound/pci/ctxfi/ctsrc.c @@ -668,13 +668,6 @@ static int srcimp_rsc_init(struct srcimp *srcimp, if (err) return err; - /* Reserve memory for imapper nodes */ - srcimp->imappers = kzalloc_objs(struct imapper, desc->msr); - if (!srcimp->imappers) { - err = -ENOMEM; - goto error1; - } - /* Set srcimp specific operations */ srcimp->rsc.ops = &srcimp_basic_rsc_ops; srcimp->ops = &srcimp_ops; @@ -683,16 +676,10 @@ static int srcimp_rsc_init(struct srcimp *srcimp, srcimp->rsc.ops->master(&srcimp->rsc); return 0; - -error1: - rsc_uninit(&srcimp->rsc); - return err; } static int srcimp_rsc_uninit(struct srcimp *srcimp) { - kfree(srcimp->imappers); - srcimp->imappers = NULL; srcimp->ops = NULL; srcimp->mgr = NULL; rsc_uninit(&srcimp->rsc); @@ -711,7 +698,7 @@ static int get_srcimp_rsc(struct srcimp_mgr *mgr, *rsrcimp = NULL; /* Allocate mem for SRCIMP resource */ - srcimp = kzalloc(sizeof(*srcimp), GFP_KERNEL); + srcimp = kzalloc_flex(*srcimp, imappers, desc->msr); if (!srcimp) return -ENOMEM; diff --git a/sound/pci/ctxfi/ctsrc.h b/sound/pci/ctxfi/ctsrc.h index e6366cc6a7ae6a..7ba94adde92085 100644 --- a/sound/pci/ctxfi/ctsrc.h +++ b/sound/pci/ctxfi/ctsrc.h @@ -103,10 +103,10 @@ struct srcimp_rsc_ops; struct srcimp { struct rsc rsc; unsigned char idx[8]; - struct imapper *imappers; unsigned int mapped; /* A bit-map indicating which conj rsc is mapped */ struct srcimp_mgr *mgr; const struct srcimp_rsc_ops *ops; + struct imapper imappers[]; }; struct srcimp_rsc_ops { diff --git a/sound/pci/ice1712/ice1724.c b/sound/pci/ice1712/ice1724.c index 2e64f9c020e5a1..79d57938a1c8cb 100644 --- a/sound/pci/ice1712/ice1724.c +++ b/sound/pci/ice1712/ice1724.c @@ -730,13 +730,22 @@ static int snd_vt1724_pcm_hw_params(struct snd_pcm_substream *substream, static int snd_vt1724_pcm_hw_free(struct snd_pcm_substream *substream) { struct snd_ice1712 *ice = snd_pcm_substream_chip(substream); + bool released = false; int i; - guard(mutex)(&ice->open_mutex); - /* unmark surround channels */ - for (i = 0; i < 3; i++) - if (ice->pcm_reserved[i] == substream) + scoped_guard(mutex, &ice->open_mutex) { + /* unmark surround channels */ + for (i = 0; i < 3; i++) { + if (ice->pcm_reserved[i] != substream) + continue; ice->pcm_reserved[i] = NULL; + released = true; + } + } + + if (released && ice->pcm_ds) + wake_up(&ice->pcm_ds->open_wait); + return 0; } @@ -1364,7 +1373,7 @@ static int snd_vt1724_playback_indep_open(struct snd_pcm_substream *substream) scoped_guard(mutex, &ice->open_mutex) { /* already used by PDMA0? */ if (ice->pcm_reserved[substream->number]) - return -EBUSY; /* FIXME: should handle blocking mode properly */ + return -EAGAIN; } runtime->private_data = (void *)&vt1724_playback_dma_regs[substream->number]; ice->playback_con_substream_ds[substream->number] = substream; diff --git a/sound/soc/amd/acp-config.c b/sound/soc/amd/acp-config.c index 1604ed679224ba..309dc9ed6e0d2b 100644 --- a/sound/soc/amd/acp-config.c +++ b/sound/soc/amd/acp-config.c @@ -30,6 +30,13 @@ static const struct dmi_system_id acp70_acpi_flag_override_table[] = { DMI_MATCH(DMI_PRODUCT_NAME, "HN7306EA"), }, }, + { + /* ASUS Zenbook S16 UM5606GA (Strix Point, ACP 7.0) */ + .matches = { + DMI_MATCH(DMI_BOARD_VENDOR, "ASUSTeK COMPUTER INC."), + DMI_MATCH(DMI_PRODUCT_NAME, "Zenbook S16 UM5606GA"), + }, + }, {} }; diff --git a/sound/soc/amd/acp/acp-sdw-legacy-mach.c b/sound/soc/amd/acp/acp-sdw-legacy-mach.c index 0f21e5f64531a2..09b475c83c4966 100644 --- a/sound/soc/amd/acp/acp-sdw-legacy-mach.c +++ b/sound/soc/amd/acp/acp-sdw-legacy-mach.c @@ -260,9 +260,9 @@ static int create_sdw_dailink(struct snd_soc_card *card, cpus->dai_name = devm_kasprintf(dev, GFP_KERNEL, "SDW%d Pin%d", link_num, cpu_pin_id); - dev_dbg(dev, "cpu->dai_name:%s\n", cpus->dai_name); if (!cpus->dai_name) return -ENOMEM; + dev_dbg(dev, "cpu->dai_name:%s\n", cpus->dai_name); codec_maps[j].cpu = 0; codec_maps[j].codec = j; diff --git a/sound/soc/codecs/Kconfig b/sound/soc/codecs/Kconfig index 069ec05e4a6152..a7c61f7c7f4c5c 100644 --- a/sound/soc/codecs/Kconfig +++ b/sound/soc/codecs/Kconfig @@ -122,6 +122,7 @@ config SND_SOC_ALL_CODECS imply SND_SOC_ES8328_I2C imply SND_SOC_ES8375 imply SND_SOC_ES8389 + imply SND_SOC_ES9356 imply SND_SOC_ES7134 imply SND_SOC_ES7241 imply SND_SOC_FRAMER @@ -886,7 +887,7 @@ config SND_SOC_CS35L56_SPI config SND_SOC_CS35L56_SDW tristate "Cirrus Logic CS35L56 CODEC (SDW)" depends on SOUNDWIRE - select REGMAP + select REGMAP_SOUNDWIRE select SND_SOC_CS35L56 select SND_SOC_CS35L56_SHARED help @@ -900,7 +901,6 @@ menu "CS35L56 driver options" config SND_SOC_CS35L56_CAL_DEBUGFS bool "CS35L56 create debugfs for factory calibration" - default N depends on DEBUG_FS select SND_SOC_CS35L56_CAL_DEBUGFS_COMMON help @@ -911,7 +911,6 @@ config SND_SOC_CS35L56_CAL_DEBUGFS config SND_SOC_CS35L56_CAL_SET_CTRL bool "CS35L56 ALSA control to restore factory calibration" - default N select SND_SOC_CS35L56_CAL_DEBUGFS_COMMON help Allow restoring factory calibration data through an ALSA @@ -925,7 +924,6 @@ config SND_SOC_CS35L56_CAL_SET_CTRL config SND_SOC_CS35L56_CAL_PERFORM_CTRL bool "CS35L56 ALSA control to perform factory calibration" - default N select SND_SOC_CS35L56_CAL_DEBUGFS_COMMON help Allow performing factory calibration data through an ALSA @@ -1302,6 +1300,12 @@ config SND_SOC_ES8389 tristate "Everest Semi ES8389 CODEC" depends on I2C +config SND_SOC_ES9356 + tristate "Everest Semi ES9356 CODEC SDW" + depends on SOUNDWIRE + select REGMAP_SOUNDWIRE + select REGMAP_SOUNDWIRE_MBQ + config SND_SOC_FRAMER tristate "Framer codec" depends on GENERIC_FRAMER diff --git a/sound/soc/codecs/Makefile b/sound/soc/codecs/Makefile index 2c2d0553f35fbb..73315d017c57b4 100644 --- a/sound/soc/codecs/Makefile +++ b/sound/soc/codecs/Makefile @@ -138,6 +138,7 @@ snd-soc-es8328-i2c-y := es8328-i2c.o snd-soc-es8328-spi-y := es8328-spi.o snd-soc-es8375-y := es8375.o snd-soc-es8389-y := es8389.o +snd-soc-es9356-y := es9356.o snd-soc-framer-y := framer-codec.o snd-soc-fs-amp-lib-y := fs-amp-lib.o snd-soc-fs210x-y := fs210x.o @@ -577,6 +578,7 @@ obj-$(CONFIG_SND_SOC_ES8328_I2C)+= snd-soc-es8328-i2c.o obj-$(CONFIG_SND_SOC_ES8328_SPI)+= snd-soc-es8328-spi.o obj-$(CONFIG_SND_SOC_ES8375) += snd-soc-es8375.o obj-$(CONFIG_SND_SOC_ES8389) += snd-soc-es8389.o +obj-$(CONFIG_SND_SOC_ES9356) += snd-soc-es9356.o obj-$(CONFIG_SND_SOC_FRAMER) += snd-soc-framer.o obj-$(CONFIG_SND_SOC_FS_AMP_LIB)+= snd-soc-fs-amp-lib.o obj-$(CONFIG_SND_SOC_FS210X) += snd-soc-fs210x.o diff --git a/sound/soc/codecs/cs35l56-sdw.c b/sound/soc/codecs/cs35l56-sdw.c index 9dc47fec1ea048..d2b82a846ae8f8 100644 --- a/sound/soc/codecs/cs35l56-sdw.c +++ b/sound/soc/codecs/cs35l56-sdw.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include "cs35l56.h" @@ -59,8 +60,6 @@ static int cs35l56_sdw_slow_read(struct sdw_slave *peripheral, unsigned int reg, { int ret, i; - reg += CS35L56_SDW_ADDR_OFFSET; - for (i = 0; i < val_size; i += sizeof(u32)) { /* Poll for bus bridge idle */ ret = cs35l56_sdw_poll_mem_status(peripheral, @@ -97,57 +96,23 @@ static int cs35l56_sdw_slow_read(struct sdw_slave *peripheral, unsigned int reg, return 0; } -static int cs35l56_sdw_read_one(struct sdw_slave *peripheral, unsigned int reg, void *buf) -{ - int ret; - - ret = sdw_nread_no_pm(peripheral, reg, 4, (u8 *)buf); - if (ret != 0) { - dev_err(&peripheral->dev, "Read failed @%#x:%d\n", reg, ret); - return ret; - } - - swab32s((u32 *)buf); - - return 0; -} - static int cs35l56_sdw_read(void *context, const void *reg_buf, const size_t reg_size, void *val_buf, size_t val_size) { struct sdw_slave *peripheral = context; - u8 *buf8 = val_buf; - unsigned int reg, bytes; + struct cs35l56_private *cs35l56 = dev_get_drvdata(&peripheral->dev); + unsigned int reg_addr = get_unaligned_le32(reg_buf); int ret; - reg = le32_to_cpu(*(const __le32 *)reg_buf); - - if (cs35l56_is_otp_register(reg)) - return cs35l56_sdw_slow_read(peripheral, reg, buf8, val_size); + if (cs35l56_is_otp_register(reg_addr - CS35L56_SDW_ADDR_OFFSET)) + return cs35l56_sdw_slow_read(peripheral, reg_addr, (u8 *)val_buf, val_size); - reg += CS35L56_SDW_ADDR_OFFSET; - - if (val_size == 4) - return cs35l56_sdw_read_one(peripheral, reg, val_buf); - - while (val_size) { - bytes = SDW_REG_NO_PAGE - (reg & SDW_REGADDR); /* to end of page */ - if (bytes > val_size) - bytes = val_size; - - ret = sdw_nread_no_pm(peripheral, reg, bytes, buf8); - if (ret != 0) { - dev_err(&peripheral->dev, "Read failed @%#x..%#x:%d\n", - reg, reg + bytes - 1, ret); - return ret; - } + ret = regmap_raw_read(cs35l56->sdw_bus_regmap, reg_addr, val_buf, val_size); + if (ret) + return ret; - swab32_array((u32 *)buf8, bytes / 4); - val_size -= bytes; - reg += bytes; - buf8 += bytes; - } + swab32_array((u32 *)val_buf, val_size / sizeof(u32)); return 0; } @@ -161,58 +126,34 @@ static inline void cs35l56_swab_copy(void *dest, const void *src, size_t nbytes) *dest32++ = swab32(*src32++); } -static int cs35l56_sdw_write_one(struct sdw_slave *peripheral, unsigned int reg, const void *buf) -{ - u32 val_le = swab32(*(u32 *)buf); - int ret; - - ret = sdw_nwrite_no_pm(peripheral, reg, 4, (u8 *)&val_le); - if (ret != 0) { - dev_err(&peripheral->dev, "Write failed @%#x:%d\n", reg, ret); - return ret; - } - - return 0; -} - static int cs35l56_sdw_gather_write(void *context, const void *reg_buf, size_t reg_size, const void *val_buf, size_t val_size) { struct sdw_slave *peripheral = context; - const u8 *src_be = val_buf; - u32 val_le_buf[64]; /* Define u32 so it is 32-bit aligned */ - unsigned int reg, bytes; + struct cs35l56_private *cs35l56 = dev_get_drvdata(&peripheral->dev); + unsigned int reg_addr = get_unaligned_le32(reg_buf); + u32 swab_buf[64]; /* Define u32 so it is 32-bit aligned */ int ret; - reg = le32_to_cpu(*(const __le32 *)reg_buf); - reg += CS35L56_SDW_ADDR_OFFSET; - - if (val_size == 4) - return cs35l56_sdw_write_one(peripheral, reg, src_be); - - while (val_size) { - bytes = SDW_REG_NO_PAGE - (reg & SDW_REGADDR); /* to end of page */ - if (bytes > val_size) - bytes = val_size; - if (bytes > sizeof(val_le_buf)) - bytes = sizeof(val_le_buf); - - cs35l56_swab_copy(val_le_buf, src_be, bytes); - - ret = sdw_nwrite_no_pm(peripheral, reg, bytes, (u8 *)val_le_buf); - if (ret != 0) { - dev_err(&peripheral->dev, "Write failed @%#x..%#x:%d\n", - reg, reg + bytes - 1, ret); + while (val_size > sizeof(swab_buf)) { + cs35l56_swab_copy(swab_buf, val_buf, sizeof(swab_buf)); + ret = regmap_raw_write(cs35l56->sdw_bus_regmap, reg_addr, + swab_buf, sizeof(swab_buf)); + if (ret) return ret; - } - val_size -= bytes; - reg += bytes; - src_be += bytes; + val_size -= sizeof(swab_buf); + reg_addr += sizeof(swab_buf); + val_buf += sizeof(swab_buf); } - return 0; + if (val_size == 0) + return 0; + + cs35l56_swab_copy(swab_buf, val_buf, val_size); + + return regmap_raw_write(cs35l56->sdw_bus_regmap, reg_addr, swab_buf, val_size); } static int cs35l56_sdw_write(void *context, const void *val_buf, size_t val_size) @@ -231,7 +172,7 @@ static int cs35l56_sdw_write(void *context, const void *val_buf, size_t val_size * byte controls always have the same byte order, and firmware file blobs * can be written verbatim. */ -static const struct regmap_bus cs35l56_regmap_bus_sdw = { +static const struct regmap_bus cs35l56_regmap_swab_bus_sdw = { .read = cs35l56_sdw_read, .write = cs35l56_sdw_write, .gather_write = cs35l56_sdw_gather_write, @@ -239,6 +180,18 @@ static const struct regmap_bus cs35l56_regmap_bus_sdw = { .val_format_endian_default = REGMAP_ENDIAN_BIG, }; +/* Low-level SoundWire regmap to transfer the data over the bus */ +static const struct regmap_config cs35l56_sdw_bus_regmap = { + .name = "sdw-le32", + .reg_bits = 32, + .val_bits = 32, + .reg_stride = 4, + .reg_format_endian = REGMAP_ENDIAN_LITTLE, + .val_format_endian = REGMAP_ENDIAN_LITTLE, + .max_register = CS35L56_DSP1_PMEM_5114 + 0x8000, + .cache_type = REGCACHE_NONE, +}; + static int cs35l56_sdw_get_unique_id(struct cs35l56_private *cs35l56) { int ret; @@ -385,18 +338,19 @@ static int cs35l56_sdw_update_status(struct sdw_slave *peripheral, switch (status) { case SDW_SLAVE_ATTACHED: - dev_dbg(cs35l56->base.dev, "%s: ATTACHED\n", __func__); cs35l56->sdw_in_clock_stop_1 = false; if (cs35l56->sdw_attached) break; + dev_dbg(cs35l56->base.dev, "%s: ATTACHED\n", __func__); if (!cs35l56->base.init_done || cs35l56->soft_resetting) cs35l56_sdw_init(peripheral); cs35l56->sdw_attached = true; break; case SDW_SLAVE_UNATTACHED: - dev_dbg(cs35l56->base.dev, "%s: UNATTACHED\n", __func__); + if (cs35l56->sdw_attached) + dev_dbg(cs35l56->base.dev, "%s: UNATTACHED\n", __func__); cs35l56->sdw_attached = false; break; default: @@ -436,6 +390,7 @@ static const struct sdw_slave_ops cs35l56_sdw_ops = { static int __maybe_unused cs35l56_sdw_handle_unattach(struct cs35l56_private *cs35l56) { struct sdw_slave *peripheral = cs35l56->sdw_peripheral; + int ret; dev_dbg(cs35l56->base.dev, "attached:%u unattach_request:%u in_clock_stop_1:%u\n", cs35l56->sdw_attached, peripheral->unattach_request, cs35l56->sdw_in_clock_stop_1); @@ -443,13 +398,10 @@ static int __maybe_unused cs35l56_sdw_handle_unattach(struct cs35l56_private *cs if (cs35l56->sdw_in_clock_stop_1 || peripheral->unattach_request) { /* Cannot access registers until bus is re-initialized. */ dev_dbg(cs35l56->base.dev, "Wait for initialization_complete\n"); - if (!wait_for_completion_timeout(&peripheral->initialization_complete, - msecs_to_jiffies(5000))) { - dev_err(cs35l56->base.dev, "initialization_complete timed out\n"); - return -ETIMEDOUT; - } + ret = sdw_slave_wait_for_init(peripheral, 5000); + if (ret) + return ret; - peripheral->unattach_request = 0; cs35l56->sdw_in_clock_stop_1 = false; /* @@ -561,8 +513,16 @@ static int cs35l56_sdw_probe(struct sdw_slave *peripheral, const struct sdw_devi cs35l56->base.type = ((unsigned int)id->driver_data) & 0xff; - cs35l56->base.regmap = devm_regmap_init(dev, &cs35l56_regmap_bus_sdw, - peripheral, regmap_config); + /* Low-level regmap to transfer read/writes over SoundWire bus */ + cs35l56->sdw_bus_regmap = devm_regmap_init_sdw(peripheral, &cs35l56_sdw_bus_regmap); + if (IS_ERR(cs35l56->sdw_bus_regmap)) { + ret = PTR_ERR(cs35l56->sdw_bus_regmap); + return dev_err_probe(dev, ret, "Failed to allocate bus register map\n"); + } + + /* Wrapper regmap to simulate big-endian ordering */ + cs35l56->base.regmap = devm_regmap_init(dev, &cs35l56_regmap_swab_bus_sdw, + peripheral, regmap_config); if (IS_ERR(cs35l56->base.regmap)) { ret = PTR_ERR(cs35l56->base.regmap); return dev_err_probe(dev, ret, "Failed to allocate register map\n"); diff --git a/sound/soc/codecs/cs35l56-shared-test.c b/sound/soc/codecs/cs35l56-shared-test.c index cfe7938065f90e..8060a275d32bd4 100644 --- a/sound/soc/codecs/cs35l56-shared-test.c +++ b/sound/soc/codecs/cs35l56-shared-test.c @@ -29,6 +29,7 @@ struct cs35l56_shared_test_priv { struct faux_device *gpio_dev; struct cs35l56_shared_test_mock_gpio *gpio_priv; struct regmap *registers; + unsigned int reg_offset; struct cs35l56_base *cs35l56_base; u8 applied_pad_pull_state[CS35L56_MAX_GPIO]; }; @@ -194,6 +195,8 @@ static int cs35l56_shared_test_reg_read(void *context, unsigned int reg, unsigne { struct cs35l56_shared_test_priv *priv = context; + reg -= priv->reg_offset; + switch (reg) { case CS35L56_SYNC_GPIO1_CFG ... CS35L56_ASP2_DIO_GPIO13_CFG: case CS35L56_GPIO1_CTRL1 ... CS35L56_GPIO13_CTRL1: @@ -214,6 +217,8 @@ static int cs35l56_shared_test_reg_write(void *context, unsigned int reg, unsign { struct cs35l56_shared_test_priv *priv = context; + reg -= priv->reg_offset; + switch (reg) { case CS35L56_UPDATE_REGS: return cs35l56_shared_test_updt_gpio_pres(priv, reg, val); @@ -657,6 +662,7 @@ static int cs35l56_shared_test_case_base_init(struct kunit *test, u8 type, u8 re test->priv = priv; priv->test = test; + priv->reg_offset = regmap_config->reg_base; /* Create dummy amp driver dev */ priv->amp_dev = faux_device_create("cs35l56_shared_test_drv", NULL, NULL); diff --git a/sound/soc/codecs/cs35l56-shared.c b/sound/soc/codecs/cs35l56-shared.c index 795e2764d67ec8..8e3538e28fade3 100644 --- a/sound/soc/codecs/cs35l56-shared.c +++ b/sound/soc/codecs/cs35l56-shared.c @@ -1880,6 +1880,7 @@ EXPORT_SYMBOL_NS_GPL(cs35l56_regmap_spi, "SND_SOC_CS35L56_SHARED"); const struct regmap_config cs35l56_regmap_sdw = { .reg_bits = 32, + .reg_base = 0x8000, .val_bits = 32, .reg_stride = 4, .reg_format_endian = REGMAP_ENDIAN_LITTLE, @@ -1915,6 +1916,7 @@ const struct regmap_config cs35l63_regmap_sdw = { .reg_bits = 32, .val_bits = 32, .reg_stride = 4, + .reg_base = 0x8000, .reg_format_endian = REGMAP_ENDIAN_LITTLE, .val_format_endian = REGMAP_ENDIAN_BIG, .max_register = CS35L56_DSP1_PMEM_5114, diff --git a/sound/soc/codecs/cs35l56.c b/sound/soc/codecs/cs35l56.c index 849d70ca23d6f5..80158913a60e02 100644 --- a/sound/soc/codecs/cs35l56.c +++ b/sound/soc/codecs/cs35l56.c @@ -426,6 +426,8 @@ static unsigned int cs35l56_make_tdm_config_word(unsigned int reg_val, unsigned reg_val &= ~(0x3f << channel_shift); reg_val |= bit_num << channel_shift; channel_shift += 8; + if (channel_shift > 24) + break; } return reg_val; @@ -867,11 +869,16 @@ static void cs35l56_dsp_work(struct work_struct *work) if (!cs35l56->base.init_done) return; - pm_runtime_get_sync(cs35l56->base.dev); + PM_RUNTIME_ACQUIRE(cs35l56->base.dev, pm); + ret = PM_RUNTIME_ACQUIRE_ERR(&pm); + if (ret) { + dev_err(cs35l56->base.dev, "dsp_work failed to runtime-resume: %d\n", ret); + return; + } ret = cs35l56_read_prot_status(&cs35l56->base, &firmware_missing, &firmware_version); if (ret) - goto err; + return; /* Populate fw file qualifier with the revision and security state */ kfree(cs35l56->dsp.fwf_name); @@ -887,7 +894,7 @@ static void cs35l56_dsp_work(struct work_struct *work) } if (!cs35l56->dsp.fwf_name) - goto err; + return; dev_dbg(cs35l56->base.dev, "DSP fwf name: '%s' system name: '%s'\n", cs35l56->dsp.fwf_name, cs35l56->dsp.system_name); @@ -905,8 +912,6 @@ static void cs35l56_dsp_work(struct work_struct *work) cs35l56_patch(cs35l56, firmware_missing); cs35l56_log_tuning(&cs35l56->base, &cs35l56->dsp.cs_dsp); -err: - pm_runtime_put_autosuspend(cs35l56->base.dev); } static struct snd_soc_dapm_context *cs35l56_power_up_for_cal(struct cs35l56_private *cs35l56) diff --git a/sound/soc/codecs/cs35l56.h b/sound/soc/codecs/cs35l56.h index cd71b23b2a3ace..d029fa3f86565f 100644 --- a/sound/soc/codecs/cs35l56.h +++ b/sound/soc/codecs/cs35l56.h @@ -37,6 +37,7 @@ struct cs35l56_private { struct snd_soc_component *component; struct regulator_bulk_data supplies[CS35L56_NUM_BULK_SUPPLIES]; struct sdw_slave *sdw_peripheral; + struct regmap *sdw_bus_regmap; const char *fallback_fw_suffix; struct work_struct sdw_irq_work; bool sdw_irq_no_unmask; diff --git a/sound/soc/codecs/cs42l42-sdw.c b/sound/soc/codecs/cs42l42-sdw.c index d5999ad9ff9b48..b8256ce0b8fbee 100644 --- a/sound/soc/codecs/cs42l42-sdw.c +++ b/sound/soc/codecs/cs42l42-sdw.c @@ -433,19 +433,16 @@ static const struct reg_sequence cs42l42_soft_reboot_seq[] = { static int cs42l42_sdw_handle_unattach(struct cs42l42_private *cs42l42) { struct sdw_slave *peripheral = cs42l42->sdw_peripheral; + int ret; if (!peripheral->unattach_request) return 0; /* Cannot access registers until master re-attaches. */ dev_dbg(&peripheral->dev, "Wait for initialization_complete\n"); - if (!wait_for_completion_timeout(&peripheral->initialization_complete, - msecs_to_jiffies(5000))) { - dev_err(&peripheral->dev, "initialization_complete timed out\n"); - return -ETIMEDOUT; - } - - peripheral->unattach_request = 0; + ret = sdw_slave_wait_for_init(peripheral, 5000); + if (ret) + return ret; /* * After a bus reset there must be a reconfiguration reset to diff --git a/sound/soc/codecs/cs42l43-jack.c b/sound/soc/codecs/cs42l43-jack.c index 3e04e6897b1423..934666295ee307 100644 --- a/sound/soc/codecs/cs42l43-jack.c +++ b/sound/soc/codecs/cs42l43-jack.c @@ -805,7 +805,7 @@ irqreturn_t cs42l43_tip_sense(int irq, void *data) if (priv->suspend_jack_debounce) db_delay += priv->tip_fall_db_ms + priv->tip_rise_db_ms; - queue_delayed_work(system_long_wq, &priv->tip_sense_work, + queue_delayed_work(system_dfl_long_wq, &priv->tip_sense_work, msecs_to_jiffies(db_delay)); return IRQ_HANDLED; @@ -932,7 +932,8 @@ int cs42l43_jack_put(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *u snd_soc_jack_report(priv->jack_hp, 0, 0xFFFF); if (!override) { - queue_delayed_work(system_long_wq, &priv->tip_sense_work, 0); + queue_delayed_work(system_dfl_long_wq, &priv->tip_sense_work, + 0); } else { override--; diff --git a/sound/soc/codecs/es9356.c b/sound/soc/codecs/es9356.c new file mode 100644 index 00000000000000..78fddd9d01711d --- /dev/null +++ b/sound/soc/codecs/es9356.c @@ -0,0 +1,1151 @@ +// SPDX-License-Identifier: GPL-2.0-only +// +// es9356.c -- SoundWire codec driver +// +// Copyright(c) 2025 Everest Semiconductor Co., Ltd +// +// + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "es9356.h" + +struct es9356_sdw_priv { + struct sdw_slave *slave; + struct device *dev; + struct regmap *regmap; + struct snd_soc_component *component; + struct snd_soc_jack *hs_jack; + + /* lock for irq*/ + struct mutex disable_irq_lock; + + /* lock for pde*/ + struct mutex pde_lock; + + bool hw_init; + bool first_hw_init; + int jack_type; + bool disable_irq; + + struct delayed_work interrupt_handle_work; + struct delayed_work button_detect_work; + unsigned int sdca_status; +}; + +static int es9356_sdw_component_probe(struct snd_soc_component *component) +{ + struct es9356_sdw_priv *es9356 = snd_soc_component_get_drvdata(component); + + es9356->component = component; + + return 0; +} + +static const DECLARE_TLV_DB_SCALE(out_vol_tlv, -9600, 12, 0); +static const DECLARE_TLV_DB_SCALE(amic_gain_tlv, 0, 3, 0); +static const DECLARE_TLV_DB_SCALE(dmic_gain_tlv, 0, 6, 0); + +static const struct snd_kcontrol_new es9356_sdca_controls[] = { + SDCA_SINGLE_Q78_TLV("FU41 Left Playback Volume", + SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_FU41, ES9356_SDCA_CTL_FU_VOLUME, CH_L), + ES9356_VOLUME_MIN, ES9356_VOLUME_MAX, ES9356_VOLUME_STEP, out_vol_tlv), + SDCA_SINGLE_Q78_TLV("FU41 Right Playback Volume", + SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_FU41, ES9356_SDCA_CTL_FU_VOLUME, CH_R), + ES9356_VOLUME_MIN, ES9356_VOLUME_MAX, ES9356_VOLUME_STEP, out_vol_tlv), + SDCA_SINGLE_Q78_TLV("FU36 Left Capture Volume", + SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_FU36, ES9356_SDCA_CTL_FU_VOLUME, CH_L), + ES9356_VOLUME_MIN, ES9356_VOLUME_MAX, ES9356_VOLUME_STEP, out_vol_tlv), + SDCA_SINGLE_Q78_TLV("FU36 Right Capture Volume", + SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_FU36, ES9356_SDCA_CTL_FU_VOLUME, CH_R), + ES9356_VOLUME_MIN, ES9356_VOLUME_MAX, ES9356_VOLUME_STEP, out_vol_tlv), + SDCA_SINGLE_Q78_TLV("FU33 Capture Volume", + SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_FU33, ES9356_SDCA_CTL_FU_CH_GAIN, 0), + ES9356_GAIN_MIN, ES9356_AMIC_GAIN_MAX, ES9356_AMIC_GAIN_STEP, amic_gain_tlv), + SDCA_SINGLE_Q78_TLV("FU21 Left Playback Volume", + SDW_SDCA_CTL(FUNC_NUM_AMP, ES9356_SDCA_ENT_FU21, ES9356_SDCA_CTL_FU_VOLUME, CH_L), + ES9356_VOLUME_MIN, ES9356_VOLUME_MAX, ES9356_VOLUME_STEP, out_vol_tlv), + SDCA_SINGLE_Q78_TLV("FU21 Right Playback Volume", + SDW_SDCA_CTL(FUNC_NUM_AMP, ES9356_SDCA_ENT_FU21, ES9356_SDCA_CTL_FU_VOLUME, CH_R), + ES9356_VOLUME_MIN, ES9356_VOLUME_MAX, ES9356_VOLUME_STEP, out_vol_tlv), + SDCA_SINGLE_Q78_TLV("FU113 Left Capture Volume", + SDW_SDCA_CTL(FUNC_NUM_MIC, ES9356_SDCA_ENT_FU113, ES9356_SDCA_CTL_FU_VOLUME, CH_L), + ES9356_VOLUME_MIN, ES9356_VOLUME_MAX, ES9356_VOLUME_STEP, out_vol_tlv), + SDCA_SINGLE_Q78_TLV("FU113 Right Capture Volume", + SDW_SDCA_CTL(FUNC_NUM_MIC, ES9356_SDCA_ENT_FU113, ES9356_SDCA_CTL_FU_VOLUME, CH_R), + ES9356_VOLUME_MIN, ES9356_VOLUME_MAX, ES9356_VOLUME_STEP, out_vol_tlv), + SDCA_SINGLE_Q78_TLV("FU11 Capture Volume", + SDW_SDCA_CTL(FUNC_NUM_MIC, ES9356_SDCA_ENT_FU11, ES9356_SDCA_CTL_FU_CH_GAIN, 0), + ES9356_GAIN_MIN, ES9356_DMIC_GAIN_MAX, ES9356_DMIC_GAIN_STEP, dmic_gain_tlv), +}; + +static const char *const es9356_left_mux_txt[] = { + "Left", + "Right", +}; + +static const char *const es9356_right_mux_txt[] = { + "Right", + "Left", +}; + +static const struct soc_enum es9356_left_mux_enum = + SOC_ENUM_SINGLE(ES9356_DAC_SWAP, 1, + ARRAY_SIZE(es9356_left_mux_txt), es9356_left_mux_txt); +static const struct soc_enum es9356_right_mux_enum = + SOC_ENUM_SINGLE(ES9356_DAC_SWAP, 0, + ARRAY_SIZE(es9356_right_mux_txt), es9356_right_mux_txt); + +static const struct snd_kcontrol_new es9356_left_mux_controls = + SOC_DAPM_ENUM("Channel MUX", es9356_left_mux_enum); +static const struct snd_kcontrol_new es9356_right_mux_controls = + SOC_DAPM_ENUM("Channel MUX", es9356_right_mux_enum); + +static const struct snd_soc_dapm_widget es9356_dapm_widgets[] = { + SND_SOC_DAPM_OUTPUT("HP"), + SND_SOC_DAPM_OUTPUT("SPK"), + SND_SOC_DAPM_INPUT("MIC1"), + SND_SOC_DAPM_INPUT("PDM_DIN"), + + SND_SOC_DAPM_SUPPLY("DMIC Clock", ES9356_DMIC_GPIO, 1, 1, NULL, 0), + + SND_SOC_DAPM_AIF_IN("DP4RX", "DP4 Playback", 0, SND_SOC_NOPM, 0, 0), + SND_SOC_DAPM_AIF_IN("DP3RX", "DP3 Playback", 0, SND_SOC_NOPM, 0, 0), + SND_SOC_DAPM_AIF_OUT("DP1TX", "DP1 Capture", 0, SND_SOC_NOPM, 0, 0), + SND_SOC_DAPM_AIF_OUT("DP2TX", "DP2 Capture", 0, SND_SOC_NOPM, 0, 0), + + SND_SOC_DAPM_PGA("IF DP3RXL", SND_SOC_NOPM, 0, 0, NULL, 0), + SND_SOC_DAPM_PGA("IF DP3RXR", SND_SOC_NOPM, 0, 0, NULL, 0), + + SND_SOC_DAPM_MUX("Left Channel MUX", SND_SOC_NOPM, 0, 0, &es9356_left_mux_controls), + SND_SOC_DAPM_MUX("Right Channel MUX", SND_SOC_NOPM, 0, 0, &es9356_right_mux_controls), + + SND_SOC_DAPM_DAC("FU 21 Left", NULL, + SDW_SDCA_CTL(FUNC_NUM_AMP, ES9356_SDCA_ENT_FU21, ES9356_SDCA_CTL_FU_MUTE, CH_L), 0, 1), + SND_SOC_DAPM_DAC("FU 21 Right", NULL, + SDW_SDCA_CTL(FUNC_NUM_AMP, ES9356_SDCA_ENT_FU21, ES9356_SDCA_CTL_FU_MUTE, CH_R), 0, 1), + SND_SOC_DAPM_DAC("FU 41 Left", NULL, + SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_FU41, ES9356_SDCA_CTL_FU_MUTE, CH_L), 0, 1), + SND_SOC_DAPM_DAC("FU 41 Right", NULL, + SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_FU41, ES9356_SDCA_CTL_FU_MUTE, CH_R), 0, 1), + SND_SOC_DAPM_DAC("FU 113 Left", NULL, + SDW_SDCA_CTL(FUNC_NUM_MIC, ES9356_SDCA_ENT_FU113, ES9356_SDCA_CTL_FU_MUTE, CH_L), 0, 1), + SND_SOC_DAPM_DAC("FU 113 Right", NULL, + SDW_SDCA_CTL(FUNC_NUM_MIC, ES9356_SDCA_ENT_FU113, ES9356_SDCA_CTL_FU_MUTE, CH_R), 0, 1), + SND_SOC_DAPM_DAC("FU 36 Left", NULL, + SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_FU36, ES9356_SDCA_CTL_FU_MUTE, CH_L), 0, 1), + SND_SOC_DAPM_DAC("FU 36 Right", NULL, + SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_FU36, ES9356_SDCA_CTL_FU_MUTE, CH_R), 0, 1), +}; + +static const struct snd_soc_dapm_route es9356_audio_map[] = { + {"FU 36 Left", NULL, "MIC1"}, + {"FU 36 Right", NULL, "MIC1"}, + {"DP2TX", NULL, "FU 36 Left"}, + {"DP2TX", NULL, "FU 36 Right"}, + + {"PDM_DIN", NULL, "DMIC Clock"}, + {"FU 113 Left", NULL, "PDM_DIN"}, + {"FU 113 Right", NULL, "PDM_DIN"}, + {"DP1TX", NULL, "FU 113 Left"}, + {"DP1TX", NULL, "FU 113 Right"}, + + {"FU 41 Left", NULL, "DP4RX"}, + {"FU 41 Right", NULL, "DP4RX"}, + + {"IF DP3RXL", NULL, "DP3RX"}, + {"IF DP3RXR", NULL, "DP3RX"}, + + {"Left Channel MUX", "Left", "IF DP3RXL"}, + {"Left Channel MUX", "Right", "IF DP3RXR"}, + {"Right Channel MUX", "Left", "IF DP3RXL"}, + {"Right Channel MUX", "Right", "IF DP3RXR"}, + + {"FU 21 Left", NULL, "Left Channel MUX"}, + {"FU 21 Right", NULL, "Right Channel MUX"}, + + {"SPK", NULL, "FU 21 Left"}, + {"SPK", NULL, "FU 21 Right"}, + + {"HP", NULL, "FU 41 Left"}, + {"HP", NULL, "FU 41 Right"}, +}; + +static int es9356_set_jack_detect(struct snd_soc_component *component, + struct snd_soc_jack *hs_jack, void *data) +{ + struct es9356_sdw_priv *es9356 = snd_soc_component_get_drvdata(component); + int ret; + + es9356->hs_jack = hs_jack; + + /* we can only resume if the device was initialized at least once */ + if (!es9356->first_hw_init) + return 0; + + ret = pm_runtime_resume_and_get(component->dev); + if (ret < 0) { + if (ret != -EACCES) { + dev_err(component->dev, "%s: failed to resume %d\n", __func__, ret); + return ret; + } + /* pm_runtime not enabled yet */ + dev_info(component->dev, "%s: skipping jack init for now\n", __func__); + return 0; + } + + if (es9356->hs_jack) + sdw_write_no_pm(es9356->slave, SDW_SCP_SDCA_INTMASK1, + (SDW_SCP_SDCA_INTMASK_SDCA_7 | SDW_SCP_SDCA_INTMASK_SDCA_5 | SDW_SCP_SDCA_INTMASK_SDCA_1)); + + pm_runtime_mark_last_busy(component->dev); + pm_runtime_put_autosuspend(component->dev); + + return 0; +} +static const struct snd_soc_component_driver snd_soc_es9356_sdw_component = { + .probe = es9356_sdw_component_probe, + .controls = es9356_sdca_controls, + .num_controls = ARRAY_SIZE(es9356_sdca_controls), + .dapm_widgets = es9356_dapm_widgets, + .num_dapm_widgets = ARRAY_SIZE(es9356_dapm_widgets), + .dapm_routes = es9356_audio_map, + .num_dapm_routes = ARRAY_SIZE(es9356_audio_map), + .set_jack = es9356_set_jack_detect, + .endianness = 1, +}; + +static int es9356_sdw_set_sdw_stream(struct snd_soc_dai *dai, void *sdw_stream, + int direction) +{ + snd_soc_dai_dma_data_set(dai, direction, sdw_stream); + + return 0; +} + +static void es9356_sdw_shutdown(struct snd_pcm_substream *substream, + struct snd_soc_dai *dai) +{ + snd_soc_dai_set_dma_data(dai, substream, NULL); +} + +static int es9356_sdca_button(unsigned int *buffer) +{ + int cur_button = -1; + + if (*(buffer + 1) | *(buffer + 2)) + return -EINVAL; + switch (*buffer) { + case 0x00: + cur_button = 0; + break; + case 0x20: + cur_button = SND_JACK_BTN_4; + break; + case 0x10: + cur_button = SND_JACK_BTN_2; + break; + case 0x08: + cur_button = SND_JACK_BTN_1; + break; + case 0x02: + cur_button = SND_JACK_BTN_3; + break; + case 0x01: + cur_button = SND_JACK_BTN_0; + break; + default: + break; + } + + return cur_button; +} + +static int es9356_sdca_button_detect(struct es9356_sdw_priv *es9356) +{ + unsigned int btn_type = 0, offset, idx, val, owner; + unsigned int button[3]; + int ret; + + ret = regmap_read(es9356->regmap, + SDW_SDCA_CTL(FUNC_NUM_HID, ES9356_SDCA_ENT_HID01, ES9356_SDCA_CTL_HIDTX_CURRENT_OWNER, 0), &owner); + if (ret < 0 || owner == 0x01) + return 0; + + ret = regmap_read(es9356->regmap, ES9356_BUF_ADDR_HID, &offset); + if (ret < 0) + goto button_det_end; + + for (idx = 0; idx < ARRAY_SIZE(button); idx++) { + ret = regmap_read(es9356->regmap, ES9356_BUF_ADDR_HID + offset + idx, &val); + if (ret < 0) + goto button_det_end; + button[idx] = val; + } + + btn_type = es9356_sdca_button(&button[0]); + +button_det_end: + if (owner == 0x00) + regmap_write(es9356->regmap, + SDW_SDCA_CTL(FUNC_NUM_HID, ES9356_SDCA_ENT_HID01, ES9356_SDCA_CTL_HIDTX_CURRENT_OWNER, 0), 0x01); + + return btn_type; +} + +static int es9356_sdca_headset_detect(struct es9356_sdw_priv *es9356) +{ + unsigned int reg; + int ret; + + ret = regmap_read(es9356->regmap, + SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_GE35, ES9356_SDCA_CTL_DETECTED_MODE, 0), ®); + + if (ret < 0) + goto io_error; + + switch (reg) { + case 0x00: + es9356->jack_type = 0; + break; + case 0x03: + es9356->jack_type = SND_JACK_HEADPHONE; + break; + case 0x04: + es9356->jack_type = SND_JACK_HEADSET; + break; + default: + es9356->jack_type = 0; + return -1; + } + + if (reg) { + regmap_write(es9356->regmap, + SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_GE35, ES9356_SDCA_CTL_SELECTED_MODE, 0), reg); + regmap_write(es9356->regmap, ES9356_HP_DETECTTIME, 0x75); + } else { + regmap_write(es9356->regmap, ES9356_HP_DETECTTIME, 0xa4); + } + + return 0; + +io_error: + pr_err_ratelimited("IO error in %s, ret %d\n", __func__, ret); + return ret; +} + +static void es9356_interrupt_handler(struct work_struct *work) +{ + struct es9356_sdw_priv *es9356 = + container_of(work, struct es9356_sdw_priv, interrupt_handle_work.work); + int ret, btn_type = 0; + + if (!es9356->hs_jack) + return; + + if (!es9356->component->card || !es9356->component->card->instantiated) + return; + + /* Handling different types of interrupts based on the mask bit */ + if (es9356->sdca_status & SDW_SCP_SDCA_INT_SDCA_7) { + btn_type = es9356_sdca_button_detect(es9356); + if (btn_type < 0) + return; + } else { + ret = es9356_sdca_headset_detect(es9356); + if (ret < 0) + return; + } + + if (es9356->jack_type != SND_JACK_HEADSET) + btn_type = 0; + + snd_soc_jack_report(es9356->hs_jack, es9356->jack_type | btn_type, + SND_JACK_HEADSET | + SND_JACK_BTN_0 | SND_JACK_BTN_1 | + SND_JACK_BTN_2 | SND_JACK_BTN_3 | + SND_JACK_BTN_4); + + if (btn_type) { + snd_soc_jack_report(es9356->hs_jack, es9356->jack_type, + SND_JACK_HEADSET | + SND_JACK_BTN_0 | SND_JACK_BTN_1 | + SND_JACK_BTN_2 | SND_JACK_BTN_3 | + SND_JACK_BTN_4); + mod_delayed_work(system_power_efficient_wq, + &es9356->button_detect_work, msecs_to_jiffies(280)); + } +} + +static void es9356_button_detect_handler(struct work_struct *work) +{ + struct es9356_sdw_priv *es9356 = + container_of(work, struct es9356_sdw_priv, button_detect_work.work); + int ret, idx, btn_type = 0; + unsigned int reg, offset; + unsigned int button[3]; + + /* Check headset */ + ret = regmap_read(es9356->regmap, + SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_GE35, ES9356_SDCA_CTL_DETECTED_MODE, 0), ®); + + if (ret < 0) + goto io_error; + + if (reg == 0x04) { + ret = regmap_read(es9356->regmap, ES9356_BUF_ADDR_HID, &offset); + if (ret < 0) + goto io_error; + for (idx = 0; idx < ARRAY_SIZE(button); idx++) { + ret = regmap_read(es9356->regmap, ES9356_BUF_ADDR_HID + offset + idx, ®); + if (ret < 0) + goto io_error; + button[idx] = reg; + } + btn_type = es9356_sdca_button(&button[0]); + if (btn_type < 0) + return; + } + + snd_soc_jack_report(es9356->hs_jack, es9356->jack_type | btn_type, + SND_JACK_HEADSET | + SND_JACK_BTN_0 | SND_JACK_BTN_1 | + SND_JACK_BTN_2 | SND_JACK_BTN_3 | + SND_JACK_BTN_4); + + if (btn_type) { + snd_soc_jack_report(es9356->hs_jack, es9356->jack_type, + SND_JACK_HEADSET | + SND_JACK_BTN_0 | SND_JACK_BTN_1 | + SND_JACK_BTN_2 | SND_JACK_BTN_3 | + SND_JACK_BTN_4); + mod_delayed_work(system_power_efficient_wq, + &es9356->button_detect_work, msecs_to_jiffies(280)); + } + + return; +io_error: + pr_err_ratelimited("IO error in %s, ret %d\n", __func__, ret); +} + +static int es9356_pde_transition_delay(struct es9356_sdw_priv *es9356, unsigned char func, + unsigned char entity, unsigned char ps) +{ + unsigned int retries = 10, val; + + /* waiting for Actual PDE becomes to PS0/PS3 */ + while (retries) { + regmap_read(es9356->regmap, + SDW_SDCA_CTL(func, entity, ES9356_SDCA_CTL_ACTUAL_POWER_STATE, 0), &val); + if (val == ps) + return 1; + + usleep_range(1000, 1500); + retries--; + } + if (!retries) { + dev_dbg(&es9356->slave->dev, "%s PDE is NOT %s", __func__, ps?"PS3":"PS0"); + } + return 0; +} + +static int es9356_power_state(struct snd_soc_dai *dai, unsigned char ps, unsigned int *rate) +{ + struct snd_soc_component *component = dai->component; + struct es9356_sdw_priv *es9356 = snd_soc_component_get_drvdata(component); + unsigned char ps0 = 0x0, ps3 = 0x3; + unsigned char func, cs_entity, pde_entity; + int ret; + + switch (dai->id) { + case ES9356_DMIC: + func = FUNC_NUM_MIC; + cs_entity = ES9356_SDCA_ENT_CS113; + pde_entity = ES9356_SDCA_ENT_PDE11; + break; + case ES9356_AMP: + func = FUNC_NUM_AMP; + cs_entity = ES9356_SDCA_ENT_CS21; + pde_entity = ES9356_SDCA_ENT_PDE23; + break; + case ES9356_JACK_IN: + func = FUNC_NUM_UAJ; + cs_entity = ES9356_SDCA_ENT_CS36; + pde_entity = ES9356_SDCA_ENT_PDE34; + break; + case ES9356_JACK_OUT: + func = FUNC_NUM_UAJ; + cs_entity = ES9356_SDCA_ENT_CS41; + pde_entity = ES9356_SDCA_ENT_PDE47; + break; + default: + return -EINVAL; + } + + /* power state changes are not independent across functions */ + mutex_lock(&es9356->pde_lock); + ret = es9356_pde_transition_delay(es9356, func, pde_entity, ps?ps0:ps3); + if (ret) { + regmap_write(es9356->regmap, + SDW_SDCA_CTL(func, pde_entity, ES9356_SDCA_CTL_REQ_POWER_STATE, 0), ps?ps3:ps0); + es9356_pde_transition_delay(es9356, func, pde_entity, ps?ps3:ps0); + } else + dev_dbg(component->dev, "%s PDE is already %d\n", __func__, ps?ps0:ps3); + + mutex_unlock(&es9356->pde_lock); + + if (rate) + regmap_write(es9356->regmap, + SDW_SDCA_CTL(func, cs_entity, ES9356_SDCA_CTL_SAMPLE_FREQ_INDEX, 0), *rate); + + return 0; +} + +static int es9356_sdw_pcm_hw_params(struct snd_pcm_substream *substream, + struct snd_pcm_hw_params *params, + struct snd_soc_dai *dai) +{ + struct snd_soc_component *component = dai->component; + struct es9356_sdw_priv *es9356 = snd_soc_component_get_drvdata(component); + struct sdw_stream_config stream_config = {0}; + struct sdw_port_config port_config = {0}; + struct sdw_stream_runtime *sdw_stream = snd_soc_dai_get_dma_data(dai, substream); + unsigned char ps0 = 0x0; + unsigned int rate; + int ret; + + if (!sdw_stream) + return -EINVAL; + + if (!es9356->slave) + return -EINVAL; + + /* SoundWire specific configuration */ + snd_sdw_params_to_config(substream, params, &stream_config, &port_config); + + port_config.num = dai->id; + + ret = sdw_stream_add_slave(es9356->slave, &stream_config, + &port_config, 1, sdw_stream); + if (ret) { + dev_err(dai->dev, "Unable to configure port\n"); + return -EINVAL; + } + + switch (params_rate(params)) { + case 16000: + rate = ES9356_SDCA_RATE_16000HZ; + break; + case 44100: + rate = ES9356_SDCA_RATE_44100HZ; + break; + case 48000: + rate = ES9356_SDCA_RATE_48000HZ; + break; + case 96000: + rate = ES9356_SDCA_RATE_96000HZ; + break; + default: + dev_err(component->dev, "%s: Rate %d is not supported\n", + __func__, params_rate(params)); + return -EINVAL; + } + + ret = es9356_power_state(dai, ps0, &rate); + if (ret) { + dev_err(component->dev, "%s: Invalid dai id: %d\n", + __func__, dai->id); + return -EINVAL; + } + + return 0; +} + +static int es9356_sdw_pcm_hw_free(struct snd_pcm_substream *substream, + struct snd_soc_dai *dai) +{ + struct snd_soc_component *component = dai->component; + struct es9356_sdw_priv *es9356 = snd_soc_component_get_drvdata(component); + struct sdw_stream_runtime *sdw_stream = snd_soc_dai_get_dma_data(dai, substream); + unsigned char ps3 = 0x3; + int ret; + + if (!es9356->slave) + return -EINVAL; + + sdw_stream_remove_slave(es9356->slave, sdw_stream); + + ret = es9356_power_state(dai, ps3, NULL); + if (ret) { + dev_err(component->dev, "%s: Invalid dai id: %d\n", + __func__, dai->id); + return -EINVAL; + } + + return 0; +} + +static const struct snd_soc_dai_ops es9356_sdw_ops = { + .hw_params = es9356_sdw_pcm_hw_params, + .hw_free = es9356_sdw_pcm_hw_free, + .set_stream = es9356_sdw_set_sdw_stream, + .shutdown = es9356_sdw_shutdown, +}; + +static struct snd_soc_dai_driver es9356_sdw_dai[] = { + { + .name = "es9356-sdp-aif4", + .id = ES9356_DMIC, + .capture = { + .stream_name = "DP1 Capture", + .channels_min = 1, + .channels_max = 2, + }, + .ops = &es9356_sdw_ops, + }, + { + .name = "es9356-sdp-aif2", + .id = ES9356_JACK_IN, + .capture = { + .stream_name = "DP2 Capture", + .channels_min = 1, + .channels_max = 2, + }, + .ops = &es9356_sdw_ops, + }, + { + .name = "es9356-sdp-aif3", + .id = ES9356_AMP, + .playback = { + .stream_name = "DP3 Playback", + .channels_min = 1, + .channels_max = 2, + }, + .ops = &es9356_sdw_ops, + }, + { + .name = "es9356-sdp-aif1", + .id = ES9356_JACK_OUT, + .playback = { + .stream_name = "DP4 Playback", + .channels_min = 1, + .channels_max = 2, + }, + .ops = &es9356_sdw_ops, + }, +}; + +static int es9356_sdca_mbq_size(struct device *dev, unsigned int reg) +{ + switch (reg) { + case SDW_SDCA_CTL(FUNC_NUM_MIC, ES9356_SDCA_ENT_FU113, ES9356_SDCA_CTL_FU_VOLUME, CH_L): + case SDW_SDCA_CTL(FUNC_NUM_MIC, ES9356_SDCA_ENT_FU113, ES9356_SDCA_CTL_FU_VOLUME, CH_R): + case SDW_SDCA_CTL(FUNC_NUM_AMP, ES9356_SDCA_ENT_FU21, ES9356_SDCA_CTL_FU_VOLUME, CH_L): + case SDW_SDCA_CTL(FUNC_NUM_AMP, ES9356_SDCA_ENT_FU21, ES9356_SDCA_CTL_FU_VOLUME, CH_R): + case SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_FU41, ES9356_SDCA_CTL_FU_VOLUME, CH_L): + case SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_FU41, ES9356_SDCA_CTL_FU_VOLUME, CH_R): + case SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_FU36, ES9356_SDCA_CTL_FU_VOLUME, CH_L): + case SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_FU36, ES9356_SDCA_CTL_FU_VOLUME, CH_R): + case SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_FU33, ES9356_SDCA_CTL_FU_CH_GAIN, 0): + case SDW_SDCA_CTL(FUNC_NUM_MIC, ES9356_SDCA_ENT_FU11, ES9356_SDCA_CTL_FU_CH_GAIN, 0): + return 2; + default: + return 1; + } +} + +static struct regmap_sdw_mbq_cfg es9356_mbq_config = { + .mbq_size = es9356_sdca_mbq_size, +}; + +static bool es9356_sdca_volatile_register(struct device *dev, unsigned int reg) +{ + switch (reg) { + case ES9356_BUF_ADDR_HID: + case ES9356_HID_BYTE2: + case ES9356_HID_BYTE3: + case ES9356_HID_BYTE4: + case SDW_SDCA_CTL(FUNC_NUM_HID, ES9356_SDCA_ENT_HID01, ES9356_SDCA_CTL_HIDTX_CURRENT_OWNER, 0): + case SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_GE35, ES9356_SDCA_CTL_DETECTED_MODE, 0): + case SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_GE35, ES9356_SDCA_CTL_SELECTED_MODE, 0): + case SDW_SDCA_CTL(FUNC_NUM_AMP, ES9356_SDCA_ENT_PDE23, ES9356_SDCA_CTL_REQ_POWER_STATE, 0): + case SDW_SDCA_CTL(FUNC_NUM_AMP, ES9356_SDCA_ENT_PDE23, ES9356_SDCA_CTL_ACTUAL_POWER_STATE, 0): + case SDW_SDCA_CTL(FUNC_NUM_MIC, ES9356_SDCA_ENT_PDE11, ES9356_SDCA_CTL_REQ_POWER_STATE, 0): + case SDW_SDCA_CTL(FUNC_NUM_MIC, ES9356_SDCA_ENT_PDE11, ES9356_SDCA_CTL_ACTUAL_POWER_STATE, 0): + case SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_PDE47, ES9356_SDCA_CTL_REQ_POWER_STATE, 0): + case SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_PDE47, ES9356_SDCA_CTL_ACTUAL_POWER_STATE, 0): + case SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_PDE34, ES9356_SDCA_CTL_REQ_POWER_STATE, 0): + case SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_PDE34, ES9356_SDCA_CTL_ACTUAL_POWER_STATE, 0): + case ES9356_FLAGS_HP: + return true; + default: + return false; + } +} + +static const struct regmap_config es9356_sdca_regmap = { + .reg_bits = 32, + .val_bits = 16, + .volatile_reg = es9356_sdca_volatile_register, + .max_register = 0x45ffffff, + .cache_type = REGCACHE_MAPLE, + .use_single_read = true, + .use_single_write = true, +}; + +static void es9356_register_init(struct es9356_sdw_priv *es9356) +{ + regmap_write(es9356->regmap, ES9356_STATE, 0x02); + regmap_write(es9356->regmap, ES9356_ENDPOINT_MODE, 0x24); + regmap_write(es9356->regmap, ES9356_PRE_DIV_CTL, 0x00); + regmap_write(es9356->regmap, ES9356_ADC_OSR, 0x18); + regmap_write(es9356->regmap, ES9356_ADC_OSRGAIN, 0x13); + regmap_write(es9356->regmap, ES9356_DAC_OSR, 0x16); + regmap_write(es9356->regmap, ES9356_CLK_CTL, 0x0f); + regmap_write(es9356->regmap, ES9356_CSM_RESET, 0x01); + regmap_write(es9356->regmap, ES9356_CLK_SEL, 0x30); + + regmap_write(es9356->regmap, ES9356_DETCLK_CTL, 0x51); + regmap_write(es9356->regmap, ES9356_HP_TYPE, 0x10); + regmap_write(es9356->regmap, ES9356_MICBIAS_CTL, 0x10); + regmap_write(es9356->regmap, ES9356_HPDETECT_CTL, 0x07); + regmap_write(es9356->regmap, ES9356_ADC_ANA, 0x30); + regmap_write(es9356->regmap, ES9356_PGA_CTL, 0xa8); + regmap_write(es9356->regmap, ES9356_ADC_INT, 0xaa); + regmap_write(es9356->regmap, ES9356_ADC_LP, 0x19); + regmap_write(es9356->regmap, ES9356_VMID1SEL, 0xbc); + regmap_write(es9356->regmap, ES9356_VMID_TIME, 0x0b); + regmap_write(es9356->regmap, ES9356_STATE_TIME, 0xbb); + regmap_write(es9356->regmap, ES9356_HP_SPK_TIME, 0x77); + regmap_write(es9356->regmap, ES9356_HP_DETECTTIME, 0xa4); + regmap_write(es9356->regmap, ES9356_MICBIAS_SEL, 0x15); + regmap_write(es9356->regmap, ES9356_KEY_PRESS_TIME, 0xff); + regmap_write(es9356->regmap, ES9356_KEY_RELEASE_TIME, 0xff); + regmap_write(es9356->regmap, ES9356_KEY_HOLD_TIME, 0x0f); + regmap_write(es9356->regmap, ES9356_BTSEL_REF, 0x00); + regmap_write(es9356->regmap, ES9356_KEYD_DETECT, 0x18); + regmap_write(es9356->regmap, ES9356_MICBIAS_RES, 0x03); + regmap_write(es9356->regmap, ES9356_BUTTON_CHARGE, 0x00); + regmap_write(es9356->regmap, ES9356_CALIBRATION_TIME, 0x13); + regmap_write(es9356->regmap, ES9356_CALIBRATION_SETTING, 0xf4); + + regmap_write(es9356->regmap, ES9356_SPK_VOLUME, 0x33); + regmap_write(es9356->regmap, ES9356_DAC_VROI, 0x01); + regmap_write(es9356->regmap, ES9356_DAC_LP, 0x00); + regmap_write(es9356->regmap, ES9356_HP_IBIAS, 0x04); + regmap_write(es9356->regmap, ES9356_HP_LP, 0x03); + regmap_write(es9356->regmap, ES9356_SPKLDO_CTL, 0x65); + regmap_write(es9356->regmap, ES9356_SPKBIAS_COMP, 0x09); + regmap_write(es9356->regmap, ES9356_VMID1STL, 0x00); + regmap_write(es9356->regmap, ES9356_VMID2STL, 0x00); + regmap_write(es9356->regmap, ES9356_VSEL, 0xfc); + + regmap_write(es9356->regmap, ES9356_IBIASGEN, 0x10); + regmap_write(es9356->regmap, ES9356_ADC_AMIC_CTL, 0x0d); + regmap_write(es9356->regmap, ES9356_STATE, 0x0e); + regmap_write(es9356->regmap, ES9356_CSM_RESET, 0x00); + regmap_write(es9356->regmap, ES9356_HP_TYPE, 0x08); + + regmap_write(es9356->regmap, + SDW_SDCA_CTL(FUNC_NUM_MIC, ES9356_SDCA_ENT_FU113, ES9356_SDCA_CTL_FU_VOLUME, CH_L), ES9356_DEFAULT_VOLUME); + regmap_write(es9356->regmap, + SDW_SDCA_CTL(FUNC_NUM_MIC, ES9356_SDCA_ENT_FU113, ES9356_SDCA_CTL_FU_VOLUME, CH_R), ES9356_DEFAULT_VOLUME); + + regmap_write(es9356->regmap, + SDW_SDCA_CTL(FUNC_NUM_AMP, ES9356_SDCA_ENT_FU21, ES9356_SDCA_CTL_FU_VOLUME, CH_L), ES9356_DEFAULT_VOLUME); + regmap_write(es9356->regmap, + SDW_SDCA_CTL(FUNC_NUM_AMP, ES9356_SDCA_ENT_FU21, ES9356_SDCA_CTL_FU_VOLUME, CH_R), ES9356_DEFAULT_VOLUME); + + regmap_write(es9356->regmap, + SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_FU41, ES9356_SDCA_CTL_FU_VOLUME, CH_L), ES9356_DEFAULT_VOLUME); + regmap_write(es9356->regmap, + SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_FU41, ES9356_SDCA_CTL_FU_VOLUME, CH_R), ES9356_DEFAULT_VOLUME); + + regmap_write(es9356->regmap, + SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_FU36, ES9356_SDCA_CTL_FU_VOLUME, CH_L), ES9356_DEFAULT_VOLUME); + regmap_write(es9356->regmap, + SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_FU36, ES9356_SDCA_CTL_FU_VOLUME, CH_R), ES9356_DEFAULT_VOLUME); +} + +static int es9356_sdca_io_init(struct device *dev, struct sdw_slave *slave) +{ + struct es9356_sdw_priv *es9356 = dev_get_drvdata(&slave->dev); + + if (es9356->hw_init) + return 0; + + es9356->disable_irq = false; + + regcache_cache_only(es9356->regmap, false); + + if (es9356->first_hw_init) { + regcache_cache_bypass(es9356->regmap, true); + } else { + /* update count of parent 'active' children */ + pm_runtime_set_active(&slave->dev); + + es9356_register_init(es9356); + } + pm_runtime_get_noresume(&slave->dev); + + regmap_write(es9356->regmap, + SDW_SDCA_CTL(FUNC_NUM_MIC, ES9356_SDCA_ENT_XU12, ES9356_SDCA_CTL_SELECTED_MODE, 0), 0x01); + regmap_write(es9356->regmap, + SDW_SDCA_CTL(FUNC_NUM_MIC, ES9356_SDCA_ENT0, ES9356_SDCA_CTL_FUNC_STATUS, 0), 0x40); + regmap_write(es9356->regmap, + SDW_SDCA_CTL(FUNC_NUM_AMP, ES9356_SDCA_ENT_XU22, ES9356_SDCA_CTL_SELECTED_MODE, 0), 0x01); + regmap_write(es9356->regmap, + SDW_SDCA_CTL(FUNC_NUM_AMP, ES9356_SDCA_ENT0, ES9356_SDCA_CTL_FUNC_STATUS, 0), 0x40); + regmap_write(es9356->regmap, + SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_XU42, ES9356_SDCA_CTL_SELECTED_MODE, 0), 0x01); + regmap_write(es9356->regmap, + SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT_XU36, ES9356_SDCA_CTL_SELECTED_MODE, 0), 0x01); + regmap_write(es9356->regmap, + SDW_SDCA_CTL(FUNC_NUM_UAJ, ES9356_SDCA_ENT0, ES9356_SDCA_CTL_FUNC_STATUS, 0), 0x40); + + if (es9356->first_hw_init) { + regcache_cache_bypass(es9356->regmap, false); + regcache_mark_dirty(es9356->regmap); + } else + es9356->first_hw_init = true; + + es9356->hw_init = true; + + pm_runtime_mark_last_busy(&slave->dev); + pm_runtime_put_autosuspend(&slave->dev); + + return 0; +} + +static int es9356_sdw_update_status(struct sdw_slave *slave, + enum sdw_slave_status status) +{ + struct es9356_sdw_priv *es9356 = dev_get_drvdata(&slave->dev); + + if (status == SDW_SLAVE_UNATTACHED) { + es9356->hw_init = false; + cancel_delayed_work_sync(&es9356->interrupt_handle_work); + cancel_delayed_work_sync(&es9356->button_detect_work); + regcache_cache_only(es9356->regmap, true); + } + + if (status == SDW_SLAVE_ATTACHED) { + if (es9356->hs_jack) + sdw_write_no_pm(es9356->slave, SDW_SCP_SDCA_INTMASK1, + (SDW_SCP_SDCA_INTMASK_SDCA_7 | SDW_SCP_SDCA_INTMASK_SDCA_5 | SDW_SCP_SDCA_INTMASK_SDCA_1)); + } + + if (es9356->hw_init || status != SDW_SLAVE_ATTACHED) + return 0; + + return es9356_sdca_io_init(&slave->dev, slave); +} + +static int es9356_sdw_read_prop(struct sdw_slave *slave) +{ + struct sdw_slave_prop *prop = &slave->prop; + int nval; + int i, j; + u32 bit; + unsigned long addr; + struct sdw_dpn_prop *dpn; + + prop->paging_support = true; + + /* + * first we need to allocate memory for set bits in port lists + * the port allocation is completely arbitrary: + * DP0 is not supported + * DP3 and DP4 is sink + * DP1 and DP2 is source + */ + prop->source_ports = BIT(1) | BIT(2); + prop->sink_ports = BIT(3) | BIT(4); + + nval = hweight32(prop->source_ports); + prop->src_dpn_prop = devm_kcalloc(&slave->dev, nval, + sizeof(*prop->src_dpn_prop), + GFP_KERNEL); + if (!prop->src_dpn_prop) + return -ENOMEM; + + i = 0; + dpn = prop->src_dpn_prop; + addr = prop->source_ports; + for_each_set_bit(bit, &addr, 32) { + dpn[i].num = bit; + dpn[i].type = SDW_DPN_FULL; + dpn[i].simple_ch_prep_sm = true; + i++; + } + + /* do this again for sink now */ + nval = hweight32(prop->sink_ports); + prop->sink_dpn_prop = devm_kcalloc(&slave->dev, nval, + sizeof(*prop->sink_dpn_prop), + GFP_KERNEL); + if (!prop->sink_dpn_prop) + return -ENOMEM; + + j = 0; + dpn = prop->sink_dpn_prop; + addr = prop->sink_ports; + for_each_set_bit(bit, &addr, 32) { + dpn[j].num = bit; + dpn[j].type = SDW_DPN_FULL; + dpn[j].simple_ch_prep_sm = true; + j++; + } + + /* wake-up event */ + prop->wake_capable = 1; + + return 0; +} + +static int es9356_sdw_interrupt_callback(struct sdw_slave *slave, + struct sdw_slave_intr_status *status) +{ + struct es9356_sdw_priv *es9356 = dev_get_drvdata(&slave->dev); + unsigned int sdca_cascade, scp_sdca_stat1 = 0; + int count = 0, retry = 3; + int ret, stat, reg; + + mutex_lock(&es9356->disable_irq_lock); + + ret = sdw_read_no_pm(es9356->slave, SDW_SCP_SDCA_INT1); + if (ret < 0) + goto io_error; + es9356->sdca_status = ret; + + do { + reg = sdw_read_no_pm(es9356->slave, SDW_SCP_SDCA_INT1); + if (reg < 0) + goto io_error; + if (reg & SDW_SCP_SDCA_INTMASK_SDCA_1) { + ret = sdw_update_no_pm(es9356->slave, SDW_SCP_SDCA_INT1, + SDW_SCP_SDCA_INT_SDCA_1, SDW_SCP_SDCA_INT_SDCA_1); + if (ret < 0) + goto io_error; + } + + if (reg & SDW_SCP_SDCA_INTMASK_SDCA_5) { + ret = sdw_update_no_pm(es9356->slave, SDW_SCP_SDCA_INT1, + SDW_SCP_SDCA_INT_SDCA_5, SDW_SCP_SDCA_INT_SDCA_5); + if (ret < 0) + goto io_error; + } + + if (reg & SDW_SCP_SDCA_INTMASK_SDCA_7) { + ret = sdw_update_no_pm(es9356->slave, SDW_SCP_SDCA_INT1, + SDW_SCP_SDCA_INT_SDCA_7, SDW_SCP_SDCA_INT_SDCA_7); + if (ret < 0) + goto io_error; + } + + ret = sdw_read_no_pm(es9356->slave, SDW_DP0_INT); + if (ret < 0) + goto io_error; + sdca_cascade = ret & SDW_DP0_SDCA_CASCADE; + + ret = sdw_read_no_pm(es9356->slave, SDW_SCP_SDCA_INT1); + if (ret < 0) + goto io_error; + scp_sdca_stat1 = ret & + (SDW_SCP_SDCA_INTMASK_SDCA_1 | SDW_SCP_SDCA_INTMASK_SDCA_5 | SDW_SCP_SDCA_INTMASK_SDCA_7); + + stat = scp_sdca_stat1 || sdca_cascade; + + count++; + } while (stat != 0 && count < retry); + + /* The 280 ms figure was determined through testing */ + if (status->sdca_cascade && !es9356->disable_irq) + mod_delayed_work(system_power_efficient_wq, + &es9356->interrupt_handle_work, msecs_to_jiffies(280)); + + mutex_unlock(&es9356->disable_irq_lock); + return 0; + +io_error: + mutex_unlock(&es9356->disable_irq_lock); + pr_err_ratelimited("IO error in %s, ret %d\n", __func__, ret); + return ret; +} + +static const struct sdw_slave_ops es9356_sdw_slave_ops = { + .read_prop = es9356_sdw_read_prop, + .interrupt_callback = es9356_sdw_interrupt_callback, + .update_status = es9356_sdw_update_status, +}; + +static int es9356_sdca_init(struct device *dev, struct regmap *regmap, struct sdw_slave *slave) +{ + struct es9356_sdw_priv *es9356; + int ret; + + es9356 = devm_kzalloc(dev, sizeof(*es9356), GFP_KERNEL); + if (!es9356) + return -ENOMEM; + + dev_set_drvdata(dev, es9356); + + es9356->slave = slave; + es9356->regmap = regmap; + mutex_init(&es9356->disable_irq_lock); + mutex_init(&es9356->pde_lock); + + regcache_cache_only(es9356->regmap, true); + + es9356->hw_init = false; + es9356->first_hw_init = false; + + INIT_DELAYED_WORK(&es9356->interrupt_handle_work, + es9356_interrupt_handler); + INIT_DELAYED_WORK(&es9356->button_detect_work, + es9356_button_detect_handler); + + ret = devm_snd_soc_register_component(dev, + &snd_soc_es9356_sdw_component, + es9356_sdw_dai, + ARRAY_SIZE(es9356_sdw_dai)); + if (ret) { + dev_err_probe(dev, ret, "Failed to register component\n"); + return ret; + } + /* set autosuspend parameters */ + pm_runtime_set_autosuspend_delay(dev, 3000); + pm_runtime_use_autosuspend(dev); + /* make sure the device does not suspend immediately */ + pm_runtime_mark_last_busy(dev); + pm_runtime_enable(dev); + + return 0; +} + +static int es9356_sdw_probe(struct sdw_slave *slave, + const struct sdw_device_id *id) +{ + struct regmap *regmap; + + regmap = devm_regmap_init_sdw_mbq_cfg(&slave->dev, slave, &es9356_sdca_regmap, &es9356_mbq_config); + if (IS_ERR(regmap)) + return PTR_ERR(regmap); + + return es9356_sdca_init(&slave->dev, regmap, slave); +} + +static void es9356_sdw_remove(struct sdw_slave *slave) +{ + struct es9356_sdw_priv *es9356 = dev_get_drvdata(&slave->dev); + + if (es9356->hw_init) { + cancel_delayed_work_sync(&es9356->interrupt_handle_work); + cancel_delayed_work_sync(&es9356->button_detect_work); + } + + if (es9356->first_hw_init) + pm_runtime_disable(&slave->dev); + + mutex_destroy(&es9356->disable_irq_lock); + mutex_destroy(&es9356->pde_lock); +} + +static const struct sdw_device_id es9356_sdw_id[] = { + SDW_SLAVE_ENTRY_EXT(0x04b3, 0x9356, 0x02, 0, 0), + SDW_SLAVE_ENTRY_EXT(0x04b3, 0x9356, 0x03, 0, 0), + {}, +}; +MODULE_DEVICE_TABLE(sdw, es9356_sdw_id); + +static int es9356_sdca_dev_suspend(struct device *dev) +{ + struct es9356_sdw_priv *es9356 = dev_get_drvdata(dev); + + cancel_delayed_work_sync(&es9356->interrupt_handle_work); + cancel_delayed_work_sync(&es9356->button_detect_work); + + regcache_cache_only(es9356->regmap, true); + + return 0; +} + +static int es9356_sdca_dev_system_suspend(struct device *dev) +{ + struct es9356_sdw_priv *es9356 = dev_get_drvdata(dev); + + mutex_lock(&es9356->disable_irq_lock); + es9356->disable_irq = true; + mutex_unlock(&es9356->disable_irq_lock); + + return es9356_sdca_dev_suspend(dev); +} + +#define es9356_PROBE_TIMEOUT 2000 + +static int es9356_sdca_dev_resume(struct device *dev) +{ + struct sdw_slave *slave = dev_to_sdw_dev(dev); + struct es9356_sdw_priv *es9356 = dev_get_drvdata(dev); + unsigned long time; + + if (!slave->unattach_request) { + es9356->disable_irq = false; + goto regmap_sync; + } + + time = wait_for_completion_timeout(&slave->initialization_complete, + msecs_to_jiffies(es9356_PROBE_TIMEOUT)); + if (!time) { + dev_err(&slave->dev, "Initialization not complete, timed out\n"); + sdw_show_ping_status(slave->bus, true); + + return -ETIMEDOUT; + } + +regmap_sync: + slave->unattach_request = 0; + regcache_cache_only(es9356->regmap, false); + regcache_sync(es9356->regmap); + return 0; +} + +static const struct dev_pm_ops es9356_sdca_pm = { + SYSTEM_SLEEP_PM_OPS(es9356_sdca_dev_system_suspend, es9356_sdca_dev_resume) + RUNTIME_PM_OPS(es9356_sdca_dev_suspend, es9356_sdca_dev_resume, NULL) +}; + +static struct sdw_driver es9356_sdw_driver = { + .driver = { + .name = "es9356", + .pm = pm_ptr(&es9356_sdca_pm), + }, + .probe = es9356_sdw_probe, + .remove = es9356_sdw_remove, + .ops = &es9356_sdw_slave_ops, + .id_table = es9356_sdw_id, +}; +module_sdw_driver(es9356_sdw_driver); + +MODULE_IMPORT_NS("SND_SOC_SDCA"); +MODULE_DESCRIPTION("ASoC ES9356 SDCA SDW codec driver"); +MODULE_AUTHOR("Michael Zhang "); +MODULE_LICENSE("GPL"); diff --git a/sound/soc/codecs/es9356.h b/sound/soc/codecs/es9356.h new file mode 100644 index 00000000000000..2a676f36590f6d --- /dev/null +++ b/sound/soc/codecs/es9356.h @@ -0,0 +1,208 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +#ifndef __ES9356_H__ +#define __ES9356_H__ + +/*ES9356 Implementation-define*/ +#define ES9356_FLAGS_HP 0x2003 +#define ES9356_CSM_RESET 0x2020 +#define ES9356_FUC_RESET 0x2021 +#define ES9356_STATE 0x2022 +#define ES9356_VMID_TIME 0x2023 +#define ES9356_STATE_TIME 0x2024 +#define ES9356_HP_SPK_TIME 0x2025 +#define ES9356_WP_ENABLE 0x2026 +#define ES9356_DMIC_GPIO 0x2027 +#define ES9356_ENDPOINT_MODE 0x2028 + +/*HP DETECT*/ +#define ES9356_HP_TYPE 0x2029 +#define ES9356_HP_DETECTTIME 0x202A +#define ES9356_MICBIAS_SEL 0x202B +#define ES9356_KEY_PRESS_TIME 0x202C +#define ES9356_KEY_RELEASE_TIME 0x202D +#define ES9356_KEY_HOLD_TIME 0x202E +#define ES9356_BTSEL_REF 0x202F +#define ES9356_BUTTON_CHARGE 0x2030 + +#define ES9356_KEYD_DETECT 0x2031 +#define ES9356_DPEN_TIME 0x2032 +#define ES9356_TIMER_CHECK 0x2033 +#define ES9356_IBIASGEN 0x2041 +#define ES9356_VMID1SEL 0x2042 +#define ES9356_VMID1STL 0x2043 +#define ES9356_VMID2SEL 0x2044 +#define ES9356_VMID2STL 0x2045 +#define ES9356_VSEL 0x2046 +#define ES9356_MICBIAS_CTL 0x2047 +#define ES9356_HPDETECT_CTL 0x2048 +#define ES9356_MICBIAS_RES 0x2049 + +/*CLK*/ +#define ES9356_CLK_SEL 0x2050 +#define ES9356_CLK_CTL 0x2051 +#define ES9356_DETCLK_CTL 0x2052 +#define ES9356_CPCLK_CTL 0x2053 +#define ES9356_SPKCLK_CTL 0x2054 +#define ES9356_PRE_DIV_CTL 0x2055 +#define ES9356_DLL_MODE 0x2056 +#define ES9356_ANACLK_SEL 0x2057 +#define ES9356_OSRCLK_SEL 0x2058 +#define ES9356_DSPCLK_SEL 0x2059 +#define ES9356_SPK9M_MODE 0x205a + +/*ADC DIG CTL*/ +#define ES9356_DMIC_POL 0x2061 +#define ES9356_ADC_SWAP 0x2062 +#define ES9356_ADC_OSR 0x2063 +#define ES9356_ADC_OSRGAIN 0x2064 +#define ES9356_ADC_CLEARRAM 0x2065 +#define ES9356_ADC_RAMP 0x2066 +#define ES9356_ADC_HPF1 0x2067 +#define ES9356_ADC_HPF2 0x2068 +#define ES9356_ADC_ALC 0x206C +#define ES9356_ALC_LEVEL 0x206D +#define ES9356_ALC_RAMP_WINSIZE 0x206E + +/*ADC ANA CTL*/ +#define ES9356_ADC_REF_EN 0x2080 +#define ES9356_ADC_AMIC_CTL 0x2081 +#define ES9356_ADC_ANA 0x2082 +#define ES9356_PGA_CTL 0x2083 +#define ES9356_ADC_INT 0x2084 +#define ES9356_ADC_VCM 0x2085 +#define ES9356_ADC_VRPBIAS 0x2086 +#define ES9356_ADC_LP 0x2087 + +/*DAC DIG CTL*/ +#define ES9356_DAC_FSMODE 0x2090 +#define ES9356_DAC_OSR 0x2091 +#define ES9356_DAC_INV 0x2092 +#define ES9356_DAC_RAMP 0x2093 +#define ES9356_DAC_VPPSCALE 0x2094 +#define ES9356_DAC_SWAP 0x2097 +#define ES9356_SPKCMP_VPPSC 0x20A0 +#define ES9356_CALIBRATION_TIME 0x20A1 +#define ES9356_CALIBRATION_SETTING 0x20A2 +#define ES9356_DAC_OFFSET_LH 0x20A3 +#define ES9356_DAC_OFFSET_LL 0x20A4 +#define ES9356_DAC_OFFSET_RH 0x20A5 +#define ES9356_DAC_OFFSET_RL 0x20A6 + +/*DAC ANA CTL*/ +#define ES9356_DAC_REF_EN 0x20B0 +#define ES9356_DAC_ENABLE 0x20B1 +#define ES9356_DAC_VROI 0x20B2 +#define ES9356_DAC_LP 0x20B3 + +/*HP CTL*/ +#define ES9356_CHARGEPUMP_CTL 0x20C0 +#define ES9356_CPLDO_CTL 0x20C1 +#define ES9356_HP_REF_CTL 0x20C2 +#define ES9356_HP_IBIAS 0x20C3 +#define ES9356_HP_EN 0x20C4 +#define ES9356_HP_VOLUME 0x20C5 +#define ES9356_HP_LP 0x20C6 + +/*SPK CTL*/ +#define ES9356_SPKLDO_CTL 0x20D0 +#define ES9356_CLASSD_CTL 0x20D1 +#define ES9356_SPK_HBDG 0x20D5 +#define ES9356_SPK_VOLUME 0x20D7 +#define ES9356_SPK_SCP 0x20D8 +#define ES9356_SPK_DT 0x20D9 +#define ES9356_SPK_OTP 0x20DA +#define ES9356_SPKBIAS_COMP 0x20DB + +/* ES9356 SDCA Control - function number */ +#define FUNC_NUM_UAJ 0x01 +#define FUNC_NUM_MIC 0x02 +#define FUNC_NUM_AMP 0x03 +#define FUNC_NUM_HID 0x04 + +/* ES9356 SDCA entity */ +#define ES9356_SDCA_ENT0 0x00 +#define ES9356_SDCA_ENT_PDE11 0x03 +#define ES9356_SDCA_ENT_FU11 0x04 +#define ES9356_SDCA_ENT_XU12 0x05 +#define ES9356_SDCA_ENT_FU113 0x07 +#define ES9356_SDCA_ENT_CS113 0x09 +#define ES9356_SDCA_ENT_PPU11 0x0C + +#define ES9356_SDCA_ENT_CS21 0x02 +#define ES9356_SDCA_ENT_PPU21 0x03 +#define ES9356_SDCA_ENT_FU21 0X04 +#define ES9356_SDCA_ENT_XU22 0x06 +#define ES9356_SDCA_ENT_SAPU29 0x03 +#define ES9356_SDCA_ENT_PDE23 0x0B +#define ES9356_SDCA_ENT_HID01 0x01 + +#define ES9356_SDCA_ENT_CS41 0x02 +#define ES9356_SDCA_ENT_FU35 0x04 +#define ES9356_SDCA_ENT_XU42 0x06 +#define ES9356_SDCA_ENT_FU41 0x07 +#define ES9356_SDCA_ENT_PDE47 0x0E +#define ES9356_SDCA_ENT_IT33 0x0F +#define ES9356_SDCA_ENT_PDE34 0x10 +#define ES9356_SDCA_ENT_FU33 0x11 +#define ES9356_SDCA_ENT_XU36 0x13 +#define ES9356_SDCA_ENT_FU36 0x15 +#define ES9356_SDCA_ENT_CS36 0x17 +#define ES9356_SDCA_ENT_GE35 0x18 + +/* ES9356 SDCA control */ +#define ES9356_SDCA_CTL_SAMPLE_FREQ_INDEX 0x10 +#define ES9356_SDCA_CTL_FU_MUTE 0x01 +#define ES9356_SDCA_CTL_FU_VOLUME 0x02 +#define ES9356_SDCA_CTL_HIDTX_CURRENT_OWNER 0x10 +#define ES9356_SDCA_CTL_SELECTED_MODE 0x01 +#define ES9356_SDCA_CTL_DETECTED_MODE 0x02 +#define ES9356_SDCA_CTL_REQ_POWER_STATE 0x01 +#define ES9356_SDCA_CTL_FU_CH_GAIN 0x0b +#define ES9356_SDCA_CTL_FUNC_STATUS 0x10 +#define ES9356_SDCA_CTL_ACTUAL_POWER_STATE 0x10 +#define ES9356_SDCA_CTL_POSTURE_NUMBER 0x00 + +/* ES9356 SDCA channel */ +#define CH_L 0x01 +#define CH_R 0x02 +#define MBQ 0x2000 + +/* ES9356 HID*/ +#define ES9356_BUF_ADDR_HID 0x44000000 +#define ES9356_HID_BYTE2 0x44000001 +#define ES9356_HID_BYTE3 0x44000002 +#define ES9356_HID_BYTE4 0x44000003 + +/* ES9356 Volume Setting*/ +#define ES9356_VU_BASE 768 +#define ES9356_OFFSET_HIGH 0x07F8 +#define ES9356_OFFSET_LOW 0x0007 +#define ES9356_DEFAULT_VOLUME 0x00 +#define ES9356_VOLUME_STEP 32 +#define ES9356_VOLUME_MIN -768 +#define ES9356_VOLUME_MAX 285 +#define ES9356_AMIC_GAIN_STEP 768 +#define ES9356_DMIC_GAIN_STEP 1536 +#define ES9356_GAIN_MIN 0 +#define ES9356_AMIC_GAIN_MAX 10 +#define ES9356_DMIC_GAIN_MAX 3 + +enum { + ES9356_DMIC = 1, /* For dmic */ + ES9356_JACK_IN, /* For headset mic */ + ES9356_AMP, /* For speaker */ + ES9356_JACK_OUT, /* For headphone */ +}; + +enum { + ES9356_SDCA_RATE_16000HZ, + ES9356_SDCA_RATE_24000HZ, + ES9356_SDCA_RATE_32000HZ, + ES9356_SDCA_RATE_44100HZ, + ES9356_SDCA_RATE_48000HZ, + ES9356_SDCA_RATE_88200HZ, + ES9356_SDCA_RATE_96000HZ, +}; + +#endif diff --git a/sound/soc/codecs/fs210x.c b/sound/soc/codecs/fs210x.c index e6195b71adadcc..eda716f817b58c 100644 --- a/sound/soc/codecs/fs210x.c +++ b/sound/soc/codecs/fs210x.c @@ -968,7 +968,7 @@ static int fs210x_effect_scene_info(struct snd_kcontrol *kcontrol, if (scene->name) name = scene->name; - strscpy(uinfo->value.enumerated.name, name, strlen(name) + 1); + strscpy(uinfo->value.enumerated.name, name); return 0; } diff --git a/sound/soc/codecs/max98363.c b/sound/soc/codecs/max98363.c index 25af78ab30d5c5..099dc5bf6195f1 100644 --- a/sound/soc/codecs/max98363.c +++ b/sound/soc/codecs/max98363.c @@ -90,24 +90,15 @@ static int max98363_resume(struct device *dev) { struct sdw_slave *slave = dev_to_sdw_dev(dev); struct max98363_priv *max98363 = dev_get_drvdata(dev); - unsigned long time; + int ret; if (!max98363->first_hw_init) return 0; - if (!slave->unattach_request) - goto regmap_sync; - - time = wait_for_completion_timeout(&slave->initialization_complete, - msecs_to_jiffies(MAX98363_PROBE_TIMEOUT)); - if (!time) { - dev_err(dev, "Initialization not complete, timed out\n"); - return -ETIMEDOUT; - } - -regmap_sync: + ret = sdw_slave_wait_for_init(slave, MAX98363_PROBE_TIMEOUT); + if (ret) + return ret; - slave->unattach_request = 0; regcache_cache_only(max98363->regmap, false); regcache_sync(max98363->regmap); diff --git a/sound/soc/codecs/max98373-sdw.c b/sound/soc/codecs/max98373-sdw.c index 16673440218cb5..6829fa07c9ecbe 100644 --- a/sound/soc/codecs/max98373-sdw.c +++ b/sound/soc/codecs/max98373-sdw.c @@ -266,25 +266,15 @@ static int max98373_resume(struct device *dev) { struct sdw_slave *slave = dev_to_sdw_dev(dev); struct max98373_priv *max98373 = dev_get_drvdata(dev); - unsigned long time; + int ret; if (!max98373->first_hw_init) return 0; - if (!slave->unattach_request) - goto regmap_sync; - - time = wait_for_completion_timeout(&slave->initialization_complete, - msecs_to_jiffies(MAX98373_PROBE_TIMEOUT)); - if (!time) { - dev_err(dev, "Initialization not complete, timed out\n"); - sdw_show_ping_status(slave->bus, true); - - return -ETIMEDOUT; - } + ret = sdw_slave_wait_for_init(slave, MAX98373_PROBE_TIMEOUT); + if (ret) + return ret; -regmap_sync: - slave->unattach_request = 0; regcache_cache_only(max98373->regmap, false); regcache_sync(max98373->regmap); diff --git a/sound/soc/codecs/pcm6240.c b/sound/soc/codecs/pcm6240.c index 78b21fbfad50e0..667596fc1781b6 100644 --- a/sound/soc/codecs/pcm6240.c +++ b/sound/soc/codecs/pcm6240.c @@ -1230,15 +1230,11 @@ static struct pcmdevice_config_info *pcmdevice_add_config(void *ctxt, int *status) { struct pcmdevice_priv *pcm_dev = (struct pcmdevice_priv *)ctxt; - struct pcmdevice_config_info *cfg_info; + struct pcmdevice_config_info *cfg_info = NULL; struct pcmdevice_block_data **bk_da; + char cfg_name[64] = {}; unsigned int config_offset = 0, i; - - cfg_info = kzalloc_obj(struct pcmdevice_config_info); - if (!cfg_info) { - *status = -ENOMEM; - goto out; - } + unsigned int nblocks; if (pcm_dev->regbin.fw_hdr.binary_version_num >= 0x105) { if (config_offset + 64 > (int)config_size) { @@ -1247,7 +1243,7 @@ static struct pcmdevice_config_info *pcmdevice_add_config(void *ctxt, "%s: cfg_name out of boundary\n", __func__); goto out; } - memcpy(cfg_info->cfg_name, &config_data[config_offset], 64); + memcpy(cfg_name, &config_data[config_offset], 64); config_offset += 64; } @@ -1257,16 +1253,17 @@ static struct pcmdevice_config_info *pcmdevice_add_config(void *ctxt, __func__); goto out; } - cfg_info->nblocks = - get_unaligned_be32(&config_data[config_offset]); + nblocks = get_unaligned_be32(&config_data[config_offset]); config_offset += 4; - bk_da = cfg_info->blk_data = kzalloc_objs(struct pcmdevice_block_data *, - cfg_info->nblocks); - if (!bk_da) { + cfg_info = kzalloc_flex(*cfg_info, blk_data, nblocks); + if (!cfg_info) { *status = -ENOMEM; goto out; } + cfg_info->nblocks = nblocks; + memcpy(cfg_info->cfg_name, cfg_name, sizeof(cfg_info->cfg_name)); + bk_da = cfg_info->blk_data; cfg_info->real_nblocks = 0; for (i = 0; i < cfg_info->nblocks; i++) { if (config_offset + 12 > config_size) { @@ -1449,14 +1446,11 @@ static void pcmdevice_config_info_remove(void *ctxt) for (i = 0; i < regbin->ncfgs; i++) { if (!cfg_info[i]) continue; - if (cfg_info[i]->blk_data) { - for (j = 0; j < (int)cfg_info[i]->real_nblocks; j++) { - if (!cfg_info[i]->blk_data[j]) - continue; - kfree(cfg_info[i]->blk_data[j]->regdata); - kfree(cfg_info[i]->blk_data[j]); - } - kfree(cfg_info[i]->blk_data); + for (j = 0; j < (int)cfg_info[i]->real_nblocks; j++) { + if (!cfg_info[i]->blk_data[j]) + continue; + kfree(cfg_info[i]->blk_data[j]->regdata); + kfree(cfg_info[i]->blk_data[j]); } kfree(cfg_info[i]); } diff --git a/sound/soc/codecs/pcm6240.h b/sound/soc/codecs/pcm6240.h index 2d8f9e798139ac..e27afd33910c4b 100644 --- a/sound/soc/codecs/pcm6240.h +++ b/sound/soc/codecs/pcm6240.h @@ -199,7 +199,7 @@ struct pcmdevice_config_info { unsigned int nblocks; unsigned int real_nblocks; unsigned char active_dev; - struct pcmdevice_block_data **blk_data; + struct pcmdevice_block_data *blk_data[] __counted_by(nblocks); }; struct pcmdevice_regbin { diff --git a/sound/soc/codecs/rt1017-sdca-sdw.c b/sound/soc/codecs/rt1017-sdca-sdw.c index 148b36173a2573..d62e8a25367679 100644 --- a/sound/soc/codecs/rt1017-sdca-sdw.c +++ b/sound/soc/codecs/rt1017-sdca-sdw.c @@ -773,25 +773,15 @@ static int rt1017_sdca_dev_resume(struct device *dev) { struct sdw_slave *slave = dev_to_sdw_dev(dev); struct rt1017_sdca_priv *rt1017 = dev_get_drvdata(dev); - unsigned long time; + int ret; if (!rt1017->first_hw_init) return 0; - if (!slave->unattach_request) - goto regmap_sync; - - time = wait_for_completion_timeout(&slave->initialization_complete, - msecs_to_jiffies(RT1017_PROBE_TIMEOUT)); - if (!time) { - dev_err(&slave->dev, "Initialization not complete, timed out\n"); - sdw_show_ping_status(slave->bus, true); - - return -ETIMEDOUT; - } + ret = sdw_slave_wait_for_init(slave, RT1017_PROBE_TIMEOUT); + if (ret) + return ret; -regmap_sync: - slave->unattach_request = 0; regcache_cache_only(rt1017->regmap, false); regcache_sync(rt1017->regmap); diff --git a/sound/soc/codecs/rt1308-sdw.c b/sound/soc/codecs/rt1308-sdw.c index e077d096bc239c..39e06a3a75609a 100644 --- a/sound/soc/codecs/rt1308-sdw.c +++ b/sound/soc/codecs/rt1308-sdw.c @@ -768,25 +768,15 @@ static int rt1308_dev_resume(struct device *dev) { struct sdw_slave *slave = dev_to_sdw_dev(dev); struct rt1308_sdw_priv *rt1308 = dev_get_drvdata(dev); - unsigned long time; + int ret; if (!rt1308->first_hw_init) return 0; - if (!slave->unattach_request) - goto regmap_sync; - - time = wait_for_completion_timeout(&slave->initialization_complete, - msecs_to_jiffies(RT1308_PROBE_TIMEOUT)); - if (!time) { - dev_err(&slave->dev, "Initialization not complete, timed out\n"); - sdw_show_ping_status(slave->bus, true); - - return -ETIMEDOUT; - } + ret = sdw_slave_wait_for_init(slave, RT1308_PROBE_TIMEOUT); + if (ret) + return ret; -regmap_sync: - slave->unattach_request = 0; regcache_cache_only(rt1308->regmap, false); regcache_sync_region(rt1308->regmap, 0xc000, 0xcfff); diff --git a/sound/soc/codecs/rt1316-sdw.c b/sound/soc/codecs/rt1316-sdw.c index 20fc1579eb9cf2..1828fd9d5af6ab 100644 --- a/sound/soc/codecs/rt1316-sdw.c +++ b/sound/soc/codecs/rt1316-sdw.c @@ -745,25 +745,15 @@ static int rt1316_dev_resume(struct device *dev) { struct sdw_slave *slave = dev_to_sdw_dev(dev); struct rt1316_sdw_priv *rt1316 = dev_get_drvdata(dev); - unsigned long time; + int ret; if (!rt1316->first_hw_init) return 0; - if (!slave->unattach_request) - goto regmap_sync; - - time = wait_for_completion_timeout(&slave->initialization_complete, - msecs_to_jiffies(RT1316_PROBE_TIMEOUT)); - if (!time) { - dev_err(&slave->dev, "%s: Initialization not complete, timed out\n", __func__); - sdw_show_ping_status(slave->bus, true); - - return -ETIMEDOUT; - } + ret = sdw_slave_wait_for_init(slave, RT1316_PROBE_TIMEOUT); + if (ret) + return ret; -regmap_sync: - slave->unattach_request = 0; regcache_cache_only(rt1316->regmap, false); regcache_sync(rt1316->regmap); diff --git a/sound/soc/codecs/rt1318-sdw.c b/sound/soc/codecs/rt1318-sdw.c index d28f1afe68f184..51bd11b92a554c 100644 --- a/sound/soc/codecs/rt1318-sdw.c +++ b/sound/soc/codecs/rt1318-sdw.c @@ -821,23 +821,15 @@ static int rt1318_dev_resume(struct device *dev) { struct sdw_slave *slave = dev_to_sdw_dev(dev); struct rt1318_sdw_priv *rt1318 = dev_get_drvdata(dev); - unsigned long time; + int ret; if (!rt1318->first_hw_init) return 0; - if (!slave->unattach_request) - goto regmap_sync; - - time = wait_for_completion_timeout(&slave->initialization_complete, - msecs_to_jiffies(RT1318_PROBE_TIMEOUT)); - if (!time) { - dev_err(&slave->dev, "%s: Initialization not complete, timed out\n", __func__); - return -ETIMEDOUT; - } + ret = sdw_slave_wait_for_init(slave, RT1318_PROBE_TIMEOUT); + if (ret) + return ret; -regmap_sync: - slave->unattach_request = 0; regcache_cache_only(rt1318->regmap, false); regcache_sync(rt1318->regmap); diff --git a/sound/soc/codecs/rt1320-sdw.c b/sound/soc/codecs/rt1320-sdw.c index 192faa431b5e9c..13493b85f3c95a 100644 --- a/sound/soc/codecs/rt1320-sdw.c +++ b/sound/soc/codecs/rt1320-sdw.c @@ -3053,23 +3053,15 @@ static int rt1320_dev_resume(struct device *dev) { struct sdw_slave *slave = dev_to_sdw_dev(dev); struct rt1320_sdw_priv *rt1320 = dev_get_drvdata(dev); - unsigned long time; + int ret; if (!rt1320->first_hw_init) return 0; - if (!slave->unattach_request) - goto regmap_sync; - - time = wait_for_completion_timeout(&slave->initialization_complete, - msecs_to_jiffies(RT1320_PROBE_TIMEOUT)); - if (!time) { - dev_err(&slave->dev, "%s: Initialization not complete, timed out\n", __func__); - return -ETIMEDOUT; - } + ret = sdw_slave_wait_for_init(slave, RT1320_PROBE_TIMEOUT); + if (ret) + return ret; -regmap_sync: - slave->unattach_request = 0; regcache_cache_only(rt1320->regmap, false); regcache_sync(rt1320->regmap); regcache_cache_only(rt1320->mbq_regmap, false); diff --git a/sound/soc/codecs/rt5640.c b/sound/soc/codecs/rt5640.c index f6c6294e158805..f9ee8cc64ee5e2 100644 --- a/sound/soc/codecs/rt5640.c +++ b/sound/soc/codecs/rt5640.c @@ -2397,7 +2397,7 @@ static void rt5640_jack_work(struct work_struct *work) * disabled the OVCD IRQ, the IRQ pin will stay high and as * we react to edges, we miss the unplug event -> recheck. */ - queue_delayed_work(system_long_wq, &rt5640->jack_work, 0); + queue_delayed_work(system_dfl_long_wq, &rt5640->jack_work, 0); } } @@ -2410,7 +2410,8 @@ static irqreturn_t rt5640_irq(int irq, void *data) delay = 100; if (rt5640->jack) - mod_delayed_work(system_long_wq, &rt5640->jack_work, delay); + mod_delayed_work(system_dfl_long_wq, &rt5640->jack_work, + delay); return IRQ_HANDLED; } @@ -2419,7 +2420,7 @@ static irqreturn_t rt5640_jd_gpio_irq(int irq, void *data) { struct rt5640_priv *rt5640 = data; - queue_delayed_work(system_long_wq, &rt5640->jack_work, + queue_delayed_work(system_dfl_long_wq, &rt5640->jack_work, msecs_to_jiffies(JACK_SETTLE_TIME)); return IRQ_HANDLED; @@ -2553,10 +2554,12 @@ static void rt5640_enable_jack_detect(struct snd_soc_component *component, rt5640->jd_gpio = jack_data->jd_gpio; rt5640->jd_gpio_irq = gpiod_to_irq(rt5640->jd_gpio); - ret = request_irq(rt5640->jd_gpio_irq, rt5640_jd_gpio_irq, - IRQF_TRIGGER_RISING | IRQF_TRIGGER_FALLING, - "rt5640-jd-gpio", rt5640); - if (ret) { + ret = request_any_context_irq(rt5640->jd_gpio_irq, + rt5640_jd_gpio_irq, + IRQF_TRIGGER_RISING | + IRQF_TRIGGER_FALLING, + "rt5640-jd-gpio", rt5640); + if (ret < 0) { dev_warn(component->dev, "Failed to request jd GPIO IRQ %d: %d\n", rt5640->jd_gpio_irq, ret); rt5640_disable_jack_detect(component); @@ -2568,10 +2571,10 @@ static void rt5640_enable_jack_detect(struct snd_soc_component *component, if (jack_data && jack_data->use_platform_clock) rt5640->use_platform_clock = jack_data->use_platform_clock; - ret = request_irq(rt5640->irq, rt5640_irq, - IRQF_TRIGGER_RISING | IRQF_TRIGGER_FALLING, - "rt5640", rt5640); - if (ret) { + ret = request_any_context_irq(rt5640->irq, rt5640_irq, + IRQF_TRIGGER_RISING | IRQF_TRIGGER_FALLING, + "rt5640", rt5640); + if (ret < 0) { dev_warn(component->dev, "Failed to request IRQ %d: %d\n", rt5640->irq, ret); rt5640_disable_jack_detect(component); return; @@ -2579,7 +2582,7 @@ static void rt5640_enable_jack_detect(struct snd_soc_component *component, rt5640->irq_requested = true; /* sync initial jack state */ - queue_delayed_work(system_long_wq, &rt5640->jack_work, 0); + queue_delayed_work(system_dfl_long_wq, &rt5640->jack_work, 0); } static const struct snd_soc_dapm_route rt5640_hda_jack_dapm_routes[] = { @@ -2622,9 +2625,9 @@ static void rt5640_enable_hda_jack_detect( rt5640->jack = jack; - ret = request_irq(rt5640->irq, rt5640_irq, - IRQF_TRIGGER_RISING, "rt5640", rt5640); - if (ret) { + ret = request_any_context_irq(rt5640->irq, rt5640_irq, + IRQF_TRIGGER_RISING, "rt5640", rt5640); + if (ret < 0) { dev_warn(component->dev, "Failed to request IRQ %d: %d\n", rt5640->irq, ret); rt5640->jack = NULL; return; @@ -2632,7 +2635,7 @@ static void rt5640_enable_hda_jack_detect( rt5640->irq_requested = true; /* sync initial jack state */ - queue_delayed_work(system_long_wq, &rt5640->jack_work, 0); + queue_delayed_work(system_dfl_long_wq, &rt5640->jack_work, 0); snd_soc_dapm_add_routes(dapm, rt5640_hda_jack_dapm_routes, ARRAY_SIZE(rt5640_hda_jack_dapm_routes)); @@ -2860,7 +2863,7 @@ static int rt5640_resume(struct snd_soc_component *component) } enable_irq(rt5640->irq); - queue_delayed_work(system_long_wq, &rt5640->jack_work, 0); + queue_delayed_work(system_dfl_long_wq, &rt5640->jack_work, 0); } return 0; diff --git a/sound/soc/codecs/rt5682-sdw.c b/sound/soc/codecs/rt5682-sdw.c index fc464538ceffb0..ec2a35a0cacdea 100644 --- a/sound/soc/codecs/rt5682-sdw.c +++ b/sound/soc/codecs/rt5682-sdw.c @@ -754,7 +754,7 @@ static int rt5682_dev_resume(struct device *dev) { struct sdw_slave *slave = dev_to_sdw_dev(dev); struct rt5682_priv *rt5682 = dev_get_drvdata(dev); - unsigned long time; + int ret; if (!rt5682->first_hw_init) return 0; @@ -766,20 +766,12 @@ static int rt5682_dev_resume(struct device *dev) rt5682->disable_irq = false; } mutex_unlock(&rt5682->disable_irq_lock); - goto regmap_sync; } - time = wait_for_completion_timeout(&slave->initialization_complete, - msecs_to_jiffies(RT5682_PROBE_TIMEOUT)); - if (!time) { - dev_err(&slave->dev, "%s: Initialization not complete, timed out\n", __func__); - sdw_show_ping_status(slave->bus, true); - - return -ETIMEDOUT; - } + ret = sdw_slave_wait_for_init(slave, RT5682_PROBE_TIMEOUT); + if (ret) + return ret; -regmap_sync: - slave->unattach_request = 0; regcache_cache_only(rt5682->sdw_regmap, false); regcache_cache_only(rt5682->regmap, false); regcache_sync(rt5682->regmap); diff --git a/sound/soc/codecs/rt700-sdw.c b/sound/soc/codecs/rt700-sdw.c index 9ce36a66fae1d1..30fcca210f0518 100644 --- a/sound/soc/codecs/rt700-sdw.c +++ b/sound/soc/codecs/rt700-sdw.c @@ -522,25 +522,15 @@ static int rt700_dev_resume(struct device *dev) { struct sdw_slave *slave = dev_to_sdw_dev(dev); struct rt700_priv *rt700 = dev_get_drvdata(dev); - unsigned long time; + int ret; if (!rt700->first_hw_init) return 0; - if (!slave->unattach_request) - goto regmap_sync; - - time = wait_for_completion_timeout(&slave->initialization_complete, - msecs_to_jiffies(RT700_PROBE_TIMEOUT)); - if (!time) { - dev_err(&slave->dev, "Initialization not complete, timed out\n"); - sdw_show_ping_status(slave->bus, true); - - return -ETIMEDOUT; - } + ret = sdw_slave_wait_for_init(slave, RT700_PROBE_TIMEOUT); + if (ret) + return ret; -regmap_sync: - slave->unattach_request = 0; regcache_cache_only(rt700->regmap, false); regcache_sync_region(rt700->regmap, 0x3000, 0x8fff); regcache_sync_region(rt700->regmap, 0x752010, 0x75206b); diff --git a/sound/soc/codecs/rt711-sdca-sdw.c b/sound/soc/codecs/rt711-sdca-sdw.c index 49dacceddf8150..a8164fc3979ab8 100644 --- a/sound/soc/codecs/rt711-sdca-sdw.c +++ b/sound/soc/codecs/rt711-sdca-sdw.c @@ -438,7 +438,7 @@ static int rt711_sdca_dev_resume(struct device *dev) { struct sdw_slave *slave = dev_to_sdw_dev(dev); struct rt711_sdca_priv *rt711 = dev_get_drvdata(dev); - unsigned long time; + int ret; if (!rt711->first_hw_init) return 0; @@ -451,20 +451,12 @@ static int rt711_sdca_dev_resume(struct device *dev) rt711->disable_irq = false; } mutex_unlock(&rt711->disable_irq_lock); - goto regmap_sync; } - time = wait_for_completion_timeout(&slave->initialization_complete, - msecs_to_jiffies(RT711_PROBE_TIMEOUT)); - if (!time) { - dev_err(&slave->dev, "%s: Initialization not complete, timed out\n", __func__); - sdw_show_ping_status(slave->bus, true); + ret = sdw_slave_wait_for_init(slave, RT711_PROBE_TIMEOUT); + if (ret) + return ret; - return -ETIMEDOUT; - } - -regmap_sync: - slave->unattach_request = 0; regcache_cache_only(rt711->regmap, false); regcache_sync(rt711->regmap); regcache_cache_only(rt711->mbq_regmap, false); diff --git a/sound/soc/codecs/rt711-sdw.c b/sound/soc/codecs/rt711-sdw.c index 72ddf4cebdf366..df3c43f2ab6b83 100644 --- a/sound/soc/codecs/rt711-sdw.c +++ b/sound/soc/codecs/rt711-sdw.c @@ -530,7 +530,7 @@ static int rt711_dev_resume(struct device *dev) { struct sdw_slave *slave = dev_to_sdw_dev(dev); struct rt711_priv *rt711 = dev_get_drvdata(dev); - unsigned long time; + int ret; if (!rt711->first_hw_init) return 0; @@ -542,18 +542,12 @@ static int rt711_dev_resume(struct device *dev) rt711->disable_irq = false; } mutex_unlock(&rt711->disable_irq_lock); - goto regmap_sync; } - time = wait_for_completion_timeout(&slave->initialization_complete, - msecs_to_jiffies(RT711_PROBE_TIMEOUT)); - if (!time) { - dev_err(&slave->dev, "%s: Initialization not complete, timed out\n", __func__); - return -ETIMEDOUT; - } + ret = sdw_slave_wait_for_init(slave, RT711_PROBE_TIMEOUT); + if (ret) + return ret; -regmap_sync: - slave->unattach_request = 0; regcache_cache_only(rt711->regmap, false); regcache_sync_region(rt711->regmap, 0x3000, 0x8fff); regcache_sync_region(rt711->regmap, 0x752009, 0x752091); diff --git a/sound/soc/codecs/rt712-sdca-dmic.c b/sound/soc/codecs/rt712-sdca-dmic.c index 4d83544ef2049e..4c5c2f5ba5edff 100644 --- a/sound/soc/codecs/rt712-sdca-dmic.c +++ b/sound/soc/codecs/rt712-sdca-dmic.c @@ -905,26 +905,15 @@ static int rt712_sdca_dmic_dev_resume(struct device *dev) { struct sdw_slave *slave = dev_to_sdw_dev(dev); struct rt712_sdca_dmic_priv *rt712 = dev_get_drvdata(dev); - unsigned long time; + int ret; if (!rt712->first_hw_init) return 0; - if (!slave->unattach_request) - goto regmap_sync; - - time = wait_for_completion_timeout(&slave->initialization_complete, - msecs_to_jiffies(RT712_PROBE_TIMEOUT)); - if (!time) { - dev_err(&slave->dev, "%s: Initialization not complete, timed out\n", - __func__); - sdw_show_ping_status(slave->bus, true); - - return -ETIMEDOUT; - } + ret = sdw_slave_wait_for_init(slave, RT712_PROBE_TIMEOUT); + if (ret) + return ret; -regmap_sync: - slave->unattach_request = 0; regcache_cache_only(rt712->regmap, false); regcache_sync(rt712->regmap); regcache_cache_only(rt712->mbq_regmap, false); diff --git a/sound/soc/codecs/rt712-sdca-sdw.c b/sound/soc/codecs/rt712-sdca-sdw.c index 8c82887174db29..5817321804736d 100644 --- a/sound/soc/codecs/rt712-sdca-sdw.c +++ b/sound/soc/codecs/rt712-sdca-sdw.c @@ -450,7 +450,7 @@ static int rt712_sdca_dev_resume(struct device *dev) { struct sdw_slave *slave = dev_to_sdw_dev(dev); struct rt712_sdca_priv *rt712 = dev_get_drvdata(dev); - unsigned long time; + int ret; if (!rt712->first_hw_init) return 0; @@ -464,20 +464,12 @@ static int rt712_sdca_dev_resume(struct device *dev) rt712->disable_irq = false; } mutex_unlock(&rt712->disable_irq_lock); - goto regmap_sync; } - time = wait_for_completion_timeout(&slave->initialization_complete, - msecs_to_jiffies(RT712_PROBE_TIMEOUT)); - if (!time) { - dev_err(&slave->dev, "%s: Initialization not complete, timed out\n", __func__); - sdw_show_ping_status(slave->bus, true); + ret = sdw_slave_wait_for_init(slave, RT712_PROBE_TIMEOUT); + if (ret) + return ret; - return -ETIMEDOUT; - } - -regmap_sync: - slave->unattach_request = 0; regcache_cache_only(rt712->regmap, false); regcache_sync(rt712->regmap); regcache_cache_only(rt712->mbq_regmap, false); diff --git a/sound/soc/codecs/rt715-sdca-sdw.c b/sound/soc/codecs/rt715-sdca-sdw.c index 968bc183b8d8ce..4b9815b5628dbe 100644 --- a/sound/soc/codecs/rt715-sdca-sdw.c +++ b/sound/soc/codecs/rt715-sdca-sdw.c @@ -224,25 +224,15 @@ static int rt715_dev_resume(struct device *dev) { struct sdw_slave *slave = dev_to_sdw_dev(dev); struct rt715_sdca_priv *rt715 = dev_get_drvdata(dev); - unsigned long time; + int ret; if (!rt715->first_hw_init) return 0; - if (!slave->unattach_request) - goto regmap_sync; + ret = sdw_slave_wait_for_init(slave, RT715_PROBE_TIMEOUT); + if (ret) + return ret; - time = wait_for_completion_timeout(&slave->initialization_complete, - msecs_to_jiffies(RT715_PROBE_TIMEOUT)); - if (!time) { - dev_err(&slave->dev, "%s: Initialization not complete, timed out\n", __func__); - sdw_show_ping_status(slave->bus, true); - - return -ETIMEDOUT; - } - -regmap_sync: - slave->unattach_request = 0; regcache_cache_only(rt715->regmap, false); regcache_sync_region(rt715->regmap, SDW_SDCA_CTL(FUN_JACK_CODEC, RT715_SDCA_ST_EN, RT715_SDCA_ST_CTRL, diff --git a/sound/soc/codecs/rt715-sdw.c b/sound/soc/codecs/rt715-sdw.c index 49c91d015be4d3..7f83a8f1a06e98 100644 --- a/sound/soc/codecs/rt715-sdw.c +++ b/sound/soc/codecs/rt715-sdw.c @@ -501,25 +501,15 @@ static int rt715_dev_resume(struct device *dev) { struct sdw_slave *slave = dev_to_sdw_dev(dev); struct rt715_priv *rt715 = dev_get_drvdata(dev); - unsigned long time; + int ret; if (!rt715->first_hw_init) return 0; - if (!slave->unattach_request) - goto regmap_sync; - - time = wait_for_completion_timeout(&slave->initialization_complete, - msecs_to_jiffies(RT715_PROBE_TIMEOUT)); - if (!time) { - dev_err(&slave->dev, "%s: Initialization not complete, timed out\n", __func__); - sdw_show_ping_status(slave->bus, true); - - return -ETIMEDOUT; - } + ret = sdw_slave_wait_for_init(slave, RT715_PROBE_TIMEOUT); + if (ret) + return ret; -regmap_sync: - slave->unattach_request = 0; regcache_cache_only(rt715->regmap, false); regcache_sync_region(rt715->regmap, 0x3000, 0x8fff); regcache_sync_region(rt715->regmap, 0x752039, 0x752039); diff --git a/sound/soc/codecs/rt721-sdca-sdw.c b/sound/soc/codecs/rt721-sdca-sdw.c index 6eb8512975b85b..58606209316a4a 100644 --- a/sound/soc/codecs/rt721-sdca-sdw.c +++ b/sound/soc/codecs/rt721-sdca-sdw.c @@ -489,7 +489,7 @@ static int rt721_sdca_dev_resume(struct device *dev) { struct sdw_slave *slave = dev_to_sdw_dev(dev); struct rt721_sdca_priv *rt721 = dev_get_drvdata(dev); - unsigned long time; + int ret; if (!rt721->first_hw_init) return 0; @@ -502,20 +502,12 @@ static int rt721_sdca_dev_resume(struct device *dev) rt721->disable_irq = false; } mutex_unlock(&rt721->disable_irq_lock); - goto regmap_sync; } - time = wait_for_completion_timeout(&slave->initialization_complete, - msecs_to_jiffies(RT721_PROBE_TIMEOUT)); - if (!time) { - dev_err(&slave->dev, "Initialization not complete, timed out\n"); - sdw_show_ping_status(slave->bus, true); + ret = sdw_slave_wait_for_init(slave, RT721_PROBE_TIMEOUT); + if (ret) + return ret; - return -ETIMEDOUT; - } - -regmap_sync: - slave->unattach_request = 0; regcache_cache_only(rt721->regmap, false); regcache_sync(rt721->regmap); regcache_cache_only(rt721->mbq_regmap, false); diff --git a/sound/soc/codecs/rt722-sdca-sdw.c b/sound/soc/codecs/rt722-sdca-sdw.c index 0a5b3ffa90daf2..a5feba3d0c182d 100644 --- a/sound/soc/codecs/rt722-sdca-sdw.c +++ b/sound/soc/codecs/rt722-sdca-sdw.c @@ -501,7 +501,7 @@ static int rt722_sdca_dev_resume(struct device *dev) { struct sdw_slave *slave = dev_to_sdw_dev(dev); struct rt722_sdca_priv *rt722 = dev_get_drvdata(dev); - unsigned long time; + int ret; if (!rt722->first_hw_init) return 0; @@ -514,20 +514,12 @@ static int rt722_sdca_dev_resume(struct device *dev) rt722->disable_irq = false; } mutex_unlock(&rt722->disable_irq_lock); - goto regmap_sync; } - time = wait_for_completion_timeout(&slave->initialization_complete, - msecs_to_jiffies(RT722_PROBE_TIMEOUT)); - if (!time) { - dev_err(&slave->dev, "Initialization not complete, timed out\n"); - sdw_show_ping_status(slave->bus, true); + ret = sdw_slave_wait_for_init(slave, RT722_PROBE_TIMEOUT); + if (ret) + return ret; - return -ETIMEDOUT; - } - -regmap_sync: - slave->unattach_request = 0; regcache_cache_only(rt722->regmap, false); regcache_sync(rt722->regmap); return 0; diff --git a/sound/soc/codecs/sigmadsp.c b/sound/soc/codecs/sigmadsp.c index 3343dbb63d2b70..2e08fde3989c2f 100644 --- a/sound/soc/codecs/sigmadsp.c +++ b/sound/soc/codecs/sigmadsp.c @@ -35,7 +35,7 @@ struct sigmadsp_control { struct snd_kcontrol *kcontrol; bool is_readback; bool cached; - uint8_t cache[]; + u8 cache[] __counted_by(num_bytes); }; struct sigmadsp_data { @@ -223,7 +223,7 @@ static int sigma_fw_load_control(struct sigmadsp *sigmadsp, return -EINVAL; num_bytes = le16_to_cpu(ctrl_chunk->num_bytes); - ctrl = kzalloc(sizeof(*ctrl) + num_bytes, GFP_KERNEL); + ctrl = kzalloc_flex(*ctrl, cache, num_bytes); if (!ctrl) return -ENOMEM; diff --git a/sound/soc/codecs/simple-amplifier.c b/sound/soc/codecs/simple-amplifier.c index d306c585b52b09..ca0e6ce8cc379b 100644 --- a/sound/soc/codecs/simple-amplifier.c +++ b/sound/soc/codecs/simple-amplifier.c @@ -1,25 +1,101 @@ // SPDX-License-Identifier: GPL-2.0-only /* - * Copyright (c) 2017 BayLibre, SAS. - * Author: Jerome Brunet + * Support for gpio amplifier + * Copyright 2026 CS GROUP France + * Author: Herve Codina + * + * Basic simple amplifier driver + * Copyright (c) 2017 BayLibre, SAS. + * Author: Jerome Brunet */ +#include +#include #include +#include +#include +#include #include +#include #include +#include #include +#include +#include -#define DRV_NAME "simple-amplifier" +struct simple_amp_single { + struct gpio_desc *gpio; + bool is_inverted; + int kctrl_val; + const char *control_name; +}; + +struct simple_amp_point { + u32 gpio_val; + int gain_db; +}; + +struct simple_amp_range { + unsigned int nb_points; + struct simple_amp_point min; + struct simple_amp_point max; +}; + +struct simple_amp_ranges { + unsigned int nb_ranges; + struct simple_amp_range *tab_ranges; +}; + +struct simple_amp_labels { + unsigned int nb_labels; + const char **tab_labels; +}; + +enum simple_amp_mode { + SIMPLE_AMP_MODE_NONE, + SIMPLE_AMP_MODE_RANGES, + SIMPLE_AMP_MODE_LABELS, +}; + +struct simple_amp_multi { + struct gpio_descs *gpios; + u32 kctrl_val; + u32 kctrl_max; + const char *control_name; + unsigned int *tlv_array; + enum simple_amp_mode mode; + union { + struct simple_amp_ranges ranges; + struct simple_amp_labels labels; + }; +}; + +struct simple_amp_data { + unsigned int supports; +#define SIMPLE_AUDIO_SUPPORT_PGA BIT(0) +#define SIMPLE_AUDIO_SUPPORT_POWER_SUPPLIES BIT(1) +#define SIMPLE_AUDIO_SUPPORT_MUTE BIT(2) +#define SIMPLE_AUDIO_SUPPORT_BYPASS BIT(3) + + const struct snd_soc_dapm_widget *dapm_widgets; + unsigned int num_dapm_widgets; + const struct snd_soc_dapm_route *dapm_routes; + unsigned int num_dapm_routes; +}; struct simple_amp { + const struct simple_amp_data *data; struct gpio_desc *gpiod_enable; + struct simple_amp_single mute; + struct simple_amp_single bypass; + struct simple_amp_multi gain; }; -static int drv_event(struct snd_soc_dapm_widget *w, - struct snd_kcontrol *control, int event) +static int simple_amp_power_event(struct snd_soc_dapm_widget *w, + struct snd_kcontrol *control, int event) { struct snd_soc_component *c = snd_soc_dapm_to_component(w->dapm); - struct simple_amp *priv = snd_soc_component_get_drvdata(c); + struct simple_amp *simple_amp = snd_soc_component_get_drvdata(c); int val; switch (event) { @@ -34,7 +110,7 @@ static int drv_event(struct snd_soc_dapm_widget *w, return -EINVAL; } - gpiod_set_value_cansleep(priv->gpiod_enable, val); + gpiod_set_value_cansleep(simple_amp->gpiod_enable, val); return 0; } @@ -42,7 +118,7 @@ static int drv_event(struct snd_soc_dapm_widget *w, static const struct snd_soc_dapm_widget simple_amp_dapm_widgets[] = { SND_SOC_DAPM_INPUT("INL"), SND_SOC_DAPM_INPUT("INR"), - SND_SOC_DAPM_OUT_DRV_E("DRV", SND_SOC_NOPM, 0, 0, NULL, 0, drv_event, + SND_SOC_DAPM_OUT_DRV_E("DRV", SND_SOC_NOPM, 0, 0, NULL, 0, simple_amp_power_event, (SND_SOC_DAPM_POST_PMU | SND_SOC_DAPM_PRE_PMD)), SND_SOC_DAPM_OUTPUT("OUTL"), SND_SOC_DAPM_OUTPUT("OUTR"), @@ -58,47 +134,836 @@ static const struct snd_soc_dapm_route simple_amp_dapm_routes[] = { { "OUTR", NULL, "DRV" }, }; +static const struct snd_soc_dapm_widget simple_amp_mono_pga_dapm_widgets[] = { + SND_SOC_DAPM_INPUT("IN"), + SND_SOC_DAPM_OUTPUT("OUT"), + SND_SOC_DAPM_PGA_E("PGA", SND_SOC_NOPM, 0, 0, NULL, 0, simple_amp_power_event, + (SND_SOC_DAPM_POST_PMU | SND_SOC_DAPM_PRE_PMD)), + SND_SOC_DAPM_REGULATOR_SUPPLY("vdd", 0, 0), +}; + +static const struct snd_soc_dapm_route simple_amp_mono_pga_dapm_routes[] = { + { "PGA", NULL, "IN" }, + { "PGA", NULL, "vdd" }, + { "OUT", NULL, "PGA" }, +}; + +static const struct snd_soc_dapm_widget simple_amp_stereo_pga_dapm_widgets[] = { + SND_SOC_DAPM_INPUT("INL"), + SND_SOC_DAPM_INPUT("INR"), + SND_SOC_DAPM_OUTPUT("OUTL"), + SND_SOC_DAPM_OUTPUT("OUTR"), + SND_SOC_DAPM_PGA_E("PGA", SND_SOC_NOPM, 0, 0, NULL, 0, simple_amp_power_event, + (SND_SOC_DAPM_POST_PMU | SND_SOC_DAPM_PRE_PMD)), + SND_SOC_DAPM_REGULATOR_SUPPLY("vdd", 0, 0), +}; + +static const struct snd_soc_dapm_route simple_amp_stereo_pga_dapm_routes[] = { + { "PGA", NULL, "INL" }, + { "PGA", NULL, "INR" }, + { "PGA", NULL, "vdd" }, + { "OUTL", NULL, "PGA" }, + { "OUTR", NULL, "PGA" }, +}; + +static int simple_amp_single_kctrl_write_gpio(struct simple_amp_single *single, + int kctrl_val) +{ + int gpio_val; + + gpio_val = single->is_inverted ? !kctrl_val : kctrl_val; + + return gpiod_set_value_cansleep(single->gpio, gpio_val); +} + +static int simple_amp_single_kctrl_info(struct snd_kcontrol *kcontrol, + struct snd_ctl_elem_info *uinfo) +{ + uinfo->count = 1; + uinfo->value.integer.min = 0; + uinfo->value.integer.max = 1; + uinfo->type = SNDRV_CTL_ELEM_TYPE_BOOLEAN; + return 0; +} + +static int simple_amp_single_kctrl_get(struct snd_kcontrol *kcontrol, + struct snd_ctl_elem_value *ucontrol) +{ + struct simple_amp_single *single = (struct simple_amp_single *)kcontrol->private_value; + + ucontrol->value.integer.value[0] = single->kctrl_val; + + return 0; +} + +static int simple_amp_single_kctrl_put(struct snd_kcontrol *kcontrol, + struct snd_ctl_elem_value *ucontrol) +{ + struct simple_amp_single *single = (struct simple_amp_single *)kcontrol->private_value; + int kctrl_val; + int err; + + kctrl_val = ucontrol->value.integer.value[0] ? 1 : 0; + + if (kctrl_val == single->kctrl_val) + return 0; + + err = simple_amp_single_kctrl_write_gpio(single, kctrl_val); + if (err) + return err; + + single->kctrl_val = kctrl_val; + + return 1; /* The value changed */ +} + +static int simple_amp_single_add_kcontrol(struct snd_soc_component *component, + struct simple_amp_single *single) +{ + struct snd_kcontrol_new control = { + .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .name = single->control_name, + .info = simple_amp_single_kctrl_info, + .get = simple_amp_single_kctrl_get, + .put = simple_amp_single_kctrl_put, + .private_value = (unsigned long)single, + }; + int ret; + + /* Be consistent between single->kctrl_val value and the GPIO value */ + ret = simple_amp_single_kctrl_write_gpio(single, single->kctrl_val); + if (ret) + return ret; + + return snd_soc_add_component_controls(component, &control, 1); +} + +static u32 simple_amp_multi_ranges_kctrl_to_gpio(u32 kctrl_val, + struct simple_amp_ranges *ranges) +{ + struct simple_amp_range *range; + u32 index = kctrl_val; + unsigned int i; + + for (i = 0; i < ranges->nb_ranges; i++) { + range = &ranges->tab_ranges[i]; + + if (index < range->nb_points) + return (range->max.gpio_val >= range->min.gpio_val) ? + range->min.gpio_val + index : + range->min.gpio_val - index; + + index -= range->nb_points; + } + + /* + * Given index out of possible ranges. This is shouldn't happen. + * Signal the issue and return the maximum value + */ + WARN(1, "kctrl_val %u out of ranges\n", kctrl_val); + return ranges->tab_ranges[ranges->nb_ranges - 1].max.gpio_val; +} + +static int simple_amp_multi_kctrl_write_gpios(struct simple_amp_multi *multi, + u32 kctrl_val) +{ + DECLARE_BITMAP(bm, 32); + u32 gpio_val; + + if (kctrl_val > multi->kctrl_max) + return -EINVAL; + + if (multi->mode == SIMPLE_AMP_MODE_RANGES) + gpio_val = simple_amp_multi_ranges_kctrl_to_gpio(kctrl_val, + &multi->ranges); + else + gpio_val = kctrl_val; + + bitmap_from_arr32(bm, &gpio_val, multi->gpios->ndescs); + + return gpiod_multi_set_value_cansleep(multi->gpios, bm); +} + +static int simple_amp_multi_kctrl_int_info(struct snd_kcontrol *kcontrol, + struct snd_ctl_elem_info *uinfo) +{ + struct simple_amp_multi *multi = (struct simple_amp_multi *)kcontrol->private_value; + + uinfo->count = 1; + uinfo->value.integer.min = 0; + uinfo->value.integer.max = multi->kctrl_max; + uinfo->type = SNDRV_CTL_ELEM_TYPE_INTEGER; + return 0; +} + +static int simple_amp_multi_kctrl_int_get(struct snd_kcontrol *kcontrol, + struct snd_ctl_elem_value *ucontrol) +{ + struct simple_amp_multi *multi = (struct simple_amp_multi *)kcontrol->private_value; + + ucontrol->value.integer.value[0] = multi->kctrl_val; + return 0; +} + +static int simple_amp_multi_kctrl_int_put(struct snd_kcontrol *kcontrol, + struct snd_ctl_elem_value *ucontrol) +{ + struct simple_amp_multi *multi = (struct simple_amp_multi *)kcontrol->private_value; + u32 kctrl_val; + int ret; + + kctrl_val = ucontrol->value.integer.value[0]; + + if (kctrl_val == multi->kctrl_val) + return 0; + + ret = simple_amp_multi_kctrl_write_gpios(multi, kctrl_val); + if (ret) + return ret; + + multi->kctrl_val = kctrl_val; + + return 1; /* The value changed */ +} + +static int simple_amp_multi_kctrl_enum_info(struct snd_kcontrol *kcontrol, + struct snd_ctl_elem_info *uinfo) +{ + struct simple_amp_multi *multi = (struct simple_amp_multi *)kcontrol->private_value; + + return snd_ctl_enum_info(uinfo, 1, multi->labels.nb_labels, + multi->labels.tab_labels); +} + +static int simple_amp_multi_kctrl_enum_get(struct snd_kcontrol *kcontrol, + struct snd_ctl_elem_value *ucontrol) +{ + struct simple_amp_multi *multi = (struct simple_amp_multi *)kcontrol->private_value; + + ucontrol->value.enumerated.item[0] = multi->kctrl_val; + return 0; +} + +static int simple_amp_multi_kctrl_enum_put(struct snd_kcontrol *kcontrol, + struct snd_ctl_elem_value *ucontrol) +{ + struct simple_amp_multi *multi = (struct simple_amp_multi *)kcontrol->private_value; + u32 kctrl_val; + int ret; + + kctrl_val = ucontrol->value.enumerated.item[0]; + + if (kctrl_val == multi->kctrl_val) + return 0; + + ret = simple_amp_multi_kctrl_write_gpios(multi, kctrl_val); + if (ret) + return ret; + + multi->kctrl_val = kctrl_val; + + return 1; /* The value changed */ +} + +static unsigned int *simple_amp_alloc_tlv_ranges(const struct simple_amp_ranges *ranges) +{ + unsigned int index; + unsigned int *tlv; + unsigned int *t; + unsigned int i; + + tlv = kzalloc_objs(*tlv, 2 + ranges->nb_ranges * 6, GFP_KERNEL); + if (!tlv) + return NULL; + + t = tlv; + + /* Fill first TLV */ + *t++ = SNDRV_CTL_TLVT_DB_RANGE; /* Tag */ + *t++ = ranges->nb_ranges * 6 * sizeof(*tlv); /* Len */ + /* Ranges are sorted from lower to higher value */ + index = 0; + for (i = 0; i < ranges->nb_ranges; i++) { + /* Fill range item i */ + *t++ = index; /* min */ + index += ranges->tab_ranges[i].nb_points; + *t++ = index - 1; /* max */ + *t++ = SNDRV_CTL_TLVT_DB_MINMAX; /* Tag */ + *t++ = 2 * sizeof(*tlv); /* Len */ + *t++ = ranges->tab_ranges[i].min.gain_db; /* min_dB */ + *t++ = ranges->tab_ranges[i].max.gain_db; /* max_dB */ + } + + return tlv; +} + +static int simple_amp_multi_add_kcontrol(struct snd_soc_component *component, + struct simple_amp_multi *multi) +{ + struct snd_kcontrol_new control = { + .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .name = multi->control_name, + .info = simple_amp_multi_kctrl_int_info, + .get = simple_amp_multi_kctrl_int_get, + .put = simple_amp_multi_kctrl_int_put, + .private_value = (unsigned long)multi, + }; + int ret; + + switch (multi->mode) { + case SIMPLE_AMP_MODE_RANGES: + multi->tlv_array = simple_amp_alloc_tlv_ranges(&multi->ranges); + if (!multi->tlv_array) + return -ENOMEM; + + control.access = SNDRV_CTL_ELEM_ACCESS_TLV_READ | + SNDRV_CTL_ELEM_ACCESS_READWRITE; + control.tlv.p = multi->tlv_array; + break; + + case SIMPLE_AMP_MODE_LABELS: + /* Use enumerated values */ + control.info = simple_amp_multi_kctrl_enum_info; + control.get = simple_amp_multi_kctrl_enum_get; + control.put = simple_amp_multi_kctrl_enum_put; + break; + + case SIMPLE_AMP_MODE_NONE: + /* Already set control configuration is enough */ + break; + + default: + return -EINVAL; + } + + /* Be consistent between multi->kctrl_val value and the GPIOs value */ + ret = simple_amp_multi_kctrl_write_gpios(multi, multi->kctrl_val); + if (ret) + goto err_free_tlv_array; + + ret = snd_soc_add_component_controls(component, &control, 1); + if (ret) + goto err_free_tlv_array; + + return 0; + +err_free_tlv_array: + kfree(multi->tlv_array); + return ret; +} + +static int simple_amp_add_basic_dapm(struct snd_soc_component *component) +{ + struct snd_soc_dapm_context *dapm = snd_soc_component_to_dapm(component); + struct simple_amp *simple_amp = snd_soc_component_get_drvdata(component); + struct device *dev = component->dev; + int ret; + + /* Add basic dapm widgets and routes */ + ret = snd_soc_dapm_new_controls(dapm, simple_amp->data->dapm_widgets, + simple_amp->data->num_dapm_widgets); + if (ret) { + dev_err(dev, "Failed to add basic dapm widgets (%d)\n", ret); + return ret; + } + + ret = snd_soc_dapm_add_routes(dapm, simple_amp->data->dapm_routes, + simple_amp->data->num_dapm_routes); + if (ret) { + dev_err(dev, "Failed to add basic dapm routes (%d)\n", ret); + return ret; + } + + return 0; +} + +struct simple_amp_supply { + const char *prop_name; + const struct snd_soc_dapm_widget dapm_widget; + const struct snd_soc_dapm_route dapm_route; +}; + +static const struct simple_amp_supply simple_amp_supplies[] = { + { + .prop_name = "vddio-supply", + .dapm_widget = SND_SOC_DAPM_REGULATOR_SUPPLY("vddio", 0, 0), + .dapm_route = { "PGA", NULL, "vddio" }, + }, { + .prop_name = "vdda1-supply", + .dapm_widget = SND_SOC_DAPM_REGULATOR_SUPPLY("vdda1", 0, 0), + .dapm_route = { "PGA", NULL, "vdda1" }, + }, { + .prop_name = "vdda2-supply", + .dapm_widget = SND_SOC_DAPM_REGULATOR_SUPPLY("vdda2", 0, 0), + .dapm_route = { "PGA", NULL, "vdda2" }, + }, + { /* End of list */} +}; + +static int simple_amp_add_power_supplies(struct snd_soc_component *component) +{ + struct snd_soc_dapm_context *dapm = snd_soc_component_to_dapm(component); + struct simple_amp *simple_amp = snd_soc_component_get_drvdata(component); + const struct simple_amp_supply *supply; + struct device *dev = component->dev; + int ret; + + /* + * Those additional power supplies are attached to the PGA. + * If PGA is not supported, simply skipped them. + */ + if (!(simple_amp->data->supports & SIMPLE_AUDIO_SUPPORT_PGA)) { + dev_err(dev, "Extra power supplied need PGA\n"); + return -EINVAL; + } + + supply = simple_amp_supplies; + do { + if (!of_property_present(dev->of_node, supply->prop_name)) + continue; + + ret = snd_soc_dapm_new_controls(dapm, &supply->dapm_widget, 1); + if (ret) { + dev_err(dev, "Failed to add control for '%s' (%d)\n", + supply->prop_name, ret); + return ret; + } + ret = snd_soc_dapm_add_routes(dapm, &supply->dapm_route, 1); + if (ret) { + dev_err(dev, "Failed to add route for '%s' (%d)\n", + supply->prop_name, ret); + return ret; + } + } while ((++supply)->prop_name); + + return 0; +} + +static int simple_amp_component_probe(struct snd_soc_component *component) +{ + struct simple_amp *simple_amp = snd_soc_component_get_drvdata(component); + int ret; + + /* Add basic dapm widgets and routes */ + ret = simple_amp_add_basic_dapm(component); + if (ret) + return ret; + + /* Add additional power supplies */ + if (simple_amp->data->supports & SIMPLE_AUDIO_SUPPORT_POWER_SUPPLIES) { + ret = simple_amp_add_power_supplies(component); + if (ret) + return ret; + } + + if (simple_amp->mute.gpio) { + /* + * The name of the GPIO used is mute. According to this name, 1 + * means muted and 0 means un-muted. + * + * An inversion is expected by ALSA. Indeed from ALSA point of + * view, 1 means 'on' (un-muted) and 0 means 'off' (muted). + */ + simple_amp->mute.is_inverted = true; + simple_amp->mute.kctrl_val = 1; /* Un-muted */ + ret = simple_amp_single_add_kcontrol(component, &simple_amp->mute); + if (ret) + return ret; + } + + if (simple_amp->bypass.gpio) { + ret = simple_amp_single_add_kcontrol(component, &simple_amp->bypass); + if (ret) + return ret; + } + + if (simple_amp->gain.gpios) { + ret = simple_amp_multi_add_kcontrol(component, &simple_amp->gain); + if (ret) + return ret; + } + + return 0; +} + +static void simple_amp_component_remove(struct snd_soc_component *component) +{ + struct simple_amp *simple_amp = snd_soc_component_get_drvdata(component); + + kfree(simple_amp->gain.tlv_array); + simple_amp->gain.tlv_array = NULL; +} + static const struct snd_soc_component_driver simple_amp_component_driver = { - .dapm_widgets = simple_amp_dapm_widgets, - .num_dapm_widgets = ARRAY_SIZE(simple_amp_dapm_widgets), - .dapm_routes = simple_amp_dapm_routes, - .num_dapm_routes = ARRAY_SIZE(simple_amp_dapm_routes), + .probe = simple_amp_component_probe, + .remove = simple_amp_component_remove, }; +static int simple_amp_parse_single_gpio(struct device *dev, + struct simple_amp_single *single, + const char *gpio_property) +{ + /* Start with the inactive value */ + single->is_inverted = false; + single->kctrl_val = 0; + single->gpio = devm_gpiod_get_optional(dev, gpio_property, GPIOD_OUT_LOW); + if (IS_ERR(single->gpio)) + return dev_err_probe(dev, PTR_ERR(single->gpio), + "Failed to get '%s' gpio\n", + gpio_property); + return 0; +} + +static int simple_amp_cmp_ranges(const void *a, const void *b) +{ + const struct simple_amp_range *a_range = a; + const struct simple_amp_range *b_range = b; + + /* Ranges a and b don't overlap. This has been already checked */ + + return a_range->min.gain_db - b_range->max.gain_db; +} + +static int simple_amp_check_new_range(const struct simple_amp_range *new_range, + const struct simple_amp_range *tab_ranges, + unsigned int nb_ranges) +{ + unsigned int i; + + for (i = 0; i < nb_ranges; i++) { + /* Check for range overlaps */ + if (new_range->min.gain_db >= tab_ranges[i].min.gain_db && + new_range->min.gain_db <= tab_ranges[i].max.gain_db) + return -EINVAL; + + if (new_range->max.gain_db >= tab_ranges[i].min.gain_db && + new_range->max.gain_db <= tab_ranges[i].max.gain_db) + return -EINVAL; + + if (new_range->min.gain_db <= tab_ranges[i].min.gain_db && + new_range->max.gain_db >= tab_ranges[i].max.gain_db) + return -EINVAL; + } + return 0; +} + +static int simple_amp_parse_ranges(struct device *dev, + struct simple_amp_multi *multi, + const char *ranges_property) +{ + struct simple_amp_ranges *ranges = &multi->ranges; + struct simple_amp_range *range; + struct device_node *np = dev->of_node; + struct simple_amp_point first_point; + unsigned int max_gpio_val; + unsigned int i; + int ret; + u32 u; + s32 s; + + max_gpio_val = (1 << multi->gpios->ndescs) - 1; + + ret = of_property_count_u32_elems(np, ranges_property); + if (ret < 0) + return ret; + + /* The ranges array cannot be empty */ + if (ret == 0) + return -EINVAL; + /* + * One range item is composed of 2 points and each point is composed of + * 2 values. + */ + if (ret % 4) + return -EINVAL; + + ranges->nb_ranges = ret / 4; + + /* The worst case is one range per possible gpio value */ + if (ranges->nb_ranges > max_gpio_val + 1) + return -EINVAL; + + ranges->tab_ranges = devm_kcalloc(dev, ranges->nb_ranges, + sizeof(*ranges->tab_ranges), + GFP_KERNEL); + if (!ranges->tab_ranges) + return -ENOMEM; + + multi->kctrl_max = 0; + for (i = 0; i < ranges->nb_ranges; i++) { + range = &ranges->tab_ranges[i]; + + /* First gpios value */ + ret = of_property_read_u32_index(np, ranges_property, i * 4, &u); + if (ret) + return ret; + if (u > max_gpio_val) + return -EINVAL; + + range->min.gpio_val = u; + + /* First Gain value */ + ret = of_property_read_s32_index(np, ranges_property, i * 4 + 1, &s); + if (ret) + return ret; + + range->min.gain_db = s; + + /* Second gpios value */ + ret = of_property_read_u32_index(np, ranges_property, i * 4 + 2, &u); + if (ret) + return ret; + if (u > max_gpio_val) + return -EINVAL; + + range->max.gpio_val = u; + + /* Second Gain value */ + ret = of_property_read_s32_index(np, ranges_property, i * 4 + 3, &s); + if (ret) + return ret; + + range->max.gain_db = s; + + /* Save the first point for later usage */ + if (i == 0) + first_point = range->min; + + /* Fix min and max if needed */ + if (range->min.gain_db > range->max.gain_db) + swap(range->min, range->max); + + ret = simple_amp_check_new_range(range, ranges->tab_ranges, i); + if (ret) + return ret; + + range->nb_points = abs_diff(range->min.gpio_val, + range->max.gpio_val) + 1; + + multi->kctrl_max += range->nb_points; + } + + multi->kctrl_max -= 1; + + /* Sort the tab_range array by gain_db value */ + sort(ranges->tab_ranges, ranges->nb_ranges, sizeof(*ranges->tab_ranges), + simple_amp_cmp_ranges, NULL); + + /* + * multi->kctrl_val is the index in tab_ranges. + * + * Choose to have the initial amplification value set to the first point + * available in the first range available in the tab_ranges array before + * sorting. + * + * This first point has been identified before sorting. Search for it in + * the sorted array in order to set the multi->kctrl_val initial value. + */ + multi->kctrl_val = 0; + for (i = 0; i < ranges->nb_ranges; i++) { + range = &ranges->tab_ranges[i]; + + if (range->min.gpio_val == first_point.gpio_val && + range->min.gain_db == first_point.gain_db) + break; + + multi->kctrl_val += range->nb_points; + + if (range->max.gpio_val == first_point.gpio_val && + range->max.gain_db == first_point.gain_db) { + multi->kctrl_val--; + break; + } + } + + return 0; +} + +static int simple_amp_parse_labels(struct device *dev, + struct simple_amp_multi *multi, + const char *labels_property) +{ + struct simple_amp_labels *labels = &multi->labels; + struct device_node *np = dev->of_node; + int ret; + + ret = of_property_count_strings(np, labels_property); + if (ret < 0) + return ret; + + /* The labels array cannot be empty */ + if (ret == 0) + return -EINVAL; + + labels->nb_labels = ret; + if (labels->nb_labels > (1 << multi->gpios->ndescs)) + return -EINVAL; + + labels->tab_labels = devm_kcalloc(dev, labels->nb_labels, + sizeof(*labels->tab_labels), + GFP_KERNEL); + if (!labels->tab_labels) + return -ENOMEM; + + multi->kctrl_max = labels->nb_labels - 1; + multi->kctrl_val = 0; + + return of_property_read_string_array(np, labels_property, labels->tab_labels, + labels->nb_labels); +} + +static int simple_amp_parse_multi_gpio(struct device *dev, + struct simple_amp_multi *multi, + const char *gpios_property, + const char *ranges_property, + const char *labels_property) +{ + struct device_node *np = dev->of_node; + int ret; + + /* Start with the value 0 (GPIO inactive). Can be changed later */ + multi->kctrl_val = 0; + multi->gpios = devm_gpiod_get_array_optional(dev, gpios_property, GPIOD_OUT_LOW); + if (IS_ERR(multi->gpios)) + return dev_err_probe(dev, PTR_ERR(multi->gpios), + "Failed to get '%s' gpios\n", + gpios_property); + if (!multi->gpios) + return 0; + + if (multi->gpios->ndescs > 16) + return dev_err_probe(dev, -EINVAL, + "Number of '%s' gpios limited to 16\n", + gpios_property); + + /* Set default value for the kctrl_max. Can be changed later */ + multi->kctrl_max = (1 << multi->gpios->ndescs) - 1; + + multi->mode = SIMPLE_AMP_MODE_NONE; + if (of_property_present(np, ranges_property)) { + ret = simple_amp_parse_ranges(dev, multi, ranges_property); + if (ret < 0) + return dev_err_probe(dev, ret, "Failed to parse '%s'\n", + ranges_property); + multi->mode = SIMPLE_AMP_MODE_RANGES; + } else if (of_property_present(np, labels_property)) { + ret = simple_amp_parse_labels(dev, multi, labels_property); + if (ret < 0) + return dev_err_probe(dev, ret, "Failed to parse '%s'\n", + labels_property); + + multi->mode = SIMPLE_AMP_MODE_LABELS; + } + + return 0; +} + static int simple_amp_probe(struct platform_device *pdev) { struct device *dev = &pdev->dev; - struct simple_amp *priv; + struct simple_amp *simple_amp; + int ret; - priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL); - if (priv == NULL) + simple_amp = devm_kzalloc(dev, sizeof(*simple_amp), GFP_KERNEL); + if (!simple_amp) return -ENOMEM; - platform_set_drvdata(pdev, priv); + platform_set_drvdata(pdev, simple_amp); - priv->gpiod_enable = devm_gpiod_get_optional(dev, "enable", - GPIOD_OUT_LOW); - if (IS_ERR(priv->gpiod_enable)) - return dev_err_probe(dev, PTR_ERR(priv->gpiod_enable), + simple_amp->data = of_device_get_match_data(dev); + if (!simple_amp->data) + return -EINVAL; + + simple_amp->gpiod_enable = devm_gpiod_get_optional(dev, "enable", + GPIOD_OUT_LOW); + if (IS_ERR(simple_amp->gpiod_enable)) + return dev_err_probe(dev, PTR_ERR(simple_amp->gpiod_enable), "Failed to get 'enable' gpio"); + if (simple_amp->data->supports & SIMPLE_AUDIO_SUPPORT_MUTE) { + ret = simple_amp_parse_single_gpio(dev, &simple_amp->mute, "mute"); + if (ret) + return ret; + } + + if (simple_amp->data->supports & SIMPLE_AUDIO_SUPPORT_BYPASS) { + ret = simple_amp_parse_single_gpio(dev, &simple_amp->bypass, "bypass"); + if (ret) + return ret; + } + + if (simple_amp->data->supports & SIMPLE_AUDIO_SUPPORT_PGA) { + ret = simple_amp_parse_multi_gpio(dev, &simple_amp->gain, "gain", + "gain-ranges", "gain-labels"); + if (ret) + return ret; + } + + /* Set controls name */ + simple_amp->gain.control_name = "Volume"; + simple_amp->mute.control_name = "Switch"; + simple_amp->bypass.control_name = "Bypass Switch"; + + if (simple_amp->gain.mode == SIMPLE_AMP_MODE_LABELS) { + /* + * The gain widget control will use enumerated values. + * + * Having just "Voltage" and "Switch" widget names with + * enumerated values and boolean value can confuse ALSA in terms + * of possible values (strings). + * + * Make things clear and avoid the just "Switch" name in that + * case. + */ + simple_amp->mute.control_name = "Out Switch"; + } + return devm_snd_soc_register_component(dev, &simple_amp_component_driver, NULL, 0); } -#ifdef CONFIG_OF +static const struct simple_amp_data simple_audio_amplifier_data = { + .dapm_widgets = simple_amp_dapm_widgets, + .num_dapm_widgets = ARRAY_SIZE(simple_amp_dapm_widgets), + .dapm_routes = simple_amp_dapm_routes, + .num_dapm_routes = ARRAY_SIZE(simple_amp_dapm_routes), +}; + +static const struct simple_amp_data simple_audio_mono_pga_data = { + .supports = SIMPLE_AUDIO_SUPPORT_PGA | + SIMPLE_AUDIO_SUPPORT_POWER_SUPPLIES | + SIMPLE_AUDIO_SUPPORT_MUTE | + SIMPLE_AUDIO_SUPPORT_BYPASS, + .dapm_widgets = simple_amp_mono_pga_dapm_widgets, + .num_dapm_widgets = ARRAY_SIZE(simple_amp_mono_pga_dapm_widgets), + .dapm_routes = simple_amp_mono_pga_dapm_routes, + .num_dapm_routes = ARRAY_SIZE(simple_amp_mono_pga_dapm_routes), +}; + +static const struct simple_amp_data simple_audio_stereo_pga_data = { + .supports = SIMPLE_AUDIO_SUPPORT_PGA | + SIMPLE_AUDIO_SUPPORT_POWER_SUPPLIES | + SIMPLE_AUDIO_SUPPORT_MUTE | + SIMPLE_AUDIO_SUPPORT_BYPASS, + .dapm_widgets = simple_amp_stereo_pga_dapm_widgets, + .num_dapm_widgets = ARRAY_SIZE(simple_amp_stereo_pga_dapm_widgets), + .dapm_routes = simple_amp_stereo_pga_dapm_routes, + .num_dapm_routes = ARRAY_SIZE(simple_amp_stereo_pga_dapm_routes), +}; + static const struct of_device_id simple_amp_ids[] = { - { .compatible = "dioo,dio2125", }, - { .compatible = "simple-audio-amplifier", }, + { .compatible = "dioo,dio2125", .data = &simple_audio_amplifier_data}, + { .compatible = "simple-audio-amplifier", .data = &simple_audio_amplifier_data}, + { .compatible = "gpio-audio-amp-mono", .data = &simple_audio_mono_pga_data}, + { .compatible = "gpio-audio-amp-stereo", .data = &simple_audio_stereo_pga_data}, { } }; MODULE_DEVICE_TABLE(of, simple_amp_ids); -#endif static struct platform_driver simple_amp_driver = { .driver = { - .name = DRV_NAME, - .of_match_table = of_match_ptr(simple_amp_ids), + .name = "simple-amplifier", + .of_match_table = simple_amp_ids, }, .probe = simple_amp_probe, }; @@ -107,4 +972,5 @@ module_platform_driver(simple_amp_driver); MODULE_DESCRIPTION("ASoC Simple Audio Amplifier driver"); MODULE_AUTHOR("Jerome Brunet "); +MODULE_AUTHOR("Herve Codina "); MODULE_LICENSE("GPL"); diff --git a/sound/soc/codecs/tac5xx2-sdw.c b/sound/soc/codecs/tac5xx2-sdw.c index 917b36ac1cd3b4..bb12cfb6da12bc 100644 --- a/sound/soc/codecs/tac5xx2-sdw.c +++ b/sound/soc/codecs/tac5xx2-sdw.c @@ -1437,7 +1437,6 @@ static s32 tac5xx2_sdca_dev_resume(struct device *dev) { struct tac5xx2_prv *tac_dev = dev_get_drvdata(dev); struct sdw_slave *slave = dev_to_sdw_dev(dev); - unsigned long t; int ret; if (!tac_dev->first_hw_init_done) { @@ -1445,19 +1444,10 @@ static s32 tac5xx2_sdca_dev_resume(struct device *dev) return 0; } - if (!slave->unattach_request) - goto regmap_sync; - - t = wait_for_completion_timeout(&slave->initialization_complete, - msecs_to_jiffies(TAC5XX2_PROBE_TIMEOUT_MS)); - if (!t) { - dev_err(&slave->dev, "resume: initialization timed out\n"); - sdw_show_ping_status(slave->bus, true); - return -ETIMEDOUT; - } - slave->unattach_request = 0; + ret = sdw_slave_wait_for_init(slave, TAC5XX2_PROBE_TIMEOUT_MS); + if (ret) + return ret; -regmap_sync: regcache_cache_only(tac_dev->regmap, false); regcache_mark_dirty(tac_dev->regmap); ret = regcache_sync(tac_dev->regmap); diff --git a/sound/soc/codecs/tas2783-sdw.c b/sound/soc/codecs/tas2783-sdw.c index 38009168c5a114..69d03ddc7a0baf 100644 --- a/sound/soc/codecs/tas2783-sdw.c +++ b/sound/soc/codecs/tas2783-sdw.c @@ -1082,22 +1082,12 @@ static s32 tas2783_sdca_dev_resume(struct device *dev) { struct sdw_slave *slave = dev_to_sdw_dev(dev); struct tas2783_prv *tas_dev = dev_get_drvdata(dev); - unsigned long t; - - if (!slave->unattach_request) - goto regmap_sync; - - t = wait_for_completion_timeout(&slave->initialization_complete, - msecs_to_jiffies(TAS2783_PROBE_TIMEOUT)); - if (!t) { - dev_err(&slave->dev, "resume: initialization timed out\n"); - sdw_show_ping_status(slave->bus, true); - return -ETIMEDOUT; - } + int ret; - slave->unattach_request = 0; + ret = sdw_slave_wait_for_init(slave, TAS2783_PROBE_TIMEOUT); + if (ret) + return ret; -regmap_sync: regcache_cache_only(tas_dev->regmap, false); regcache_sync(tas_dev->regmap); return 0; diff --git a/sound/soc/fsl/eukrea-tlv320.c b/sound/soc/fsl/eukrea-tlv320.c index 6be074ea0b3f04..5bb31a5cdf23c1 100644 --- a/sound/soc/fsl/eukrea-tlv320.c +++ b/sound/soc/fsl/eukrea-tlv320.c @@ -19,7 +19,6 @@ #include #include #include -#include #include "../codecs/tlv320aic23.h" #include "imx-ssi.h" @@ -142,7 +141,7 @@ static int eukrea_tlv320_probe(struct platform_device *pdev) eukrea_tlv320.name = "cpuimx-audio"; } - if (machine_is_eukrea_cpuimx27() || + if (of_machine_is_compatible("eukrea,cpuimx27") || (tmp_np = of_find_compatible_node(NULL, NULL, "fsl,imx21-audmux"))) { imx_audmux_v1_configure_port(MX27_AUDMUX_HPCR1_SSI0, IMX_AUDMUX_V1_PCR_SYN | @@ -159,12 +158,12 @@ static int eukrea_tlv320_probe(struct platform_device *pdev) IMX_AUDMUX_V1_PCR_RXDSEL(MX27_AUDMUX_HPCR1_SSI0) ); of_node_put(tmp_np); - } else if (machine_is_eukrea_cpuimx25sd() || - machine_is_eukrea_cpuimx35sd() || - machine_is_eukrea_cpuimx51sd() || + } else if (of_machine_is_compatible("eukrea,cpuimx25") || + of_machine_is_compatible("eukrea,cpuimx35") || + of_machine_is_compatible("eukrea,cpuimx51") || (tmp_np = of_find_compatible_node(NULL, NULL, "fsl,imx31-audmux"))) { if (!np) - ext_port = machine_is_eukrea_cpuimx25sd() ? + ext_port = of_machine_is_compatible("eukrea,cpuimx25") ? 4 : 3; imx_audmux_v2_configure_port(int_port, diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c index 87a40e2b9fdf77..d6dd95680892b0 100644 --- a/sound/soc/fsl/fsl_sai.c +++ b/sound/soc/fsl/fsl_sai.c @@ -1374,6 +1374,31 @@ static int fsl_sai_check_version(struct device *dev) return 0; } +static int fsl_sai_reset_hw(struct device *dev) +{ + struct fsl_sai *sai = dev_get_drvdata(dev); + unsigned char ofs = sai->soc_data->reg_offset; + int ret; + + /* + * Clear TCSR/RCSR to reset SAI and disable all interrupts. + * Bootloader may leave SAI running causing interrupt storm. + */ + ret = regmap_write(sai->regmap, FSL_SAI_TCSR(ofs), 0); + if (ret) { + dev_err(dev, "Failed to clear TCSR: %d\n", ret); + return ret; + } + + ret = regmap_write(sai->regmap, FSL_SAI_RCSR(ofs), 0); + if (ret) { + dev_err(dev, "Failed to clear RCSR: %d\n", ret); + return ret; + } + + return 0; +} + /* * Calculate the offset between first two datalines, don't * different offset in one case. @@ -1580,13 +1605,6 @@ static int fsl_sai_probe(struct platform_device *pdev) if (irq < 0) return irq; - ret = devm_request_irq(dev, irq, fsl_sai_isr, IRQF_SHARED, - np->name, sai); - if (ret) { - dev_err(dev, "failed to claim irq %u\n", irq); - return ret; - } - memcpy(&sai->cpu_dai_drv, fsl_sai_dai_template, sizeof(*fsl_sai_dai_template) * ARRAY_SIZE(fsl_sai_dai_template)); @@ -1661,6 +1679,10 @@ static int fsl_sai_probe(struct platform_device *pdev) if (ret < 0) dev_warn(dev, "Error reading SAI version: %d\n", ret); + ret = fsl_sai_reset_hw(dev); + if (ret < 0) + dev_warn(dev, "Failed to reset hardware: %d\n", ret); + /* Select MCLK direction */ if (sai->mclk_direction_output && sai->soc_data->max_register >= FSL_SAI_MCTL) { @@ -1672,6 +1694,13 @@ static int fsl_sai_probe(struct platform_device *pdev) if (ret < 0 && ret != -ENOSYS) goto err_pm_get_sync; + ret = devm_request_irq(dev, irq, fsl_sai_isr, IRQF_SHARED, + np->name, sai); + if (ret) { + dev_err(dev, "failed to claim irq %u\n", irq); + goto err_pm_get_sync; + } + if (of_device_is_compatible(np, "fsl,imx952-sai") && !of_property_read_string(np, "fsl,sai-amix-mode", &str)) { if (!strcmp(str, "bypass")) diff --git a/sound/soc/intel/boards/Kconfig b/sound/soc/intel/boards/Kconfig index d53af8f7e55b80..cddbd2aa424ebb 100644 --- a/sound/soc/intel/boards/Kconfig +++ b/sound/soc/intel/boards/Kconfig @@ -532,6 +532,7 @@ config SND_SOC_INTEL_SOUNDWIRE_SOF_MACH select MFD_CS42L43_SDW select SND_SOC_CS35L56_SPI select SND_SOC_CS35L56_SDW + select SND_SOC_ES9356 select SND_SOC_DMIC select SND_SOC_INTEL_HDA_DSP_COMMON imply SND_SOC_SDW_MOCKUP diff --git a/sound/soc/intel/common/soc-acpi-intel-arl-match.c b/sound/soc/intel/common/soc-acpi-intel-arl-match.c index 52c5b5719f51c0..59bfd5248819f4 100644 --- a/sound/soc/intel/common/soc-acpi-intel-arl-match.c +++ b/sound/soc/intel/common/soc-acpi-intel-arl-match.c @@ -193,6 +193,42 @@ static const struct snd_soc_acpi_endpoint cs42l43_endpoints[] = { }, }; +static const struct snd_soc_acpi_endpoint es9356_endpoints[] = { + { /* Jack Playback Endpoint */ + .num = 0, + .aggregated = 0, + .group_position = 0, + .group_id = 0, + }, + { /* DMIC Capture Endpoint */ + .num = 1, + .aggregated = 0, + .group_position = 0, + .group_id = 0, + }, + { /* Jack Capture Endpoint */ + .num = 2, + .aggregated = 0, + .group_position = 0, + .group_id = 0, + }, + { /* Speaker Playback Endpoint */ + .num = 3, + .aggregated = 0, + .group_position = 0, + .group_id = 0, + }, +}; + +static const struct snd_soc_acpi_adr_device es9356_adr[] = { + { + .adr = 0x00013004b3935601ull, + .num_endpoints = ARRAY_SIZE(es9356_endpoints), + .endpoints = es9356_endpoints, + .name_prefix = "es9356" + } +}; + static const struct snd_soc_acpi_adr_device cs42l43_0_adr[] = { { .adr = 0x00003001FA424301ull, @@ -265,6 +301,15 @@ static const struct snd_soc_acpi_adr_device rt1320_2_single_adr[] = { } }; +static const struct snd_soc_acpi_link_adr arl_n_mrd_es9356_link1[] = { + { + .mask = BIT(1), + .num_adr = ARRAY_SIZE(es9356_adr), + .adr_d = es9356_adr, + }, + {} +}; + static const struct snd_soc_acpi_adr_device rt1320_3_group1_adr[] = { { .adr = 0x000330025D132001ull, @@ -569,6 +614,13 @@ struct snd_soc_acpi_mach snd_soc_acpi_intel_arl_sdw_machines[] = { .sof_tplg_filename = "sof-arl-cs42l43-l2.tplg", .get_function_tplg_files = sof_sdw_get_tplg_files, }, + { + .link_mask = BIT(1), + .links = arl_n_mrd_es9356_link1, + .drv_name = "sof_sdw", + .sof_tplg_filename = "sof-arl-es9356.tplg", + .get_function_tplg_files = sof_sdw_get_tplg_files, + }, {}, }; EXPORT_SYMBOL_GPL(snd_soc_acpi_intel_arl_sdw_machines); diff --git a/sound/soc/intel/common/soc-acpi-intel-ptl-match.c b/sound/soc/intel/common/soc-acpi-intel-ptl-match.c index 3b7818355ff645..ad3af8834e431a 100644 --- a/sound/soc/intel/common/soc-acpi-intel-ptl-match.c +++ b/sound/soc/intel/common/soc-acpi-intel-ptl-match.c @@ -493,6 +493,20 @@ static const struct snd_soc_acpi_link_adr ptl_sdw_rt713_vb_l3_rt1320_l12[] = { {} }; +static const struct snd_soc_acpi_link_adr ptl_sdw_rt713_vb_l3_rt1320_l1[] = { + { + .mask = BIT(3), + .num_adr = ARRAY_SIZE(rt713_vb_3_adr), + .adr_d = rt713_vb_3_adr, + }, + { + .mask = BIT(1), + .num_adr = ARRAY_SIZE(rt1320_1_group2_adr), + .adr_d = rt1320_1_group2_adr, + }, + {} +}; + static const struct snd_soc_acpi_link_adr ptl_sdw_rt712_vb_l2_rt1320_l1[] = { { .mask = BIT(2), @@ -578,6 +592,13 @@ struct snd_soc_acpi_mach snd_soc_acpi_intel_ptl_sdw_machines[] = { .sof_tplg_filename = "sof-ptl-rt713-l3-rt1320-l12.tplg", .get_function_tplg_files = sof_sdw_get_tplg_files, }, + { + .link_mask = BIT(1) | BIT(3), + .links = ptl_sdw_rt713_vb_l3_rt1320_l1, + .drv_name = "sof_sdw", + .sof_tplg_filename = "sof-ptl-rt713-l3-rt1320-l1.tplg", + .get_function_tplg_files = sof_sdw_get_tplg_files, + }, { .link_mask = BIT(1) | BIT(2) | BIT(3), .links = ptl_cs42l43_l2_cs35l56x6_l13, diff --git a/sound/soc/mediatek/mt8196/mt8196-afe-pcm.c b/sound/soc/mediatek/mt8196/mt8196-afe-pcm.c index 511e888567be87..a1ae8322d8b6cf 100644 --- a/sound/soc/mediatek/mt8196/mt8196-afe-pcm.c +++ b/sound/soc/mediatek/mt8196/mt8196-afe-pcm.c @@ -2242,13 +2242,16 @@ static int mt8196_afe_runtime_resume(struct device *dev) static int mt8196_afe_component_probe(struct snd_soc_component *component) { struct mtk_base_afe *afe = snd_soc_component_get_drvdata(component); + int ret; + + /* enable clock for regcache get default value from hw */ + ret = pm_runtime_resume_and_get(afe->dev); + if (ret) + return dev_err_probe(afe->dev, ret, "failed to resume device\n"); + + mtk_afe_add_sub_dai_control(component); + pm_runtime_put_sync(afe->dev); - if (component) { - /* enable clock for regcache get default value from hw */ - pm_runtime_get_sync(afe->dev); - mtk_afe_add_sub_dai_control(component); - pm_runtime_put_sync(afe->dev); - } return 0; } @@ -2306,6 +2309,11 @@ static const struct reg_sequence mt8196_cg_patch[] = { { AUDIO_TOP_CON4, 0x361c }, }; +static void mt8196_afe_release_reserved_mem(void *data) +{ + of_reserved_mem_device_release(data); +} + static int mt8196_afe_pcm_dev_probe(struct platform_device *pdev) { int ret, i; @@ -2320,8 +2328,13 @@ static int mt8196_afe_pcm_dev_probe(struct platform_device *pdev) return ret; ret = of_reserved_mem_device_init(dev); - if (ret) + if (ret) { dev_err(dev, "failed to assign memory region: %d\n", ret); + } else { + ret = devm_add_action_or_reset(dev, mt8196_afe_release_reserved_mem, dev); + if (ret) + return ret; + } afe = devm_kzalloc(dev, sizeof(*afe), GFP_KERNEL); if (!afe) @@ -2422,18 +2435,22 @@ static int mt8196_afe_pcm_dev_probe(struct platform_device *pdev) dev_pm_syscore_device(dev, true); /* enable clock for regcache get default value from hw */ - pm_runtime_get_sync(dev); + ret = pm_runtime_resume_and_get(dev); + if (ret) + return dev_err_probe(dev, ret, "failed to resume device\n"); afe->regmap = devm_regmap_init_mmio(dev, afe->base_addr, &mt8196_afe_regmap_config); - if (IS_ERR(afe->regmap)) - return PTR_ERR(afe->regmap); + if (IS_ERR(afe->regmap)) { + ret = PTR_ERR(afe->regmap); + goto err_pm_put; + } ret = regmap_register_patch(afe->regmap, mt8196_cg_patch, ARRAY_SIZE(mt8196_cg_patch)); if (ret < 0) { dev_err(dev, "Failed to apply cg patch\n"); - goto err_pm_disable; + goto err_pm_put; } regmap_read(afe->regmap, AFE_IRQ_MCU_EN, &tmp_reg); @@ -2452,12 +2469,12 @@ static int mt8196_afe_pcm_dev_probe(struct platform_device *pdev) afe->num_dai_drivers); if (ret) { dev_err(dev, "afe component err\n"); - goto err_pm_disable; + return ret; } return 0; -err_pm_disable: +err_pm_put: pm_runtime_put_sync(dev); return ret; } @@ -2467,7 +2484,6 @@ static void mt8196_afe_pcm_dev_remove(struct platform_device *pdev) struct mtk_base_afe *afe = platform_get_drvdata(pdev); struct device *dev = &pdev->dev; - pm_runtime_put_sync(dev); if (!pm_runtime_status_suspended(dev)) mt8196_afe_runtime_suspend(dev); diff --git a/sound/soc/qcom/qdsp6/q6apm-dai.c b/sound/soc/qcom/qdsp6/q6apm-dai.c index ede19fdea6e9ee..3a1be41df096cb 100644 --- a/sound/soc/qcom/qdsp6/q6apm-dai.c +++ b/sound/soc/qcom/qdsp6/q6apm-dai.c @@ -497,7 +497,12 @@ static int q6apm_dai_pcm_new(struct snd_soc_component *component, struct snd_soc { struct snd_soc_dai *cpu_dai = snd_soc_rtd_to_cpu(rtd, 0); struct snd_pcm *pcm = rtd->pcm; - int size = BUFFER_BYTES_MAX; + /* + * Allocate one extra page as a workaround for a DSP bug where 32-bit + * address arithmetic can overflow when the buffer is placed near the + * end of the addressable range. + */ + int size = BUFFER_BYTES_MAX + PAGE_SIZE; int graph_id, ret; struct snd_pcm_substream *substream; diff --git a/sound/soc/sdw_utils/Makefile b/sound/soc/sdw_utils/Makefile index a8d091fd374b6e..5ae8c69b8b98b7 100644 --- a/sound/soc/sdw_utils/Makefile +++ b/sound/soc/sdw_utils/Makefile @@ -8,6 +8,7 @@ snd-soc-sdw-utils-y := soc_sdw_utils.o soc_sdw_dmic.o soc_sdw_rt_dmic.o \ soc_sdw_cs42l45.o \ soc_sdw_cs47l47.o \ soc_sdw_cs_amp.o \ + soc_sdw_es9356.o \ soc_sdw_maxim.o \ soc_sdw_ti_amp.o obj-$(CONFIG_SND_SOC_SDW_UTILS) += snd-soc-sdw-utils.o diff --git a/sound/soc/sdw_utils/soc_sdw_es9356.c b/sound/soc/sdw_utils/soc_sdw_es9356.c new file mode 100644 index 00000000000000..aa405e7dad9942 --- /dev/null +++ b/sound/soc/sdw_utils/soc_sdw_es9356.c @@ -0,0 +1,230 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Based on sof_sdw_rt5682.c +// This file incorporates work covered by the following copyright notice: +// Copyright (c) 2023 Intel Corporation +// Copyright (c) 2024 Advanced Micro Devices, Inc. +// Copyright (c) 2025 Everest Semiconductor Co., Ltd + +/* + * soc_sdw_es9356 - Helpers to handle ES9356 from generic machine driver + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * Note this MUST be called before snd_soc_register_card(), so that the props + * are in place before the codec component driver's probe function parses them. + */ +static int es9356_add_codec_device_props(struct device *sdw_dev, unsigned long quirk) +{ + struct property_entry props[SOC_SDW_MAX_NO_PROPS] = {}; + struct fwnode_handle *fwnode; + int ret; + + if (!SOC_SDW_JACK_JDSRC(quirk)) + return 0; + props[0] = PROPERTY_ENTRY_U32("everest,jd-src", SOC_SDW_JACK_JDSRC(quirk)); + + fwnode = fwnode_create_software_node(props, NULL); + if (IS_ERR(fwnode)) + return PTR_ERR(fwnode); + + ret = device_add_software_node(sdw_dev, to_software_node(fwnode)); + + fwnode_handle_put(fwnode); + + return ret; +} + +static const struct snd_soc_dapm_route es9356_map[] = { + /* Headphones */ + { "Headphone", NULL, "es9356 HP" }, + { "es9356 MIC1", NULL, "Headset Mic" }, +}; + +static const struct snd_soc_dapm_route es9356_spk_map[] = { + /* Speaker */ + { "Speaker", NULL, "es9356 SPK" }, +}; + +static const struct snd_soc_dapm_route es9356_dmic_map[] = { + /* DMIC */ + { "es9356 PDM_DIN", NULL, "DMIC" }, +}; + +static struct snd_soc_jack_pin es9356_jack_pins[] = { + { + .pin = "Headphone", + .mask = SND_JACK_HEADPHONE, + }, + { + .pin = "Headset Mic", + .mask = SND_JACK_MICROPHONE, + }, +}; + +int asoc_sdw_es9356_spk_rtd_init(struct snd_soc_pcm_runtime *rtd, struct snd_soc_dai *dai) +{ + struct snd_soc_card *card = rtd->card; + struct snd_soc_dapm_context *dapm = snd_soc_card_to_dapm(card); + int ret; + + card->components = devm_kasprintf(card->dev, GFP_KERNEL, + "%s spk:es9356-spk", + card->components); + if (!card->components) + return -ENOMEM; + + ret = snd_soc_dapm_add_routes(dapm, es9356_spk_map, + ARRAY_SIZE(es9356_spk_map)); + if (ret) + dev_err(card->dev, "es9356 map addition failed: %d\n", ret); + + return ret; +} +EXPORT_SYMBOL_NS(asoc_sdw_es9356_spk_rtd_init, "SND_SOC_SDW_UTILS"); + +int asoc_sdw_es9356_dmic_rtd_init(struct snd_soc_pcm_runtime *rtd, struct snd_soc_dai *dai) +{ + struct snd_soc_card *card = rtd->card; + struct snd_soc_dapm_context *dapm = snd_soc_card_to_dapm(card); + int ret; + + card->components = devm_kasprintf(card->dev, GFP_KERNEL, + "%s mic:es9356-dmic", + card->components); + if (!card->components) + return -ENOMEM; + + ret = snd_soc_dapm_add_routes(dapm, es9356_dmic_map, + ARRAY_SIZE(es9356_dmic_map)); + if (ret) + dev_err(card->dev, "es9356 map addition failed: %d\n", ret); + + return ret; +} +EXPORT_SYMBOL_NS(asoc_sdw_es9356_dmic_rtd_init, "SND_SOC_SDW_UTILS"); + +int asoc_sdw_es9356_rtd_init(struct snd_soc_pcm_runtime *rtd, struct snd_soc_dai *dai) +{ + struct snd_soc_card *card = rtd->card; + struct snd_soc_dapm_context *dapm = snd_soc_card_to_dapm(card); + struct asoc_sdw_mc_private *ctx = snd_soc_card_get_drvdata(card); + struct snd_soc_component *component; + struct snd_soc_jack *jack; + int ret; + + component = dai->component; + card->components = devm_kasprintf(card->dev, GFP_KERNEL, + "%s hs:es9356", + card->components); + if (!card->components) + return -ENOMEM; + + ret = snd_soc_dapm_add_routes(dapm, es9356_map, + ARRAY_SIZE(es9356_map)); + + if (ret) { + dev_err(card->dev, "es9356 map addition failed: %d\n", ret); + return ret; + } + + ret = snd_soc_card_jack_new_pins(rtd->card, "Headset Jack", + SND_JACK_HEADSET | SND_JACK_BTN_0 | + SND_JACK_BTN_1 | SND_JACK_BTN_2 | + SND_JACK_BTN_3 | SND_JACK_BTN_4, + &ctx->sdw_headset, + es9356_jack_pins, + ARRAY_SIZE(es9356_jack_pins)); + if (ret) { + dev_err(rtd->card->dev, "Headset Jack creation failed: %d\n", + ret); + return ret; + } + + jack = &ctx->sdw_headset; + + snd_jack_set_key(jack->jack, SND_JACK_BTN_0, KEY_PLAYPAUSE); + snd_jack_set_key(jack->jack, SND_JACK_BTN_1, KEY_VOLUMEUP); + snd_jack_set_key(jack->jack, SND_JACK_BTN_2, KEY_VOLUMEDOWN); + snd_jack_set_key(jack->jack, SND_JACK_BTN_3, KEY_NEXTSONG); + snd_jack_set_key(jack->jack, SND_JACK_BTN_4, KEY_PREVIOUSSONG); + + ret = snd_soc_component_set_jack(component, jack, NULL); + + if (ret) + dev_err(rtd->card->dev, "Headset Jack call-back failed: %d\n", + ret); + + return ret; +} +EXPORT_SYMBOL_NS(asoc_sdw_es9356_rtd_init, "SND_SOC_SDW_UTILS"); + +int asoc_sdw_es9356_exit(struct snd_soc_card *card, struct snd_soc_dai_link *dai_link) +{ + struct asoc_sdw_mc_private *ctx = snd_soc_card_get_drvdata(card); + + if (!ctx->headset_codec_dev) + return 0; + + device_remove_software_node(ctx->headset_codec_dev); + put_device(ctx->headset_codec_dev); + + return 0; +} +EXPORT_SYMBOL_NS(asoc_sdw_es9356_exit, "SND_SOC_SDW_UTILS"); + +int asoc_sdw_es9356_init(struct snd_soc_card *card, + struct snd_soc_dai_link *dai_links, + struct asoc_sdw_codec_info *info, + bool playback) +{ + struct asoc_sdw_mc_private *ctx = snd_soc_card_get_drvdata(card); + struct device *sdw_dev; + int ret; + + /* + * headset should be initialized once. + * Do it with dai link for playback. + */ + if (!playback) + return 0; + + sdw_dev = bus_find_device_by_name(&sdw_bus_type, NULL, dai_links->codecs[0].name); + if (!sdw_dev) + return -EPROBE_DEFER; + + ret = es9356_add_codec_device_props(sdw_dev, ctx->mc_quirk); + if (ret < 0) { + put_device(sdw_dev); + return ret; + } + ctx->headset_codec_dev = sdw_dev; + + return 0; +} +EXPORT_SYMBOL_NS(asoc_sdw_es9356_init, "SND_SOC_SDW_UTILS"); + +int asoc_sdw_es9356_amp_init(struct snd_soc_card *card, + struct snd_soc_dai_link *dai_links, + struct asoc_sdw_codec_info *info, + bool playback) +{ + if (!playback) + return 0; + + info->amp_num++; + + return 0; +} +EXPORT_SYMBOL_NS(asoc_sdw_es9356_amp_init, "SND_SOC_SDW_UTILS"); diff --git a/sound/soc/sdw_utils/soc_sdw_ti_amp.c b/sound/soc/sdw_utils/soc_sdw_ti_amp.c index f156116fbeb652..d0ae5d7efe8f71 100644 --- a/sound/soc/sdw_utils/soc_sdw_ti_amp.c +++ b/sound/soc/sdw_utils/soc_sdw_ti_amp.c @@ -105,18 +105,12 @@ static int asoc_sdw_ti_add_tac5xx2_routes(struct snd_soc_dapm_context *dapm, struct snd_soc_dapm_route routes[2]; char left_widget[TAC5XX2_WIDGET_NAME_MAX]; char right_widget[TAC5XX2_WIDGET_NAME_MAX]; - int ret; if (strlen(name_prefix) > (TAC5XX2_WIDGET_NAME_MAX - 7)) return -ENAMETOOLONG; - ret = scnprintf(left_widget, sizeof(left_widget), "%s SPK_L", name_prefix); - if (ret <= 0) - return -EINVAL; - - ret = scnprintf(right_widget, sizeof(right_widget), "%s SPK_R", name_prefix); - if (ret <= 0) - return -EINVAL; + scnprintf(left_widget, sizeof(left_widget), "%s SPK_L", name_prefix); + scnprintf(right_widget, sizeof(right_widget), "%s SPK_R", name_prefix); routes[0] = (struct snd_soc_dapm_route){"Left Spk", NULL, left_widget}; routes[1] = (struct snd_soc_dapm_route){"Right Spk", NULL, right_widget}; diff --git a/sound/soc/sdw_utils/soc_sdw_utils.c b/sound/soc/sdw_utils/soc_sdw_utils.c index d8378d1b435e0f..9d0768f21ba4c7 100644 --- a/sound/soc/sdw_utils/soc_sdw_utils.c +++ b/sound/soc/sdw_utils/soc_sdw_utils.c @@ -1094,6 +1094,56 @@ struct asoc_sdw_codec_info codec_info_list[] = { }, .aux_num = 1, }, + { + .vendor_id = 0x04b3, + .part_id = 0x9356, + .name_prefix = "es9356", + .version_id = 3, + .dais = { + { + .direction = {true, false}, + .dai_name = "es9356-sdp-aif1", + .dai_type = SOC_SDW_DAI_TYPE_JACK, + .dailink = {SOC_SDW_JACK_OUT_DAI_ID, SOC_SDW_UNUSED_DAI_ID}, + .init = asoc_sdw_es9356_init, + .exit = asoc_sdw_es9356_exit, + .rtd_init = asoc_sdw_es9356_rtd_init, + .controls = generic_jack_controls, + .num_controls = ARRAY_SIZE(generic_jack_controls), + .widgets = generic_jack_widgets, + .num_widgets = ARRAY_SIZE(generic_jack_widgets), + }, + { + .direction = {false, true}, + .dai_name = "es9356-sdp-aif4", + .dai_type = SOC_SDW_DAI_TYPE_MIC, + .dailink = {SOC_SDW_UNUSED_DAI_ID, SOC_SDW_DMIC_DAI_ID}, + .rtd_init = asoc_sdw_es9356_dmic_rtd_init, + .widgets = generic_dmic_widgets, + .num_widgets = ARRAY_SIZE(generic_dmic_widgets), + }, + { + .direction = {false, true}, + .dai_name = "es9356-sdp-aif2", + .dai_type = SOC_SDW_DAI_TYPE_JACK, + .dailink = {SOC_SDW_UNUSED_DAI_ID, SOC_SDW_JACK_IN_DAI_ID}, + }, + { + .direction = {true, false}, + .dai_name = "es9356-sdp-aif3", + .component_name = "es9356", + .dai_type = SOC_SDW_DAI_TYPE_AMP, + .dailink = {SOC_SDW_AMP_OUT_DAI_ID, SOC_SDW_UNUSED_DAI_ID}, + .init = asoc_sdw_es9356_amp_init, + .rtd_init = asoc_sdw_es9356_spk_rtd_init, + .controls = generic_spk_controls, + .num_controls = ARRAY_SIZE(generic_spk_controls), + .widgets = generic_spk_widgets, + .num_widgets = ARRAY_SIZE(generic_spk_widgets), + }, + }, + .dai_num = 4, + }, { .vendor_id = 0x0105, .part_id = 0xaaaa, /* generic codec mockup */ @@ -1265,7 +1315,7 @@ int asoc_sdw_rtd_init(struct snd_soc_pcm_runtime *rtd) struct asoc_sdw_codec_info *codec_info; struct snd_soc_dai *dai; struct sdw_slave *sdw_peripheral; - const char *spk_components=""; + const char *spk_components = NULL; int dai_index; int ret; int i; @@ -1348,7 +1398,7 @@ int asoc_sdw_rtd_init(struct snd_soc_pcm_runtime *rtd) else component = codec_info->dais[dai_index].component_name; - if (strlen (spk_components) == 0) + if (!spk_components) spk_components = devm_kasprintf(card->dev, GFP_KERNEL, "%s", component); else @@ -1356,13 +1406,15 @@ int asoc_sdw_rtd_init(struct snd_soc_pcm_runtime *rtd) spk_components = devm_kasprintf(card->dev, GFP_KERNEL, "%s+%s", spk_components, component); + + if (!spk_components) + return -ENOMEM; } codec_info->dais[dai_index].rtd_init_done = true; - } - if (strlen (spk_components) > 0) { + if (spk_components) { /* Update card components for speaker components */ card->components = devm_kasprintf(card->dev, GFP_KERNEL, "%s spk:%s", card->components, spk_components); diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c index fb5090e4288c83..8c8e904b4bc253 100644 --- a/sound/soc/soc-core.c +++ b/sound/soc/soc-core.c @@ -2159,6 +2159,7 @@ static int snd_soc_bind_card(struct snd_soc_card *card) snd_soc_fill_dummy_dai(card); snd_soc_dapm_init(dapm, card, NULL); + list_del_init(&card->list); /* check whether any platform is ignore machine FE and using topology */ soc_check_tplg_fes(card); @@ -2302,6 +2303,10 @@ static int snd_soc_bind_card(struct snd_soc_card *card) probe_end: if (ret < 0) soc_cleanup_card_resources(card); + if (ret == -EPROBE_DEFER) { + list_add(&card->list, &unbind_card_list); + ret = 0; + } snd_soc_card_mutex_unlock(card); return ret; @@ -2317,12 +2322,15 @@ static int devm_snd_soc_bind_card(struct device *dev, struct snd_soc_card *card) struct snd_soc_card **ptr; int ret; + /* The procedure may be called many times during the lifetime of the card. */ + devres_destroy(dev, devm_card_bind_release, NULL, NULL); + ptr = devres_alloc(devm_card_bind_release, sizeof(*ptr), GFP_KERNEL); if (!ptr) return -ENOMEM; ret = snd_soc_bind_card(card); - if (ret == 0 || ret == -EPROBE_DEFER) { + if (ret == 0) { *ptr = card; devres_add(dev, ptr); } else { @@ -2332,21 +2340,11 @@ static int devm_snd_soc_bind_card(struct device *dev, struct snd_soc_card *card) return ret; } -static int snd_soc_rebind_card(struct snd_soc_card *card) +static int call_soc_bind_card(struct snd_soc_card *card) { - int ret; - - if (card->devres_dev) { - devres_destroy(card->devres_dev, devm_card_bind_release, NULL, NULL); - ret = devm_snd_soc_bind_card(card->devres_dev, card); - } else { - ret = snd_soc_bind_card(card); - } - - if (ret != -EPROBE_DEFER) - list_del_init(&card->list); - - return ret; + if (card->devres_dev) + return devm_snd_soc_bind_card(card->devres_dev, card); + return snd_soc_bind_card(card); } /* probes a new socdev */ @@ -2544,8 +2542,6 @@ EXPORT_SYMBOL_GPL(snd_soc_add_dai_controls); */ int snd_soc_register_card(struct snd_soc_card *card) { - int ret; - if (!card->name || !card->dev) return -EINVAL; @@ -2571,17 +2567,7 @@ int snd_soc_register_card(struct snd_soc_card *card) guard(mutex)(&client_mutex); - if (card->devres_dev) { - ret = devm_snd_soc_bind_card(card->devres_dev, card); - if (ret == -EPROBE_DEFER) { - list_add(&card->list, &unbind_card_list); - ret = 0; - } - } else { - ret = snd_soc_bind_card(card); - } - - return ret; + return call_soc_bind_card(card); } EXPORT_SYMBOL_GPL(snd_soc_register_card); @@ -2901,7 +2887,7 @@ int snd_soc_add_component(struct snd_soc_component *component, list_add(&component->list, &component_list); list_for_each_entry_safe(card, c, &unbind_card_list, list) - snd_soc_rebind_card(card); + call_soc_bind_card(card); err_cleanup: if (ret < 0) diff --git a/sound/soc/soc-devres.c b/sound/soc/soc-devres.c index d33f83ec24f275..718165ba84ac87 100644 --- a/sound/soc/soc-devres.c +++ b/sound/soc/soc-devres.c @@ -49,11 +49,6 @@ int devm_snd_soc_register_component(struct device *dev, } EXPORT_SYMBOL_GPL(devm_snd_soc_register_component); -static void devm_card_release(struct device *dev, void *res) -{ - snd_soc_unregister_card(*(struct snd_soc_card **)res); -} - /** * devm_snd_soc_register_card - resource managed card registration * @dev: Device used to manage card @@ -63,32 +58,11 @@ static void devm_card_release(struct device *dev, void *res) * unregistered. */ int devm_snd_soc_register_card(struct device *dev, struct snd_soc_card *card) -{ - struct snd_soc_card **ptr; - int ret; - - ptr = devres_alloc(devm_card_release, sizeof(*ptr), GFP_KERNEL); - if (!ptr) - return -ENOMEM; - - ret = snd_soc_register_card(card); - if (ret == 0) { - *ptr = card; - devres_add(dev, ptr); - } else { - devres_free(ptr); - } - - return ret; -} -EXPORT_SYMBOL_GPL(devm_snd_soc_register_card); - -int devm_snd_soc_register_deferrable_card(struct device *dev, struct snd_soc_card *card) { card->devres_dev = dev; return snd_soc_register_card(card); } -EXPORT_SYMBOL_GPL(devm_snd_soc_register_deferrable_card); +EXPORT_SYMBOL_GPL(devm_snd_soc_register_card); #ifdef CONFIG_SND_SOC_GENERIC_DMAENGINE_PCM diff --git a/sound/soc/sof/amd/acp.c b/sound/soc/sof/amd/acp.c index 71a18f156de23b..f615b8d1c80208 100644 --- a/sound/soc/sof/amd/acp.c +++ b/sound/soc/sof/amd/acp.c @@ -223,7 +223,7 @@ static int psp_send_cmd(struct acp_dev_data *adata, int cmd) { struct snd_sof_dev *sdev = adata->dev; int ret; - u32 data; + int data; if (!cmd) return -EINVAL; diff --git a/sound/soc/tegra/tegra210_mixer.c b/sound/soc/tegra/tegra210_mixer.c index f05617b5f43356..c237ba7531de03 100644 --- a/sound/soc/tegra/tegra210_mixer.c +++ b/sound/soc/tegra/tegra210_mixer.c @@ -7,6 +7,7 @@ #include #include #include +#include #include #include #include @@ -149,7 +150,7 @@ static int tegra210_mixer_configure_gain(struct snd_soc_component *cmpnt, /* Write duration parameters */ for (i = 0; i < NUM_DURATION_PARMS; i++) { - int val; + u32 val; if (instant_gain) { val = 1; @@ -157,8 +158,8 @@ static int tegra210_mixer_configure_gain(struct snd_soc_component *cmpnt, if (i == DURATION_N3_ID) val = mixer->duration[id]; else if (i == DURATION_INV_N3_ID) - val = (u32)(BIT_ULL(31 + TEGRA210_MIXER_PRESCALAR) / - mixer->duration[id]); + val = div_u64(BIT_ULL(31 + TEGRA210_MIXER_PRESCALAR), + mixer->duration[id]); else val = gain_params.duration[i]; } @@ -180,6 +181,17 @@ static int tegra210_mixer_configure_gain(struct snd_soc_component *cmpnt, return err; } +static int tegra210_mixer_fade_duration_info(struct snd_kcontrol *kcontrol, + struct snd_ctl_elem_info *uinfo) +{ + uinfo->type = SNDRV_CTL_ELEM_TYPE_INTEGER; + uinfo->count = 1; + uinfo->value.integer.min = TEGRA210_MIXER_FADE_DURATION_MIN; + uinfo->value.integer.max = TEGRA210_MIXER_FADE_DURATION_MAX; + + return 0; +} + static int tegra210_mixer_get_fade_duration(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol) { @@ -201,9 +213,10 @@ static int tegra210_mixer_put_fade_duration(struct snd_kcontrol *kcontrol, struct snd_soc_component *cmpnt = snd_kcontrol_chip(kcontrol); struct tegra210_mixer *mixer = snd_soc_component_get_drvdata(cmpnt); unsigned int id = mc->reg; - u32 duration = ucontrol->value.integer.value[0]; + long duration = ucontrol->value.integer.value[0]; - if (duration == 0 || duration > TEGRA210_MIXER_FADE_DURATION_MAX) + if (duration < (long)TEGRA210_MIXER_FADE_DURATION_MIN || + duration > TEGRA210_MIXER_FADE_DURATION_MAX) return -EINVAL; if (mixer->duration[id] == duration) @@ -613,10 +626,17 @@ static int tegra210_mixer_fade_status_info(struct snd_kcontrol *kcontrol, } #define FADE_CTRL(id) \ - SOC_SINGLE_EXT("RX" #id " Fade Duration", (id) - 1, 0, \ - TEGRA210_MIXER_FADE_DURATION_MAX, 0, \ - tegra210_mixer_get_fade_duration, \ - tegra210_mixer_put_fade_duration), \ + { \ + .iface = SNDRV_CTL_ELEM_IFACE_MIXER, \ + .name = "RX" #id " Fade Duration", \ + .info = tegra210_mixer_fade_duration_info, \ + .get = tegra210_mixer_get_fade_duration, \ + .put = tegra210_mixer_put_fade_duration, \ + .private_value = SOC_SINGLE_VALUE((id) - 1, 0, \ + TEGRA210_MIXER_FADE_DURATION_MIN, \ + TEGRA210_MIXER_FADE_DURATION_MAX, \ + 0, 0), \ + }, \ SOC_SINGLE_EXT("RX" #id " Fade Gain", (id) - 1, 0, \ TEGRA210_MIXER_GAIN_MAX, 0, \ tegra210_mixer_get_fade_gain, \ diff --git a/sound/soc/tegra/tegra210_mixer.h b/sound/soc/tegra/tegra210_mixer.h index bcbad08cbb9d32..d9c8fa6124ded4 100644 --- a/sound/soc/tegra/tegra210_mixer.h +++ b/sound/soc/tegra/tegra210_mixer.h @@ -87,9 +87,11 @@ #define NUM_GAIN_POLY_COEFFS 9 #define TEGRA210_MIXER_GAIN_MAX 0x20000 -#define TEGRA210_MIXER_FADE_DURATION_MAX 0x7fffffff #define TEGRA210_MIXER_PRESCALAR 6 +#define TEGRA210_MIXER_FADE_DURATION_MIN \ + (BIT(TEGRA210_MIXER_PRESCALAR - 1) + 1) +#define TEGRA210_MIXER_FADE_DURATION_MAX 0x7fffffff #define TEGRA210_MIXER_FADE_IDLE 0 #define TEGRA210_MIXER_FADE_ACTIVE 1 diff --git a/sound/soc/ti/ams-delta.c b/sound/soc/ti/ams-delta.c index ba173d9fcba91d..6dd0d051a757e6 100644 --- a/sound/soc/ti/ams-delta.c +++ b/sound/soc/ti/ams-delta.c @@ -264,10 +264,10 @@ static void cx81801_timeout(struct timer_list *unused) { int muted; - spin_lock(&ams_delta_lock); - cx81801_cmd_pending = 0; - muted = ams_delta_muted; - spin_unlock(&ams_delta_lock); + scoped_guard(spinlock, &ams_delta_lock) { + cx81801_cmd_pending = 0; + muted = ams_delta_muted; + } /* Reconnect the codec DAI back from the modem to the CPU DAI * only if digital mute still off */ @@ -373,11 +373,11 @@ static void cx81801_receive(struct tty_struct *tty, const u8 *cp, const u8 *fp, continue; /* Complete modem response received, apply config to codec */ - spin_lock_bh(&ams_delta_lock); - mod_timer(&cx81801_timer, jiffies + msecs_to_jiffies(150)); - apply = !ams_delta_muted && !cx81801_cmd_pending; - cx81801_cmd_pending = 1; - spin_unlock_bh(&ams_delta_lock); + scoped_guard(spinlock_bh, &ams_delta_lock) { + mod_timer(&cx81801_timer, jiffies + msecs_to_jiffies(150)); + apply = !ams_delta_muted && !cx81801_cmd_pending; + cx81801_cmd_pending = 1; + } /* Apply config pulse by connecting the codec to the modem * if not already done */ @@ -426,10 +426,10 @@ static int ams_delta_mute(struct snd_soc_dai *dai, int mute, int direction) if (ams_delta_muted == mute) return 0; - spin_lock_bh(&ams_delta_lock); - ams_delta_muted = mute; - apply = !cx81801_cmd_pending; - spin_unlock_bh(&ams_delta_lock); + scoped_guard(spinlock_bh, &ams_delta_lock) { + ams_delta_muted = mute; + apply = !cx81801_cmd_pending; + } if (apply) gpiod_set_value(gpiod_modem_codec, !!mute); diff --git a/sound/soc/ti/j721e-evm.c b/sound/soc/ti/j721e-evm.c index be9c363df361bc..c214ae0d7b95e3 100644 --- a/sound/soc/ti/j721e-evm.c +++ b/sound/soc/ti/j721e-evm.c @@ -263,7 +263,7 @@ static int j721e_audio_startup(struct snd_pcm_substream *substream) int ret = 0; int i; - mutex_lock(&priv->mutex); + guard(mutex)(&priv->mutex); domain->active++; @@ -303,7 +303,6 @@ static int j721e_audio_startup(struct snd_pcm_substream *substream) out: if (ret) domain->active--; - mutex_unlock(&priv->mutex); return ret; } @@ -323,30 +322,28 @@ static int j721e_audio_hw_params(struct snd_pcm_substream *substream, int ret; int i; - mutex_lock(&priv->mutex); + guard(mutex)(&priv->mutex); - if (domain->rate && domain->rate != params_rate(params)) { - ret = -EINVAL; - goto out; - } + if (domain->rate && domain->rate != params_rate(params)) + return -EINVAL; if (params_width(params) == 16) slot_width = 16; ret = snd_soc_dai_set_tdm_slot(cpu_dai, 0x3, 0x3, 2, slot_width); if (ret && ret != -ENOTSUPP) - goto out; + return ret; for_each_rtd_codec_dais(rtd, i, codec_dai) { ret = snd_soc_dai_set_tdm_slot(codec_dai, 0x3, 0x3, 2, slot_width); if (ret && ret != -ENOTSUPP) - goto out; + return ret; } ret = j721e_configure_refclk(priv, domain_id, params_rate(params)); if (ret) - goto out; + return ret; sysclk_rate = priv->hsdiv_rates[domain->parent_clk_id]; for_each_rtd_codec_dais(rtd, i, codec_dai) { @@ -356,7 +353,7 @@ static int j721e_audio_hw_params(struct snd_pcm_substream *substream, dev_err(priv->dev, "codec set_sysclk failed for %u Hz\n", sysclk_rate); - goto out; + return ret; } } @@ -371,8 +368,6 @@ static int j721e_audio_hw_params(struct snd_pcm_substream *substream, ret = 0; } -out: - mutex_unlock(&priv->mutex); return ret; } @@ -383,15 +378,13 @@ static void j721e_audio_shutdown(struct snd_pcm_substream *substream) unsigned int domain_id = rtd->dai_link->id; struct j721e_audio_domain *domain = &priv->audio_domains[domain_id]; - mutex_lock(&priv->mutex); + guard(mutex)(&priv->mutex); domain->active--; if (!domain->active) { domain->rate = 0; domain->active_link = 0; } - - mutex_unlock(&priv->mutex); } static const struct snd_soc_ops j721e_audio_ops = { diff --git a/sound/soc/ti/omap-dmic.c b/sound/soc/ti/omap-dmic.c index fb92bb88eb5c2b..7ca46c57566d9e 100644 --- a/sound/soc/ti/omap-dmic.c +++ b/sound/soc/ti/omap-dmic.c @@ -91,18 +91,14 @@ static int omap_dmic_dai_startup(struct snd_pcm_substream *substream, struct snd_soc_dai *dai) { struct omap_dmic *dmic = snd_soc_dai_get_drvdata(dai); - int ret = 0; - - mutex_lock(&dmic->mutex); - if (!snd_soc_dai_active(dai)) - dmic->active = 1; - else - ret = -EBUSY; + guard(mutex)(&dmic->mutex); - mutex_unlock(&dmic->mutex); + if (snd_soc_dai_active(dai)) + return -EBUSY; - return ret; + dmic->active = 1; + return 0; } static void omap_dmic_dai_shutdown(struct snd_pcm_substream *substream, @@ -110,14 +106,12 @@ static void omap_dmic_dai_shutdown(struct snd_pcm_substream *substream, { struct omap_dmic *dmic = snd_soc_dai_get_drvdata(dai); - mutex_lock(&dmic->mutex); + guard(mutex)(&dmic->mutex); cpu_latency_qos_remove_request(&dmic->pm_qos_req); if (!snd_soc_dai_active(dai)) dmic->active = 0; - - mutex_unlock(&dmic->mutex); } static int omap_dmic_select_divider(struct omap_dmic *dmic, int sample_rate) @@ -328,32 +322,30 @@ static int omap_dmic_select_fclk(struct omap_dmic *dmic, int clk_id, } mux = clk_get_parent(dmic->fclk); - if (IS_ERR(mux)) { + if (!mux) { dev_err(dmic->dev, "can't get fck mux parent\n"); clk_put(parent_clk); return -ENODEV; } - mutex_lock(&dmic->mutex); - if (dmic->active) { - /* disable clock while reparenting */ - pm_runtime_put_sync(dmic->dev); - ret = clk_set_parent(mux, parent_clk); - pm_runtime_get_sync(dmic->dev); - } else { - ret = clk_set_parent(mux, parent_clk); + scoped_guard(mutex, &dmic->mutex) { + if (dmic->active) { + /* disable clock while reparenting */ + pm_runtime_put_sync(dmic->dev); + ret = clk_set_parent(mux, parent_clk); + pm_runtime_get_sync(dmic->dev); + } else { + ret = clk_set_parent(mux, parent_clk); + } } - mutex_unlock(&dmic->mutex); if (ret < 0) { dev_err(dmic->dev, "re-parent failed\n"); - goto err_busy; + } else { + dmic->sysclk = clk_id; + dmic->fclk_freq = freq; } - dmic->sysclk = clk_id; - dmic->fclk_freq = freq; - -err_busy: clk_put(mux); clk_put(parent_clk); diff --git a/sound/soc/ti/omap-hdmi.c b/sound/soc/ti/omap-hdmi.c index 55e7cb96858fca..e60f5b483fc574 100644 --- a/sound/soc/ti/omap-hdmi.c +++ b/sound/soc/ti/omap-hdmi.c @@ -49,7 +49,7 @@ static void hdmi_dai_abort(struct device *dev) { struct hdmi_audio_data *ad = dev_get_drvdata(dev); - mutex_lock(&ad->current_stream_lock); + guard(mutex)(&ad->current_stream_lock); if (ad->current_stream && ad->current_stream->runtime && snd_pcm_running(ad->current_stream)) { dev_err(dev, "HDMI display disabled, aborting playback\n"); @@ -57,7 +57,6 @@ static void hdmi_dai_abort(struct device *dev) snd_pcm_stop(ad->current_stream, SNDRV_PCM_STATE_DISCONNECTED); snd_pcm_stream_unlock_irq(ad->current_stream); } - mutex_unlock(&ad->current_stream_lock); } static int hdmi_dai_startup(struct snd_pcm_substream *substream, @@ -86,16 +85,14 @@ static int hdmi_dai_startup(struct snd_pcm_substream *substream, snd_soc_dai_set_dma_data(dai, substream, &ad->dma_data); - mutex_lock(&ad->current_stream_lock); - ad->current_stream = substream; - mutex_unlock(&ad->current_stream_lock); + scoped_guard(mutex, &ad->current_stream_lock) + ad->current_stream = substream; ret = ad->ops->audio_startup(ad->dssdev, hdmi_dai_abort); if (ret) { - mutex_lock(&ad->current_stream_lock); - ad->current_stream = NULL; - mutex_unlock(&ad->current_stream_lock); + scoped_guard(mutex, &ad->current_stream_lock) + ad->current_stream = NULL; } return ret; @@ -261,9 +258,8 @@ static void hdmi_dai_shutdown(struct snd_pcm_substream *substream, ad->ops->audio_shutdown(ad->dssdev); - mutex_lock(&ad->current_stream_lock); - ad->current_stream = NULL; - mutex_unlock(&ad->current_stream_lock); + scoped_guard(mutex, &ad->current_stream_lock) + ad->current_stream = NULL; } static const struct snd_soc_dai_ops hdmi_dai_ops = { diff --git a/sound/soc/ti/omap-mcbsp-st.c b/sound/soc/ti/omap-mcbsp-st.c index 901578896ef324..b762d5d3e33b6e 100644 --- a/sound/soc/ti/omap-mcbsp-st.c +++ b/sound/soc/ti/omap-mcbsp-st.c @@ -156,7 +156,7 @@ static int omap_mcbsp_st_set_chgain(struct omap_mcbsp *mcbsp, int channel, if (!st_data) return -ENOENT; - spin_lock_irq(&mcbsp->lock); + guard(spinlock_irq)(&mcbsp->lock); if (channel == 0) st_data->ch0gain = chgain; else if (channel == 1) @@ -166,7 +166,6 @@ static int omap_mcbsp_st_set_chgain(struct omap_mcbsp *mcbsp, int channel, if (st_data->enabled) omap_mcbsp_st_chgain(mcbsp); - spin_unlock_irq(&mcbsp->lock); return ret; } @@ -180,14 +179,13 @@ static int omap_mcbsp_st_get_chgain(struct omap_mcbsp *mcbsp, int channel, if (!st_data) return -ENOENT; - spin_lock_irq(&mcbsp->lock); + guard(spinlock_irq)(&mcbsp->lock); if (channel == 0) *chgain = st_data->ch0gain; else if (channel == 1) *chgain = st_data->ch1gain; else ret = -EINVAL; - spin_unlock_irq(&mcbsp->lock); return ret; } @@ -199,10 +197,9 @@ static int omap_mcbsp_st_enable(struct omap_mcbsp *mcbsp) if (!st_data) return -ENODEV; - spin_lock_irq(&mcbsp->lock); + guard(spinlock_irq)(&mcbsp->lock); st_data->enabled = 1; omap_mcbsp_st_start(mcbsp); - spin_unlock_irq(&mcbsp->lock); return 0; } @@ -215,10 +212,9 @@ static int omap_mcbsp_st_disable(struct omap_mcbsp *mcbsp) if (!st_data) return -ENODEV; - spin_lock_irq(&mcbsp->lock); + guard(spinlock_irq)(&mcbsp->lock); omap_mcbsp_st_stop(mcbsp); st_data->enabled = 0; - spin_unlock_irq(&mcbsp->lock); return ret; } @@ -241,13 +237,12 @@ static ssize_t st_taps_show(struct device *dev, ssize_t status = 0; int i; - spin_lock_irq(&mcbsp->lock); + guard(spinlock_irq)(&mcbsp->lock); for (i = 0; i < st_data->nr_taps; i++) status += sysfs_emit_at(buf, status, (i ? ", %d" : "%d"), st_data->taps[i]); if (i) status += sysfs_emit_at(buf, status, "\n"); - spin_unlock_irq(&mcbsp->lock); return status; } @@ -260,19 +255,17 @@ static ssize_t st_taps_store(struct device *dev, struct omap_mcbsp_st_data *st_data = mcbsp->st_data; int val, tmp, status, i = 0; - spin_lock_irq(&mcbsp->lock); + guard(spinlock_irq)(&mcbsp->lock); memset(st_data->taps, 0, sizeof(st_data->taps)); st_data->nr_taps = 0; do { status = sscanf(buf, "%d%n", &val, &tmp); if (status < 0 || status == 0) { - size = -EINVAL; - goto out; + return -EINVAL; } if (val < -32768 || val > 32767) { - size = -EINVAL; - goto out; + return -EINVAL; } st_data->taps[i++] = val; buf += tmp; @@ -283,9 +276,6 @@ static ssize_t st_taps_store(struct device *dev, st_data->nr_taps = i; -out: - spin_unlock_irq(&mcbsp->lock); - return size; } diff --git a/sound/soc/ti/omap-mcbsp.c b/sound/soc/ti/omap-mcbsp.c index 411970399271a2..26af616c33f510 100644 --- a/sound/soc/ti/omap-mcbsp.c +++ b/sound/soc/ti/omap-mcbsp.c @@ -290,25 +290,24 @@ static u16 omap_mcbsp_get_rx_delay(struct omap_mcbsp *mcbsp) static int omap_mcbsp_request(struct omap_mcbsp *mcbsp) { - void *reg_cache; + void *reg_cache __free(kfree) = kzalloc(mcbsp->reg_cache_size, GFP_KERNEL); int err; - reg_cache = kzalloc(mcbsp->reg_cache_size, GFP_KERNEL); if (!reg_cache) return -ENOMEM; - spin_lock(&mcbsp->lock); - if (!mcbsp->free) { - dev_err(mcbsp->dev, "McBSP%d is currently in use\n", mcbsp->id); - err = -EBUSY; - goto err_kfree; - } + scoped_guard(spinlock, &mcbsp->lock) { + if (!mcbsp->free) { + dev_err(mcbsp->dev, "McBSP%d is currently in use\n", mcbsp->id); + return -EBUSY; + } - mcbsp->free = false; - mcbsp->reg_cache = reg_cache; - spin_unlock(&mcbsp->lock); + mcbsp->free = false; + mcbsp->reg_cache = reg_cache; + reg_cache = NULL; + } - if(mcbsp->pdata->ops && mcbsp->pdata->ops->request) + if (mcbsp->pdata->ops && mcbsp->pdata->ops->request) mcbsp->pdata->ops->request(mcbsp->id - 1); /* @@ -323,43 +322,40 @@ static int omap_mcbsp_request(struct omap_mcbsp *mcbsp) "McBSP", (void *)mcbsp); if (err != 0) { dev_err(mcbsp->dev, "Unable to request IRQ\n"); - goto err_clk_disable; } } else { err = request_irq(mcbsp->tx_irq, omap_mcbsp_tx_irq_handler, 0, "McBSP TX", (void *)mcbsp); if (err != 0) { dev_err(mcbsp->dev, "Unable to request TX IRQ\n"); - goto err_clk_disable; - } - - err = request_irq(mcbsp->rx_irq, omap_mcbsp_rx_irq_handler, 0, - "McBSP RX", (void *)mcbsp); - if (err != 0) { - dev_err(mcbsp->dev, "Unable to request RX IRQ\n"); - goto err_free_irq; + } else { + err = request_irq(mcbsp->rx_irq, omap_mcbsp_rx_irq_handler, 0, + "McBSP RX", (void *)mcbsp); + if (err != 0) { + dev_err(mcbsp->dev, "Unable to request RX IRQ\n"); + free_irq(mcbsp->tx_irq, (void *)mcbsp); + } } } - return 0; -err_free_irq: - free_irq(mcbsp->tx_irq, (void *)mcbsp); -err_clk_disable: - if(mcbsp->pdata->ops && mcbsp->pdata->ops->free) - mcbsp->pdata->ops->free(mcbsp->id - 1); + if (err != 0) { + if (mcbsp->pdata->ops && mcbsp->pdata->ops->free) + mcbsp->pdata->ops->free(mcbsp->id - 1); - /* Disable wakeup behavior */ - if (mcbsp->pdata->has_wakeup) - MCBSP_WRITE(mcbsp, WAKEUPEN, 0); + /* Disable wakeup behavior */ + if (mcbsp->pdata->has_wakeup) + MCBSP_WRITE(mcbsp, WAKEUPEN, 0); - spin_lock(&mcbsp->lock); - mcbsp->free = true; - mcbsp->reg_cache = NULL; -err_kfree: - spin_unlock(&mcbsp->lock); - kfree(reg_cache); + scoped_guard(spinlock, &mcbsp->lock) { + reg_cache = mcbsp->reg_cache; + mcbsp->free = true; + mcbsp->reg_cache = NULL; + } - return err; + return err; + } + + return 0; } static void omap_mcbsp_free(struct omap_mcbsp *mcbsp) @@ -395,13 +391,13 @@ static void omap_mcbsp_free(struct omap_mcbsp *mcbsp) if (!mcbsp_omap1()) omap2_mcbsp_set_clks_src(mcbsp, MCBSP_CLKS_PRCM_SRC); - spin_lock(&mcbsp->lock); - if (mcbsp->free) - dev_err(mcbsp->dev, "McBSP%d was not reserved\n", mcbsp->id); - else - mcbsp->free = true; - mcbsp->reg_cache = NULL; - spin_unlock(&mcbsp->lock); + scoped_guard(spinlock, &mcbsp->lock) { + if (mcbsp->free) + dev_err(mcbsp->dev, "McBSP%d was not reserved\n", mcbsp->id); + else + mcbsp->free = true; + mcbsp->reg_cache = NULL; + } kfree(reg_cache); } @@ -581,16 +577,12 @@ static ssize_t dma_op_mode_store(struct device *dev, if (i < 0) return i; - spin_lock_irq(&mcbsp->lock); + guard(spinlock_irq)(&mcbsp->lock); if (!mcbsp->free) { - size = -EBUSY; - goto unlock; + return -EBUSY; } mcbsp->dma_op_mode = i; -unlock: - spin_unlock_irq(&mcbsp->lock); - return size; } diff --git a/sound/soc/ti/omap-mcpdm.c b/sound/soc/ti/omap-mcpdm.c index 1a5d19937c6424..c7d7b502f120fe 100644 --- a/sound/soc/ti/omap-mcpdm.c +++ b/sound/soc/ti/omap-mcpdm.c @@ -251,13 +251,11 @@ static int omap_mcpdm_dai_startup(struct snd_pcm_substream *substream, { struct omap_mcpdm *mcpdm = snd_soc_dai_get_drvdata(dai); - mutex_lock(&mcpdm->mutex); + guard(mutex)(&mcpdm->mutex); if (!snd_soc_dai_active(dai)) omap_mcpdm_open_streams(mcpdm); - mutex_unlock(&mcpdm->mutex); - return 0; } @@ -269,7 +267,7 @@ static void omap_mcpdm_dai_shutdown(struct snd_pcm_substream *substream, int stream1 = tx ? SNDRV_PCM_STREAM_PLAYBACK : SNDRV_PCM_STREAM_CAPTURE; int stream2 = tx ? SNDRV_PCM_STREAM_CAPTURE : SNDRV_PCM_STREAM_PLAYBACK; - mutex_lock(&mcpdm->mutex); + guard(mutex)(&mcpdm->mutex); if (!snd_soc_dai_active(dai)) { if (omap_mcpdm_active(mcpdm)) { @@ -287,8 +285,6 @@ static void omap_mcpdm_dai_shutdown(struct snd_pcm_substream *substream, cpu_latency_qos_remove_request(&mcpdm->pm_qos_req); mcpdm->latency[stream1] = 0; - - mutex_unlock(&mcpdm->mutex); } static int omap_mcpdm_dai_hw_params(struct snd_pcm_substream *substream, diff --git a/sound/soc/ti/omap3pandora.c b/sound/soc/ti/omap3pandora.c index f11b1d8a1306c0..6c9c184cd9d6f6 100644 --- a/sound/soc/ti/omap3pandora.c +++ b/sound/soc/ti/omap3pandora.c @@ -11,12 +11,12 @@ #include #include #include +#include #include #include #include -#include #include #include "omap-mcbsp.h" @@ -225,7 +225,8 @@ static int __init omap3pandora_soc_init(void) { int ret; - if (!machine_is_omap3_pandora()) + if (!of_machine_is_compatible("openpandora,omap3-pandora-600mhz") && + !of_machine_is_compatible("openpandora,omap3-pandora-1ghz")) return -ENODEV; pr_info("OMAP3 Pandora SoC init\n"); diff --git a/sound/soc/ti/rx51.c b/sound/soc/ti/rx51.c index 7eeb12e5066c4a..cfc23e0838c25e 100644 --- a/sound/soc/ti/rx51.c +++ b/sound/soc/ti/rx51.c @@ -19,8 +19,6 @@ #include #include -#include - #include "omap-mcbsp.h" enum { @@ -364,7 +362,7 @@ static int rx51_soc_probe(struct platform_device *pdev) struct snd_soc_card *card = &rx51_sound_card; int err; - if (!machine_is_nokia_rx51() && !of_machine_is_compatible("nokia,omap3-n900")) + if (!of_machine_is_compatible("nokia,omap3-n900")) return -ENODEV; card->dev = &pdev->dev; diff --git a/sound/sparc/dbri.c b/sound/sparc/dbri.c index 75f82a92ff44fd..2f5f62079fa4a5 100644 --- a/sound/sparc/dbri.c +++ b/sound/sparc/dbri.c @@ -592,6 +592,7 @@ static __u32 reverse_bytes(__u32 b, int len) fallthrough; case 2: b = ((b & 0xaaaaaaaa) >> 1) | ((b & 0x55555555) << 1); + fallthrough; case 1: case 0: break; diff --git a/sound/synth/emux/emux_seq.c b/sound/synth/emux/emux_seq.c index 2ed01e9d79bb90..01ad5b12b680f5 100644 --- a/sound/synth/emux/emux_seq.c +++ b/sound/synth/emux/emux_seq.c @@ -132,19 +132,15 @@ snd_emux_create_port(struct snd_emux *emu, char *name, int i, type, cap; /* Allocate structures for this channel */ - p = kzalloc_obj(*p); + p = kzalloc_flex(*p, chset.channels, max_channels); if (!p) return NULL; - p->chset.channels = kzalloc_objs(*p->chset.channels, max_channels); - if (!p->chset.channels) { - kfree(p); - return NULL; - } + p->chset.max_channels = max_channels; + for (i = 0; i < max_channels; i++) p->chset.channels[i].number = i; p->chset.private_data = p; - p->chset.max_channels = max_channels; p->emu = emu; p->chset.client = emu->client; #ifdef SNDRV_EMUX_USE_RAW_EFFECT @@ -182,7 +178,6 @@ free_port(void *private_data) #ifdef SNDRV_EMUX_USE_RAW_EFFECT snd_emux_delete_effect(p); #endif - kfree(p->chset.channels); kfree(p); } } diff --git a/sound/usb/clock.c b/sound/usb/clock.c index 842ba5b801eae8..2e0c18e3528123 100644 --- a/sound/usb/clock.c +++ b/sound/usb/clock.c @@ -208,11 +208,18 @@ static bool uac_clock_source_is_valid_quirk(struct snd_usb_audio *chip, } /* - * MOTU MicroBook IIc - * Sample rate changes takes more than 2 seconds for this device. Clock - * validity request returns false during that period. + * Quirk for older MOTU AVB / hybrid interfaces + * + * These devices take more than 2 seconds to switch sample rate or + * clock source. During this period the clock validity request + * returns false, causing ALSA to fail prematurely. + * + * Affected models (all use vendor 0x07fd): + * - MicroBook IIc → 0x0004 + * - 1248, 624, 8A, UltraLite AVB, 8M, 16A, ... → 0x0005 */ - if (chip->usb_id == USB_ID(0x07fd, 0x0004)) { + if (chip->usb_id == USB_ID(0x07fd, 0x0004) || /* MicroBook IIc */ + chip->usb_id == USB_ID(0x07fd, 0x0005)) { /* 1248 / 624 / 8A / UltraLite AVB / ... */ count = 0; while ((!ret) && (count < 50)) { diff --git a/sound/usb/fcp.c b/sound/usb/fcp.c index 0fc4d063c48a58..ea746bdb36ffcf 100644 --- a/sound/usb/fcp.c +++ b/sound/usb/fcp.c @@ -191,13 +191,13 @@ static int fcp_usb(struct usb_mixer_interface *mixer, u32 opcode, struct fcp_usb_packet *req __free(kfree) = NULL; size_t req_buf_size = struct_size(req, data, req_size); - req = kmalloc(req_buf_size, GFP_KERNEL); + req = kmalloc_flex(*req, data, req_size); if (!req) return -ENOMEM; struct fcp_usb_packet *resp __free(kfree) = NULL; size_t resp_buf_size = struct_size(resp, data, resp_size); - resp = kmalloc(resp_buf_size, GFP_KERNEL); + resp = kmalloc_flex(*resp, data, resp_size); if (!resp) return -ENOMEM; diff --git a/sound/usb/midi.c b/sound/usb/midi.c index 0a5b8941ebdaaf..d87e3f357cf71b 100644 --- a/sound/usb/midi.c +++ b/sound/usb/midi.c @@ -1951,15 +1951,17 @@ static struct usb_ms_endpoint_descriptor *find_usb_ms_endpoint_descriptor( while (extralen > 3) { struct usb_ms_endpoint_descriptor *ms_ep = (struct usb_ms_endpoint_descriptor *)extra; + int length = ms_ep->bLength; - if (ms_ep->bLength > 3 && + if (!length || length > extralen) + break; + + if (length > 3 && ms_ep->bDescriptorType == USB_DT_CS_ENDPOINT && ms_ep->bDescriptorSubtype == UAC_MS_GENERAL) return ms_ep; - if (!extra[0]) - break; - extralen -= extra[0]; - extra += extra[0]; + extralen -= length; + extra += length; } return NULL; } diff --git a/sound/usb/midi2.c b/sound/usb/midi2.c index 3546ba926cb310..04aeb9052f139e 100644 --- a/sound/usb/midi2.c +++ b/sound/usb/midi2.c @@ -227,7 +227,7 @@ static void kill_midi_urbs(struct snd_usb_midi2_endpoint *ep, bool suspending) if (!ep) return; if (suspending) - ep->suspended = ep->running; + atomic_set(&ep->suspended, atomic_read(&ep->running)); atomic_set(&ep->running, 0); for (i = 0; i < ep->num_urbs; i++) { if (!ep->urbs[i].urb) @@ -496,15 +496,17 @@ static void *find_usb_ms_endpoint_descriptor(struct usb_host_endpoint *hostep, while (extralen > 3) { struct usb_ms_endpoint_descriptor *ms_ep = (struct usb_ms_endpoint_descriptor *)extra; + int length = ms_ep->bLength; - if (ms_ep->bLength > 3 && + if (!length || length > extralen) + break; + + if (length > 3 && ms_ep->bDescriptorType == USB_DT_CS_ENDPOINT && ms_ep->bDescriptorSubtype == subtype) return ms_ep; - if (!extra[0]) - break; - extralen -= extra[0]; - extra += extra[0]; + extralen -= length; + extra += length; } return NULL; } @@ -1188,10 +1190,11 @@ void snd_usb_midi_v2_suspend_all(struct snd_usb_audio *chip) static void resume_midi2_endpoint(struct snd_usb_midi2_endpoint *ep) { - ep->running = ep->suspended; - if (ep->direction == STR_IN) + atomic_set(&ep->running, atomic_read(&ep->suspended)); + atomic_set(&ep->suspended, 0); + + if (ep->direction == STR_IN || atomic_read(&ep->running)) submit_io_urbs(ep); - /* FIXME: does it all? */ } void snd_usb_midi_v2_resume_all(struct snd_usb_audio *chip) diff --git a/sound/usb/mixer_quirks.c b/sound/usb/mixer_quirks.c index 99975c3240a551..628f841b04aaed 100644 --- a/sound/usb/mixer_quirks.c +++ b/sound/usb/mixer_quirks.c @@ -1301,7 +1301,7 @@ static int snd_ftu_eff_switch_init(struct usb_mixer_interface *mixer, err = snd_usb_ctl_msg(dev, usb_rcvctrlpipe(dev, 0), UAC_GET_CUR, USB_RECIP_INTERFACE | USB_TYPE_CLASS | USB_DIR_IN, - pval & 0xff00, + (pval & 0xff00) | ((pval & 0xff0000) >> 16), snd_usb_ctrl_intf(mixer->hostif) | ((pval & 0xff) << 8), value, 2); if (err < 0) @@ -1334,7 +1334,7 @@ static int snd_ftu_eff_switch_update(struct usb_mixer_elem_list *list) usb_sndctrlpipe(chip->dev, 0), UAC_SET_CUR, USB_RECIP_INTERFACE | USB_TYPE_CLASS | USB_DIR_OUT, - pval & 0xff00, + (pval & 0xff00) | ((pval & 0xff0000) >> 16), snd_usb_ctrl_intf(list->mixer->hostif) | ((pval & 0xff) << 8), value, 2); } @@ -1754,6 +1754,44 @@ static int snd_c400_create_effect_ret_vol_ctls(struct usb_mixer_interface *mixer return 0; } +/* output gain knob selectively adjusts outputs as stereo pairs */ +/* reuses functions from FTU effect switch */ +static int snd_c400_knob_switch_info(struct snd_kcontrol *kcontrol, + struct snd_ctl_elem_info *uinfo) +{ + static const char *const texts[8] = { + "None", "1/2", "3/4", "1/2 3/4", + "5/6", "1/2 5/6", "3/4 5/6", "1/2 3/4 5/6" + }; + + return snd_ctl_enum_info(uinfo, 1, ARRAY_SIZE(texts), texts); +} + +static int snd_c400_create_knob_switch(struct usb_mixer_interface *mixer, + int validx, int bUnitID) +{ + static struct snd_kcontrol_new template = { + .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .name = "Output Gain Knob", + .index = 0, + .access = SNDRV_CTL_ELEM_ACCESS_READWRITE, + .info = snd_c400_knob_switch_info, + .get = snd_ftu_eff_switch_get, + .put = snd_ftu_eff_switch_put + }; + struct usb_mixer_elem_list *list; + int err; + + err = add_single_ctl_with_resume(mixer, bUnitID, + snd_ftu_eff_switch_update, + &template, &list); + if (err < 0) + return err; + list->kctl->private_value = (validx << 8) | bUnitID; + snd_ftu_eff_switch_init(mixer, list->kctl); + return 0; +} + static int snd_c400_create_mixer(struct usb_mixer_interface *mixer) { int err; @@ -1786,6 +1824,10 @@ static int snd_c400_create_mixer(struct usb_mixer_interface *mixer) if (err < 0) return err; + err = snd_c400_create_knob_switch(mixer, 0x0900, 0x20); + if (err < 0) + return err; + return 0; } diff --git a/sound/usb/mixer_scarlett2.c b/sound/usb/mixer_scarlett2.c index 6f66149baea6a1..7c43ca51938e30 100644 --- a/sound/usb/mixer_scarlett2.c +++ b/sound/usb/mixer_scarlett2.c @@ -2614,13 +2614,13 @@ static int scarlett2_usb( struct scarlett2_usb_packet *req __free(kfree) = NULL; size_t req_buf_size = struct_size(req, data, req_size); - req = kmalloc(req_buf_size, GFP_KERNEL); + req = kmalloc_flex(*req, data, req_size); if (!req) return -ENOMEM; struct scarlett2_usb_packet *resp __free(kfree) = NULL; size_t resp_buf_size = struct_size(resp, data, resp_size); - resp = kmalloc(resp_buf_size, GFP_KERNEL); + resp = kmalloc_flex(*resp, data, resp_size); if (!resp) return -ENOMEM; @@ -2830,9 +2830,9 @@ static int scarlett2_usb_set_data_buf( u8 data[]; } __packed *req; int err; - int buf_size = struct_size(req, data, bytes); + size_t buf_size = struct_size(req, data, bytes); - req = kmalloc(buf_size, GFP_KERNEL); + req = kmalloc_flex(*req, data, bytes); if (!req) return -ENOMEM; @@ -6941,6 +6941,8 @@ static int scarlett2_add_line_in_ctls(struct usb_mixer_interface *mixer) err = scarlett2_add_new_ctl( mixer, &scarlett2_autogain_status_ctl, i, 1, s, &private->autogain_status_ctls[i]); + if (err < 0) + return err; } /* Add autogain target controls */ @@ -9644,7 +9646,7 @@ static long scarlett2_hwdep_write(struct snd_hwdep *hw, /* Create and send the request */ len = struct_size(req, data, count); - req = kzalloc(len, GFP_KERNEL); + req = kzalloc_flex(*req, data, count); if (!req) return -ENOMEM; diff --git a/sound/usb/qcom/qc_audio_offload.c b/sound/usb/qcom/qc_audio_offload.c index 5f993b88448c7e..a0009503b2c592 100644 --- a/sound/usb/qcom/qc_audio_offload.c +++ b/sound/usb/qcom/qc_audio_offload.c @@ -565,6 +565,7 @@ static unsigned long uaudio_iommu_map_pa(enum mem_type mtype, bool dma_coherent, unsigned long iova = 0; bool map = true; int prot = uaudio_iommu_map_prot(dma_coherent); + int ret; switch (mtype) { case MEM_EVENT_RING: @@ -582,10 +583,24 @@ static unsigned long uaudio_iommu_map_pa(enum mem_type mtype, bool dma_coherent, dev_err(uaudio_qdev->data->dev, "unknown mem type %d\n", mtype); } - if (!iova || !map) + if (!iova) return 0; - iommu_map(uaudio_qdev->data->domain, iova, pa, size, prot, GFP_KERNEL); + if (!map) + return iova; + + ret = iommu_map(uaudio_qdev->data->domain, iova, pa, size, prot, + GFP_KERNEL); + if (ret) { + dev_err(uaudio_qdev->data->dev, + "failed to map %zu bytes at iova 0x%08lx: %d\n", + size, iova, ret); + if (mtype == MEM_XFER_RING) + uaudio_put_iova(iova, size, + &uaudio_qdev->xfer_ring_list, + &uaudio_qdev->xfer_ring_iova_size); + return 0; + } return iova; } @@ -1054,15 +1069,17 @@ static int uaudio_transfer_buffer_setup(struct snd_usb_substream *subs, if (!xfer_buf) return -ENOMEM; - dma_get_sgtable(subs->dev->bus->sysdev, &xfer_buf_sgt, xfer_buf, - xfer_buf_dma, len); + ret = dma_get_sgtable(subs->dev->bus->sysdev, &xfer_buf_sgt, xfer_buf, + xfer_buf_dma, len); + if (ret) + goto free_xfer_buf; /* map the physical buffer into sysdev as well */ xfer_buf_dma_sysdev = uaudio_iommu_map_xfer_buf(dma_coherent, len, &xfer_buf_sgt); if (!xfer_buf_dma_sysdev) { ret = -ENOMEM; - goto unmap_sync; + goto free_sgt; } mem_info->dma = xfer_buf_dma; @@ -1073,7 +1090,9 @@ static int uaudio_transfer_buffer_setup(struct snd_usb_substream *subs, return 0; -unmap_sync: +free_sgt: + sg_free_table(&xfer_buf_sgt); +free_xfer_buf: usb_free_coherent(subs->dev, len, xfer_buf, xfer_buf_dma); return ret; diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c index 0b4ecc2c6bcc40..31cbe383ae658d 100644 --- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -2277,6 +2277,9 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = { QUIRK_FLAG_ALIGN_TRANSFER), DEVICE_FLG(0x05e1, 0x0480, /* Hauppauge Woodbury */ QUIRK_FLAG_SHARE_MEDIA_DEVICE | QUIRK_FLAG_ALIGN_TRANSFER), + DEVICE_FLG(0x05fc, 0x0231, /* JBL Pebbles */ + QUIRK_FLAG_MIXER_PLAYBACK_LINEAR_VOL | QUIRK_FLAG_MIXER_CAPTURE_LINEAR_VOL | + QUIRK_FLAG_GET_SAMPLE_RATE), DEVICE_FLG(0x0624, 0x3d3f, /* AB13X USB Audio */ QUIRK_FLAG_FORCE_IFACE_RESET | QUIRK_FLAG_IFACE_DELAY), DEVICE_FLG(0x0644, 0x8043, /* TEAC UD-501/UD-501V2/UD-503/NT-503 */ @@ -2366,6 +2369,8 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = { QUIRK_FLAG_IGNORE_CTL_ERROR), DEVICE_FLG(0x152a, 0x880a, /* NeuralDSP Quad Cortex */ 0), /* Doesn't have the vendor quirk which would otherwise apply */ + DEVICE_FLG(0x1532, 0x055e, /* Razer Nommo V2 X */ + QUIRK_FLAG_MIXER_PLAYBACK_MIN_MUTE), DEVICE_FLG(0x154e, 0x1002, /* Denon DCD-1500RE */ QUIRK_FLAG_ITF_USB_DSD_DAC | QUIRK_FLAG_CTL_MSG_DELAY), DEVICE_FLG(0x154e, 0x1003, /* Denon DA-300USB */ @@ -2458,6 +2463,8 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = { QUIRK_FLAG_GENERIC_IMPLICIT_FB), DEVICE_FLG(0x2b53, 0x0031, /* Fiero SC-01 (firmware v1.1.0) */ QUIRK_FLAG_GENERIC_IMPLICIT_FB), + DEVICE_FLG(0x2b73, 0x0047, /* AlphaTheta EUPHONIA */ + QUIRK_FLAG_PLAYBACK_FIRST | QUIRK_FLAG_GENERIC_IMPLICIT_FB), DEVICE_FLG(0x2d95, 0x8011, /* VIVO USB-C HEADSET */ QUIRK_FLAG_CTL_MSG_DELAY_1M), DEVICE_FLG(0x2d95, 0x8021, /* VIVO USB-C-XE710 HEADSET */ @@ -2472,6 +2479,8 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = { QUIRK_FLAG_IGNORE_CTL_ERROR), DEVICE_FLG(0x3255, 0x0000, /* Luxman D-10X */ QUIRK_FLAG_ITF_USB_DSD_DAC | QUIRK_FLAG_CTL_MSG_DELAY), + DEVICE_FLG(0x3302, 0x17c2, /* TTGK Technology USB-C Audio */ + QUIRK_FLAG_FORCE_IFACE_RESET | QUIRK_FLAG_IFACE_DELAY), DEVICE_FLG(0x339b, 0x3a07, /* Synaptics HONOR USB-C HEADSET */ QUIRK_FLAG_MIXER_PLAYBACK_MIN_MUTE), DEVICE_FLG(0x3443, 0x930d, /* NexiGo N930W 60fps Webcam */ diff --git a/sound/usb/usx2y/usbusx2y.c b/sound/usb/usx2y/usbusx2y.c index f34e78910200aa..4190227c5a2a58 100644 --- a/sound/usb/usx2y/usbusx2y.c +++ b/sound/usb/usx2y/usbusx2y.c @@ -180,7 +180,7 @@ static void i_usx2y_in04_int(struct urb *urb) struct usx2ydev *usx2y = urb->context; struct us428ctls_sharedmem *us428ctls = usx2y->us428ctls_sharedmem; struct us428_p4out *p4out; - int i, j, n, diff, send; + int i, j, n, diff, send, len; usx2y->in04_int_calls++; @@ -222,24 +222,31 @@ static void i_usx2y_in04_int(struct urb *urb) } while (!err && usx2y->us04->submitted < usx2y->us04->len); } } else { - if (us428ctls && us428ctls->p4out_last >= 0 && us428ctls->p4out_last < N_US428_P4OUT_BUFS) { - if (us428ctls->p4out_last != us428ctls->p4out_sent) { - send = us428ctls->p4out_sent + 1; - if (send >= N_US428_P4OUT_BUFS) - send = 0; - for (j = 0; j < URBS_ASYNC_SEQ && !err; ++j) { - if (!usx2y->as04.urb[j]->status) { - p4out = us428ctls->p4out + send; // FIXME if more than 1 p4out is new, 1 gets lost. - usb_fill_bulk_urb(usx2y->as04.urb[j], usx2y->dev, - usb_sndbulkpipe(usx2y->dev, 0x04), &p4out->val.vol, - p4out->type == ELT_LIGHT ? sizeof(struct us428_lights) : 5, - i_usx2y_out04_int, usx2y); - err = usb_submit_urb(usx2y->as04.urb[j], GFP_ATOMIC); + while (us428ctls && + us428ctls->p4out_last >= 0 && + us428ctls->p4out_last < N_US428_P4OUT_BUFS && + us428ctls->p4out_last != us428ctls->p4out_sent) { + for (j = 0; j < URBS_ASYNC_SEQ && !err; ++j) { + if (!usx2y->as04.urb[j]->status) { + send = us428ctls->p4out_sent + 1; + if (send >= N_US428_P4OUT_BUFS) + send = 0; + + p4out = us428ctls->p4out + send; + len = p4out->type == ELT_LIGHT ? + sizeof(struct us428_lights) : 5; + memcpy(usx2y->as04.urb[j]->transfer_buffer, + &p4out->val.vol, len); + usx2y->as04.urb[j]->transfer_buffer_length = len; + err = usb_submit_urb(usx2y->as04.urb[j], GFP_ATOMIC); + if (!err) us428ctls->p4out_sent = send; - break; - } + + break; } } + if (j >= URBS_ASYNC_SEQ || err) + break; } } diff --git a/sound/virtio/virtio_kctl.c b/sound/virtio/virtio_kctl.c index ffb903d56297a7..45f7b6a5b3080a 100644 --- a/sound/virtio/virtio_kctl.c +++ b/sound/virtio/virtio_kctl.c @@ -18,6 +18,21 @@ static const snd_ctl_elem_type_t g_v2a_type_map[] = { [VIRTIO_SND_CTL_TYPE_IEC958] = SNDRV_CTL_ELEM_TYPE_IEC958 }; +/* Map for converting VirtIO types to maximum value counts. */ +static const unsigned int g_v2a_count_map[] = { + [VIRTIO_SND_CTL_TYPE_BOOLEAN] = + ARRAY_SIZE(((struct virtio_snd_ctl_value *)0)->value.integer), + [VIRTIO_SND_CTL_TYPE_INTEGER] = + ARRAY_SIZE(((struct virtio_snd_ctl_value *)0)->value.integer), + [VIRTIO_SND_CTL_TYPE_INTEGER64] = + ARRAY_SIZE(((struct virtio_snd_ctl_value *)0)->value.integer64), + [VIRTIO_SND_CTL_TYPE_ENUMERATED] = + ARRAY_SIZE(((struct virtio_snd_ctl_value *)0)->value.enumerated), + [VIRTIO_SND_CTL_TYPE_BYTES] = + ARRAY_SIZE(((struct virtio_snd_ctl_value *)0)->value.bytes), + [VIRTIO_SND_CTL_TYPE_IEC958] = 1 +}; + /* Map for converting VirtIO access rights to ALSA access rights. */ static const unsigned int g_v2a_access_map[] = { [VIRTIO_SND_CTL_ACCESS_READ] = SNDRV_CTL_ELEM_ACCESS_READ, @@ -36,6 +51,37 @@ static const unsigned int g_v2a_mask_map[] = { [VIRTIO_SND_CTL_EVT_MASK_TLV] = SNDRV_CTL_EVENT_MASK_TLV }; +static int virtsnd_kctl_validate_info(struct virtio_snd *snd, u32 cid, + struct virtio_snd_ctl_info *kinfo) +{ + struct virtio_device *vdev = snd->vdev; + unsigned int type = le32_to_cpu(kinfo->type); + unsigned int count = le32_to_cpu(kinfo->count); + + if (type >= ARRAY_SIZE(g_v2a_type_map)) { + dev_err(&vdev->dev, "control #%u: unknown type %u\n", + cid, type); + return -EINVAL; + } + + if (count > g_v2a_count_map[type] || + (type == VIRTIO_SND_CTL_TYPE_IEC958 && count != 1)) { + dev_err(&vdev->dev, "control #%u: invalid count %u for type %u\n", + cid, count, type); + return -EINVAL; + } + + if (type == VIRTIO_SND_CTL_TYPE_ENUMERATED && + !le32_to_cpu(kinfo->value.enumerated.items)) { + dev_err(&vdev->dev, + "control #%u: no items for enumerated control\n", + cid); + return -EINVAL; + } + + return 0; +} + /** * virtsnd_kctl_info() - Returns information about the control. * @kcontrol: ALSA control element. @@ -385,6 +431,10 @@ int virtsnd_kctl_parse_cfg(struct virtio_snd *snd) struct virtio_snd_ctl_info *kinfo = &snd->kctl_infos[i]; unsigned int type = le32_to_cpu(kinfo->type); + rc = virtsnd_kctl_validate_info(snd, i, kinfo); + if (rc) + return rc; + if (type == VIRTIO_SND_CTL_TYPE_ENUMERATED) { rc = virtsnd_kctl_get_enum_items(snd, i); if (rc) diff --git a/sound/virtio/virtio_pcm.c b/sound/virtio/virtio_pcm.c index eb9cc8131905da..be3893de40a559 100644 --- a/sound/virtio/virtio_pcm.c +++ b/sound/virtio/virtio_pcm.c @@ -77,7 +77,8 @@ static const struct virtsnd_v2a_rate g_v2a_rate_map[] = { [VIRTIO_SND_PCM_RATE_88200] = { SNDRV_PCM_RATE_88200, 88200 }, [VIRTIO_SND_PCM_RATE_96000] = { SNDRV_PCM_RATE_96000, 96000 }, [VIRTIO_SND_PCM_RATE_176400] = { SNDRV_PCM_RATE_176400, 176400 }, - [VIRTIO_SND_PCM_RATE_192000] = { SNDRV_PCM_RATE_192000, 192000 } + [VIRTIO_SND_PCM_RATE_192000] = { SNDRV_PCM_RATE_192000, 192000 }, + [VIRTIO_SND_PCM_RATE_384000] = { SNDRV_PCM_RATE_384000, 384000 } }; /** diff --git a/sound/virtio/virtio_pcm_ops.c b/sound/virtio/virtio_pcm_ops.c index 6297a9c61e70b0..1105e7ff3523a0 100644 --- a/sound/virtio/virtio_pcm_ops.c +++ b/sound/virtio/virtio_pcm_ops.c @@ -90,7 +90,8 @@ static const struct virtsnd_a2v_rate g_a2v_rate_map[] = { { 88200, VIRTIO_SND_PCM_RATE_88200 }, { 96000, VIRTIO_SND_PCM_RATE_96000 }, { 176400, VIRTIO_SND_PCM_RATE_176400 }, - { 192000, VIRTIO_SND_PCM_RATE_192000 } + { 192000, VIRTIO_SND_PCM_RATE_192000 }, + { 384000, VIRTIO_SND_PCM_RATE_384000 } }; static int virtsnd_pcm_sync_stop(struct snd_pcm_substream *substream); diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h index 6673601246b382..eff29645719bc7 100644 --- a/tools/arch/x86/include/asm/msr-index.h +++ b/tools/arch/x86/include/asm/msr-index.h @@ -793,9 +793,10 @@ #define MSR_AMD64_LBR_SELECT 0xc000010e /* Zen4 */ -#define MSR_ZEN4_BP_CFG 0xc001102e +#define MSR_ZEN4_BP_CFG 0xc001102e #define MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT 4 #define MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT 5 +#define MSR_ZEN2_BP_CFG_BUG_FIX_BIT 33 /* Fam 19h MSRs */ #define MSR_F19H_UMC_PERF_CTL 0xc0010800 diff --git a/tools/include/uapi/linux/stddef.h b/tools/include/uapi/linux/stddef.h index c53cde425406b7..45749825949464 100644 --- a/tools/include/uapi/linux/stddef.h +++ b/tools/include/uapi/linux/stddef.h @@ -3,7 +3,6 @@ #define _LINUX_STDDEF_H - #ifndef __always_inline #define __always_inline __inline__ #endif @@ -36,6 +35,11 @@ struct __struct_group_tag(TAG) { MEMBERS } ATTRS NAME; \ } ATTRS +#ifdef __cplusplus +/* sizeof(struct{}) is 1 in C++, not 0, can't use C version of the macro. */ +#define __DECLARE_FLEX_ARRAY(T, member) \ + T member[0] +#else /** * __DECLARE_FLEX_ARRAY() - Declare a flexible array usable in a union * @@ -52,3 +56,23 @@ TYPE NAME[]; \ } #endif + +#ifndef __counted_by +#define __counted_by(m) +#endif + +#ifndef __counted_by_le +#define __counted_by_le(m) +#endif + +#ifndef __counted_by_be +#define __counted_by_be(m) +#endif + +#ifndef __counted_by_ptr +#define __counted_by_ptr(m) +#endif + +#define __kernel_nonstring + +#endif /* _LINUX_STDDEF_H */ diff --git a/tools/net/ynl/Makefile.deps b/tools/net/ynl/Makefile.deps index 08205f9fc52575..cc53b2f21c4446 100644 --- a/tools/net/ynl/Makefile.deps +++ b/tools/net/ynl/Makefile.deps @@ -15,9 +15,11 @@ UAPI_PATH:=../../../../include/uapi/ get_hdr_inc=-D$(1) -include $(UAPI_PATH)/linux/$(2) get_hdr_inc2=-D$(1) -D$(2) -include $(UAPI_PATH)/linux/$(3) +CFLAGS_dev-energymodel:=$(call get_hdr_inc,_LINUX_DEV_ENERGYMODEL_H,dev_energymodel.h) CFLAGS_devlink:=$(call get_hdr_inc,_LINUX_DEVLINK_H_,devlink.h) CFLAGS_dpll:=$(call get_hdr_inc,_LINUX_DPLL_H,dpll.h) -CFLAGS_ethtool:=$(call get_hdr_inc,_LINUX_ETHTOOL_H,ethtool.h) \ +CFLAGS_ethtool:=$(call get_hdr_inc,_LINUX_TYPELIMITS_H,typelimits.h) \ + $(call get_hdr_inc,_LINUX_ETHTOOL_H,ethtool.h) \ $(call get_hdr_inc,_LINUX_ETHTOOL_NETLINK_H_,ethtool_netlink.h) \ $(call get_hdr_inc,_LINUX_ETHTOOL_NETLINK_GENERATED_H,ethtool_netlink_generated.h) CFLAGS_handshake:=$(call get_hdr_inc,_LINUX_HANDSHAKE_H,handshake.h) diff --git a/tools/net/ynl/pyynl/ynl_gen_c.py b/tools/net/ynl/pyynl/ynl_gen_c.py index 0e1e486c1185a4..cdc3646f2642c4 100755 --- a/tools/net/ynl/pyynl/ynl_gen_c.py +++ b/tools/net/ynl/pyynl/ynl_gen_c.py @@ -3212,6 +3212,8 @@ def render_uapi(family, cw): for const in family['definitions']: if const.get('header'): continue + if const.get('scope', 'uapi') != 'uapi': + continue if const['type'] != 'const': cw.writes_defines(defines) @@ -3339,6 +3341,25 @@ def render_uapi(family, cw): cw.p(f'#endif /* {hdr_prot} */') +def render_scoped_consts(family, cw, scope): + defines = [] + for const in family['definitions']: + if const['type'] != 'const': + continue + if const.get('header'): + continue + if const.get('scope') != scope: + continue + name_pfx = const.get('name-prefix', f"{family.ident_name}-") + defines.append([ + c_upper(family.get('c-define-name', + f"{name_pfx}{const['name']}")), + const['value']]) + if defines: + cw.writes_defines(defines) + cw.nl() + + def _render_user_ntf_entry(ri, op): if not ri.family.is_classic(): ri.cw.block_start(line=f"[{op.enum_name}] = ") @@ -3504,8 +3525,12 @@ def main(): cw.p('#include "ynl.h"') headers = [] for definition in parsed['definitions'] + parsed['attribute-sets']: - if 'header' in definition: - headers.append(definition['header']) + if 'header' not in definition: + continue + scope = definition.get('scope', 'uapi') + if scope != 'uapi' and scope != args.mode: + continue + headers.append(definition['header']) if args.mode == 'user': headers.append(parsed.uapi_header) seen_header = [] @@ -3522,6 +3547,7 @@ def main(): for one in args.user_header: cw.p(f'#include "{one}"') else: + render_scoped_consts(parsed, cw, 'user') cw.p('struct ynl_sock;') cw.nl() render_user_family(parsed, cw, True) @@ -3529,6 +3555,7 @@ def main(): if args.mode == "kernel": if args.header: + render_scoped_consts(parsed, cw, 'kernel') for _, struct in sorted(parsed.pure_nested_structs.items()): if struct.request: cw.p('/* Common nested types */') diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c index f829b6f09bc9d3..fe30181e63367d 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c @@ -112,6 +112,10 @@ static void test_cubic(void) ASSERT_EQ(cubic_skel->bss->bpf_cubic_acked_called, 1, "pkts_acked called"); + ASSERT_TRUE(cubic_skel->bss->nodelay_init_reject, "init reject nodelay option"); + ASSERT_TRUE(cubic_skel->bss->nodelay_cwnd_event_tx_start_reject, + "cwnd_event_tx_start reject nodelay option"); + bpf_link__destroy(link); bpf_cubic__destroy(cubic_skel); } diff --git a/tools/testing/selftests/bpf/prog_tests/sockopt_sk.c b/tools/testing/selftests/bpf/prog_tests/sockopt_sk.c index 53637431ec5de3..3a41c517b9182b 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockopt_sk.c +++ b/tools/testing/selftests/bpf/prog_tests/sockopt_sk.c @@ -190,7 +190,7 @@ static int getsetsockopt(void) fd = socket(AF_NETLINK, SOCK_RAW, 0); if (fd < 0) { log_err("Failed to create AF_NETLINK socket"); - return -1; + goto err; } buf.u32 = 1; @@ -211,6 +211,21 @@ static int getsetsockopt(void) } ASSERT_EQ(optlen, 8, "Unexpected NETLINK_LIST_MEMBERSHIPS value"); + /* Trick bpf_tcp_sock() with IPPROTO_TCP */ + close(fd); + fd = socket(AF_INET, SOCK_RAW, IPPROTO_TCP); + if (!ASSERT_OK_FD(fd, "socket")) + goto err; + + /* The BPF prog intercepts this before the kernel sees it, any + * optlen works. Go with 4 bytes for simplicity. + */ + buf.u32 = 1; + optlen = sizeof(buf.u32); + err = setsockopt(fd, SOL_TCP, TCP_SAVED_SYN, &buf, optlen); + if (!ASSERT_ERR(err, "setsockopt(TCP_SAVED_SYN)")) + goto err; + free(big_buf); close(fd); return 0; diff --git a/tools/testing/selftests/bpf/prog_tests/tcp_hdr_options.c b/tools/testing/selftests/bpf/prog_tests/tcp_hdr_options.c index 56685fc03c7e9e..80e6315da2a515 100644 --- a/tools/testing/selftests/bpf/prog_tests/tcp_hdr_options.c +++ b/tools/testing/selftests/bpf/prog_tests/tcp_hdr_options.c @@ -507,6 +507,10 @@ static void misc(void) ASSERT_EQ(misc_skel->bss->nr_hwtstamp, 0, "nr_hwtstamp"); + ASSERT_TRUE(misc_skel->bss->nodelay_est_ok, "nodelay_est_ok"); + ASSERT_TRUE(misc_skel->bss->nodelay_hdr_len_reject, "nodelay_hdr_len_reject"); + ASSERT_TRUE(misc_skel->bss->nodelay_write_hdr_reject, "nodelay_write_hdr_reject"); + check_linum: ASSERT_FALSE(check_error_linum(&sk_fds), "check_error_linum"); sk_fds_close(&sk_fds); diff --git a/tools/testing/selftests/bpf/progs/bpf_cubic.c b/tools/testing/selftests/bpf/progs/bpf_cubic.c index ce18a4db813fab..ebd5a1e69f5606 100644 --- a/tools/testing/selftests/bpf/progs/bpf_cubic.c +++ b/tools/testing/selftests/bpf/progs/bpf_cubic.c @@ -16,6 +16,7 @@ #include "bpf_tracing_net.h" #include +#include char _license[] SEC("license") = "GPL"; @@ -170,10 +171,18 @@ static void bictcp_hystart_reset(struct sock *sk) ca->sample_cnt = 0; } +bool nodelay_init_reject = false; +bool nodelay_cwnd_event_tx_start_reject = false; + SEC("struct_ops") void BPF_PROG(bpf_cubic_init, struct sock *sk) { struct bpf_bictcp *ca = inet_csk_ca(sk); + int true_val = 1, ret; + + ret = bpf_setsockopt(sk, SOL_TCP, TCP_NODELAY, &true_val, sizeof(true_val)); + if (ret == -EOPNOTSUPP) + nodelay_init_reject = true; bictcp_reset(ca); @@ -189,8 +198,13 @@ void BPF_PROG(bpf_cubic_cwnd_event_tx_start, struct sock *sk) { struct bpf_bictcp *ca = inet_csk_ca(sk); __u32 now = tcp_jiffies32; + int true_val = 1, ret; __s32 delta; + ret = bpf_setsockopt(sk, SOL_TCP, TCP_NODELAY, &true_val, sizeof(true_val)); + if (ret == -EOPNOTSUPP) + nodelay_cwnd_event_tx_start_reject = true; + delta = now - tcp_sk(sk)->lsndtime; /* We were application limited (idle) for a while. diff --git a/tools/testing/selftests/bpf/progs/sockopt_sk.c b/tools/testing/selftests/bpf/progs/sockopt_sk.c index cb990a7d3d4586..5e0b27e7855cb8 100644 --- a/tools/testing/selftests/bpf/progs/sockopt_sk.c +++ b/tools/testing/selftests/bpf/progs/sockopt_sk.c @@ -149,6 +149,20 @@ int _setsockopt(struct bpf_sockopt *ctx) if (sk && sk->family == AF_NETLINK) goto out; + if (sk && sk->family == AF_INET && sk->type == SOCK_RAW) { + struct bpf_tcp_sock *tp = bpf_tcp_sock(sk); + + if (tp) { + char saved_syn[60]; + + bpf_getsockopt(sk, SOL_TCP, TCP_SAVED_SYN, + &saved_syn, sizeof(saved_syn)); + goto consumed; + } + + goto out; + } + /* Make sure bpf_get_netns_cookie is callable. */ if (bpf_get_netns_cookie(NULL) == 0) @@ -224,6 +238,8 @@ int _setsockopt(struct bpf_sockopt *ctx) return 0; /* couldn't get sk storage */ storage->val = optval[0]; + +consumed: ctx->optlen = -1; /* BPF has consumed this option, don't call kernel * setsockopt handler. */ diff --git a/tools/testing/selftests/bpf/progs/test_misc_tcp_hdr_options.c b/tools/testing/selftests/bpf/progs/test_misc_tcp_hdr_options.c index d487153a839d71..ed5a0011b8639f 100644 --- a/tools/testing/selftests/bpf/progs/test_misc_tcp_hdr_options.c +++ b/tools/testing/selftests/bpf/progs/test_misc_tcp_hdr_options.c @@ -29,6 +29,10 @@ unsigned int nr_syn = 0; unsigned int nr_fin = 0; unsigned int nr_hwtstamp = 0; +bool nodelay_est_ok = false; +bool nodelay_hdr_len_reject = false; +bool nodelay_write_hdr_reject = false; + /* Check the header received from the active side */ static int __check_active_hdr_in(struct bpf_sock_ops *skops, bool check_syn) { @@ -300,7 +304,7 @@ static int handle_passive_estab(struct bpf_sock_ops *skops) SEC("sockops") int misc_estab(struct bpf_sock_ops *skops) { - int true_val = 1; + int true_val = 1, false_val = 0, ret; switch (skops->op) { case BPF_SOCK_OPS_TCP_LISTEN_CB: @@ -316,10 +320,19 @@ int misc_estab(struct bpf_sock_ops *skops) case BPF_SOCK_OPS_PARSE_HDR_OPT_CB: return handle_parse_hdr(skops); case BPF_SOCK_OPS_HDR_OPT_LEN_CB: + ret = bpf_setsockopt(skops, SOL_TCP, TCP_NODELAY, &true_val, sizeof(true_val)); + if (ret == -EOPNOTSUPP) + nodelay_hdr_len_reject = true; return handle_hdr_opt_len(skops); case BPF_SOCK_OPS_WRITE_HDR_OPT_CB: + ret = bpf_setsockopt(skops, SOL_TCP, TCP_NODELAY, &true_val, sizeof(true_val)); + if (ret == -EOPNOTSUPP) + nodelay_write_hdr_reject = true; return handle_write_hdr_opt(skops); case BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB: + ret = bpf_setsockopt(skops, SOL_TCP, TCP_NODELAY, &false_val, sizeof(false_val)); + if (!ret) + nodelay_est_ok = true; return handle_passive_estab(skops); } diff --git a/tools/testing/selftests/cgroup/lib/cgroup_util.c b/tools/testing/selftests/cgroup/lib/cgroup_util.c index 6a7295347e90b1..42f54936f4bbd3 100644 --- a/tools/testing/selftests/cgroup/lib/cgroup_util.c +++ b/tools/testing/selftests/cgroup/lib/cgroup_util.c @@ -106,8 +106,9 @@ int cg_read_strcmp(const char *cgroup, const char *control, /* Handle the case of comparing against empty string */ if (!expected) return -1; - else - size = strlen(expected) + 1; + + /* needs size > 1, otherwise cg_read() reads 0 bytes */ + size = (expected[0] == '\0') ? 2 : strlen(expected) + 1; buf = malloc(size); if (!buf) diff --git a/tools/testing/selftests/cgroup/test_cpuset_v1_base.sh b/tools/testing/selftests/cgroup/test_cpuset_v1_base.sh index 42a6628fb8bc3e..1c0444729e707d 100755 --- a/tools/testing/selftests/cgroup/test_cpuset_v1_base.sh +++ b/tools/testing/selftests/cgroup/test_cpuset_v1_base.sh @@ -18,7 +18,7 @@ write_test() { echo "testing $interface $value" echo $value > $dir/$interface new=$(cat $dir/$interface) - [[ $value -ne $(cat $dir/$interface) ]] && { + [[ "$value" != "$new" ]] && { echo "$interface write $value failed: new:$new" exit 1 } diff --git a/tools/testing/selftests/cgroup/test_kmem.c b/tools/testing/selftests/cgroup/test_kmem.c index eeabd34bf08370..12f59925500bd4 100644 --- a/tools/testing/selftests/cgroup/test_kmem.c +++ b/tools/testing/selftests/cgroup/test_kmem.c @@ -368,11 +368,15 @@ static int test_percpu_basic(const char *root) for (i = 0; i < 1000; i++) { child = cg_name_indexed(parent, "child", i); - if (!child) - return -1; + if (!child) { + ret = -1; + goto cleanup_children; + } - if (cg_create(child)) + if (cg_create(child)) { + free(child); goto cleanup_children; + } free(child); } diff --git a/tools/testing/selftests/drivers/net/hw/Makefile b/tools/testing/selftests/drivers/net/hw/Makefile index 85ca4d1ecf9ecb..82809d5b24780c 100644 --- a/tools/testing/selftests/drivers/net/hw/Makefile +++ b/tools/testing/selftests/drivers/net/hw/Makefile @@ -31,6 +31,7 @@ TEST_PROGS = \ hw_stats_l3.sh \ hw_stats_l3_gre.sh \ iou-zcrx.py \ + ipsec_vxlan.py \ irq.py \ loopback.sh \ nic_timestamp.py \ diff --git a/tools/testing/selftests/drivers/net/hw/config b/tools/testing/selftests/drivers/net/hw/config index dd50cb8a79110c..8c132ace2b8de1 100644 --- a/tools/testing/selftests/drivers/net/hw/config +++ b/tools/testing/selftests/drivers/net/hw/config @@ -3,6 +3,10 @@ CONFIG_FAIL_FUNCTION=y CONFIG_FAULT_INJECTION=y CONFIG_FAULT_INJECTION_DEBUG_FS=y CONFIG_FUNCTION_ERROR_INJECTION=y +CONFIG_INET6_ESP=y +CONFIG_INET6_ESP_OFFLOAD=y +CONFIG_INET_ESP=y +CONFIG_INET_ESP_OFFLOAD=y CONFIG_IO_URING=y CONFIG_IPV6=y CONFIG_IPV6_GRE=y @@ -14,3 +18,4 @@ CONFIG_NETKIT=y CONFIG_NET_SCH_INGRESS=y CONFIG_UDMABUF=y CONFIG_VXLAN=y +CONFIG_XFRM_USER=y diff --git a/tools/testing/selftests/drivers/net/hw/ipsec_vxlan.py b/tools/testing/selftests/drivers/net/hw/ipsec_vxlan.py new file mode 100755 index 00000000000000..0740a4d852407e --- /dev/null +++ b/tools/testing/selftests/drivers/net/hw/ipsec_vxlan.py @@ -0,0 +1,204 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 +"""Traffic test for VXLAN + IPsec crypto-offload.""" + +import os + +from lib.py import ksft_run, ksft_exit, ksft_eq, ksft_ge +from lib.py import ksft_variants, KsftNamedVariant, KsftSkipEx +from lib.py import CmdExitFailure, NetDrvEpEnv, cmd, defer, ethtool, ip +from lib.py import Iperf3Runner + +# Inner tunnel addresses - TEST-NET-2 (RFC 5737) / doc prefix (RFC 3849) +INNER_V4_LOCAL = "198.51.100.1" +INNER_V4_REMOTE = "198.51.100.2" +INNER_V6_LOCAL = "2001:db8:100::1" +INNER_V6_REMOTE = "2001:db8:100::2" + +# ESP parameters +SPI_OUT = "0x1000" +SPI_IN = "0x1001" +# 128-bit key + 32-bit salt = 20 bytes hex, 128-bit ICV +ESP_AEAD = "aead 'rfc4106(gcm(aes))' 0x" + "01" * 20 + " 128" + + +def xfrm(args, host=None): + """Runs 'ip xfrm' via shell to preserve parentheses in algo names.""" + cmd(f"ip xfrm {args}", shell=True, host=host) + + +def check_xfrm_offload_support(): + """Skips if iproute2 lacks xfrm offload support.""" + out = cmd("ip xfrm state help", fail=False) + if "offload" not in out.stdout + out.stderr: + raise KsftSkipEx("iproute2 too old, missing xfrm offload") + + +def check_esp_hw_offload(cfg): + """Skips if device lacks esp-hw-offload support.""" + check_xfrm_offload_support() + try: + feat = ethtool(f"-k {cfg.ifname}", json=True)[0] + except (CmdExitFailure, IndexError) as e: + raise KsftSkipEx(f"can't query features: {e}") from e + if not feat.get("esp-hw-offload", {}).get("active"): + raise KsftSkipEx("Device does not support esp-hw-offload") + + +def get_tx_drops(cfg): + """Returns TX dropped counter from the physical device.""" + stats = ip("-s -s link show dev " + cfg.ifname, json=True)[0] + return stats["stats64"]["tx"]["dropped"] + + +def setup_vxlan_ipsec(cfg, outer_ipver, inner_ipver): + """Sets up VXLAN tunnel with IPsec transport-mode crypto-offload.""" + vxlan_name = f"vx{os.getpid()}" + local_addr = cfg.addr_v[outer_ipver] + remote_addr = cfg.remote_addr_v[outer_ipver] + + if inner_ipver == "4": + inner_local = f"{INNER_V4_LOCAL}/24" + inner_remote = f"{INNER_V4_REMOTE}/24" + addr_extra = "" + else: + inner_local = f"{INNER_V6_LOCAL}/64" + inner_remote = f"{INNER_V6_REMOTE}/64" + addr_extra = " nodad" + + if outer_ipver == "6": + vxlan_opts = "udp6zerocsumtx udp6zerocsumrx" + else: + vxlan_opts = "noudpcsum" + + # VXLAN tunnel - local side + ip(f"link add {vxlan_name} type vxlan id 100 dstport 4789 {vxlan_opts} " + f"local {local_addr} remote {remote_addr} dev {cfg.ifname}") + defer(ip, f"link del {vxlan_name}") + ip(f"addr add {inner_local} dev {vxlan_name}{addr_extra}") + ip(f"link set {vxlan_name} up") + + # VXLAN tunnel - remote side + ip(f"link add {vxlan_name} type vxlan id 100 dstport 4789 {vxlan_opts} " + f"local {remote_addr} remote {local_addr} dev {cfg.remote_ifname}", + host=cfg.remote) + defer(ip, f"link del {vxlan_name}", host=cfg.remote) + ip(f"addr add {inner_remote} dev {vxlan_name}{addr_extra}", + host=cfg.remote) + ip(f"link set {vxlan_name} up", host=cfg.remote) + + # xfrm state - local outbound SA + xfrm(f"state add src {local_addr} dst {remote_addr} " + f"proto esp spi {SPI_OUT} " + f"{ESP_AEAD} " + f"mode transport offload crypto dev {cfg.ifname} dir out") + defer(xfrm, f"state del src {local_addr} dst {remote_addr} " + f"proto esp spi {SPI_OUT}") + + # xfrm state - local inbound SA + xfrm(f"state add src {remote_addr} dst {local_addr} " + f"proto esp spi {SPI_IN} " + f"{ESP_AEAD} " + f"mode transport offload crypto dev {cfg.ifname} dir in") + defer(xfrm, f"state del src {remote_addr} dst {local_addr} " + f"proto esp spi {SPI_IN}") + + # xfrm state - remote outbound SA (mirror, software crypto) + xfrm(f"state add src {remote_addr} dst {local_addr} " + f"proto esp spi {SPI_IN} " + f"{ESP_AEAD} " + f"mode transport", + host=cfg.remote) + defer(xfrm, f"state del src {remote_addr} dst {local_addr} " + f"proto esp spi {SPI_IN}", host=cfg.remote) + + # xfrm state - remote inbound SA (mirror, software crypto) + xfrm(f"state add src {local_addr} dst {remote_addr} " + f"proto esp spi {SPI_OUT} " + f"{ESP_AEAD} " + f"mode transport", + host=cfg.remote) + defer(xfrm, f"state del src {local_addr} dst {remote_addr} " + f"proto esp spi {SPI_OUT}", host=cfg.remote) + + # xfrm policy - local out + xfrm(f"policy add src {local_addr} dst {remote_addr} " + f"proto udp dport 4789 dir out " + f"tmpl src {local_addr} dst {remote_addr} proto esp mode transport") + defer(xfrm, f"policy del src {local_addr} dst {remote_addr} " + f"proto udp dport 4789 dir out") + + # xfrm policy - local in + xfrm(f"policy add src {remote_addr} dst {local_addr} " + f"proto udp dport 4789 dir in " + f"tmpl src {remote_addr} dst {local_addr} proto esp mode transport") + defer(xfrm, f"policy del src {remote_addr} dst {local_addr} " + f"proto udp dport 4789 dir in") + + # xfrm policy - remote out + xfrm(f"policy add src {remote_addr} dst {local_addr} " + f"proto udp dport 4789 dir out " + f"tmpl src {remote_addr} dst {local_addr} proto esp mode transport", + host=cfg.remote) + defer(xfrm, f"policy del src {remote_addr} dst {local_addr} " + f"proto udp dport 4789 dir out", host=cfg.remote) + + # xfrm policy - remote in + xfrm(f"policy add src {local_addr} dst {remote_addr} " + f"proto udp dport 4789 dir in " + f"tmpl src {local_addr} dst {remote_addr} proto esp mode transport", + host=cfg.remote) + defer(xfrm, f"policy del src {local_addr} dst {remote_addr} " + f"proto udp dport 4789 dir in", host=cfg.remote) + + +def _vxlan_ipsec_variants(): + """Generates outer/inner IP version variants.""" + for outer in ["4", "6"]: + for inner in ["4", "6"]: + yield KsftNamedVariant(f"outer_v{outer}_inner_v{inner}", outer, inner) + + +@ksft_variants(_vxlan_ipsec_variants()) +def test_vxlan_ipsec_crypto_offload(cfg, outer_ipver, inner_ipver): + """Tests VXLAN+IPsec crypto-offload has no TX drops.""" + cfg.require_ipver(outer_ipver) + check_esp_hw_offload(cfg) + + setup_vxlan_ipsec(cfg, outer_ipver, inner_ipver) + + if inner_ipver == "4": + inner_local = INNER_V4_LOCAL + inner_remote = INNER_V4_REMOTE + ping = "ping" + else: + inner_local = INNER_V6_LOCAL + inner_remote = INNER_V6_REMOTE + ping = "ping -6" + + cmd(f"{ping} -c 1 -W 2 {inner_remote}") + + drops_before = get_tx_drops(cfg) + + runner = Iperf3Runner(cfg, server_ip=inner_local, + client_ip=inner_remote) + bw_gbps = runner.measure_bandwidth(reverse=True) + + cfg.wait_hw_stats_settle() + drops_after = get_tx_drops(cfg) + + ksft_eq(drops_after - drops_before, 0, + comment="TX drops during VXLAN+IPsec") + ksft_ge(bw_gbps, 0.1, + comment="Minimum 100Mbps over VXLAN+IPsec") + + +def main(): + """Runs VXLAN+IPsec crypto-offload GSO selftest.""" + with NetDrvEpEnv(__file__, nsim_test=False) as cfg: + ksft_run([test_vxlan_ipsec_crypto_offload], args=(cfg,)) + ksft_exit() + + +if __name__ == "__main__": + main() diff --git a/tools/testing/selftests/drivers/net/lib/py/load.py b/tools/testing/selftests/drivers/net/lib/py/load.py index f181fa2d38fca0..e24660e5c27f3b 100644 --- a/tools/testing/selftests/drivers/net/lib/py/load.py +++ b/tools/testing/selftests/drivers/net/lib/py/load.py @@ -48,7 +48,10 @@ def start_client(self, background=False, streams=1, duration=10, reverse=False): Starts the iperf3 client with the configured options. """ cmdline = self._build_client(streams, duration, reverse) - return cmd(cmdline, background=background, host=self.env.remote) + kwargs = {"background": background, "host": self.env.remote} + if not background: + kwargs["timeout"] = duration + 5 + return cmd(cmdline, **kwargs) def measure_bandwidth(self, reverse=False): """ diff --git a/tools/testing/selftests/drivers/net/shaper.py b/tools/testing/selftests/drivers/net/shaper.py index 11310f19bfa028..e39d270e688df2 100755 --- a/tools/testing/selftests/drivers/net/shaper.py +++ b/tools/testing/selftests/drivers/net/shaper.py @@ -1,7 +1,10 @@ #!/usr/bin/env python3 # SPDX-License-Identifier: GPL-2.0 -from lib.py import ksft_run, ksft_exit, ksft_eq, ksft_true, KsftSkipEx +import errno + +from lib.py import ksft_run, ksft_exit +from lib.py import ksft_eq, ksft_raises, ksft_true, KsftSkipEx from lib.py import EthtoolFamily, NetshaperFamily from lib.py import NetDrvEnv from lib.py import NlError @@ -438,6 +441,21 @@ def queue_update(cfg, nl_shaper) -> None: nl_shaper.delete({'ifindex': cfg.ifindex, 'handle': {'scope': 'queue', 'id': i}}) +def dup_leaves(cfg, nl_shaper) -> None: + """ Ensure that the kernel rejects duplicate leaves. """ + if not cfg.groups: + raise KsftSkipEx("device does not support node scope") + + with ksft_raises(NlError) as cm: + nl_shaper.group({ + 'ifindex': cfg.ifindex, + 'leaves':[{'handle': {'scope': 'queue', 'id': 0}}, + {'handle': {'scope': 'queue', 'id': 0}}], + 'handle': {'scope':'node'}, + 'metric': 'bps', + 'bw-max': 10000}) + ksft_eq(cm.exception.error, errno.EINVAL) + def main() -> None: with NetDrvEnv(__file__, queue_count=4) as cfg: cfg.queues = False @@ -453,7 +471,9 @@ def main() -> None: basic_groups, qgroups, delegation, - queue_update], args=(cfg, NetshaperFamily())) + dup_leaves, + queue_update], + args=(cfg, NetshaperFamily())) ksft_exit() diff --git a/tools/testing/selftests/kselftest.h b/tools/testing/selftests/kselftest.h index 6d809f08ab7b1b..60838b61a2da52 100644 --- a/tools/testing/selftests/kselftest.h +++ b/tools/testing/selftests/kselftest.h @@ -450,7 +450,7 @@ static inline __noreturn __printf(1, 2) void ksft_exit_skip(const char *msg, ... */ if (ksft_plan || ksft_test_num()) { ksft_cnt.ksft_xskip++; - printf("ok %u # SKIP ", 1 + ksft_test_num()); + printf("ok %u # SKIP ", ksft_test_num()); } else { printf("1..0 # SKIP "); } diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h index 75fb016cd190bb..cfdce9cd252e75 100644 --- a/tools/testing/selftests/kselftest_harness.h +++ b/tools/testing/selftests/kselftest_harness.h @@ -76,7 +76,7 @@ static inline void __kselftest_memset_safe(void *s, int c, size_t n) memset(s, c, n); } -#define KSELFTEST_PRIO_TEST_F 20000 +#define KSELFTEST_PRIO_TEST 20000 #define KSELFTEST_PRIO_XFAIL 20001 #define TEST_TIMEOUT_DEFAULT 30 @@ -194,7 +194,7 @@ static inline void __kselftest_memset_safe(void *s, int c, size_t n) .fixture = &_fixture_global, \ .termsig = _signal, \ .timeout = TEST_TIMEOUT_DEFAULT, }; \ - static void __attribute__((constructor)) _register_##test_name(void) \ + static void __attribute__((constructor(KSELFTEST_PRIO_TEST))) _register_##test_name(void) \ { \ __register_test(&_##test_name##_object); \ } \ @@ -238,7 +238,7 @@ static inline void __kselftest_memset_safe(void *s, int c, size_t n) FIXTURE_VARIANT(fixture_name); \ static struct __fixture_metadata _##fixture_name##_fixture_object = \ { .name = #fixture_name, }; \ - static void __attribute__((constructor)) \ + static void __attribute__((constructor(KSELFTEST_PRIO_TEST))) \ _register_##fixture_name##_data(void) \ { \ __register_fixture(&_##fixture_name##_fixture_object); \ @@ -364,7 +364,7 @@ static inline void __kselftest_memset_safe(void *s, int c, size_t n) _##fixture_name##_##variant_name##_object = \ { .name = #variant_name, \ .data = &_##fixture_name##_##variant_name##_variant}; \ - static void __attribute__((constructor)) \ + static void __attribute__((constructor(KSELFTEST_PRIO_TEST))) \ _register_##fixture_name##_##variant_name(void) \ { \ __register_fixture_variant(&_##fixture_name##_fixture_object, \ @@ -468,7 +468,7 @@ static inline void __kselftest_memset_safe(void *s, int c, size_t n) fixture_name##_teardown(_metadata, self, variant); \ } \ static struct __test_metadata *_##fixture_name##_##test_name##_object; \ - static void __attribute__((constructor(KSELFTEST_PRIO_TEST_F))) \ + static void __attribute__((constructor(KSELFTEST_PRIO_TEST))) \ _register_##fixture_name##_##test_name(void) \ { \ struct __test_metadata *object = mmap(NULL, sizeof(*object), \ @@ -1323,7 +1323,7 @@ static int test_harness_run(int argc, char **argv) return KSFT_FAIL; } -static void __attribute__((constructor)) __constructor_order_first(void) +static void __attribute__((constructor(KSELFTEST_PRIO_TEST))) __constructor_order_first(void) { __constructor_order_forward = true; } diff --git a/tools/testing/selftests/kvm/guest_memfd_test.c b/tools/testing/selftests/kvm/guest_memfd_test.c index d6528c6f5e031d..253e748c1d4aa8 100644 --- a/tools/testing/selftests/kvm/guest_memfd_test.c +++ b/tools/testing/selftests/kvm/guest_memfd_test.c @@ -510,7 +510,12 @@ static void test_guest_memfd_guest(void) "Default VM type should support INIT_SHARED, supported flags = 0x%x", vm_check_cap(vm, KVM_CAP_GUEST_MEMFD_FLAGS)); - size = vm->page_size; + /* + * Use the max of the host or guest page size for all operations, as + * KVM requires guest_memfd files and memslots to be sized to multiples + * of the host page size. + */ + size = max_t(size_t, vm->page_size, page_size); fd = vm_create_guest_memfd(vm, size, GUEST_MEMFD_FLAG_MMAP | GUEST_MEMFD_FLAG_INIT_SHARED); vm_set_user_memory_region2(vm, slot, KVM_MEM_GUEST_MEMFD, gpa, size, NULL, fd, 0); @@ -519,7 +524,7 @@ static void test_guest_memfd_guest(void) memset(mem, 0xaa, size); kvm_munmap(mem, size); - virt_pg_map(vm, gpa, gpa); + virt_map(vm, gpa, gpa, size / vm->page_size); vcpu_args_set(vcpu, 2, gpa, size); vcpu_run(vcpu); diff --git a/tools/testing/selftests/kvm/steal_time.c b/tools/testing/selftests/kvm/steal_time.c index 7df2bc8eec0221..76fcdd1fd3cb48 100644 --- a/tools/testing/selftests/kvm/steal_time.c +++ b/tools/testing/selftests/kvm/steal_time.c @@ -220,6 +220,8 @@ static void check_steal_time_uapi(void) }; vcpu_ioctl(vcpu, KVM_HAS_DEVICE_ATTR, &dev); + vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS, ST_GPA_BASE, 1, 1, 0); + virt_map(vm, ST_GPA_BASE, ST_GPA_BASE, 1); st_ipa = (ulong)ST_GPA_BASE | 1; ret = __vcpu_ioctl(vcpu, KVM_SET_DEVICE_ATTR, &dev); diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile index a275ed5840265c..f3da38c54d276d 100644 --- a/tools/testing/selftests/net/Makefile +++ b/tools/testing/selftests/net/Makefile @@ -96,6 +96,7 @@ TEST_PROGS := \ srv6_hl2encap_red_l2vpn_test.sh \ srv6_iptunnel_cache.sh \ stress_reuseport_listen.sh \ + tcp_ecmp_failover.sh \ tcp_fastopen_backup_key.sh \ test_bpf.sh \ test_bridge_backup_port.sh \ diff --git a/tools/testing/selftests/net/mptcp/mptcp_lib.sh b/tools/testing/selftests/net/mptcp/mptcp_lib.sh index 5fea7e7df628c8..989a5975dcea62 100644 --- a/tools/testing/selftests/net/mptcp/mptcp_lib.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_lib.sh @@ -474,20 +474,24 @@ mptcp_lib_wait_local_port_listen() { wait_local_port_listen "${@}" "tcp" } +# $1: error file, $2: cmd, $3: expected msg, [$4: expected error] mptcp_lib_check_output() { local err="${1}" local cmd="${2}" local expected="${3}" + local exp_error="${4:-0}" local cmd_ret=0 local out - if ! out=$(${cmd} 2>"${err}"); then - cmd_ret=${?} - fi + out=$(${cmd} 2>"${err}") || cmd_ret=1 - if [ ${cmd_ret} -ne 0 ]; then - mptcp_lib_pr_fail "command execution '${cmd}' stderr" - cat "${err}" + if [ "${cmd_ret}" != "${exp_error}" ]; then + mptcp_lib_pr_fail "unexpected returned code for '${cmd}', info:" + if [ "${exp_error}" = 0 ]; then + cat "${err}" + else + echo "${out}" + fi return 2 elif [ "${out}" = "${expected}" ]; then return 0 diff --git a/tools/testing/selftests/net/mptcp/pm_netlink.sh b/tools/testing/selftests/net/mptcp/pm_netlink.sh index 123d9d7a0278cd..04594dfc22b134 100755 --- a/tools/testing/selftests/net/mptcp/pm_netlink.sh +++ b/tools/testing/selftests/net/mptcp/pm_netlink.sh @@ -122,10 +122,12 @@ check() local cmd="$1" local expected="$2" local msg="$3" + local exp_error="$4" local rc=0 mptcp_lib_print_title "$msg" - mptcp_lib_check_output "${err}" "${cmd}" "${expected}" || rc=${?} + mptcp_lib_check_output "${err}" "${cmd}" "${expected}" "${exp_error}" || + rc=${?} if [ ${rc} -eq 2 ]; then mptcp_lib_result_fail "${msg} # error ${rc}" ret=${KSFT_FAIL} @@ -158,13 +160,13 @@ check "show_endpoints" \ "3,10.0.1.3,signal backup")" "dump addrs" del_endpoint 2 -check "get_endpoint 2" "" "simple del addr" +check "get_endpoint 2" "" "simple del addr" 1 check "show_endpoints" \ "$(format_endpoints "1,10.0.1.1" \ "3,10.0.1.3,signal backup")" "dump addrs after del" add_endpoint 10.0.1.3 2>/dev/null -check "get_endpoint 4" "" "duplicate addr" +check "get_endpoint 4" "" "duplicate addr" 1 add_endpoint 10.0.1.4 flags signal check "get_endpoint 4" "$(format_endpoints "4,10.0.1.4,signal")" "id addr increment" @@ -173,7 +175,7 @@ for i in $(seq 5 9); do add_endpoint "10.0.1.${i}" flags signal >/dev/null 2>&1 done check "get_endpoint 9" "$(format_endpoints "9,10.0.1.9,signal")" "hard addr limit" -check "get_endpoint 10" "" "above hard addr limit" +check "get_endpoint 10" "" "above hard addr limit" 1 del_endpoint 9 for i in $(seq 10 255); do @@ -192,9 +194,13 @@ check "show_endpoints" \ flush_endpoint check "show_endpoints" "" "flush addrs" -add_endpoint 10.0.1.1 flags unknown -check "show_endpoints" "$(format_endpoints "1,10.0.1.1")" "ignore unknown flags" -flush_endpoint +# "unknown" flag is only supported by pm_nl_ctl +if ! mptcp_lib_is_ip_mptcp; then + add_endpoint 10.0.1.1 flags unknown + check "show_endpoints" "$(format_endpoints "1,10.0.1.1")" \ + "ignore unknown flags" + flush_endpoint +fi set_limits 9 1 2>/dev/null check "get_limits" "${default_limits}" "rcv addrs above hard limit" diff --git a/tools/testing/selftests/net/openvswitch/openvswitch.sh b/tools/testing/selftests/net/openvswitch/openvswitch.sh index b327d3061ed53a..3cdd953f681324 100755 --- a/tools/testing/selftests/net/openvswitch/openvswitch.sh +++ b/tools/testing/selftests/net/openvswitch/openvswitch.sh @@ -26,6 +26,7 @@ tests=" netlink_checks ovsnl: validate netlink attrs and settings upcall_interfaces ovs: test the upcall interfaces tunnel_metadata ovs: test extraction of tunnel metadata + tunnel_refcount ovs: test tunnel vport reference cleanup drop_reason drop: test drop reasons are emitted psample psample: Sampling packets with psample" @@ -830,6 +831,42 @@ test_tunnel_metadata() { return 0 } +test_tunnel_refcount() { + sbxname="test_tunnel_refcount" + sbx_add "${sbxname}" || return 1 + + ovs_sbx "${sbxname}" ip netns add trefns || return 1 + on_exit "ovs_sbx ${sbxname} ip netns del trefns" + + for tun_type in gre vxlan geneve; do + info "testing ${tun_type} tunnel vport refcount" + + ovs_sbx "${sbxname}" ip netns exec trefns \ + python3 $ovs_base/ovs-dpctl.py \ + add-dp dp-${tun_type} || return 1 + + ovs_sbx "${sbxname}" ip netns exec trefns \ + python3 $ovs_base/ovs-dpctl.py \ + add-if --no-lwt -t ${tun_type} \ + dp-${tun_type} ovs-${tun_type}0 || return 1 + + ovs_wait ip -netns trefns link show \ + ovs-${tun_type}0 >/dev/null 2>&1 || return 1 + + info "deleting dp - may hang if reference counting is broken" + ovs_sbx "${sbxname}" ip netns exec trefns \ + python3 $ovs_base/ovs-dpctl.py \ + del-dp dp-${tun_type} & + + dev_removed() { + ! ip -netns trefns link show "$1" >/dev/null 2>&1 + } + ovs_wait dev_removed dp-${tun_type} || return 1 + ovs_wait dev_removed ovs-${tun_type}0 || return 1 + done + return 0 +} + run_test() { ( tname="$1" diff --git a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py index 848f61fdcee09a..bbe35e2718d269 100644 --- a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py +++ b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py @@ -11,7 +11,6 @@ import math import multiprocessing import re -import socket import struct import sys import time @@ -2069,7 +2068,7 @@ def str_to_type(vport_type): elif vport_type == "internal": return OvsVport.OVS_VPORT_TYPE_INTERNAL elif vport_type == "gre": - return OvsVport.OVS_VPORT_TYPE_INTERNAL + return OvsVport.OVS_VPORT_TYPE_GRE elif vport_type == "vxlan": return OvsVport.OVS_VPORT_TYPE_VXLAN elif vport_type == "geneve": @@ -2121,6 +2120,7 @@ def attach(self, dpindex, vport_ifname, ptype, dport, lwt): ) TUNNEL_DEFAULTS = [("geneve", 6081), + ("gre", 0), ("vxlan", 4789)] for tnl in TUNNEL_DEFAULTS: @@ -2129,9 +2129,13 @@ def attach(self, dpindex, vport_ifname, ptype, dport, lwt): dport = tnl[1] if not lwt: + if tnl[0] == "gre": + # GRE tunnels have no options. + break + vportopt = OvsVport.ovs_vport_msg.vportopts() vportopt["attrs"].append( - ["OVS_TUNNEL_ATTR_DST_PORT", socket.htons(dport)] + ["OVS_TUNNEL_ATTR_DST_PORT", dport] ) msg["attrs"].append( ["OVS_VPORT_ATTR_OPTIONS", vportopt] @@ -2145,6 +2149,9 @@ def attach(self, dpindex, vport_ifname, ptype, dport, lwt): geneve_port=dport, geneve_collect_metadata=True, geneve_udp_zero_csum6_rx=1) + elif tnl[0] == "gre": + ipr.link("add", ifname=vport_ifname, kind="gretap", + gre_collect_metadata=True) elif tnl[0] == "vxlan": ipr.link("add", ifname=vport_ifname, kind=tnl[0], vxlan_learning=0, vxlan_collect_metadata=1, @@ -2563,7 +2570,7 @@ def print_ovsdp_full(dp_lookup_rep, ifindex, ndb=NDB(), vpl=OvsVport()): if vpo: dpo = vpo.get_attr("OVS_TUNNEL_ATTR_DST_PORT") if dpo: - opts += " tnl-dport:%s" % socket.ntohs(dpo) + opts += " tnl-dport:%s" % dpo print( " port %d: %s (%s%s)" % ( @@ -2632,7 +2639,7 @@ def main(argv): "--ptype", type=str, default="netdev", - choices=["netdev", "internal", "geneve", "vxlan"], + choices=["netdev", "internal", "gre", "geneve", "vxlan"], help="Interface type (default netdev)", ) addifcmd.add_argument( @@ -2645,7 +2652,7 @@ def main(argv): addifcmd.add_argument( "-l", "--lwt", - type=bool, + action=argparse.BooleanOptionalAction, default=True, help="Use LWT infrastructure instead of vport (default true)." ) diff --git a/tools/testing/selftests/net/ovpn/test.sh b/tools/testing/selftests/net/ovpn/test.sh index b50dbe45a4d00a..c06e3135fbef8e 100755 --- a/tools/testing/selftests/net/ovpn/test.sh +++ b/tools/testing/selftests/net/ovpn/test.sh @@ -98,10 +98,10 @@ ovpn_run_basic_traffic() { sleep 0.3 ovpn_cmd_ok "send baseline traffic to peer ${p}" \ ip netns exec ovpn_peer0 \ - ping -qfc 500 -w 3 5.5.5.$((p + 1)) + ping -qfc 100 -w 3 5.5.5.$((p + 1)) ovpn_cmd_ok "send large-payload traffic to peer ${p}" \ ip netns exec ovpn_peer0 \ - ping -qfc 500 -s 3000 -w 3 5.5.5.$((p + 1)) + ping -qfc 100 -s 3000 -w 3 5.5.5.$((p + 1)) wait "${tcpdump_pid1}" || return 1 wait "${tcpdump_pid2}" || return 1 diff --git a/tools/testing/selftests/net/tcp_ecmp_failover.sh b/tools/testing/selftests/net/tcp_ecmp_failover.sh new file mode 100755 index 00000000000000..5768aa8bff6a61 --- /dev/null +++ b/tools/testing/selftests/net/tcp_ecmp_failover.sh @@ -0,0 +1,216 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# +# Copyright 2026 Google LLC. +# +# This test verifies TCP flow failover between ECMP routes +# upon carrier loss on the active device. +# +# socat -----------------------------> socat +# | +# .-- veth-c1 -|- veth-s1 --. +# dummy0 -| | |-- dummy0 +# '-- veth-c2 -|- veth-s2 --' +# | +# + +REQUIRE_JQ=no +REQUIRE_MZ=no +NUM_NETIFS=0 + +source forwarding/lib.sh + +CLIENT_IP="10.0.59.1" +SERVER_IP="10.0.92.1" +CLIENT_IP6="2001:db8:5a9a::1" +SERVER_IP6="2001:db8:9292::1" + +setup_server() +{ + IP="ip -n $server" + NS_EXEC="ip netns exec $server" + + $IP link add dummy0 type dummy + $IP link set dummy0 up + + $IP -4 addr add $SERVER_IP/32 dev dummy0 + $IP -6 addr add $SERVER_IP6/128 dev dummy0 nodad + + $IP link set veth-s1 up + $IP link set veth-s2 up + + $IP -4 addr add 192.168.1.2/24 dev veth-s1 + $IP -4 addr add 192.168.2.2/24 dev veth-s2 + + $IP -4 route add $CLIENT_IP/32 \ + nexthop via 192.168.1.1 dev veth-s1 weight 1 \ + nexthop via 192.168.2.1 dev veth-s2 weight 1 + + $IP -6 addr add 2001:db8:1::2/64 dev veth-s1 nodad + $IP -6 addr add 2001:db8:2::2/64 dev veth-s2 nodad + + $IP -6 route add $CLIENT_IP6/128 \ + nexthop via 2001:db8:1::1 dev veth-s1 weight 1 \ + nexthop via 2001:db8:2::1 dev veth-s2 weight 1 +} + +setup_client() +{ + IP="ip -n $client" + NS_EXEC="ip netns exec $client" + + $IP link add dummy0 type dummy + $IP link set dummy0 up + + $IP -4 addr add $CLIENT_IP/32 dev dummy0 + $IP -6 addr add $CLIENT_IP6/128 dev dummy0 nodad + + $IP link set veth-c1 up + $IP link set veth-c2 up + + $IP -4 addr add 192.168.1.1/24 dev veth-c1 + $IP -4 addr add 192.168.2.1/24 dev veth-c2 + + $IP -4 route add $SERVER_IP/32 \ + nexthop via 192.168.1.2 dev veth-c1 weight 1 \ + nexthop via 192.168.2.2 dev veth-c2 weight 1 + + $IP -6 addr add 2001:db8:1::1/64 dev veth-c1 nodad + $IP -6 addr add 2001:db8:2::1/64 dev veth-c2 nodad + + $IP -6 route add $SERVER_IP6/128 \ + nexthop via 2001:db8:1::2 dev veth-c1 weight 1 \ + nexthop via 2001:db8:2::2 dev veth-c2 weight 1 + + # By default, tcp_retries1=3 triggers a route refresh + # after 3 retransmits (~5s). Ensure this never occurs + # for test stability. + $NS_EXEC sysctl -qw net.ipv4.tcp_retries1=100 + + # When NETDEV_CHANGE is issued for a dev tied to an ECMP + # route, RTNH_F_LINKDOWN is flagged and the sernum is + # bumped to invalidate the route via sk_dst_check(). + # + # Without ignore_routes_with_linkdown=1, subsequent + # lookups may still select the same RTNH_F_LINKDOWN route. + $NS_EXEC sysctl -qw net.ipv4.conf.veth-c1.ignore_routes_with_linkdown=1 + $NS_EXEC sysctl -qw net.ipv4.conf.veth-c2.ignore_routes_with_linkdown=1 + + $NS_EXEC sysctl -qw net.ipv6.conf.veth-c1.ignore_routes_with_linkdown=1 + $NS_EXEC sysctl -qw net.ipv6.conf.veth-c2.ignore_routes_with_linkdown=1 +} + +setup() +{ + setup_ns client server + + ip -n "$client" link add veth-c1 type veth peer veth-s1 netns "$server" + ip -n "$client" link add veth-c2 type veth peer veth-s2 netns "$server" + + setup_server + setup_client +} + +cleanup() +{ + cleanup_all_ns > /dev/null 2>&1 +} + +tcp_ecmp_failover() +{ + local pf=$1; shift + local server_ip=$1; shift + local client_ip=$1; shift + + RET=0 + + tcpdump_start veth-s1 "$server" + tcpdump_start veth-s2 "$server" + + ip netns exec "$server" \ + socat -u TCP-LISTEN:8080,pf="$pf",bind="$server_ip",reuseaddr /dev/null & + server_pid=$! + + # Wait for server to start listening. + # Sometimes client fails without this sleep. + sleep 1 + + ip netns exec "$client" \ + socat -u /dev/zero TCP:"$server_ip":8080,pf="$pf",bind="$client_ip" & + client_pid=$! + + # To capture enough packets. + sleep 3 + + tcpdump_stop veth-s1 + tcpdump_stop veth-s2 + + pkts_s1=$(tcpdump_show veth-s1 | wc -l) + pkts_s2=$(tcpdump_show veth-s2 | wc -l) + + tcpdump_cleanup veth-s1 + tcpdump_cleanup veth-s2 + + # Detect the device chosen by the client + if [ "$pkts_s1" -gt "$pkts_s2" ]; then + veth_down=veth-s1 + veth_up=veth-s2 + else + veth_down=veth-s2 + veth_up=veth-s1 + fi + + # Taking down $veth_down causes its peer to lose carrier, + # triggering NETDEV_CHANGE. This flags RTNH_F_LINKDOWN + # and bumps the sernum for the route associated with that + # peer, invalidating the cached dst in the TCP socket. + # + # Consequently, sk_dst_check() fails, forcing the subsequent + # lookup to select the remaining healthy route via $veth_up. + ip -n "$server" link set "$veth_down" down + + tcpdump_start "$veth_up" "$server" + + # To capture enough packets. + sleep 3 + + tcpdump_stop "$veth_up" + + kill -9 "$client_pid" > /dev/null 2>&1 + kill -9 "$server_pid" > /dev/null 2>&1 + wait 2> /dev/null + + pkts=$(tcpdump_show $veth_up | wc -l) + + tcpdump_cleanup "$veth_up" + + if [ "$pkts" -lt 1000 ]; then + RET=$ksft_fail + fi +} + +test_ipv4() +{ + setup + tcp_ecmp_failover IPv4 $SERVER_IP $CLIENT_IP + log_test "TCP IPv4 failover" + cleanup +} + +test_ipv6() +{ + setup + tcp_ecmp_failover IPv6 "[$SERVER_IP6]" "[$CLIENT_IP6]" + log_test "TCP IPv6 failover" + cleanup +} + +require_command socat +require_command tcpdump + +trap cleanup EXIT + +test_ipv4 +test_ipv6 + +exit "$EXIT_STATUS" diff --git a/tools/testing/selftests/net/tls.c b/tools/testing/selftests/net/tls.c index 9e2ccea13d7023..30a236b8e9f733 100644 --- a/tools/testing/selftests/net/tls.c +++ b/tools/testing/selftests/net/tls.c @@ -946,6 +946,49 @@ TEST_F(tls, peek_and_splice) EXPECT_EQ(memcmp(mem_send, mem_recv, send_len), 0); } +TEST_F(tls, splice_to_pipe_small) +{ + int send_len = TLS_PAYLOAD_MAX_LEN; + char mem_send[TLS_PAYLOAD_MAX_LEN]; + char mem_recv[TLS_PAYLOAD_MAX_LEN]; + size_t total = 0; + int p[2]; + + memrnd(mem_send, sizeof(mem_send)); + + ASSERT_GE(pipe(p), 0); + + /* Shrink pipe to 1 page (typically 4096 bytes) to force multiple + * splice iterations for a 16384-byte TLS record. + */ + EXPECT_GE(fcntl(p[1], F_SETPIPE_SZ, 4096), 4096); + + EXPECT_EQ(send(self->fd, mem_send, send_len, 0), send_len); + + while (total < (size_t)send_len) { + ssize_t spliced, drained; + + spliced = splice(self->cfd, NULL, p[1], NULL, + send_len - total, 0); + EXPECT_GT(spliced, 0); + if (spliced <= 0) + break; + + drained = read(p[0], mem_recv + total, spliced); + EXPECT_EQ(drained, spliced); + if (drained <= 0) + break; + + total += drained; + } + + EXPECT_EQ(total, (size_t)send_len); + EXPECT_EQ(memcmp(mem_send, mem_recv, send_len), 0); + + close(p[0]); + close(p[1]); +} + #define MAX_FRAGS 48 TEST_F(tls, splice_short) { diff --git a/tools/testing/selftests/rseq/Makefile b/tools/testing/selftests/rseq/Makefile index 4ef90823b6526f..50d69e22ee7a67 100644 --- a/tools/testing/selftests/rseq/Makefile +++ b/tools/testing/selftests/rseq/Makefile @@ -14,14 +14,20 @@ LDLIBS += -lpthread -ldl # still track changes to header files and depend on shared object. OVERRIDE_TARGETS = 1 -TEST_GEN_PROGS = basic_test basic_percpu_ops_test basic_percpu_ops_mm_cid_test param_test \ - param_test_benchmark param_test_compare_twice param_test_mm_cid \ - param_test_mm_cid_benchmark param_test_mm_cid_compare_twice \ - syscall_errors_test slice_test +TEST_GEN_PROGS = basic_test basic_percpu_ops_test basic_percpu_ops_mm_cid_test \ + param_test_benchmark param_test_mm_cid_benchmark -TEST_GEN_PROGS_EXTENDED = librseq.so +TEST_GEN_PROGS_EXTENDED = librseq.so \ + param_test \ + param_test_compare_twice \ + param_test_mm_cid \ + param_test_mm_cid_compare_twice \ + syscall_errors_test \ + legacy_check \ + slice_test \ + check_optimized -TEST_PROGS = run_param_test.sh run_syscall_errors_test.sh +TEST_PROGS = run_param_test.sh run_syscall_errors_test.sh run_legacy_check.sh run_timeslice_test.sh TEST_FILES := settings @@ -62,3 +68,6 @@ $(OUTPUT)/syscall_errors_test: syscall_errors_test.c $(TEST_GEN_PROGS_EXTENDED) $(OUTPUT)/slice_test: slice_test.c $(TEST_GEN_PROGS_EXTENDED) rseq.h rseq-*.h $(CC) $(CFLAGS) $< $(LDLIBS) -lrseq -o $@ + +$(OUTPUT)/check_optimized: check_optimized.c $(TEST_GEN_PROGS_EXTENDED) rseq.h rseq-*.h + $(CC) $(CFLAGS) $< $(LDLIBS) -lrseq -o $@ diff --git a/tools/testing/selftests/rseq/check_optimized.c b/tools/testing/selftests/rseq/check_optimized.c new file mode 100644 index 00000000000000..a13e3f2c8fc62f --- /dev/null +++ b/tools/testing/selftests/rseq/check_optimized.c @@ -0,0 +1,17 @@ +// SPDX-License-Identifier: LGPL-2.1 +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include + +#include "rseq.h" + +int main(int argc, char **argv) +{ + if (__rseq_register_current_thread(true, false)) + return -1; + return 0; +} diff --git a/tools/testing/selftests/rseq/legacy_check.c b/tools/testing/selftests/rseq/legacy_check.c new file mode 100644 index 00000000000000..3f7de4e2830330 --- /dev/null +++ b/tools/testing/selftests/rseq/legacy_check.c @@ -0,0 +1,65 @@ +// SPDX-License-Identifier: GPL-2.0 +#ifndef _GNU_SOURCE +#define _GNU_SOURCE +#endif + +#include +#include +#include +#include + +#include "rseq.h" + +#include "../kselftest_harness.h" + +FIXTURE(legacy) +{ +}; + +static int cpu_id_in_sigfn = -1; + +static void sigfn(int sig) +{ + struct rseq_abi *rs = rseq_get_abi(); + + cpu_id_in_sigfn = rs->cpu_id_start; +} + +FIXTURE_SETUP(legacy) +{ + int res = __rseq_register_current_thread(true, true); + + switch (res) { + case -ENOSYS: + SKIP(return, "RSEQ not enabled\n"); + case -EBUSY: + SKIP(return, "GLIBC owns RSEQ. Disable GLIBC RSEQ registration\n"); + default: + ASSERT_EQ(res, 0); + } + + ASSERT_NE(signal(SIGUSR1, sigfn), SIG_ERR); +} + +FIXTURE_TEARDOWN(legacy) +{ +} + +TEST_F(legacy, legacy_test) +{ + struct rseq_abi *rs = rseq_get_abi(); + + ASSERT_NE(rs, NULL); + + /* Overwrite rs::cpu_id_start */ + rs->cpu_id_start = -1; + sleep(1); + ASSERT_NE(rs->cpu_id_start, -1); + + rs->cpu_id_start = -1; + ASSERT_EQ(raise(SIGUSR1), 0); + ASSERT_NE(rs->cpu_id_start, -1); + ASSERT_NE(cpu_id_in_sigfn, -1); +} + +TEST_HARNESS_MAIN diff --git a/tools/testing/selftests/rseq/param_test.c b/tools/testing/selftests/rseq/param_test.c index 05d03e679e0608..e1e98dbabe4bcf 100644 --- a/tools/testing/selftests/rseq/param_test.c +++ b/tools/testing/selftests/rseq/param_test.c @@ -38,7 +38,7 @@ static int opt_modulo, verbose; static int opt_yield, opt_signal, opt_sleep, opt_disable_rseq, opt_threads = 200, opt_disable_mod = 0, opt_test = 's'; - +static bool opt_rseq_legacy; static long long opt_reps = 5000; static __thread __attribute__((tls_model("initial-exec"))) @@ -281,9 +281,12 @@ unsigned int yield_mod_cnt, nr_abort; } \ } +#define rseq_no_glibc true + #else #define printf_verbose(fmt, ...) +#define rseq_no_glibc false #endif /* BENCHMARK */ @@ -481,7 +484,7 @@ void *test_percpu_spinlock_thread(void *arg) long long i, reps; if (!opt_disable_rseq && thread_data->reg && - rseq_register_current_thread()) + __rseq_register_current_thread(rseq_no_glibc, opt_rseq_legacy)) abort(); reps = thread_data->reps; for (i = 0; i < reps; i++) { @@ -558,7 +561,7 @@ void *test_percpu_inc_thread(void *arg) long long i, reps; if (!opt_disable_rseq && thread_data->reg && - rseq_register_current_thread()) + __rseq_register_current_thread(rseq_no_glibc, opt_rseq_legacy)) abort(); reps = thread_data->reps; for (i = 0; i < reps; i++) { @@ -712,7 +715,7 @@ void *test_percpu_list_thread(void *arg) long long i, reps; struct percpu_list *list = (struct percpu_list *)arg; - if (!opt_disable_rseq && rseq_register_current_thread()) + if (!opt_disable_rseq && __rseq_register_current_thread(rseq_no_glibc, opt_rseq_legacy)) abort(); reps = opt_reps; @@ -895,7 +898,7 @@ void *test_percpu_buffer_thread(void *arg) long long i, reps; struct percpu_buffer *buffer = (struct percpu_buffer *)arg; - if (!opt_disable_rseq && rseq_register_current_thread()) + if (!opt_disable_rseq && __rseq_register_current_thread(rseq_no_glibc, opt_rseq_legacy)) abort(); reps = opt_reps; @@ -1105,7 +1108,7 @@ void *test_percpu_memcpy_buffer_thread(void *arg) long long i, reps; struct percpu_memcpy_buffer *buffer = (struct percpu_memcpy_buffer *)arg; - if (!opt_disable_rseq && rseq_register_current_thread()) + if (!opt_disable_rseq && __rseq_register_current_thread(rseq_no_glibc, opt_rseq_legacy)) abort(); reps = opt_reps; @@ -1258,7 +1261,7 @@ void *test_membarrier_worker_thread(void *arg) const int iters = opt_reps; int i; - if (rseq_register_current_thread()) { + if (__rseq_register_current_thread(rseq_no_glibc, opt_rseq_legacy)) { fprintf(stderr, "Error: rseq_register_current_thread(...) failed(%d): %s\n", errno, strerror(errno)); abort(); @@ -1323,7 +1326,7 @@ void *test_membarrier_manager_thread(void *arg) intptr_t expect_a = 0, expect_b = 0; int cpu_a = 0, cpu_b = 0; - if (rseq_register_current_thread()) { + if (__rseq_register_current_thread(rseq_no_glibc, opt_rseq_legacy)) { fprintf(stderr, "Error: rseq_register_current_thread(...) failed(%d): %s\n", errno, strerror(errno)); abort(); @@ -1475,6 +1478,7 @@ static void show_usage(int argc, char **argv) printf(" [-D M] Disable rseq for each M threads\n"); printf(" [-T test] Choose test: (s)pinlock, (l)ist, (b)uffer, (m)emcpy, (i)ncrement, membarrie(r)\n"); printf(" [-M] Push into buffer and memcpy buffer with memory barriers.\n"); + printf(" [-O] Test with optimized RSEQ\n"); printf(" [-v] Verbose output.\n"); printf(" [-h] Show this help.\n"); printf("\n"); @@ -1602,6 +1606,9 @@ int main(int argc, char **argv) case 'M': opt_mo = RSEQ_MO_RELEASE; break; + case 'L': + opt_rseq_legacy = true; + break; default: show_usage(argc, argv); goto error; @@ -1618,7 +1625,7 @@ int main(int argc, char **argv) if (set_signal_handler()) goto error; - if (!opt_disable_rseq && rseq_register_current_thread()) + if (!opt_disable_rseq && __rseq_register_current_thread(rseq_no_glibc, opt_rseq_legacy)) goto error; if (!opt_disable_rseq && !rseq_validate_cpu_id()) { fprintf(stderr, "Error: cpu id getter unavailable\n"); diff --git a/tools/testing/selftests/rseq/rseq-abi.h b/tools/testing/selftests/rseq/rseq-abi.h index ecef315204b271..5f4ea2152c2fd8 100644 --- a/tools/testing/selftests/rseq/rseq-abi.h +++ b/tools/testing/selftests/rseq/rseq-abi.h @@ -191,10 +191,15 @@ struct rseq_abi { */ struct rseq_abi_slice_ctrl slice_ctrl; + /* + * Place holder to push the size above 32 bytes. + */ + __u8 __reserved; + /* * Flexible array member at end of structure, after last feature field. */ char end[]; -} __attribute__((aligned(4 * sizeof(__u64)))); +} __attribute__((aligned(256))); #endif /* _RSEQ_ABI_H */ diff --git a/tools/testing/selftests/rseq/rseq.c b/tools/testing/selftests/rseq/rseq.c index a736727b83c1eb..be0d0a97031ef2 100644 --- a/tools/testing/selftests/rseq/rseq.c +++ b/tools/testing/selftests/rseq/rseq.c @@ -56,6 +56,7 @@ ptrdiff_t rseq_offset; * unsuccessful. */ unsigned int rseq_size = -1U; +static unsigned int rseq_alloc_size; /* Flags used during rseq registration. */ unsigned int rseq_flags; @@ -115,29 +116,17 @@ bool rseq_available(void) } } -/* The rseq areas need to be at least 32 bytes. */ -static -unsigned int get_rseq_min_alloc_size(void) -{ - unsigned int alloc_size = rseq_size; - - if (alloc_size < ORIG_RSEQ_ALLOC_SIZE) - alloc_size = ORIG_RSEQ_ALLOC_SIZE; - return alloc_size; -} - /* * Return the feature size supported by the kernel. * * Depending on the value returned by getauxval(AT_RSEQ_FEATURE_SIZE): * - * 0: Return ORIG_RSEQ_FEATURE_SIZE (20) + * 0: Return ORIG_RSEQ_FEATURE_SIZE (20) * > 0: Return the value from getauxval(AT_RSEQ_FEATURE_SIZE). * * It should never return a value below ORIG_RSEQ_FEATURE_SIZE. */ -static -unsigned int get_rseq_kernel_feature_size(void) +static unsigned int get_rseq_kernel_feature_size(void) { unsigned long auxv_rseq_feature_size, auxv_rseq_align; @@ -152,15 +141,24 @@ unsigned int get_rseq_kernel_feature_size(void) return ORIG_RSEQ_FEATURE_SIZE; } -int rseq_register_current_thread(void) +int __rseq_register_current_thread(bool nolibc, bool legacy) { + unsigned int size; int rc; if (!rseq_ownership) { /* Treat libc's ownership as a successful registration. */ - return 0; + return nolibc ? -EBUSY : 0; } - rc = sys_rseq(&__rseq.abi, get_rseq_min_alloc_size(), 0, RSEQ_SIG); + + /* The minimal allocation size is 32, which is the legacy allocation size */ + size = get_rseq_kernel_feature_size(); + if (legacy || size < ORIG_RSEQ_ALLOC_SIZE) + rseq_alloc_size = ORIG_RSEQ_ALLOC_SIZE; + else + rseq_alloc_size = size; + + rc = sys_rseq(&__rseq.abi, rseq_alloc_size, 0, RSEQ_SIG); if (rc) { /* * After at least one thread has registered successfully @@ -179,9 +177,8 @@ int rseq_register_current_thread(void) * The first thread to register sets the rseq_size to mimic the libc * behavior. */ - if (RSEQ_READ_ONCE(rseq_size) == 0) { - RSEQ_WRITE_ONCE(rseq_size, get_rseq_kernel_feature_size()); - } + if (RSEQ_READ_ONCE(rseq_size) == 0) + RSEQ_WRITE_ONCE(rseq_size, size); return 0; } @@ -194,7 +191,7 @@ int rseq_unregister_current_thread(void) /* Treat libc's ownership as a successful unregistration. */ return 0; } - rc = sys_rseq(&__rseq.abi, get_rseq_min_alloc_size(), RSEQ_ABI_FLAG_UNREGISTER, RSEQ_SIG); + rc = sys_rseq(&__rseq.abi, rseq_alloc_size, RSEQ_ABI_FLAG_UNREGISTER, RSEQ_SIG); if (rc) return -1; return 0; diff --git a/tools/testing/selftests/rseq/rseq.h b/tools/testing/selftests/rseq/rseq.h index f51a5fdb044431..c62ebb9290c010 100644 --- a/tools/testing/selftests/rseq/rseq.h +++ b/tools/testing/selftests/rseq/rseq.h @@ -8,6 +8,7 @@ #ifndef RSEQ_H #define RSEQ_H +#include #include #include #include @@ -142,7 +143,12 @@ static inline struct rseq_abi *rseq_get_abi(void) * succeed. A restartable sequence executed from a non-registered * thread will always fail. */ -int rseq_register_current_thread(void); +int __rseq_register_current_thread(bool nolibc, bool legacy); + +static inline int rseq_register_current_thread(void) +{ + return __rseq_register_current_thread(false, false); +} /* * Unregister rseq for current thread. diff --git a/tools/testing/selftests/rseq/run_legacy_check.sh b/tools/testing/selftests/rseq/run_legacy_check.sh new file mode 100755 index 00000000000000..5577b46ea09272 --- /dev/null +++ b/tools/testing/selftests/rseq/run_legacy_check.sh @@ -0,0 +1,4 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +GLIBC_TUNABLES="${GLIBC_TUNABLES:-}:glibc.pthread.rseq=0" ./legacy_check diff --git a/tools/testing/selftests/rseq/run_param_test.sh b/tools/testing/selftests/rseq/run_param_test.sh index 8d31426ab41f22..69a3fa049929f3 100755 --- a/tools/testing/selftests/rseq/run_param_test.sh +++ b/tools/testing/selftests/rseq/run_param_test.sh @@ -34,6 +34,11 @@ REPS=1000 SLOW_REPS=100 NR_THREADS=$((6*${NR_CPUS})) +# Prevent GLIBC from registering RSEQ so the selftest can run in legacy and +# performance optimized mode. +GLIBC_TUNABLES="${GLIBC_TUNABLES:-}:glibc.pthread.rseq=0" +export GLIBC_TUNABLES + function do_tests() { local i=0 @@ -103,6 +108,40 @@ function inject_blocking() NR_LOOPS= } +echo "Testing in legacy RSEQ mode" +echo "Yield injection (25%)" +inject_blocking -m 4 -y -L + +echo "Yield injection (50%)" +inject_blocking -m 2 -y -L + +echo "Yield injection (100%)" +inject_blocking -m 1 -y -L + +echo "Kill injection (25%)" +inject_blocking -m 4 -k -L + +echo "Kill injection (50%)" +inject_blocking -m 2 -k -L + +echo "Kill injection (100%)" +inject_blocking -m 1 -k -L + +echo "Sleep injection (1ms, 25%)" +inject_blocking -m 4 -s 1 -L + +echo "Sleep injection (1ms, 50%)" +inject_blocking -m 2 -s 1 -L + +echo "Sleep injection (1ms, 100%)" +inject_blocking -m 1 -s 1 -L + +./check_optimized || { + echo "Skipping optimized RSEQ mode test. Not supported"; + exit 0 +} + +echo "Testing in optimized RSEQ mode" echo "Yield injection (25%)" inject_blocking -m 4 -y diff --git a/tools/testing/selftests/rseq/run_timeslice_test.sh b/tools/testing/selftests/rseq/run_timeslice_test.sh new file mode 100755 index 00000000000000..551ebed71ec61c --- /dev/null +++ b/tools/testing/selftests/rseq/run_timeslice_test.sh @@ -0,0 +1,14 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0+ + +# Prevent GLIBC from registering RSEQ so the selftest can run in legacy +# and performance optimized mode. +GLIBC_TUNABLES="${GLIBC_TUNABLES:-}:glibc.pthread.rseq=0" +export GLIBC_TUNABLES + +./check_optimized || { + echo "Skipping optimized RSEQ mode test. Not supported"; + exit 0 +} + +./slice_test diff --git a/tools/testing/selftests/rseq/slice_test.c b/tools/testing/selftests/rseq/slice_test.c index 357122dcb48703..e402d4440bc27f 100644 --- a/tools/testing/selftests/rseq/slice_test.c +++ b/tools/testing/selftests/rseq/slice_test.c @@ -124,6 +124,13 @@ FIXTURE_SETUP(slice_ext) { cpu_set_t affinity; + if (__rseq_register_current_thread(true, false)) + SKIP(return, "RSEQ not supported\n"); + + if (prctl(PR_RSEQ_SLICE_EXTENSION, PR_RSEQ_SLICE_EXTENSION_SET, + PR_RSEQ_SLICE_EXT_ENABLE, 0, 0)) + SKIP(return, "Time slice extension not supported\n"); + ASSERT_EQ(sched_getaffinity(0, sizeof(affinity), &affinity), 0); /* Pin it on a single CPU. Avoid CPU 0 */ @@ -137,11 +144,6 @@ FIXTURE_SETUP(slice_ext) break; } - ASSERT_EQ(rseq_register_current_thread(), 0); - - ASSERT_EQ(prctl(PR_RSEQ_SLICE_EXTENSION, PR_RSEQ_SLICE_EXTENSION_SET, - PR_RSEQ_SLICE_EXT_ENABLE, 0, 0), 0); - self->noise_params.noise_nsecs = variant->noise_nsecs; self->noise_params.sleep_nsecs = variant->sleep_nsecs; self->noise_params.run = 1; diff --git a/tools/testing/selftests/sched_ext/dequeue.c b/tools/testing/selftests/sched_ext/dequeue.c index 4e93262703ca84..383d06e972a463 100644 --- a/tools/testing/selftests/sched_ext/dequeue.c +++ b/tools/testing/selftests/sched_ext/dequeue.c @@ -33,6 +33,7 @@ static void worker_fn(int id) /* Do some work to trigger scheduling events */ for (j = 0; j < 10000; j++) sum += j; + asm volatile("" : : "r"(sum)); /* Sleep to trigger dequeue */ usleep(1000 + (id * 100)); diff --git a/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json b/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json index eefadd0546d3b6..848696c373fca1 100644 --- a/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json +++ b/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json @@ -1136,5 +1136,194 @@ "teardown": [ "$TC qdisc del dev $DUMMY handle 1: root" ] + }, + { + "id": "7a5f", + "name": "Force red to dequeue from its child's gso_skb with qfq leaf", + "category": [ + "qdisc", + "tbf", + "red", + "qfq" + ], + "plugins": { + "requires": "nsPlugin" + }, + "setup": [ + "$IP link set dev $DUMMY up || true", + "$IP addr add 10.10.11.10/24 dev $DUMMY || true", + "$TC qdisc add dev $DUMMY root handle 1: tbf rate 88bit burst 1661b peakrate 2257333 minburst 1024 limit 7b", + "$TC qdisc add dev $DUMMY parent 1: handle 2: red limit 757 min 16 max 24 avpkt 16", + "$TC qdisc add dev $DUMMY parent 2: handle 3: qfq", + "$TC class add dev $DUMMY classid 3:1 parent 3: qfq maxpkt 512 weight 1", + "$TC filter add dev $DUMMY parent 3: protocol ip prio 1 matchall classid 3:1 action ok" + ], + "cmdUnderTest": "ping -c 1 10.10.10.1 -W0.01 -I$DUMMY || true", + "expExitCode": "0", + "verifyCmd": "$TC -s -j qdisc ls dev $DUMMY parent 1:", + "matchJSON": [ + { + "kind": "red", + "handle": "2:", + "bytes": 98, + "packets": 1, + "backlog": 0, + "qlen": 0 + } + ], + "teardown": [ + "$TC qdisc del dev $DUMMY handle 1: root" + ] + }, + { + "id": "cdae", + "name": "Force sfb to dequeue from its child's gso_skb with qfq leaf", + "category": [ + "qdisc", + "tbf", + "sfb", + "qfq" + ], + "plugins": { + "requires": "nsPlugin" + }, + "setup": [ + "$IP link set dev $DUMMY up || true", + "$IP addr add 10.10.11.10/24 dev $DUMMY || true", + "$TC qdisc add dev $DUMMY root handle 1: tbf rate 88bit burst 1661b peakrate 2257333 minburst 1024 limit 7b", + "$TC qdisc add dev $DUMMY parent 1: handle 2: sfb", + "$TC qdisc add dev $DUMMY parent 2: handle 3: qfq", + "$TC class add dev $DUMMY classid 3:1 parent 3: qfq maxpkt 512 weight 1", + "$TC filter add dev $DUMMY parent 3: protocol ip prio 1 matchall classid 3:1 action ok" + ], + "cmdUnderTest": "ping -c 1 10.10.10.1 -W0.01 -I$DUMMY || true", + "expExitCode": "0", + "verifyCmd": "$TC -s -j qdisc ls dev $DUMMY parent 1:", + "matchJSON": [ + { + "kind": "sfb", + "handle": "2:", + "bytes": 98, + "packets": 1, + "backlog": 0, + "qlen": 0 + } + ], + "teardown": [ + "$TC qdisc del dev $DUMMY handle 1: root" + ] + }, + { + "id": "291d", + "name": "Force red to dequeue from its child's gso_skb with dualpi2 leaf", + "category": [ + "qdisc", + "tbf", + "red", + "dualpi2" + ], + "plugins": { + "requires": "nsPlugin" + }, + "setup": [ + "$IP link set dev $DUMMY up || true", + "$IP addr add 10.10.11.10/24 dev $DUMMY || true", + "$TC qdisc add dev $DUMMY root handle 1: tbf rate 88bit burst 1661b peakrate 2257333 minburst 1024 limit 7b", + "$TC qdisc add dev $DUMMY parent 1: handle 2: red limit 757 min 16 max 24 avpkt 16", + "$TC qdisc add dev $DUMMY parent 2: handle 3: dualpi2" + ], + "cmdUnderTest": "ping -c 1 10.10.10.1 -W0.01 -I$DUMMY || true", + "expExitCode": "0", + "verifyCmd": "$TC -s -j qdisc ls dev $DUMMY parent 1:", + "matchJSON": [ + { + "kind": "red", + "handle": "2:", + "bytes": 98, + "packets": 1, + "backlog": 0, + "qlen": 0 + } + ], + "teardown": [ + "$TC qdisc del dev $DUMMY handle 1: root" + ] + }, + { + "id": "9c6d", + "name": "Force sfb to dequeue from its child's gso_skb with dualpi2 leaf", + "category": [ + "qdisc", + "tbf", + "sfb", + "dualpi2" + ], + "plugins": { + "requires": "nsPlugin" + }, + "setup": [ + "$IP link set dev $DUMMY up || true", + "$IP addr add 10.10.11.10/24 dev $DUMMY || true", + "$TC qdisc add dev $DUMMY root handle 1: tbf rate 88bit burst 1661b peakrate 2257333 minburst 1024 limit 7b", + "$TC qdisc add dev $DUMMY parent 1: handle 2: sfb", + "$TC qdisc add dev $DUMMY parent 2: handle 3: dualpi2" + ], + "cmdUnderTest": "ping -c 1 10.10.10.1 -W0.01 -I$DUMMY || true", + "expExitCode": "0", + "verifyCmd": "$TC -s -j qdisc ls dev $DUMMY parent 1:", + "matchJSON": [ + { + "kind": "sfb", + "handle": "2:", + "bytes": 98, + "packets": 1, + "backlog": 0, + "qlen": 0 + } + ], + "teardown": [ + "$TC qdisc del dev $DUMMY handle 1: root" + ] + }, + { + "id": "3a62", + "name": "Try to create a qlen underflow with QFQ/CBS", + "category": [ + "qdisc", + "qfq", + "cbs" + ], + "plugins": { + "requires": "nsPlugin" + }, + "setup": [ + "$IP link set dev $DUMMY up || true", + "$IP addr add 10.10.10.10/24 dev $DUMMY || true", + "$TC qdisc add dev $DUMMY root handle 1: qfq", + "$TC class add dev $DUMMY classid 1:1 parent 1: qfq", + "$TC class add dev $DUMMY classid 1:2 parent 1: qfq", + "$TC qdisc add dev $DUMMY handle 2: parent 1:1 cbs", + "$TC qdisc add dev $DUMMY handle 3: parent 2: netem delay 5000000000", + "$TC filter add dev $DUMMY parent 1: prio 1 u32 match ip dst 10.10.10.1 classid 1:1 action ok", + "$TC filter add dev $DUMMY parent 1: prio 2 u32 match ip dst 10.10.10.2 classid 1:2 action ok", + "ping -c 1 10.10.10.1 -W0.01 -I$DUMMY || true", + "$IP l set $DUMMY down", + "$IP l set $DUMMY up", + "$TC qdisc replace dev $DUMMY handle 4: parent 2: pfifo" + ], + "cmdUnderTest": "ping -c 1 10.10.10.2 -W0.01 -I$DUMMY", + "expExitCode": "1", + "verifyCmd": "$TC -s -j qdisc ls dev $DUMMY parent 1:1", + "matchJSON": [ + { + "kind": "cbs", + "handle": "2:", + "bytes": 0, + "packets": 0 + } + ], + "teardown": [ + "$TC qdisc del dev $DUMMY handle 1: root" + ] } ] diff --git a/tools/testing/selftests/ublk/kublk.c b/tools/testing/selftests/ublk/kublk.c index fbd9b1e7342a35..0b23c09daea5a2 100644 --- a/tools/testing/selftests/ublk/kublk.c +++ b/tools/testing/selftests/ublk/kublk.c @@ -1735,6 +1735,17 @@ static int __cmd_dev_add(const struct dev_ctx *ctx) goto fail; } + /* + * The kernel may reduce nr_hw_queues (e.g. capped to nr_cpu_ids). + * Cap nthreads to the actual queue count to avoid creating extra + * handler threads that will hang during device removal. + * + * per_io_tasks mode is excluded: threads interleave across all + * queues so nthreads > nr_hw_queues is valid and intentional. + */ + if (!ctx->per_io_tasks && dev->nthreads > info->nr_hw_queues) + dev->nthreads = info->nr_hw_queues; + ret = ublk_start_daemon(ctx, dev); ublk_dbg(UBLK_DBG_DEV, "%s: daemon exit %d\n", __func__, ret); if (ret < 0) diff --git a/virt/kvm/dirty_ring.c b/virt/kvm/dirty_ring.c index 02bc6b00d76cbd..572b854edf740d 100644 --- a/virt/kvm/dirty_ring.c +++ b/virt/kvm/dirty_ring.c @@ -63,7 +63,8 @@ static void kvm_reset_dirty_gfn(struct kvm *kvm, u32 slot, u64 offset, u64 mask) memslot = id_to_memslot(__kvm_memslots(kvm, as_id), id); - if (!memslot || (offset + __fls(mask)) >= memslot->npages) + if (!memslot || offset >= memslot->npages || + offset + __fls(mask) >= memslot->npages) return; KVM_MMU_LOCK(kvm);