Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions docs/history/decisions.md
Original file line number Diff line number Diff line change
Expand Up @@ -803,3 +803,15 @@ The docs overhaul first tried to *shrink* the per-module `.md` files, then inver
- **A generator that's *present but failing* must fail loud, not return empty.** `gen_api.generate()` first returned `{}` on any error — so a transient `npx` registry hiccup would ship a docs site with **zero** API pages and no red X. Split the two cases: toolchain *absent* → graceful `{}` (a contributor without Doxygen still builds the rest); toolchain *present but failing* (or under-producing vs a floor) → raise. Silent degradation in a generated artifact is worse than a hard failure, because nobody notices until a reader hits a 404.
- **Enrichment by parallel agents needs an adversarial read, not just a build.** Worker agents enriched 19 headers; all compiled and generated. But the pre-merge reviewer + CodeRabbit still found real content bugs a build can't catch: a future-tense roadmap sentence (violates present-tense), a core-affinity claim describing plumbing that doesn't exist (inherited verbatim from the old `.md` — the inaccuracy pre-dated the enrichment), and an intentional default-value change (`TextEffect hue 0→128`) riding in unremarked and untested. "Compiles + generates" is table stakes; prose accuracy needs a human/reviewer pass, and an *intentional* behaviour change (even a good one) needs a test pinning it.
- **A migration is only net-subtractive once the old thing is deleted — stage the deletion, but don't call it done early.** Moving the old `.md` to `archive/` (instead of deleting) kept them for cross-check but left the branch net-positive and shipped a temporary migration banner on every generated page. That's fine as an explicit *stage*, but the banner and archive are debt with a name and a removal plan, not a resting state. Mark such scaffolding "temporary / removed at Stage N" at every site so the cleanup is mechanical.

## ESP32-S31 RGMII Ethernet bring-up: four bugs between "code compiles" and "link up"

Bringing up the S31's on-chip 1 Gb RGMII Ethernet took four fixes, and none of them was the C++ — each was a layer *below* the feature code that a build can't catch. The general lesson: **for a hardware bring-up, "it compiles" tells you almost nothing; the truth is only in the boot log on the actual board.**

- **A `MM_NO_ETH`-style capability gate keyed on a *filename* silently omits a new board.** `build_esp32.py` decided "does this firmware have Ethernet?" by pattern-matching the sdkconfig fragment *filename* for `.eth`. The S31 enables its EMAC in `sdkconfig.defaults.esp32s31` (no `.eth` in the name), so the gate said "no Ethernet" → compiled the `ethInit()` stub → the board booted with **zero** eth log and fell back to WiFi. The symptom (silent WiFi fallback) was maximally far from the cause (a filename heuristic in a build script). Rule: a gate that asks "does X enable feature Y?" must read what X *contains* (`CONFIG_ETH_USE_*=y`), never what X is *named*. Names drift out of the heuristic's blind spot the moment a new case doesn't follow the naming convention.

- **A schematic/reference pin table is a hypothesis until the boot log confirms it — and ours was systematically off by one.** The reference-doc pin table (hand-transcribed from the schematic) had MDC/MDIO/TXD all one GPIO low. Wrong data pins failed loudly (`invalid TX_CTL GPIO number`), but wrong *MDIO* failed as "No PHY device detected" — three inference-steps from the typo. The chip's own IDF IO_MUX table (`esp32s31/emac_periph.c`) is the authority for RGMII data pins (they're fixed pads, not arbitrary GPIOs); trust it over any transcribed table, and verify the SMI pins against IDF's `ETH_ESP32_EMAC_DEFAULT_CONFIG` for that chip. Use PHY-addr auto-detect (`-1`) so a wrong strap assumption can't mask a working bus.

- **A shared clock is a contended resource — the RGMII 125 MHz Tx clock can't come from a PLL that's already spoken for.** The default (AUTO) sourced the Tx clock from the MPLL, but PSRAM already ran the MPLL at 400 MHz (no integer path to 125), and CPLL couldn't synthesise 125 MHz on the 40 MHz XTAL grid either. Only the *fractional* APLL (built for exact frequencies) works. The lesson beyond Ethernet: on an SoC where one PLL feeds multiple peripherals, "pick a clock source" is a *conflict-resolution* decision, not a default — check what already owns each PLL before claiming one.

- **Changing a Kconfig *choice* in a defaults fragment needs a clean build.** Editing `CONFIG_ETH_EMAC_RGMII_TX_CLK_SRC_*` in the fragment and doing an incremental build silently kept the *old* choice (the build dir's `sdkconfig` already had a value; defaults don't override an existing one). Two flash cycles were spent "testing APLL" that were actually still CPLL. When a sdkconfig *choice* (not a plain `=y`) changes, `rm -rf` the build dir or the test is a lie.
115 changes: 115 additions & 0 deletions docs/history/plans/Plan-20260703 - S31 RGMII Ethernet.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
# Plan — ESP32-S31 RGMII Ethernet (1 Gb) with Ethernet-preferred cascade

## Context

The bench ESP32-S31 board (Espressif Function-CoreBoard-1) has an on-chip 1 Gb EMAC wired
through an **RGMII** interface to a **YT8531** PHY → RJ45. The product owner connected an
Ethernet cable to it and it isn't used yet: the S31 firmware only brings up WiFi. The goal is
Ethernet-preferred networking — use Ethernet when the cable is up at boot, fall back to WiFi
otherwise — matching how the classic ESP32 (Olimex) and P4 boards already behave.

**Why S31-only:** among projectMM's targets, the S31 is the *only* chip whose EMAC advertises
`SOC_EMAC_SUPPORT_1000M` + RGMII. The classic ESP32 and P4 EMACs are RMII (100 Mb); S2/S3/C3/C6
have no EMAC at all (Ethernet only via an external W5500 SPI chip). RGMII/1000M is intrinsic to
the SoC — no extension board can add it to a non-S31 — so the RGMII path is S31-only by nature,
not a selectable per-board option. (Product owner confirmed: "otherwise S31 only".)

**Failover is already built:** `NetworkModule` runs `ethInit()` first and only starts WiFi if it
returns false (no PHY/cable). So "use Ethernet when available, WiFi otherwise" needs **no new
failover code** — only a new RGMII init path that the existing cascade calls. (Product owner
confirmed: Ethernet-preferred cascade at boot, not live hot-swap.)

## Design (mirrors the existing RMII path, adds an RGMII sibling)

The Ethernet layer already dispatches on `ethConfig_.phyType`: `ethInitRmii()` (on-chip EMAC,
RMII) and `ethInitSpi()` (W5500). This adds a third sibling, `ethInitRgmii()`, selected by a new
`ethYt8531` phyType — same shape, same cascade, same on-chip-EMAC compile guard.

**RGMII data pins are hardwired, not runtime config.** Exactly like RMII (whose TX/RX data lines
live in the IDF EMAC macro, not `EthPinConfig` — see the comment at
[platform_config.h:196-198](src/platform/esp32/platform_config.h#L196)), the S31 CoreBoard's RGMII
data pins are fixed by the board schematic. So they go straight into `ethInitRgmii()` as literals
from the schematic — **no new `EthPinConfig` fields, no NetworkModule controls, no deviceModels.json
eth block.** This keeps the struct and the UI untouched; the whole feature is one board's wiring.

**Pins (from `docs/reference/esp32-s31-coreboard.md`, sourced from the official schematic):**
MDC 4, MDIO 5, PHY reset 6, PHY int 2; TX_CTL 11, TXD0-3 = 7/8/9/10; RX_CTL 15, RXD0-3 = 19/18/17/16;
clock_tx 13, clock_rx 14. PHY = YT8531 via `esp_eth_phy_new_generic` (IEEE-standard registers).

## Files to change

1. **`src/platform/esp32/platform_config.h`**
- Add `isEsp32S31` constexpr flag (keyed on `CONFIG_IDF_TARGET_ESP32S31`), following the
`isEsp32P4`/`isEsp32S3` pattern at [L33-46](src/platform/esp32/platform_config.h#L33).
- Add `ethYt8531 = 4` to the `EthPhyType` enum ([L175](src/platform/esp32/platform_config.h#L175)),
with a one-line "RGMII, YT8531 PHY, S31 on-chip 1 Gb EMAC" comment.
- Add an `isEsp32S31` branch to the `ethConfigDefault` ternary
([L216](src/platform/esp32/platform_config.h#L216)): `phyType ethYt8531`, `phyAddr` (from the
YT8531 strap — default 0, confirm on bench), `rstGpio 6`, MDC/MDIO 4/5. RGMII data + clock
pins are NOT struct fields (hardwired in `ethInitRgmii`); pass -1 for the unused RMII/SPI
fields. **No struct change.**

2. **`src/platform/esp32/platform_esp32.cpp`**
- Add `static bool ethInitRgmii()` mirroring `ethInitRmii()`
([L457-543](src/platform/esp32/platform_esp32.cpp#L457)) under the same
`#ifdef CONFIG_ETH_USE_ESP32_EMAC` guard. Differences from the RMII version:
- `emac_config.interface = EMAC_DATA_INTERFACE_RGMII`
- set `emac_config.clock_config.rgmii.clock_tx_gpio/clock_rx_gpio` (13/14) and the
`emac_config.emac_dataif_gpio.rgmii` struct (tx_ctl/txd0-3/rx_ctl/rxd0-3 = the schematic
pins) — the RGMII fields IDF exposes in `esp_eth_mac_esp.h`.
- PHY: `esp_eth_phy_new_generic(&phy_config)` (YT8531 is standard-register; same generic ctor
LAN8720 uses — no managed component needed).
- reuse the identical `fail()` cleanup lambda, driver-install, netif-attach, event-register,
non-blocking `esp_eth_start`, and the link-up hostname handling. Log "Ethernet init done
(RGMII, S31)".
- Add the dispatch case to `ethInit()`
([the switch, ~L661](src/platform/esp32/platform_esp32.cpp#L661)):
`#ifdef CONFIG_ETH_USE_ESP32_EMAC` → `case ethYt8531: return ethInitRgmii();` (alongside the
existing `ethLan8720`/`ethIp101` RMII cases).

3. **`esp32/sdkconfig.defaults.esp32s31`** — `CONFIG_ETH_USE_ESP32_EMAC=y` + DMA buffers are
already present ([L26-32](esp32/sdkconfig.defaults.esp32s31)). RGMII is selected at runtime via
the struct `interface` field (not a sdkconfig symbol), so likely **no change** — but verify at
configure time that no `CONFIG_ETH_*RGMII*`/1000M symbol is required; add it only if the build
demands it.

4. **`web-installer/deviceModels.json`** — move `"Ethernet"` from the S31's `planned` list to
`supported` (the S31 entry). No eth `NetworkModule` control block needed (pins are the
compile-time default). `check_devices.py` allows `Ethernet` in `supported` (it's in
`SUPPORTED_VOCAB`).

5. **`docs/reference/esp32-s31-coreboard.md`** — update the Ethernet section's "wiring the S31 eth
needs an RGMII branch — not a drop-in" note to present-tense "driven by `ethInitRgmii`", since
it now ships. (Small doc sync, per the present-tense rule.)

## Not doing (deliberately, keeps it minimal)

- No `EthPinConfig` struct fields for RGMII data pins (hardwired, like RMII).
- No NetworkModule UI controls, no `syncEthConfig` change (nothing new to sync).
- No deviceModels.json eth-pin block for the S31 (compile-time default covers it).
- No failover/route-switching code (the eth→WiFi cascade already exists).
- No new managed component (generic PHY driver covers YT8531).

## Verification

- **Build:** `esp32s31` builds clean on 6.1 with `-Werror` (the RGMII branch is behind the
already-set `CONFIG_ETH_USE_ESP32_EMAC`; other targets unaffected — the case is chip-guarded).
- **Non-regression:** classic/P4/S3 eth paths untouched (RMII/SPI code unchanged); a quick
`esp32p4-eth` + `esp32` build stays green. `check_devices.py` green with `Ethernet` under S31
`supported`. `ctest` + scenarios unaffected (platform-only change).
- **Bench (the real test), on the connected S31 (`/dev/cu.usbserial-20213420`), cable plugged in:**
1. Flash `esp32s31`, capture boot log → expect `Ethernet init done (RGMII, S31)` then an
**Ethernet DHCP lease** (an `MM_IP=` from the wired subnet, like the P4 eth test showed
`192.168.1.133`), and mDNS `MM-S31.local`.
2. Confirm the render loop still runs (FPS line present) and heap is healthy.
3. **Failover check:** unplug the cable, reboot → it should fall back to WiFi (the existing
cascade). Plug back in, reboot → Ethernet again. (Boot-time cascade, per the chosen model.)
- Save the approved plan to `docs/history/plans/Plan-20260703 - S31 RGMII Ethernet.md` as the first
implementation step.

## Open items to confirm on the bench during implementation

- **YT8531 PHY address** — default strap is usually 0; if `esp_eth` can't find the PHY at addr 0,
scan/try 1 (the reference doc doesn't pin the strap). One-line fix in `ethConfigDefault`.
- **RGMII clock direction / delay** — YT8531 boards sometimes need RX/TX clock delay config; if the
link comes up but no packets flow, revisit the RGMII clock config. (Bench will show.)
4 changes: 2 additions & 2 deletions docs/moonmodules/core/archive/NetworkModule.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,8 @@ When a higher-priority connection becomes available, lower ones are torn down to
- When Static: `ip`, `gateway`, `subnet`, `dns` (ipv4 controls — 4 bytes of storage each, not 16-char strings; the wire shape is still a dotted-quad string). Shown dynamically via onBuildControls.
- `mDNS` (bool) — enable/disable mDNS responder

**Ethernet PHY/pin controls** (only on builds with an Ethernet driver — `platform::hasEthernet`). The PHY *driver* is compiled into the firmware per chip (internal-EMAC RMII on classic/P4, W5500 SPI on the S3); these controls pick *which* PHY a board uses and *on which pins* — runtime config, set per board in [`deviceModels.json`](../../../../web-installer/deviceModels.json) (→ `setEthConfig` → `ethInit`), seeded from the per-chip default in `platform_config.h`. `ethType` is the switch: with it at 0 no pin rows show; choosing a type reveals only that type's pins (RMII rows for LAN8720/IP101, SPI rows for W5500). A W5500 change applies **live** (the SPI driver tears down + re-inits, no reboot); an RMII change saves and applies on the next boot (status hints "restart to apply"). See [architecture.md § Config provenance](../../../architecture.md#config-provenance-mcu-devicemodel).
- `ethType` (select) — PHY type dropdown, options `None` / `LAN8720` / `IP101` / `W5500` (stored as the index 0..3, matching the `EthPhyType` enum: 0 = none, 1 = LAN8720 RMII, 2 = IP101 RMII, 3 = W5500 SPI).
**Ethernet PHY/pin controls** (only on builds with an Ethernet driver — `platform::hasEthernet`). The PHY *driver* is compiled into the firmware per chip (internal-EMAC RMII on classic/P4, internal-EMAC RGMII on the S31, W5500 SPI on the S3); these controls pick *which* PHY a board uses and *on which pins* — runtime config, set per board in [`deviceModels.json`](../../../../web-installer/deviceModels.json) (→ `setEthConfig` → `ethInit`), seeded from the per-chip default in `platform_config.h`. `ethType` is the switch: with it at 0 no pin rows show; choosing a type reveals only that type's editable pins (RMII rows for LAN8720/IP101, SPI rows for W5500; the S31's YT8531 RGMII shows just PHY-address + reset, since its data/clock pins are the chip's fixed IO_MUX pads). A W5500 change applies **live** (the SPI driver tears down + re-inits, no reboot); an RMII/RGMII change saves and applies on the next boot (status hints "restart to apply"). See [architecture.md § Config provenance](../../../architecture.md#config-provenance-mcu-devicemodel).
- `ethType` (select) — PHY type dropdown, options `None` / `LAN8720` / `IP101` / `W5500` / `YT8531` (stored as the index 0..4, matching the `EthPhyType` enum: 0 = none, 1 = LAN8720 RMII, 2 = IP101 RMII, 3 = W5500 SPI, 4 = YT8531 RGMII — the S31's on-chip 1 Gb EMAC).
- `ethPhyAddr` (pin) — SMI/PHY address (typically 0 or 1).
- `ethRstGpio` (pin) — PHY reset GPIO (−1 = none / module self-resets).
- `ethMdcGpio`, `ethMdioGpio` (pin) — RMII SMI clock / data GPIOs (−1 = IDF default). RMII only.
Expand Down
34 changes: 23 additions & 11 deletions docs/reference/esp32-s31-coreboard.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,22 +48,34 @@ The onboard electret mic (J6) and speaker connect through an **ES8311 mono codec

On-chip EMAC → **YT8531** (Motorcomm) PHY (U8) → RJ45, **RGMII** with a 25 MHz crystal (Y2).

Pin map — **bench-verified** (link + DHCP confirmed on the CoreBoard). The RGMII data + clock
GPIOs are the chip's fixed IO_MUX pads (the only ones the EMAC accepts; from IDF's
`esp32s31/emac_periph.c`); MDC/MDIO are IDF's S31 SMI defaults (`ETH_ESP32_EMAC_DEFAULT_CONFIG`):

| Signal | GPIO | | Signal | GPIO |
|---|---|---|---|---|
| ETH_INTN | 2 | | ETH_TXD3 | 10 |
| PHY_MDC | 4 | | ETH_TX_CTL | 11 |
| PHY_MDIO | 5 | | ETH_TXCLK | 13 |
| ETH_PHY_RST | 6 | | ETH_RX_CLK | 14 |
| ETH_TXD0 | 7 | | ETH_RX_CTL | 15 |
| ETH_TXD1 | 8 | | ETH_RXD3 | 16 |
| ETH_TXD2 | 9 | | ETH_RXD2 | 17 |
| ETH_INTN | 2 | | ETH_TXD3 | 11 |
| PHY_MDC | 5 | | ETH_TX_CTL | 12 |
| PHY_MDIO | 6 | | ETH_TXCLK | 13 |
| ETH_PHY_RST | 7 | | ETH_RX_CLK | 14 |
| ETH_TXD0 | 8 | | ETH_RX_CTL | 15 |
| ETH_TXD1 | 9 | | ETH_RXD3 | 16 |
| ETH_TXD2 | 10 | | ETH_RXD2 | 17 |
| | | | ETH_RXD1 | 18 |
| | | | ETH_RXD0 | 19 |

> **RGMII, not RMII.** projectMM's classic/P4 Ethernet path (`ethInit` in
> `src/platform/esp32/platform_esp32.cpp`) is RMII (fewer data lines, 50 MHz ref clock). The S31's
> 1 Gbps EMAC is RGMII (4-bit data each way + TX/RX clocks). Wiring the S31 eth needs an RGMII MAC
> config branch — it is not a drop-in of the RMII pin struct.
> **RGMII, not RMII.** projectMM's classic/P4 Ethernet is RMII (fewer data lines, 50 MHz ref clock);
> the S31's 1 Gbps EMAC is RGMII (4-bit data each way + TX/RX clocks). The shared `ethInitEmac()` in
> `src/platform/esp32/platform_esp32.cpp` drives both: an `#ifdef CONFIG_IDF_TARGET_ESP32S31` block
> selects the RGMII interface and sets the CoreBoard's data/clock pins from the table above, then
> shares the driver-install / netif / DHCP tail with the RMII path. The board's default eth config
> (`ethConfigDefault` in `platform_config.h`) uses PHY type `ethYt8531` (PHY address auto-detected),
> driven by the generic IEEE-802.3 PHY ctor since the YT8531 is a standard-register PHY.
>
> **RGMII 125 MHz Tx clock = APLL.** The EMAC needs a clean 125 MHz Tx clock; the default MPLL is
> owned by PSRAM (400 MHz, no integer path to 125) and CPLL can't hit it on the 40 MHz XTAL grid, so
> `sdkconfig.defaults.esp32s31` sets `CONFIG_ETH_EMAC_RGMII_TX_CLK_SRC_APLL` — the fractional PLL
> synthesises 125 MHz exactly.

## Other onboard features

Expand Down
16 changes: 13 additions & 3 deletions esp32/sdkconfig.defaults.esp32s31
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,20 @@ CONFIG_ESPTOOLPY_FLASHSIZE_16MB=y
# not relied on at boot.
CONFIG_SPIRAM=y

# Ethernet — on-chip EMAC + RMII (SOC_EMAC_SUPPORTED, SOC_EMAC_SUPPORT_1000M). Pins
# are runtime config (ethConfig_ in C), not sdkconfig. RMII is the EMAC's default
# interface, so no CONFIG_ETH_PHY_INTERFACE_RMII (a non-existent symbol in IDF v6).
# Ethernet — on-chip EMAC (SOC_EMAC_SUPPORTED, SOC_EMAC_SUPPORT_1000M → RGMII 1 Gb).
# The interface (RGMII on the S31) and PHY pins (YT8531 map) are set in C — the RGMII
# branch of ethInitEmac() in src/platform/esp32/platform_esp32.cpp, keyed off the
# ETH_ESP32_EMAC_DEFAULT_CONFIG() macro which IDF fixes to RGMII on the S31 — not via a
# sdkconfig symbol, so the pin map stays in one place.
CONFIG_ETH_USE_ESP32_EMAC=y

# RGMII Tx clock source = APLL. The 1 Gb RGMII EMAC needs a clean 125 MHz Tx clock. The
# default (AUTO) picks the MPLL, but PSRAM owns the MPLL at 400 MHz (SPIRAM_SPEED_200M) —
# 400 MHz has no integer path to 125 MHz ("unusable frequency 133 MHz"). CPLL can't hit it
# either ("No CPLL on 40 MHz grid divides 125 MHz"): both are integer PLLs off the 40 MHz
# XTAL. The APLL is a *fractional* PLL built for exact frequencies (its reason to exist), so
# it synthesises 125 MHz precisely. Bench-proven on the S31 CoreBoard.
CONFIG_ETH_EMAC_RGMII_TX_CLK_SRC_APLL=y
CONFIG_ETH_DMA_BUFFER_SIZE=512
CONFIG_ETH_DMA_RX_BUFFER_NUM=10
CONFIG_ETH_DMA_TX_BUFFER_NUM=10
Expand Down
Loading
Loading