Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
94 commits
Select commit Hold shift + click to select a range
ab02e85
Replace Claude imports with symlinks
Nucs Apr 14, 2026
b1d1731
feat(NpyIter): Implement 8 NumPy parity fixes for NpyIter
Nucs Apr 15, 2026
8335532
refactor(NpyIter): Support unlimited dimensions (NumSharp divergence)
Nucs Apr 15, 2026
b71a5e8
feat(NpyIter): Add NpyAxisIter and logical reduction infrastructure
Nucs Apr 15, 2026
5c2d6fd
fix(tests): Convert TUnit [Test] to MSTest [TestMethod]
Nucs Apr 15, 2026
372a8e7
feat(NpyIter): Implement axis reordering before coalescing for full 1…
Nucs Apr 15, 2026
932a836
feat(NpyIter): Add NumPy parity features and comprehensive test coverage
Nucs Apr 15, 2026
9e6ebfd
feat(NpyIter): Implement full F-order and K-order support with MULTI_…
Nucs Apr 15, 2026
4534093
feat(NpyIter): Implement GotoIndex for flat C/F index jumping
Nucs Apr 15, 2026
3a383df
feat(NpyIter): Implement Copy() for independent iterator copies
Nucs Apr 15, 2026
a620349
docs(NpyIter): Update remaining features list
Nucs Apr 15, 2026
6b883b3
feat(NpyIter): Implement negative stride flipping for memory-order it…
Nucs Apr 15, 2026
f140b4b
feat(NpyIter): Implement GetIterView for operand view with iterator axes
Nucs Apr 15, 2026
d00df9e
feat(NpyIter): Implement cast support for type conversion during buff…
Nucs Apr 15, 2026
719d668
feat(NpyIter): Implement reduction support via op_axes
Nucs Apr 15, 2026
cfde429
feat(NpyIter): Improve reduction NumPy parity
Nucs Apr 16, 2026
f542426
feat(NpyIter): Implement buffered reduction double-loop with full Num…
Nucs Apr 16, 2026
8da97a2
fix(NpyIter): Fix buffered reduction for small buffers (bufferSize < …
Nucs Apr 16, 2026
0943e04
docs(NpyIter): Add comprehensive implementation audit
Nucs Apr 16, 2026
2ffc73a
feat(NpyIter): Implement unlimited operands (NumPy NPY_MAXARGS parity)
Nucs Apr 16, 2026
2f42caf
docs(NpyIter): Add deep audit with 4 comparison techniques
Nucs Apr 16, 2026
fc4790a
fix(tests): Mark NpyIter iteration order differences as [Misaligned]
Nucs Apr 16, 2026
12e3629
fix(NpyIter): Fix F-order iteration to match NumPy behavior
Nucs Apr 16, 2026
0d5c2ef
fix(NpyIter): Fix K-order iteration for broadcast and non-contiguous …
Nucs Apr 16, 2026
3d47d17
fix(NpyIter): Achieve 100% NumPy 2.4.2 parity - 7 bugs fixed via TDD
Nucs Apr 16, 2026
b823f81
feat(Shape): Minimal multi-order memory layout support (C/F/A/K)
Nucs Apr 19, 2026
0376003
refactor(Shape): Align contiguity computation with NumPy conventions
Nucs Apr 19, 2026
9457bc7
feat(NpyIter): Implement 10 missing NumPy APIs + battletest to 566 sc…
Nucs Apr 20, 2026
528a1da
test(order): Add TDD coverage for C/F/A/K order support across API su…
Nucs Apr 20, 2026
3b55e9e
test(order): Expand coverage to every np.* function accepting order
Nucs Apr 20, 2026
41d65f7
test(order): Add coverage for ops, statistics, manipulation, matmul
Nucs Apr 20, 2026
c10b4a6
test(order): Round 4 — unary math, division, in-place, NaN-aware, bro…
Nucs Apr 20, 2026
47b6400
fix(order): Wire F-order support through copy/conversion and _like/as…
Nucs Apr 20, 2026
2ba101f
feat(NpyIter): Three-tier custom-op API + expanded NpyExpr DSL
Nucs Apr 20, 2026
4b2f7a9
fix(order): Wire F-order support through flatten/ravel/reshape/eye (G…
Nucs Apr 20, 2026
50de6c9
fix(order): Add asarray/asanyarray/asfortranarray/ascontiguousarray +…
Nucs Apr 20, 2026
23806cd
fix(order): NDArray.argsort copies non-C-contig input to C-contig fir…
Nucs Apr 20, 2026
42381d5
docs(NDIter): Max-effort amend — gotchas, validation, 4 new bugs, 4 n…
Nucs Apr 20, 2026
39ef08c
fix(order): Post-hoc F-contig preservation across ILKernel dispatch +…
Nucs Apr 20, 2026
ee8c65b
perf(flatten): Drop redundant ArraySlice clone on F-order path
Nucs Apr 20, 2026
74a92e9
feat(NpyExpr): Add Call() for arbitrary delegate/MethodInfo invocation
Nucs Apr 20, 2026
53a506f
fix(order): Review cleanups — dim aliasing, modf Type overload, resha…
Nucs Apr 20, 2026
e7ec2fd
docs(NDIter): Promote Call to dedicated subsection + memory-model sec…
Nucs Apr 20, 2026
25b058a
docs(NDIter): Add 7-technique quick reference + decision tree at top
Nucs Apr 20, 2026
387c4e6
refactor(NpyIter): Rename Tier A/B/C to Tier 3A/3B/3C
Nucs Apr 20, 2026
c1f6e84
fix(NPTypeCode): Char SizeOf returned 1 (real=2); GetPriority Decimal…
Nucs Apr 20, 2026
3d1a529
feat(examples): 2-layer MLP on MNIST with single-NpyIter bias+ReLU fu…
Nucs Apr 20, 2026
b5ede36
test(order): Section 41 — Reductions keepdims=True on F-contig (17 te…
Nucs Apr 20, 2026
cfe2a77
test(order): Section 42 — np.sort API gap (1 test, 1 [OpenBugs])
Nucs Apr 20, 2026
f90fe45
test(order): Section 43 — matmul/dot/outer/convolve layout (11 tests,…
Nucs Apr 20, 2026
779f6fc
test(order): Section 44 — Broadcasting from F-contig (5 tests, 0 [Ope…
Nucs Apr 20, 2026
e18caef
feat(examples): Trainable MNIST MLP -- fused forward + backward + Ada…
Nucs Apr 20, 2026
76b9c4e
test(order): Section 45 — Manipulation ops layout (20 tests, 2 [OpenB…
Nucs Apr 20, 2026
2e48d2c
test(order): Section 46 — File I/O fortran_order flag (4 tests, 3 [Op…
Nucs Apr 20, 2026
3f7172e
test(order): Section 47 — around / round_ (6 tests, 3 [OpenBugs])
Nucs Apr 20, 2026
b02a304
test(order): Section 49 — Decimal scalar-full path (10 tests, 1 [Open…
Nucs Apr 20, 2026
61db29e
test(order): Section 50 — Edge cases (12 tests, 1 [OpenBugs])
Nucs Apr 20, 2026
eda98fb
test(order): Section 51 — Fancy-write isolation repros (5 tests, 3 [O…
Nucs Apr 20, 2026
cd38eb1
perf(examples/mlp): 31x faster training -- copy transposed views befo…
Nucs Apr 21, 2026
7e46030
feat(Char8): 1-byte Char8 type with 100% NumPy/Python bytes parity + …
Nucs Apr 21, 2026
038d1ca
feat(mlp): periodic test eval + 100-epoch demo config
Nucs Apr 21, 2026
1783b48
fix(examples): complete all stubbed/broken NN scaffolding classes
Nucs Apr 21, 2026
edcf866
Add NDArray documentation
Nucs Apr 21, 2026
6e1da5d
perf(matmul): stride-native GEMM for all 12 dtypes — no copies
Nucs Apr 21, 2026
ef0c0b8
feat(tile): 1-to-1 parity with NumPy 2.x — battletest + edge-case cov…
Nucs Apr 22, 2026
259e893
docs(tile): update CLAUDE.md inventory + unmark Tile_ApiGap
Nucs Apr 22, 2026
572f6b6
refactor(iterators): migrate all production callers from MultiIterato…
Nucs Apr 22, 2026
d12d7ba
feat(npyiter): promote Iterators/ to full public API + NDArray overloads
Nucs Apr 22, 2026
9b2749b
refactor(iterators): Phase 2 migration — NaN reductions, BooleanMask,…
Nucs Apr 22, 2026
8af86b2
refactor(iterators): Phase 2 cont. — random sampling, casting, GetEnu…
Nucs Apr 22, 2026
7264173
refactor(iterators): rewrite NDIterator as NpyIter wrapper, delete le…
Nucs Apr 22, 2026
51ad43c
fix(npyiter): ForEach/ExecuteGeneric/ExecuteReducing read past end wi…
Nucs Apr 22, 2026
bb205d3
docs(examples): CLAUDE.md for the NeuralNetwork.NumSharp project
Nucs Apr 22, 2026
fb4b7dc
refactor(iterators): NDIterator now iterates lazily — no materialized…
Nucs Apr 22, 2026
b86b348
refactor(iterators): NDIterator fully backed by NpyIter state
Nucs Apr 22, 2026
e2318d4
fix(storage): DTypeSize reports in-memory stride, not Marshal.SizeOf
Nucs Apr 22, 2026
d364e7f
refactor(iterators+docs): cleanup from NpyIter migration
Nucs Apr 23, 2026
574a0d8
refactor(npfunc): replace ~400 NPTypeCode switch cases with NpFunc ge…
Nucs Apr 23, 2026
c3bbe9a
fix(clip): Complex IComparable constraint + Half NaN propagation
Nucs Apr 23, 2026
a96c9d9
fix(power): NumPy-aligned np.power — crash fix, neg-exp ValueError, f…
Nucs May 13, 2026
ff1149f
docs(audit): nditer branch quality audit V1+V2 with chapter findings
Nucs May 13, 2026
7a1d44e
test(audit-v2): OpenBugs reproductions for V2 audit Tier 1 findings
Nucs May 13, 2026
65ef76f
docs(claude): refresh project doc for F-order support and NpyIter
Nucs May 13, 2026
01d57a0
fix(unmanaged): correct CopyTo direction + bounds in ArraySlice / Hel…
Nucs May 13, 2026
1a9646f
fix(shape+convert): preserve scalar offset on Clone; fix ArrayConvert…
Nucs May 13, 2026
f21ea30
fix(storage+ndarray): keep TensorEngine in sync; correct cast for F-c…
Nucs May 13, 2026
414b35f
fix(default-engine): propagate TensorEngine through Cast and Transpose
Nucs May 13, 2026
9c9a396
fix(creation+manipulation): wire TensorEngine through copy / reshape …
Nucs May 13, 2026
443b7e0
feat(np.where): NumPy-aligned C/F output layout selection
Nucs May 13, 2026
50737cd
feat(np.tile): preserve input order on all-ones / no-reps path; refre…
Nucs May 13, 2026
fa60569
chore: deep-clone Options.RemainingArgs; drop trailing newline in Npy…
Nucs May 13, 2026
40371e4
fix(npyiter): deep-copy buffered Clone buffers; preserve stride width…
Nucs May 13, 2026
d4b2af3
test(clone): regression suite for unmanaged copy + storage + iterator…
Nucs May 13, 2026
786d705
feat(dtype): full 15-dtype parity for SByte/Half/Complex across hot p…
Nucs May 13, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .agents/skills/np-function/SKILL.md
1 change: 1 addition & 0 deletions .agents/skills/np-tests/SKILL.md
53 changes: 32 additions & 21 deletions .claude/CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@ Every np.* function and DefaultEngine operation MUST satisfy these criteria:
- **Sliced views**: Correctly handles Shape.offset for base address calculation

### Dtype Support
All 12 NumSharp types must be handled (or explicitly documented as unsupported):
Boolean, Byte, Int16, UInt16, Int32, UInt32, Int64, UInt64, Char, Single, Double, Decimal
All 15 NumSharp types must be handled (or explicitly documented as unsupported):
Boolean, Byte, SByte, Int16, UInt16, Int32, UInt32, Int64, UInt64, Char, Half, Single, Double, Decimal, Complex

### NumPy API Parity
- Function signature matches NumPy (parameter names, order, defaults)
Expand All @@ -43,18 +43,26 @@ Boolean, Byte, Int16, UInt16, Int32, UInt32, Int64, UInt64, Char, Single, Double

**Full audit tracking:** See `docs/KERNEL_API_AUDIT.md`

## Supported Types (12)
## Supported Types (15)

| NPTypeCode | C# Type | NPTypeCode | C# Type |
|------------|---------|------------|---------|
| Boolean | bool | Int64 | long |
| Byte | byte | UInt64 | ulong |
| Int16 | short | Char | char |
| UInt16 | ushort | Single | float |
| Int32 | int | Double | double |
| UInt32 | uint | Decimal | decimal |

All operations must handle all 12 types via type switch pattern.
| Boolean | bool | UInt64 | ulong |
| Byte | byte | Char | char |
| SByte | sbyte | Half | System.Half |
| Int16 | short | Single | float |
| UInt16 | ushort | Double | double |
| Int32 | int | Decimal | decimal |
| UInt32 | uint | Complex | System.Numerics.Complex |
| Int64 | long | | |

All operations must handle all 15 types via type switch pattern.

**Perf notes:**
- SByte / Byte / Int*/UInt* / Single / Double — full SIMD via `MixedTypeKernel.SimdFull` (V128/V256/V512 detected at startup).
- Half — scalar path (no `Vector<Half>` arithmetic in .NET BCL). Routes through `Half→double→Math.Pow→Half` for `np.power`; ~2× slower than NumPy.
- Complex — scalar path via `System.Numerics.Complex` operators / `Complex.Pow`. ~2× slower than NumPy.
- Decimal — scalar path via `DecimalMath.Pow`. Highest precision, slowest.

## Architecture

Expand All @@ -74,7 +82,7 @@ np Static API class (like `import numpy as np`)
| Decision | Rationale |
|----------|-----------|
| Unmanaged memory | Benchmarked fastest; optimized for performance |
| C-order only | Only row-major (C-order) memory layout. Uses `ArrayFlags.C_CONTIGUOUS` flag. No F-order/column-major support. The `order` parameter on `ravel`, `flatten`, `copy`, `reshape` is accepted but ignored. |
| Order-aware layout | Row-major (C-order) remains the default. `Shape` also tracks F-contiguity, and APIs with an `order` parameter resolve NumPy `C`/`F`/`A`/`K` modes through `OrderResolver`. |
| Regen templating | Type-specific code generation (legacy, mostly replaced by ILKernel) |
| TensorEngine abstract | Future GPU/SIMD backends possible |
| View semantics | Slicing returns views (shared memory), not copies |
Expand Down Expand Up @@ -146,14 +154,15 @@ public readonly partial struct Shape
| Flag | Value | Meaning |
|------|-------|---------|
| `C_CONTIGUOUS` | 0x0001 | Data is row-major contiguous |
| `F_CONTIGUOUS` | 0x0002 | Reserved (always false for NumSharp) |
| `F_CONTIGUOUS` | 0x0002 | Data is column-major contiguous |
| `OWNDATA` | 0x0004 | Array owns its data buffer |
| `ALIGNED` | 0x0100 | Always true for managed allocations |
| `WRITEABLE` | 0x0400 | False for broadcast views |
| `BROADCASTED` | 0x1000 | Has stride=0 with dim > 1 |

**Key Shape properties:**
- `IsContiguous` — O(1) check via `C_CONTIGUOUS` flag
- `IsFContiguous` — O(1) check via `F_CONTIGUOUS` flag
- `IsBroadcasted` — O(1) check via `BROADCASTED` flag
- `IsWriteable` — False for broadcast views (prevents corruption)
- `IsSliced` — True if offset != 0, different size, or non-contiguous
Expand Down Expand Up @@ -182,15 +191,14 @@ nd["..., -1"] // Ellipsis fills dimensions

---

## Missing Functions (20)
## Missing Functions (18)

These NumPy functions are **not implemented**:

| Category | Functions |
|----------|-----------|
| Sorting | `np.sort` |
| Selection | `np.where` |
| Manipulation | `np.flip`, `np.fliplr`, `np.flipud`, `np.rot90`, `np.tile`, `np.pad` |
| Manipulation | `np.flip`, `np.fliplr`, `np.flipud`, `np.rot90`, `np.pad` |
| Splitting | `np.split`, `np.array_split`, `np.hsplit`, `np.vsplit`, `np.dsplit` |
| Diagonal | `np.diag`, `np.diagonal`, `np.trace` |
| Cumulative | `np.diff`, `np.gradient`, `np.ediff1d` |
Expand All @@ -206,7 +214,7 @@ Tested against NumPy 2.x.
`arange`, `array`, `asanyarray`, `asarray`, `copy`, `empty`, `empty_like`, `eye`, `frombuffer`, `full`, `full_like`, `identity`, `linspace`, `meshgrid`, `mgrid`, `ones`, `ones_like`, `zeros`, `zeros_like`

### Shape Manipulation
`atleast_1d`, `atleast_2d`, `atleast_3d`, `concatenate`, `dstack`, `expand_dims`, `flatten`, `hstack`, `moveaxis`, `ravel`, `repeat`, `reshape`, `roll`, `rollaxis`, `squeeze`, `stack`, `swapaxes`, `transpose`, `unique`, `vstack`
`atleast_1d`, `atleast_2d`, `atleast_3d`, `concatenate`, `dstack`, `expand_dims`, `flatten`, `hstack`, `moveaxis`, `ravel`, `repeat`, `reshape`, `roll`, `rollaxis`, `squeeze`, `stack`, `swapaxes`, `tile`, `transpose`, `unique`, `vstack`

### Broadcasting
`are_broadcastable`, `broadcast`, `broadcast_arrays`, `broadcast_to`
Expand All @@ -226,6 +234,9 @@ Tested against NumPy 2.x.
### Comparison & Logic
`all`, `allclose`, `any`, `array_equal`, `find_common_type`, `isclose`, `isfinite`, `isinf`, `isnan`, `isscalar`, `maximum`, `minimum`

### Selection
`where`

### Sorting & Searching
`argmax`, `argmin`, `argsort`, `nonzero`, `searchsorted`

Expand Down Expand Up @@ -264,7 +275,7 @@ Tested against NumPy 2.x.
| TensorEngine | `Backends/TensorEngine.cs` |
| DefaultEngine | `Backends/Default/DefaultEngine.*.cs` |
| np API | `APIs/np.cs` |
| Iterators | `Backends/Iterators/NDIterator.cs`, `MultiIterator.cs` |
| Iterators | `Backends/Iterators/NDIterator.cs`, `NpyIter.cs`, `NpyExpr.cs` |
| Type info | `Utilities/InfoOf.cs` |
| Generic NDArray | `Generics/NDArray\`1.cs` |

Expand Down Expand Up @@ -574,10 +585,10 @@ A: The `Slice` class parses Python notation (e.g., "1:5:2") into `Start`, `Stop`
A: `Slice.All` (`:` - all elements), `Slice.Ellipsis` (`...` - fill dimensions), `Slice.NewAxis` (insert dimension), `Slice.Index(n)` (single element, reduces dimensionality).

**Q: What is NDIterator used for?**
A: Traversing arrays with different memory layouts. Handles contiguous (fast pointer increment) and sliced (uses GetOffset) arrays. Has `MoveNext()`, `HasNext()`, `Reset()`. AutoReset mode for broadcasting smaller arrays.
A: Legacy typed traversal surface over `NpyIter`. It keeps the existing `MoveNext()`, `HasNext()`, and `Reset()` API while delegating stride, broadcast, and view traversal to the NpyIter state machinery.

**Q: What is MultiIterator?**
A: Handles paired iteration for broadcasting. `MultiIterator.Assign(lhs, rhs)` copies with broadcasting. `GetIterators(lhs, rhs, broadcast)` creates synchronized iterators.
**Q: What is NpyIter?**
A: The NumPy-aligned multi-operand iterator. It handles C/F/A/K order, broadcasting, external loops, buffering, casting, masks, reductions, and synchronized traversal for copy and elementwise kernels. `MultiIterator` was removed in favor of `NpyIter.Copy` and multi-operand iterator execution.

**Q: How does broadcasting work?**
A: Shapes align from the right. Dimensions must be equal OR one must be 1. Dimension of 1 "stretches" to match. Implemented via `DefaultEngine.Broadcast()` which resolves compatible shapes.
Expand Down
1 change: 1 addition & 0 deletions AGENTS.md
2 changes: 2 additions & 0 deletions ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -480,6 +480,8 @@ MoveNext = () => *((T*)Address + shape.GetOffset(index++));

## Code Generation

For the practical implementation rules used by `DefaultEngine` and `ILKernelGenerator`, see `docs/DEFAULTENGINE_ILKERNEL_PLAYBOOK.md`. That guide captures the recurring engine patterns, optimization conventions, and test expectations that are only implicit in the source code.

### Regen Templating

NumSharp uses Regen (a custom templating engine) to generate type-specific code. This results in approximately **200,000 lines of generated code**.
Expand Down
2 changes: 1 addition & 1 deletion benchmark/NumSharp.Benchmark.Exploration/Program.cs
Original file line number Diff line number Diff line change
Expand Up @@ -393,7 +393,7 @@ private class Options
Dtypes = Dtypes,
Sizes = Sizes,
OutputPath = OutputPath,
RemainingArgs = RemainingArgs
RemainingArgs = (string[])RemainingArgs.Clone()
};
}
}
Loading
Loading