Skip to content

Add approximate parameter to GELU activation function#1548

Merged
alinpahontu2912 merged 4 commits into
dotnet:mainfrom
alinpahontu2912:feature/gelu-approximate-parameter
May 4, 2026
Merged

Add approximate parameter to GELU activation function#1548
alinpahontu2912 merged 4 commits into
dotnet:mainfrom
alinpahontu2912:feature/gelu-approximate-parameter

Conversation

@alinpahontu2912
Copy link
Copy Markdown
Member

Fixes #1368

Add support for the 'approximate' parameter in GELU, matching PyTorch's torch.nn.GELU(approximate='tanh') functionality.

Changes:

  • Add GELU.Approximate enum with 'none' and 'tanh' values
  • Thread approximate parameter through all layers: native C++, PInvoke, Tensor methods, functional API, and module factory
  • Add new overloads (no breaking changes to existing API)
  • Add test for tanh approximation mode

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for PyTorch’s approximate mode to GELU (notably "tanh"), threading the option through the native (C++), P/Invoke, Tensor, functional, and module APIs, and adding a regression test.

Changes:

  • Introduces Modules.GELU.Approximate (none / tanh) and plumbs it through nn.GELU and nn.functional.gelu.
  • Extends Tensor gelu/gelu_ to accept an approximation mode and updates the corresponding native/PInvoke signatures.
  • Adds a unit test validating the tanh approximation path and that it differs from the exact mode.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
test/TorchSharpTest/NN.cs Adds a test covering GELU tanh approximation behavior.
src/TorchSharp/Tensor/Tensor.cs Adds gelu/gelu_ overloads that pass approximation through to native.
src/TorchSharp/PInvoke/LibTorchSharp.THSTensor.cs Updates P/Invoke signatures to accept the approximation string.
src/TorchSharp/NN/Activation/GELU.cs Adds approximation enum + overloads in module factory and functional API.
src/Native/LibTorchSharp/THSTensor.h Updates native exports for GELU to accept an approximation parameter.
src/Native/LibTorchSharp/THSTensor.cpp Passes approximation through to torch::gelu / torch::gelu_.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/TorchSharp/PInvoke/LibTorchSharp.THSTensor.cs Outdated
Comment thread src/TorchSharp/Tensor/Tensor.cs Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/TorchSharp/Tensor/Tensor.cs Outdated
Comment thread src/TorchSharp/Tensor/Tensor.cs Outdated
Comment thread test/TorchSharpTest/NN.cs
Comment thread src/TorchSharp/Tensor/Enums/GELUApproximate.cs Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/Native/LibTorchSharp/THSTensor.h Outdated
Comment thread src/Native/LibTorchSharp/THSTensor.cpp Outdated
Comment thread src/TorchSharp/Tensor/Enums/GELUApproximate.cs Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/TorchSharp/Tensor/Enums/GELUApproximate.cs Outdated
alinpahontu2912 and others added 4 commits May 4, 2026 10:21
Add support for the 'approximate' parameter in GELU, matching PyTorch's
torch.nn.GELU(approximate='tanh') functionality.

Changes:
- Add GELU.Approximate enum with 'none' and 'tanh' values
- Thread approximate parameter through all layers: native C++, PInvoke,
  Tensor methods, functional API, and module factory
- Add new overloads (no breaking changes to existing API)
- Add test for tanh approximation mode

Fixes dotnet#1368

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Move Approximate enum from GELU module class to neutral
  TorchSharp namespace as GELUApproximate, removing Tensor/functional
  layer dependency on Modules layer
- Add CharSet, BestFitMapping, ThrowOnUnmappableChar attributes to
  THSTensor_gelu/gelu_ DllImport declarations to match existing
  LPStr-based imports pattern
- Update all references in Tensor.cs, GELU.cs, and tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Keep original THSTensor_gelu/gelu_ exports unchanged for ABI
  compatibility
- Add new THSTensor_gelu_with_approximate/gelu_with_approximate_
  exports that accept the approximate string parameter
- Add null guard in native code, treating null as 'none'
- Update P/Invoke declarations and managed callers accordingly

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move the standalone GELUApproximate enum into the GELU class as a
nested Approximate enum, making the public API surface GELU.Approximate
instead of GELUApproximate. This matches the intended API design and
scopes the enum to its semantically relevant module.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@alinpahontu2912 alinpahontu2912 force-pushed the feature/gelu-approximate-parameter branch from 6d3eac0 to 64db9b2 Compare May 4, 2026 08:22
@alinpahontu2912 alinpahontu2912 merged commit 835e4d8 into dotnet:main May 4, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GELU does not appear to support approximate tanh

3 participants