Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
af2146d
feat: Gemma 4 support
giladgd Apr 6, 2026
2471df8
fix: Gemma 4 resource requirements estimation
giladgd Apr 12, 2026
67b60c6
feat: more precise resource usage estimation, auto flash attention, r…
giladgd Apr 28, 2026
2fe0dd9
fix: Vulkan backend successful load detection even when no devices ar…
giladgd Apr 28, 2026
d5c4c2c
feat: optimize grammar sampling performance
giladgd Apr 28, 2026
2f01d10
fix: resolve Gemma 4 chat wrapper for relevant models
giladgd Apr 28, 2026
4ce206b
test: gemma 4 function calling
giladgd Apr 28, 2026
dacca3b
feat: `useMmap: "auto"`, bug fixes, fix tests
giladgd Apr 28, 2026
3142124
feat: support `Q1_0` quant, fix `MXFP4_MOE` quant name
giladgd Apr 28, 2026
b48681f
fix: apply `llama.cpp` patches if pending PRs aren't merged yet
giladgd May 5, 2026
74fef2f
fix: adapt to breaking `llama.cpp` changes
giladgd May 5, 2026
c772709
test: fix tests
giladgd May 5, 2026
40d204d
fix: bug
giladgd May 5, 2026
0acdc31
test: fix tests
giladgd May 5, 2026
ea7fce0
fix: type
giladgd May 5, 2026
3fc0363
test: fix tests
giladgd May 6, 2026
aa50af2
fix: don't crash on unsupported model architecture
giladgd May 6, 2026
ca607fd
feat: improve stability on unified memory systems
giladgd May 20, 2026
79543d9
fix: bugs
giladgd May 20, 2026
547c692
fix: correct wired memory calculation
giladgd May 20, 2026
fe284fe
fix: improve measure safety
giladgd May 20, 2026
5ef1c2b
fix: bug
giladgd May 20, 2026
1c62b87
fix: bug
giladgd May 20, 2026
b099ead
fix: bugs
giladgd May 20, 2026
6b387e6
fix: remove patch for merged PR
giladgd May 25, 2026
720a2d2
Merge remote-tracking branch 'origin/master' into gilad/gemma4
giladgd May 26, 2026
7f91df0
feat: try using github token to fetch latest llama.cpp release on rat…
giladgd May 26, 2026
cb6f8c1
feat: disabled residency sets on macOS by default for better OS respo…
giladgd May 26, 2026
6977bcd
fix: bug
giladgd May 26, 2026
9d9cccb
fix: Windows LLVM toolchain
giladgd May 27, 2026
0cf657e
feat: more optimized local build
giladgd May 27, 2026
68386d6
fix: consider paddings in resource usage calculations
giladgd Jun 3, 2026
d3f88c9
feat: skip specific patches, respect progress logs config
giladgd Jun 3, 2026
e15fbbd
fix: add missing change
giladgd Jun 3, 2026
4bf6958
fix: properly use HF token when needed and present
giladgd Jun 4, 2026
7879fb2
fix: improve thready safety
giladgd Jun 4, 2026
98977a1
fix: update pending PR patch
giladgd Jun 4, 2026
35b0184
fix: bugs
giladgd Jun 4, 2026
5201176
feat(`inspect estimate` command): auto resolve flash attention
giladgd Jun 4, 2026
18ebb8b
fix: circular imports from `config.ts`
giladgd Jun 5, 2026
876bb0b
feat: default to `progressLogs: "stderr"`
giladgd Jun 5, 2026
115ef68
feat(`inspect measure` command): support embedding models
giladgd Jun 5, 2026
6f592d4
fix: native types in sampler
giladgd Jun 5, 2026
0ebe562
fix: update pending PR patch
giladgd Jun 5, 2026
c41b189
test: fix tests
giladgd Jun 6, 2026
dab905e
fix: model memory estimation
giladgd Jun 6, 2026
94ad63a
fix(`inspect measure` command): align `useDirectIo` with the rest of …
giladgd Jun 6, 2026
15e1c38
fix: Vulkan thread safety
giladgd Jun 6, 2026
243ed2e
fix: Vulkan thread safety
giladgd Jun 6, 2026
ecac77a
fix: load deadlock
giladgd Jun 6, 2026
a1b5a27
fix: add missing change
giladgd Jun 6, 2026
f6b8b5e
test: fix test
giladgd Jun 6, 2026
5f2094a
fix: update pending PR patch
giladgd Jun 7, 2026
3f46e0a
feat: faster resource usage estimation
giladgd Jun 7, 2026
89e3e01
docs: inform about RAM cap behavior on unified memory systems
giladgd Jun 7, 2026
9cbb5cd
fix: bugs
giladgd Jun 7, 2026
6598cfd
fix(CLI): avoid redownloading existing model that consists of multipl…
giladgd Jun 14, 2026
db92aac
fix: join metadata from multi-file models for resource usage estimation
giladgd Jun 14, 2026
0b2348e
fix: update pending PR patch
giladgd Jun 14, 2026
86eb3ba
fix: optimize checkpoints management when using grammar
giladgd Jun 14, 2026
2ad108a
fix: don't crash when loading huge models
giladgd Jun 14, 2026
922475f
fix: attribute external JS memory to the relevant objects
giladgd Jun 14, 2026
9f31ca0
fix: github client ratelimit workaround
giladgd Jun 14, 2026
ce952ac
chore: update workflow versions
giladgd Jun 14, 2026
e0cfc89
test: fix tests
giladgd Jun 14, 2026
70cfccf
docs: update Electron CI example
giladgd Jun 14, 2026
4a5ee76
fix: build
giladgd Jun 14, 2026
a98d5b9
test: fix test
giladgd Jun 14, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
117 changes: 64 additions & 53 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,32 +13,34 @@ jobs:
name: Build
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
- uses: actions/checkout@v6
- uses: actions/setup-node@v6
with:
node-version: "20"
node-version: "22"
package-manager-cache: false
- name: Install modules
run: npm ci
- name: Build
run: npm run build
- name: Download latest llama.cpp release
env:
CI: true
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: node ./dist/cli/cli.js source download --release latest --skipBuild --noBundle --noUsageExample --updateBinariesReleaseMetadataAndSaveGitBundle
- name: Upload build artifact
uses: actions/upload-artifact@v4
uses: actions/upload-artifact@v7
with:
include-hidden-files: true
name: "build"
path: "dist"
- name: Upload packed templates artifact
uses: actions/upload-artifact@v4
uses: actions/upload-artifact@v7
with:
include-hidden-files: true
name: "build-templates"
path: "templates/packed"
- name: Upload llama.cpp artifact
uses: actions/upload-artifact@v4
uses: actions/upload-artifact@v7
with:
include-hidden-files: true
name: "llama.cpp"
Expand Down Expand Up @@ -77,19 +79,20 @@ jobs:
artifact: "mac-arm64"

steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
- uses: actions/checkout@v6
- uses: actions/setup-node@v6
with:
node-version: "20"
node-version: "22"
package-manager-cache: false

- name: Download build artifact
uses: actions/download-artifact@v4
uses: actions/download-artifact@v8
with:
name: build
path: dist

- name: Download llama.cpp artifact
uses: actions/download-artifact@v4
uses: actions/download-artifact@v8
with:
name: llama.cpp
path: llama
Expand Down Expand Up @@ -314,7 +317,7 @@ jobs:

# - name: Cache UPX
# id: cache-upx
# uses: actions/cache@v4
# uses: actions/cache@v5
# with:
# path: "upxInstallations/**"
# key: cache-upx-${{ runner.os }}-${{ github.workflow }}
Expand Down Expand Up @@ -361,7 +364,7 @@ jobs:
# chmod -x ./bins/linux-x64-cuda/llama-addon.node

- name: Publish artifact
uses: actions/upload-artifact@v4
uses: actions/upload-artifact@v7
with:
include-hidden-files: true
name: "bins-${{ matrix.config.artifact }}"
Expand All @@ -383,19 +386,20 @@ jobs:
outputs:
next-version: ${{ steps.save-next-version.outputs.next-version }}
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
- uses: actions/checkout@v6
- uses: actions/setup-node@v6
with:
node-version: "22"
package-manager-cache: false
- name: Install modules
run: npm ci
- name: Download build artifact
uses: actions/download-artifact@v4
uses: actions/download-artifact@v8
with:
name: build
path: dist
- name: Download llama.cpp artifact
uses: actions/download-artifact@v4
uses: actions/download-artifact@v8
with:
name: llama.cpp
path: llama
Expand All @@ -418,7 +422,7 @@ jobs:
echo "Next release version: \`$(cat ./resolvedNextVersion.txt)\`" >> $GITHUB_STEP_SUMMARY
fi
- name: Upload resolved release artifact
uses: actions/upload-artifact@v4
uses: actions/upload-artifact@v7
with:
include-hidden-files: true
name: "resolved-next-release"
Expand All @@ -430,19 +434,20 @@ jobs:
needs:
- build
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
- uses: actions/checkout@v6
- uses: actions/setup-node@v6
with:
node-version: "20"
node-version: "22"
package-manager-cache: false

- name: Download build artifact
uses: actions/download-artifact@v4
uses: actions/download-artifact@v8
with:
name: build
path: dist

- name: Download llama.cpp artifact
uses: actions/download-artifact@v4
uses: actions/download-artifact@v8
with:
name: llama.cpp
path: llama
Expand All @@ -469,19 +474,20 @@ jobs:
needs:
- build
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
- uses: actions/checkout@v6
- uses: actions/setup-node@v6
with:
node-version: "20"
node-version: "22"
package-manager-cache: false

- name: Download build artifact
uses: actions/download-artifact@v4
uses: actions/download-artifact@v8
with:
name: build
path: dist

- name: Download llama.cpp artifact
uses: actions/download-artifact@v4
uses: actions/download-artifact@v8
with:
name: llama.cpp
path: llama
Expand All @@ -507,7 +513,7 @@ jobs:

- name: Cache models
id: cache-restore-test-models
uses: actions/cache/restore@v4
uses: actions/cache/restore@v5
with:
path: "test/.models/**.gguf"
key: cache-test-models-${{ runner.os }}-${{ github.workflow }}
Expand All @@ -524,7 +530,7 @@ jobs:
- name: Save cached models
id: cache-save-test-models
if: steps.download-all-test-models.outcome == 'success' && always()
uses: actions/cache/save@v4
uses: actions/cache/save@v5
with:
path: "test/.models/**.gguf"
key: cache-test-models-${{ runner.os }}-${{ github.workflow }}
Expand All @@ -550,17 +556,18 @@ jobs:
outputs:
package-version: ${{ steps.set-package-version.outputs.package-version }}
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v6
with:
lfs: true
- uses: actions/setup-node@v4
- uses: actions/setup-node@v6
with:
node-version: "22"
package-manager-cache: false
- name: Update npm
run: npm install -g npm@latest
- name: Install modules
run: npm ci
- uses: actions/download-artifact@v4
- uses: actions/download-artifact@v8
with:
path: artifacts
- name: Move artifacts
Expand Down Expand Up @@ -684,12 +691,13 @@ jobs:
os: macos-15-intel

steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v6
with:
lfs: true
- uses: actions/setup-node@v4
- uses: actions/setup-node@v6
with:
node-version: "20"
node-version: "22"
package-manager-cache: false

- name: Install dependencies on Ubuntu
if: matrix.config.name == 'Ubuntu'
Expand Down Expand Up @@ -721,7 +729,7 @@ jobs:
ls ./release

- name: Upload artifacts
uses: actions/upload-artifact@v4
uses: actions/upload-artifact@v7
with:
include-hidden-files: true
name: "electron-app-example-${{ matrix.config.name }}"
Expand Down Expand Up @@ -775,17 +783,18 @@ jobs:
# Can be replaced with YAML anchors when this will be supported by GitHub Actions:
# https://github.com/actions/runner/issues/1182#issuecomment-2317953582
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v6
with:
lfs: true
fetch-depth: 0
fetch-tags: true
- uses: actions/setup-node@v4
- uses: actions/setup-node@v6
with:
node-version: "20"
node-version: "22"
package-manager-cache: false
- name: Install modules
run: npm ci
- uses: actions/download-artifact@v4
- uses: actions/download-artifact@v8
with:
path: artifacts
- name: Move artifacts
Expand Down Expand Up @@ -817,12 +826,12 @@ jobs:

npm run docs:build
- name: Upload docs to GitHub Pages
uses: actions/upload-pages-artifact@v3
uses: actions/upload-pages-artifact@v5
with:
name: pages-docs
path: docs-site
- name: Deploy docs to GitHub Pages
uses: actions/deploy-pages@v4
uses: actions/deploy-pages@v5
with:
artifact_name: pages-docs
- name: Update feed
Expand Down Expand Up @@ -855,17 +864,18 @@ jobs:
# Can be replaced with YAML anchors when this will be supported by GitHub Actions:
# https://github.com/actions/runner/issues/1182#issuecomment-2317953582
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v6
with:
lfs: true
fetch-depth: 0
fetch-tags: true
- uses: actions/setup-node@v4
- uses: actions/setup-node@v6
with:
node-version: "20"
node-version: "22"
package-manager-cache: false
- name: Install modules
run: npm ci
- uses: actions/download-artifact@v4
- uses: actions/download-artifact@v8
with:
path: artifacts
- name: Move artifacts
Expand Down Expand Up @@ -897,12 +907,12 @@ jobs:

npm run docs:build
- name: Upload docs to GitHub Pages
uses: actions/upload-pages-artifact@v3
uses: actions/upload-pages-artifact@v5
with:
name: pages-docs
path: docs-site
- name: Deploy docs to GitHub Pages
uses: actions/deploy-pages@v4
uses: actions/deploy-pages@v5
with:
artifact_name: pages-docs
- name: Update feed
Expand All @@ -921,15 +931,16 @@ jobs:
# pull-requests: write
# discussions: write
# steps:
# - uses: actions/checkout@v4
# - uses: actions/setup-node@v4
# - uses: actions/checkout@v6
# - uses: actions/setup-node@v6
# with:
# node-version: "20"
# node-version: "22"
# package-manager-cache: false
# - name: Install modules
# run: npm ci
#
# - name: Pull artifact from broken release
# uses: actions/download-artifact@v4
# uses: actions/download-artifact@v8
# with:
# name: resolved-next-release
# github-token: ${{ secrets.GITHUB_TOKEN }}
Expand Down
12 changes: 6 additions & 6 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,10 @@ jobs:
name: Test
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
- uses: actions/checkout@v6
- uses: actions/setup-node@v6
with:
node-version: "20"
node-version: "22"
- name: Install modules
run: npm ci
- name: ESLint
Expand All @@ -27,14 +27,14 @@ jobs:
name: Test docs compilation
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v6
with:
lfs: true
fetch-depth: 0
fetch-tags: true
- uses: actions/setup-node@v4
- uses: actions/setup-node@v6
with:
node-version: "20"
node-version: "22"
- name: Install modules
run: npm ci
- name: Build
Expand Down
1 change: 1 addition & 0 deletions .vitepress/config/apiReferenceSidebar.ts
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ const chatWrappersOrder = [
"Llama3ChatWrapper",
"Llama2ChatWrapper",
"MistralChatWrapper",
"Gemma4ChatWrapper",
"GemmaChatWrapper",
"ChatMLChatWrapper",
"FalconChatWrapper",
Expand Down
9 changes: 5 additions & 4 deletions docs/guide/electron.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,10 +65,11 @@ jobs:
os: macos-13

steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
- uses: actions/checkout@v6
- uses: actions/setup-node@v6
with:
node-version: "20"
node-version: "22"
package-manager-cache: false

- name: Install dependencies on Ubuntu
if: matrix.config.name == 'Ubuntu'
Expand All @@ -87,7 +88,7 @@ jobs:
run: npm run build

- name: Upload artifacts
uses: actions/upload-artifact@v4
uses: actions/upload-artifact@v7
with:
include-hidden-files: true
name: "electron-app-${{ matrix.config.name }}"
Expand Down
Loading
Loading