Tensorbit Quant

Third stage of the Tensorbit Labs P-D-Q pipeline (Prune → Distill → Quant → Run).

Reads an FP32 .tbm model container from tensorbit-core, quantizes weight tensors to INT4 or INT8, and writes a new .tbm with compressed weights + per-group scale metadata. Zero external dependencies — C++20 standard library only.

.tbm Container Format

┌─────────────────────────────────────────────────────────────────┐
│  Tensor 0 Blob                                                  │
│  ┌──────────┬───────────────────┬──────────┬───────┐            │
│  │ TBHeader │ Quantized weights │  Scales  │ Masks │            │
│  │ (4096 B) │ (INT4: N/2 bytes) │ (FP32)   │       │            │
│  └──────────┴───────────────────┴──────────┴───────┘            │
├─────────────────────────────────────────────────────────────────┤
│  ... more tensors ...                                           │
├─────────────────────────────────────────────────────────────────┤
│  JSON Index (UTF-8)                                             │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │ {"name":"...","offset":...,"dtype":"int4",                │   │
│  │  "num_weights":...,"num_mask_bytes":...,                  │   │
│  │  "scale_count":8192,"group_size":128}                     │   │
│  └──────────────────────────────────────────────────────────┘   │
├─────────────────────────────────────────────────────────────────┤
│  4-byte LE uint32 = JSON byte length                            │
└─────────────────────────────────────────────────────────────────┘

New fields in the JSON index per tensor:

Field	Type	Description
`dtype`	string	`"int4"` or `"int8"`
`scale_count`	int	Number of FP32 scale values (one per group)
`group_size`	int	Elements per quant group (0 = per-tensor)
`zp_count`	int	Number of zero-point bytes (asymmetric only)

Usage

mkdir -p build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
cmake --build . --target tb-quant --parallel -j4

# INT4 symmetric, 128-element groups (default)
./bin/tb-quant --model model.tbm --output model.int4.tbm

# INT8 per-channel symmetric
./bin/tb-quant --model model.tbm --output model.int8.tbm --dtype int8 --group-size 0

# INT4 asymmetric
./bin/tb-quant --model model.tbm --output model.int4.tbm --scheme asymmetric

Options

Flag	Default	Description
`--model PATH`	(required)	Input FP32 .tbm file
`--output PATH`	(required)	Output quantized .tbm file
`--dtype TYPE`	`int4`	Quantization type: `int4` or `int8`
`--scheme SCHEME`	`symmetric`	`symmetric` or `asymmetric`
`--group-size N`	`128`	Elements per quant group (0 = per-tensor)
`--help, -h`		Print help
`--version`		Print version

Quantization Methods

Symmetric INT4 (default):

Group of 128 weights → find max_abs = max(|w|)
Scale = max_abs / 7. Each weight: q = round(w / scale), clamped to [-8, 7]
Two 4-bit values packed per byte (low nibble first)

Symmetric INT8 per-row:

One scale per matrix row. Scale = max_abs / 127
Each weight stored as signed 8-bit integer

Asymmetric INT4:

Per-group min/max. Scale = (max - min) / 15, zero-point = round(-min / scale)
Each weight: q = round(w / scale) + zp, clamped to [0, 15]

License

This project is dual-licensed.

Open source use: Licensed under the GNU AGPLv3. You may use, modify, and distribute the code under the terms of the AGPL, which requires all modifications and larger works to be licensed under the same license and requires making source code available to network users.
Commercial use: If you wish to use this library in a proprietary product without the copyleft obligations of the AGPL, a separate commercial license is available. Please contact us for details.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
include/tensorbit/quant		include/tensorbit/quant
src		src
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
test_quant.py		test_quant.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tensorbit Quant

.tbm Container Format

Usage

Options

Quantization Methods

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Tensorbit Quant

.tbm Container Format

Usage

Options

Quantization Methods

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages