Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,8 @@ COPY --from=builder /usr/local/bin/warmup /usr/local/bin/warmup
ARG EMBEDDING_MODELS=nomic,bge-small
RUN EMBEDDING_MODELS="${EMBEDDING_MODELS}" EMBEDDING_POOL_SIZE=1 /usr/local/bin/warmup

# EXPOSE is build-time metadata only; the actual port is controlled by the
# EMBEDDING_PORT env var at runtime (default 3000).
EXPOSE 3000

ENTRYPOINT ["/usr/local/bin/embedding"]
11 changes: 11 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,17 @@ curl -X POST http://localhost:3000/embed \
-d '{"texts":["hello world","another piece of text"]}'
```

## Configuration

Configured via environment variables (set them in `.env`):

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Possible default mismatch for EMBEDDING_MODELS

The table documents the default as nomic, but the Dockerfile's ARG EMBEDDING_MODELS=nomic,bge-small suggests both models are baked into the image during the warmup step. Could you confirm what the actual runtime default is in the Rust library? If the service loads both models by default, the table entry should be updated to nomic,bge-small to match.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

| Variable | Default | Description |
| --- | --- | --- |
| `EMBEDDING_PORT` | `3000` | Port the service listens on. |
| `EMBEDDING_MODELS` | `nomic` | Comma-separated list of models to load. |
| `EMBEDDING_CACHE_DIR` | _(default cache)_ | Directory for downloaded model files. |
| `EMBEDDING_POOL_SIZE` | _(memory-derived)_ | Number of model instances per pool. |

## API

### `POST /embed`
Expand Down
2 changes: 1 addition & 1 deletion docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ services:
image: embedding:latest
container_name: embedding
ports:
- "3000:3000"
- "${EMBEDDING_PORT:-3000}:${EMBEDDING_PORT:-3000}"
env_file:
- .env
volumes:
Expand Down
9 changes: 8 additions & 1 deletion src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,14 @@ async fn main() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {

let app = Router::new().route("/embed", post(embed)).with_state(state);

let addr = std::env::var("BIND_ADDR").unwrap_or_else(|_| "0.0.0.0:3000".to_string());
let port_str = std::env::var("EMBEDDING_PORT").unwrap_or_else(|_| "3000".to_string());
let port: u16 = port_str.parse().map_err(|_| {
format!(
"EMBEDDING_PORT '{}' is not a valid port number (1-65535)",
port_str
)
})?;
let addr = format!("0.0.0.0:{}", port);
let listener = tokio::net::TcpListener::bind(&addr).await?;
tracing::info!("listening on {}", addr);
axum::serve(listener, app)
Expand Down