Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
95 changes: 95 additions & 0 deletions DEPLOY_MODEL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# Deploying a Trained Model to ClimateVision

End-to-end checklist to take a model from training (Colab) to live on
**climatevision.green**. Do **flooding** first; deforestation and ice follow the
same steps with their own datasets.

---

## A. Train (Google Colab, GPU runtime)

Use `notebooks/flood_training_colab.ipynb` (Run all), or run the steps manually:

1. **Setup** — clone the repo, `pip install -r requirements.txt`, `pip install -e .`, confirm `torch.cuda.is_available()`.
2. **Download data** — `gcloud storage cp --recursive gs://sen1floods11/v1.1/data/flood_events/HandLabeled/{S2Hand,LabelHand}` and `.../splits` into `data/sen1floods11/`.
3. **Convert** —
```bash
python scripts/prepare_sen1floods11.py \
--s2-dir data/sen1floods11/S2Hand \
--label-dir data/sen1floods11/LabelHand \
--splits-dir data/sen1floods11/splits/flood_handlabeled \
--out-dir data/datasets/flooding
```
4. **Train** —
```bash
python scripts/train_real.py --analysis-type flooding \
--data-dir data/datasets/flooding \
--epochs 50 --batch-size 8 --image-size 256 --out models
```
Watch `val_iou` rise. Output: `models/flooding_<date>/best_model.pth`.

## B. Validate before promoting

5. **Evaluate** — `python scripts/evaluate.py --checkpoint <run>/best_model.pth --data-dir data/datasets/flooding`
6. **Governance gate** — `python scripts/governance_ci_gate.py`. **Only promote a model that passes.** This is the line between a "preview" and something an agency can act on.
7. **Model card** — `python scripts/generate_model_card.py --checkpoint <run>/best_model.pth` (records metrics + provenance for the audit trail).

## C. Export

8. **ONNX** — `python scripts/export_model.py --checkpoint <run>/best_model.pth`
produces `<run>/model.onnx` (+ quantized + `export_info.json`).
The API auto-serves this: `inference/pipeline.py` loads a `.pth` if present,
else `models/<type>_*/model.onnx` via onnxruntime.

## D. Get the model into the repo / image

Weights are **not** kept in git history by default, but the ONNX run dirs are
small (a few MB) and **are** committed so Render (which builds from GitHub) can
ship them.

9. Download `best_model.pth` and `model.onnx` from Colab.
10. Place them on your laptop under `models/flooding_<date>/`.
11. Commit + push:
```bash
git add models/flooding_<date>/
git commit -m "feat(models): trained Sen1Floods11 flood model (val_iou=<X>)"
git push origin main
```

> Alternative (large weights): instead of committing, put the files on the Fly
> volume at `/app/outputs` and point `config.yaml` `weights:` at them.

## E. Deploy

12. The push triggers a Render rebuild (or click **Sync** on the blueprint).
13. Confirm secrets are set once: `render env list climatevision-green` —
`GEE_SERVICE_ACCOUNT_KEY_JSON`, `GEE_PROJECT_ID`, `CLIMATEVISION_ALLOW_DEV_KEY=0`,
`CLIMATEVISION_CORS_ORIGINS` including `https://climatevision.green`.

## F. Verify it's real

14. Health + model check:
```bash
curl -s https://climatevision.green/api/health | jq
curl -s https://climatevision.green/api/health/models | jq
```
15. Confirm auth is locked down (cv_dev must be rejected):
```bash
curl -s -H "X-API-Key: cv_dev" https://climatevision.green/api/runs # expect 401
```
16. When `/api/health/models` reports a real loaded model (not demo/untrained),
**remove the "technical preview" label** from the UI/API — you are now
serving genuine predictions.

---

## Per-type status

| Analysis type | Dataset | Status |
|---------------|---------|--------|
| flooding | Sen1Floods11 | first target — follow this doc |
| deforestation | MultiEarth Amazon | same steps, `--analysis-type deforestation` |
| ice_melting | AI4Arctic v2 | same steps, `--analysis-type ice_melting` |

Keep each model's `provenance.json`, model card, and metrics — that audit trail
is what makes the platform credible for NGO and government use.
252 changes: 252 additions & 0 deletions notebooks/flood_training_colab.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,252 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# ClimateVision \u2014 Flood Model Training (Sen1Floods11)\n",
"\n",
"Run top to bottom on a **GPU runtime** (Runtime \u2192 Change runtime type \u2192 GPU).\n",
"This trains a real flood-detection U-Net and exports it for deployment.\n",
"\n",
"**Prerequisite:** the latest scripts must be on `main` (push from your laptop first):\n",
"`download_datasets.py`, `prepare_sen1floods11.py`, `train_real.py`, and the `dataset.py` change.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Setup \u2014 clone repo, install, check GPU\n"
]
},
{
"cell_type": "code",
"metadata": {},
"execution_count": null,
"outputs": [],
"source": [
"!git clone https://github.com/Climate-Vision/ClimateVision.git\n",
"%cd ClimateVision\n",
"!git pull origin main # make sure newest scripts are present\n",
"!pip install -q -r requirements.txt\n",
"!pip install -q -e .\n",
"import torch; print('CUDA:', torch.cuda.is_available(), '| GPU:', torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'NONE')\n"
]
},
{
"cell_type": "code",
"metadata": {},
"execution_count": null,
"outputs": [],
"source": [
"# Sanity check the scripts exist (fail early if not pushed yet)\n",
"import os\n",
"need = ['scripts/prepare_sen1floods11.py','scripts/train_real.py','scripts/export_model.py']\n",
"missing = [p for p in need if not os.path.exists(p)]\n",
"assert not missing, f'Missing (push these first): {missing}'\n",
"print('All scripts present.')\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Persist to Google Drive (so checkpoints survive a runtime reset)\n"
]
},
{
"cell_type": "code",
"metadata": {},
"execution_count": null,
"outputs": [],
"source": [
"from google.colab import drive\n",
"drive.mount('/content/drive')\n",
"!mkdir -p /content/drive/MyDrive/climatevision/models\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Authenticate to Google Cloud (for the Sen1Floods11 bucket)\n"
]
},
{
"cell_type": "code",
"metadata": {},
"execution_count": null,
"outputs": [],
"source": [
"from google.colab import auth\n",
"auth.authenticate_user()\n",
"PROJECT = 'kinos-473422' # your GCP/GEE project\n",
"!gcloud config set project {PROJECT}\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. Download Sen1Floods11 hand-labeled data (~hundreds of chips)\n"
]
},
{
"cell_type": "code",
"metadata": {},
"execution_count": null,
"outputs": [],
"source": [
"!mkdir -p data/sen1floods11\n",
"!gcloud storage cp --recursive gs://sen1floods11/v1.1/data/flood_events/HandLabeled/S2Hand data/sen1floods11/\n",
"!gcloud storage cp --recursive gs://sen1floods11/v1.1/data/flood_events/HandLabeled/LabelHand data/sen1floods11/\n",
"!gcloud storage cp --recursive gs://sen1floods11/v1.1/splits data/sen1floods11/\n",
"print('S2:', len(os.listdir('data/sen1floods11/S2Hand')), '| Labels:', len(os.listdir('data/sen1floods11/LabelHand')))\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5. Convert to the ClimateVision training layout\n",
"Extracts S2 bands B03,B08,B11 and pairs masks. Add `--jrc-dir` only for the 3-class variant.\n"
]
},
{
"cell_type": "code",
"metadata": {},
"execution_count": null,
"outputs": [],
"source": [
"!python scripts/prepare_sen1floods11.py \\\n",
" --s2-dir data/sen1floods11/S2Hand \\\n",
" --label-dir data/sen1floods11/LabelHand \\\n",
" --splits-dir data/sen1floods11/splits \\\n",
" --out-dir data/datasets/flooding\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6. Train\n",
"Watch `val_iou` climb. Early stopping is automatic. ~1\u20133h depending on GPU.\n"
]
},
{
"cell_type": "code",
"metadata": {},
"execution_count": null,
"outputs": [],
"source": [
"!python scripts/train_real.py --analysis-type flooding \\\n",
" --data-dir data/datasets/flooding \\\n",
" --epochs 50 --batch-size 8 --image-size 256 \\\n",
" --out /content/drive/MyDrive/climatevision/models\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 7. Locate the best checkpoint\n"
]
},
{
"cell_type": "code",
"metadata": {},
"execution_count": null,
"outputs": [],
"source": [
"import glob\n",
"runs = sorted(glob.glob('/content/drive/MyDrive/climatevision/models/flooding_*/best_model.pth'))\n",
"assert runs, 'No checkpoint found \u2014 did training finish?'\n",
"CKPT = runs[-1]; print('Best checkpoint:', CKPT)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 8. Evaluate + governance gate (promote only if it passes)\n"
]
},
{
"cell_type": "code",
"metadata": {},
"execution_count": null,
"outputs": [],
"source": [
"!python scripts/evaluate.py --checkpoint \"$CKPT\" --data-dir data/datasets/flooding || true\n",
"!python scripts/governance_ci_gate.py || true\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 9. Export to ONNX (what the API serves)\n"
]
},
{
"cell_type": "code",
"metadata": {},
"execution_count": null,
"outputs": [],
"source": [
"!python scripts/export_model.py --checkpoint \"$CKPT\"\n",
"import os; d=os.path.dirname(CKPT); print('Artifacts in', d, '->', os.listdir(d))\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 10. Download the model, then deploy\n",
"\n",
"Download `best_model.pth` and `model.onnx` from the run folder above, then on your laptop:\n",
"```bash\n",
"cp <downloaded>/* models/flooding_<date>/\n",
"git add models/flooding_<date>/\n",
"git commit -m 'feat(models): trained Sen1Floods11 flood model'\n",
"git push origin main # triggers Render rebuild\n",
"```\n",
"Verify it's live, then drop the preview label:\n",
"```bash\n",
"curl -s https://climatevision.green/api/health/models | jq\n",
"```\n"
]
},
{
"cell_type": "code",
"metadata": {},
"execution_count": null,
"outputs": [],
"source": [
"from google.colab import files\n",
"import os\n",
"d = os.path.dirname(CKPT)\n",
"for f in ['best_model.pth','model.onnx']:\n",
" p = os.path.join(d,f)\n",
" if os.path.exists(p): files.download(p)\n"
]
}
],
"metadata": {
"accelerator": "GPU",
"colab": {
"provenance": [],
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Loading
Loading