Skip to content

feat(plugins): add Replicator plugin to mirror external data#250

Open
ThaiTrevor wants to merge 2 commits into
outerbase:mainfrom
ThaiTrevor:fix/issue-72-starbasedb-replicate-data-from-external
Open

feat(plugins): add Replicator plugin to mirror external data#250
ThaiTrevor wants to merge 2 commits into
outerbase:mainfrom
ThaiTrevor:fix/issue-72-starbasedb-replicate-data-from-external

Conversation

@ThaiTrevor

@ThaiTrevor ThaiTrevor commented May 24, 2026

Copy link
Copy Markdown

Purpose

Closes #72.

Adds a new ReplicatorPlugin under plugins/replicator/ that pulls rows from an external database (Postgres, MySQL, Cloudflare D1, Turso, or another StarbaseDB) into the StarbaseDB internal SQLite store. Each replication pass is driven by a per-table watermark column (e.g. updated_at or a monotonic id) so only rows that changed since the previous run are transferred.

What changed

  • plugins/replicator/index.ts — new ReplicatorPlugin extending StarbasePlugin. Creates tmp_replication_state(table_name, last_value, last_synced_at) on registration, exposes sync() and POST /replicator/sync (admin-only), and upserts rows via INSERT ... ON CONFLICT(primaryKey) DO UPDATE.
  • Watermark tracking compares numerically when both sides parse as numbers, so a monotonic integer id column no longer falls into the lexicographic trap (e.g. "99" > "100"). String compare is used otherwise, which still handles ISO timestamps correctly.
  • Identifiers (name, watermarkColumn, primaryKey, destTable) are validated against [A-Za-z_][A-Za-z0-9_]* at construction time and quoted in the external SELECT using dialect-appropriate quoting (backticks for MySQL, double quotes elsewhere).
  • plugins/replicator/index.test.ts — vitest suite covering constructor validation, identifier validation, state-table creation, initial pull, watermark-bounded pulls, dest-table override, MySQL dialect quoting, and numeric-watermark ordering.
  • plugins/replicator/README.md — usage, configuration, Cron-plugin scheduling snippet, destination-table DDL template.
  • plugins/replicator/meta.json — registry metadata to match the other plugins.

How it works

  1. On registration the plugin creates tmp_replication_state(table_name, last_value, last_synced_at).
  2. Each sync() call reads the stored watermark per table, runs SELECT * FROM "<table>" WHERE "<watermarkColumn>" > ? ORDER BY "<watermarkColumn>" ASC LIMIT <batchSize> against the external source, and upserts each row into the internal store using ON CONFLICT(<primaryKey>) DO UPDATE.
  3. After the batch, the highest watermark seen (numeric or lexicographic depending on the value type) becomes the new stored last_value.

Scheduling is delegated to the existing Cron plugin — the README shows the snippet.

Demo

✓ plugins/replicator/index.test.ts (11 tests)
  ✓ throws if tables is empty
  ✓ throws if watermarkColumn is missing
  ✓ throws on invalid identifier characters in name
  ✓ throws on invalid identifier characters in watermarkColumn
  ✓ creates state table on register
  ✓ pulls all rows on initial sync (watermark = "")
  ✓ pulls only new rows after watermark update
  ✓ respects destTable override
  ✓ uses backtick quoting for mysql dialect
  ✓ uses double-quote quoting for non-mysql dialects
  ✓ compares numeric watermarks numerically, not lexicographically

Test Files  1 passed (1)
Tests       11 passed (11)

The 4 failures in src/rls/index.test.ts from a full run are pre-existing on main (confirmed by running that suite against main directly).

Tasks

  • Implement the plugin
  • Add unit tests (11/11 passing)
  • Document usage, scheduling, and destination-table bootstrapping
  • Validate identifiers and quote them in the external SELECT
  • Compare numeric watermarks numerically

Before

  • Branch contains exactly one commit, scoped to plugins/replicator/*.
  • No edits to unrelated files.

/claim #72

…nal store

Adds a new StarbasePlugin under plugins/replicator that pulls rows from a
configured external database (Postgres, MySQL, D1, Turso, StarbaseDB) into
StarbaseDB's internal SQLite store using a per-table watermark column.
Exposes POST /replicator/sync (admin-only) and a public sync() method that
can be wired into the Cron plugin or any external scheduler.

Closes outerbase#72
@ThaiTrevor

Copy link
Copy Markdown
Author

Quick check-in — anything I can adjust to make this easier to review? Happy to break it down, add tests, or revise the approach. Thanks!

1 similar comment
@PTHAICAP

PTHAICAP commented Jun 6, 2026

Copy link
Copy Markdown

Quick check-in — anything I can adjust to make this easier to review? Happy to break it down, add tests, or revise the approach. Thanks!

@PTHAICAP

PTHAICAP commented Jun 9, 2026

Copy link
Copy Markdown

Quick check-in — anything I can adjust to make this easier to review? Happy to break it down, add tests, or revise the approach. Thanks!

1 similar comment
@PTHAICAP

Copy link
Copy Markdown

Quick check-in — anything I can adjust to make this easier to review? Happy to break it down, add tests, or revise the approach. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Replicate data from external source to internal source with a Plugin

2 participants