Connector activation pending

The Fivetran MongoDB → BigQuery connector is not yet provisioned in this environment. All live tiles below render — until the Fivetran connector is provisioned. No hardcoded numbers are shown — the tiles populate from real data as soon as the connector goes live.

Once activated, sync metrics, schema progress, and freshness appear here automatically.

Vendor integration

Fivetran + BigQuery

Managed Mongo→BQ sync · analytical warehouse for the trade-decision corpus

Status

● Live

production · real data flowing

— · — rows in BigQuery · last sync: — · BQML: —

Marketing →Admin →Warehouse →API keys →Docs →

Capability matrix

What we use, what's gated, what's planned

Fully managed replication of the live MongoDB corpus into BigQuery on a six-hour cadence — the live trade decisions, sentiment summaries, and agent memory landing in the warehouse with full Fivetran lineage. The operator gets SQL-grade analytics over the trading record without writing or maintaining a single line of ETL.

Coming nextScheduled queries materialise the daily win-rate dashboards and per-strategy attribution cuts directly in BigQuery; dbt models layer in for governed transforms.

Capability	Status	Phase	Why / How / Note
BigQuery SELECT/WITH safety whitelist	LIVE	Phase 1.5	bigquery_reader.py rejects any non-read statement; no DML, no DDL, ever, from agent or web surface. ✓ 26 tests
100MB scan cap per query	LIVE	Phase 1.5	dryRun gate aborts queries above 100MB before execution — prevents accidental $5/TB blowouts.
/data-pipeline operator surface	LIVE	Phase 1.5	F4 page: connector inventory + curated SQL examples + safe query box. Live React surface today.
Fivetran REST client wrapper	LIVE	Phase 1.5	5 ADK tools on the data_pipeline_manager agent (list/inspect/sync/schema/log). LIVE 2026-05-18 — System Key authenticates; list_connectors round-trip verified.
Local Fivetran MCP shim	LIVE	Phase 1.5	fivetran_mcp.py exposes read-only Fivetran introspection over MCP — no public npm package as of 2026-05-08. LIVE 2026-05-18 — McpToolset mounted on data_pipeline_manager; agent.py:432 spawns the local stdio shim.
MongoDB → BigQuery managed replication	LIVE	Phase 1.5.X	Fivetran MongoDB connector replicates 9 collections to mongo_sentinelhub BigQuery dataset. LIVE 2026-05-18 — chosen_stung connector synced 4,801 rows initial load (trade_decisions 4,372 · agent_memory 162 · social_sentiment_raw 126 · market_snapshots 96 · sentiment_summaries 31 · others).
Schema evolution auto-handling	LIVE	Phase 1.5.X	Fivetran auto-evolves the BQ schema when Mongo collection shape changes — no hand-rolled migration scripts. LIVE 2026-05-18 — Fivetran auto-created the 9-table schema in mongo_sentinelhub on first sync.
BQ ML logistic regression on outcomes	PLANNED	Phase 1.5.S (S5)	In-warehouse model on trade_decisions JOIN trade_outcomes — replaces Python calibration job for Phase 1. Gated on connector activation; one-day spike once data lands in BQ.
Connector groups + authorized views per tenant APP-012	PLANNED	Phase 2.5	One Fivetran group per tenant → one BQ dataset per tenant → row-level security inherited. Multi-tenant primitive. Lowest-effort multi-tenant warehouse pattern; required before tenant #2.
Materialized views on /insights aggregates	PLANNED	Phase 2.5	Pre-aggregate rolling win-rate + per-strategy P&L. BQ auto-refreshes; sub-second tenant dashboards at zero query cost.
dbt Cloud trigger on sync-complete	PLANNED	Phase 2.5	Fivetran sync-complete webhook fires the dbt project — retires Cloud Scheduler for analytical aggregations.
Time-travel `FOR SYSTEM_TIME AS OF`	PLANNED	Phase 3	Point-in-time backtests + FCA Phase-3 audit lineage; BQ keeps 7-day history natively, extendable to 30d. FCA evidence requirement — 'what did we know on date X'.
Column blocking + hashing for PII compliance APP-015	PLANNED	Phase 2.5	Per-column directives at the Fivetran layer (block / hash / encrypt) — required for tenant emails landing in BQ.
BigQuery Vector Search (Atlas alternative)	PLANNED	Phase 3	VECTOR_SEARCH() GA from 2024 — keep embeddings + decisions in one warehouse, no Mongo round-trip. Strategic alternative to Atlas Vector Search; would decommission Atlas index in trade for warehouse consolidation.
Custom Connector SDK (broker statement PDFs)	UNDERUSED	Phase 2	First-party SentinelHub→Alpaca connector lands fills + positions directly in BQ — no Mongo stop-over. Hackathon $10K stretch track explicitly rewards novel connector use.
BigLake external tables	UNDERUSED	Phase 3	Federate raw bar archives in GCS as Parquet — BQ queries them in place, no load cost.
BI Engine in-memory acceleration	UNDERUSED	Phase 2.5	Sub-second response on repeated dashboard queries; free up to 1GB reservation.
Capacity-based pricing decision	UNDERUSED	Phase 3	On-demand $5/TB is fine pre-revenue; capacity slots cheaper past ~10TB/month. Cost-optimisation thread; not a code change.

Live data

Source: fivetranApi.capabilities (loading) · As of 17:20:59

Real-time from the Fivetran + BigQuery backend

Every tile below is a live read from the vendor backend via the FastAPI BFF. If a tile shows "—" the backend is unreachable or the metric is not yet wired (no hardcoded numbers — see anti-pattern #2).

Connectors (active/total)

—

Rows in trading_warehouse

—

Worst sync age

—

BQ ML model status

—

Roadmap commitments

Roadmap dependencies

Capabilities enabled by this integration — what is built, what is gated, and why.

APP-012Phase 2.5.1

Multi-tenant data partitioning — BQ shared dataset, tenant_id partitioned + clustered, authorized views

Connector groups per tenant + authorized views inherit row-level security; cheaper than per-tenant datasets at small scale.

APP-015Phase 2.5

GDPR DSR cascade (Article 17 right-to-erasure) — BQ is one of 5 backends

Fivetran column blocking + BQ tenant_id-filtered DELETE on DSR; cascades from operator console.

2.0.1.6Phase 2.0

bigquery_reader.py — make tenant_id parameter required, fail-closed

Today the reader is single-tenant implicit; Phase 2 retrofit makes tenant_id a required argument so a missing scope rejects rather than reads global.

Demo flow

End-to-end showcase journey

Five steps a judge or investor can replay live. Each step links to the page that demonstrates it.

1
Open /data-pipeline. Connector list is empty — surface clearly states 'Fivetran not configured' and points to operator next-action 1.5.X.3 (activate trial connector). → Open /data-pipeline
2
Type into the BigQuery query box: SELECT COUNT(*) FROM trading_warehouse.trade_decisions. Server-side allowlist accepts (SELECT-only); 100MB dry-run gate confirms scan size below cap. → Open /data-pipeline
3
(Phase 1.5.S S5) Run BQ ML logistic regression on trade_decisions JOIN trade_outcomes — in-warehouse model trains in SQL, replaces Python calibration job.
4
(Phase 2.5) Materialized view on /insights aggregates auto-refreshed by BQ — sub-second tenant dashboards at zero query cost.
5
(Phase 3) Time-travel point-in-time backtest via SELECT * FROM trade_decisions FOR SYSTEM_TIME AS OF '2026-04-01' — FCA-grade audit lineage without snapshot tables.

What's next

Top-3 vendor-enabled capabilities coming soon

Sourced from the vendor's playbook. Each entry is mapped to its delivery phase and the value it unlocks.

BQ ML logistic regression on win/loss outcomes

Phase 1.5.S (S5)

Replaces Python calibration job; live BQ ML model is a strong hackathon demo artefact (gated on connector activation).

Connector groups + authorized views per tenant

Phase 2.5

Multi-tenant infrastructure primitive; lowest-effort path to per-tenant warehouse isolation before tenant #2.

dbt Cloud trigger on sync-complete