New: Supaflow Claude Code Plugin -- let AI agents create, edit, and monitor your data pipelines. See how the plugin works
MongoDB logo

MongoDB Connector

Sync collections from MongoDB Atlas or self-managed MongoDB into your warehouse with cursor-based incremental sync and optional document packing.

SourceCommunity

Evaluating against Fivetran? See how Supaflow handles MongoDB pricing, connector quality, and Snowflake deployment side-by-side.

Supaflow vs Fivetran

Why Supaflow

All connectors included

Every connector is available on every plan. Pricing does not increase with connector count.

Pay for compute, not rows

Credit-based pricing. Usage scales with your pipelines, not with row counts.

One platform

Ingestion, dbt Core transformation, reverse ETL, and orchestration in a single workspace.

Capabilities

Connection string or split credentials

Paste a MongoDB Atlas connection string with embedded credentials, or keep the URI and the credentials in separate fields with a configurable Authentication Database. Passwords are stored encrypted, and the connector refuses to silently merge two sets of credentials.

Per-collection cursor selection

Schema discovery samples documents per collection and surfaces every datetime root field as a cursor candidate. You pick the cursor in the pipeline wizard, so different collections can use different cursor fields and run independently.

Document Packing Mode

Choose how nested document structures land in your destination. By default the connector keeps embedded documents and arrays as JSON values on the parent row (Packed) so warehouse rows mirror Mongo documents one-to-one. Switch to Unpacked to expand embedded documents into parent__leaf columns and surface arrays of objects as related child outputs.

Configurable sample size for schemaless collections

MongoDB documents do not declare a schema, so the connector samples up to a configurable number of documents per collection during discovery. Increase the sample size to improve type coverage on collections with sparse fields.

Skips system databases by default

Implicit discovery automatically skips MongoDB system databases (admin, local, config) and system collections (e.g., system.profile) so they never need to be filtered out manually. Explicit Database Filters can opt a system database back in; system collections inside any discovered database remain skipped unless you name them explicitly via Collection Filters.

Supported Objects

Discovered Objects

Collection

Every readable non-system collection becomes a root object. In your destination it lands as `database.collection` (e.g., `sales.orders`), so collections sharing a name in different databases stay distinct.

Nested array children (Unpacked mode only)

When Document Packing Mode is set to Unpacked, arrays of objects inside a document are surfaced as related child outputs (e.g., `sales.orders__line_items`). The default Packed mode keeps these arrays as JSON columns on the parent instead.

Skipped by Default

admin, local, config

MongoDB system databases are skipped during implicit discovery. They can still be selected explicitly via Database Filters if needed.

system.* collections

MongoDB system collections (e.g., system.profile) are skipped during implicit discovery; they remain selectable through Collection Filters when required.

How It Works

1

Create or pick a read-only MongoDB user

Use a dedicated service user with read access on every database you want to sync. For long-term stability, do not use an individual person's credentials.

2

Copy the connection string

In MongoDB Atlas, click Connect on the cluster and copy the standard or SRV connection string. For self-managed deployments, build the URI with your hosts, replica set name, and TLS options.

3

Enter the connection string in Supaflow

Paste the URI into MongoDB Connection String. Either embed credentials in the URI, or keep the URI clean and use the Username, Password, and Authentication Database fields. Credentials are stored encrypted.

4

Optionally narrow the scope

Leave Database Filters and Collection Filters blank to discover everything readable, or list specific databases (e.g., `sales, crm`) and fully qualified collections (e.g., `sales.orders, crm.contacts`).

5

Test and save

Click Test & Save. Supaflow validates the URI with a lightweight ping, then runs schema discovery for each selected collection.

Use Cases

Operational analytics on multi-tenant MongoDB

When each tenant lives in its own MongoDB database, a single Supaflow source discovers all of them and lands tenants side-by-side in the warehouse with database-qualified table names ready for SQL joins.

Order, customer, and event reporting from product MongoDB

Land core collections like orders, customers, and events into your warehouse, then build dbt models on top. Optionally switch Document Packing Mode to Unpacked so nested line items, addresses, and other arrays of objects surface as related child tables.

Replace ad-hoc Mongo exports

Replace one-off `mongoexport` scripts and Lambda jobs with a managed pipeline that handles incremental cursors, schema drift, and credentials rotation in one place.

Frequently Asked Questions

Does Supaflow support MongoDB Atlas, self-managed MongoDB, or both?
Both. The connector accepts any standard MongoDB connection string, including Atlas SRV URIs, replica-set URIs, and direct-connection URIs. As long as Supaflow can reach the cluster on the network, the same connector works.
How does Supaflow handle multiple databases on the same cluster?
A single source discovers every readable non-system database on the cluster, or only the databases you list in Database Filters. Collections land in the destination as `database.collection`, so identical collection names in different databases (e.g., `sales.orders` and `crm.orders`) stay distinct.
How does the connector decide what becomes a child object?
Document Packing Mode controls this. The default is Packed: embedded documents and arrays stay as JSON values on the parent row -- no child objects, no flattened columns. Set Document Packing Mode to Unpacked to expand embedded documents into parent__leaf columns and surface arrays of objects as separate child objects associated with the parent root. Whether a particular structure becomes a child also depends on its shape in the documents the connector samples.
How is incremental sync set up?
During schema discovery, every datetime root field on each collection is offered as a cursor candidate. You pick one cursor per collection in the pipeline wizard. Different collections can use different cursor fields, and each runs incrementally only after a cursor is chosen.
What permissions does the MongoDB user need?
Read on every database you want to sync, plus enough listing permission to see those databases and their collections. Cluster-wide listDatabases is the simplest setup; alternatively, list the databases explicitly via Database Filters and grant per-database listCollections.
Does Supaflow support MongoDB change streams or oplog tailing?
Not in the current connector. Incremental sync uses cursor-field comparisons against selected datetime fields per collection. If real-time CDC is important to your evaluation, email support@supa-flow.io with your use case.
Will my schemas remain stable across runs?
MongoDB collections are schemaless, so types are inferred from sampled documents. Supaflow stabilizes inferred types between runs so a column does not flip between, say, BIGINT and STRING because of sample variance. Increase Schema Sample Size when you need broader coverage of sparse fields.

Need a connector we don't support yet?

Build one with AI-powered Connector Dev Skills.

Learn More About the Connector SDK