Rivet

Declarative SQL pipelines with multi-engine execution, quality checks, and built-in testing.

Rivet is a framework that revolutionizes data pipelines by strictly separating concerns. It allows you to define your pipeline once and run it on DuckDB, Polars, PySpark, Postgres or any other engine without changing your logic.

🧠 The Mental Model

Rivet pipelines are built on three foundational pillars:

Concept	Rivet Abstraction	Description
What to compute	Joints	Named, declarative units of computation (SQL, Python, Source, Sink).
How to compute	Engines	Deterministic compute engines that execute the logic.
Where data lives	Catalogs	Named references to data locations like filesystems, databases, or object stores.

This architecture lets you build portable pipelines. Adjacent SQL joints assigned to the same engine are automatically fused into a single query to reduce memory pressure and avoid unnecessary data movement.

✨ Key Features

🔄 Multi-Engine Execution: Swap compute engines without rewriting pipelines.
🛠️ Declarative Flexibility: Define joints using SQL, YAML, or Python.
🛡️ Ironclad Data Quality: * Assertions run pre-write on computed data to catch errors before they hit your target.
- Audits run post-write by reading back from the target catalog to verify state.
🧪 Built-in Offline Testing: Validate your transformation logic using offline fixture data without needing a live database.
💻 Interactive REPL: Use rivet repl for a full-screen terminal UI to explore data, run ad-hoc queries, and iterate on pipeline logic.
🔀 Advanced Write Strategies: Supports 7 write modes including append, replace, merge, and scd2 (Slowly Changing Dimensions).

⚡ Quick Start

1. Install

Install Rivet with all plugins:

pip install 'rivetsql[all]'

Or install only what you need:

pip install 'rivetsql[duckdb]'    # recommended for local dev

2. Initialize a Project

Scaffold a new project with the required directory structure:

rivet init my_pipeline
cd my_pipeline

3. Run the Pipeline

Compile and execute your DAG:

rivet run

💡 Example: A Complete Pipeline

Three files. Source → Transform → Sink. That's it.

1. Read raw data from a catalog:

-- sources/raw_orders.sql
-- rivet:name: raw_orders
-- rivet:type: source
-- rivet:catalog: local
-- rivet:table: raw_orders
select * from raw_orders

2. Transform with plain SQL:

-- joints/daily_revenue.sql
-- rivet:name: daily_revenue
-- rivet:type: sql
SELECT
    order_date,
    SUM(amount) AS revenue
FROM raw_orders
WHERE status = 'completed'
GROUP BY order_date

3. Write results with quality checks:

-- sinks/daily_revenue_out.sql
-- rivet:name: daily_revenue_out
-- rivet:type: sink
-- rivet:upstream: [daily_revenue]
-- rivet:catalog: warehouse
-- rivet:table: daily_revenue
-- rivet:write_strategy: replace
-- rivet:assert: not_null(revenue)
-- rivet:assert: row_count(min=1)

$ rivet run
✓ compiled 3 joints in 38ms
  raw_orders          ✓ OK (1200 rows)
  daily_revenue       ✓ OK (90 rows)
  daily_revenue_out   ✓ OK (90 rows)

  38ms | 3 joints | 1 groups | 0 failures

If an assertion like not_null fails, the write is completely aborted, keeping your target clean.

🧩 Rich Plugin Ecosystem

Rivet is fully extensible through plugins.

Package	Engine Type	Catalog Type	Best For
`rivet-duckdb`	`duckdb`	`duckdb`	Local analytics and fast SQL on files.
`rivet-polars`	`polars`	—	In-process DataFrame transforms.
`rivet-pyspark`	`pyspark`	—	Large-scale distributed processing.
`rivet-postgres`	`postgres`	`postgres`	PostgreSQL databases as sources and sinks.
`rivet-aws`	—	`s3`, `glue`	AWS S3 object storage and Glue Data Catalog.
`rivet-databricks`	`databricks`	`unity`, `databricks`	Databricks SQL warehouses and Unity Catalog.

📚 Documentation

Start here:

🤝 Contributing

Pull requests are welcome! Check out our Contribution Guidelines.

git clone https://github.com/rivetsql/rivetsql

Built for data engineers who love SQL, demand quality, and value flexibility.

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
.github		.github
docs		docs
rivet		rivet
rivet_core/compiler		rivet_core/compiler
scripts		scripts
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
CONTRIBUTORS.md		CONTRIBUTORS.md
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
mise.toml		mise.toml
mkdocs.yml		mkdocs.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rivet

🧠 The Mental Model

✨ Key Features

⚡ Quick Start

1. Install

2. Initialize a Project

3. Run the Pipeline

💡 Example: A Complete Pipeline

🧩 Rich Plugin Ecosystem

📚 Documentation

🤝 Contributing

About

Uh oh!

Releases 17

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Rivet

🧠 The Mental Model

✨ Key Features

⚡ Quick Start

1. Install

2. Initialize a Project

3. Run the Pipeline

💡 Example: A Complete Pipeline

🧩 Rich Plugin Ecosystem

📚 Documentation

🤝 Contributing

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 17

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages