Launching Q1 2026 — Join Early Access

Query Your Entire Data Stack.
Zero ETL Required.

Federated SQL across PostgreSQL, Snowflake, and files—without copying a single byte.
Built on Apache Arrow and Datafusion.

strake_demo.py
import strake

conn = strake.StrakeConnection("grpc://localhost:50051")

df = conn.sql("""
SELECT 
    c.customer_name,
    c.region,
    o.order_total,
    p.product_name
  FROM postgres.prod.customers c
  JOIN snowflake.analytics.orders o
    ON c.customer_id = o.customer_id
  JOIN json.catalog.products p
    ON o.product_id = p.product_id
  WHERE o.order_date >= '2024-01-01'
    AND c.region IN ('US-West', 'US-East')
  ORDER BY o.order_total DESC
  LIMIT 5
""")

Developer First

Built for Data Teams Who Ship Fast

Stop waiting for data pipelines. Strake lets you query any data source with standard SQL—locally in development, or at scale in production.

5-Minute Setup

From zero to querying PostgreSQL + S3 in 5 minutes. No infrastructure required.

GitOps Native

Manage 100 data sources as easily as editing a YAML file. Validate offline. Deploy with confidence.

Code-First Python

10M rows → Pandas DataFrame in <1 second. Zero-copy via PyArrow. No serialization overhead.

Modern Hybrid Architecture

Move Compute, Not Data.

The default data architecture assumes you must centralize everything to analyze anything. This creates lag, cost, and complexity.

Strake flips the default. Treat your distributed data as a single logical warehouse. With intelligent caching and push-down execution, you can run production workloads directly on operational stores.

Keep data where it lives. Don't materialize data until physics demands it.

The New Standard for Data Access

DEFAULT STRATEGY

High-Performance Federation

Use Strake for 90% of workloads.

  • Live Dashboards: Sub-second response via caching
  • Customer APIs: Real-time data serving
  • Cross-Silo Joins: User data + Product events
  • Operational Analytics: "What is happening right now?"
Powered by Push-down optimization & Arrow
EXCEPTION ONLY

Physical Materialization (ETL)

Reserve for the top 10% heavy-lift tasks.

  • ⚠️ Massive History: Aggregating 10 years of logs
  • ⚠️ Slow Sources: Protecting fragile legacy APIs
  • ⚠️ Complex Snapshots: Slowly Changing Dimensions (Type 2)

How It Works

Traditional tools copy your data. Strake queries it where it lives.

No ETL pipelines. Just SQL.

Core Features

Powerful Open-Source Federation

Everything you need to unify your data landscape, available for everyone.

Sub-Second Federation

Join PostgreSQL + Snowflake + Parquet in <0.5s. Validated on TPC-H benchmarks with billions of rows.

Manage Data Mesh as Code

Version control your sources. Validate configuration offline. Deploy your data mesh with full confidence.

Built-In Governance

Row-level security, column masking, and SSO out-of-the-box. Security is not an afterthought.

Universal Sources

Connect to PostgreSQL, MySQL, SQLite, Snowflake, BigQuery, Parquet, CSV, JSON, REST APIs, and gRPC services—all through one SQL interface.

Zero-Copy Python

PyO3 bindings with direct conversion to PyArrow, Pandas, and Polars. No serialization overhead.

Flight SQL Native

Standard-compliant Arrow Flight SQL interface with full prepared statement support for maximum compatibility.

Performance Proof

TPC-H SF2.0 (12M Rows) — PostgreSQL + Parquet Federation

1.18s
Complex Join (Q3)
  • PostgreSQL + Parquet
  • 12M rows total
  • Join + Filter + Sort
0.19s
Scan & Filter (Q6)
  • Direct Parquet Scan
  • Pushdown enabled
  • Sub-second latency
0.28s
Aggregation (Q1)
  • Complex GROUP BY
  • Multi-threaded exec
  • Zero-copy results

View Full Methodology →

OSS Edition

Free Forever • Apache 2.0

Free
  • PostgreSQL, MySQL, SQLite (See 10+ more)
  • Parquet, CSV, JSON file support (local and S3)
  • REST API/gRPC Services connectors
  • Flight SQL server
  • Python bindings (PyArrow/Pandas)
  • GitOps CLI with offline validation
  • Connection pooling & circuit breakers
Download Beta