Federated SQL across PostgreSQL, Snowflake, and files—without copying a single byte.
Built on Apache Arrow and Datafusion.
import strake
conn = strake.StrakeConnection("grpc://localhost:50051")
df = conn.sql("""
SELECT
c.customer_name,
c.region,
o.order_total,
p.product_name
FROM postgres.prod.customers c
JOIN snowflake.analytics.orders o
ON c.customer_id = o.customer_id
JOIN json.catalog.products p
ON o.product_id = p.product_id
WHERE o.order_date >= '2024-01-01'
AND c.region IN ('US-West', 'US-East')
ORDER BY o.order_total DESC
LIMIT 5
""")
Stop waiting for data pipelines. Strake lets you query any data source with standard SQL—locally in development, or at scale in production.
From zero to querying PostgreSQL + S3 in 5 minutes. No infrastructure required.
Manage 100 data sources as easily as editing a YAML file. Validate offline. Deploy with confidence.
10M rows → Pandas DataFrame in <1 second. Zero-copy via PyArrow. No serialization overhead.
The default data architecture assumes you must centralize everything to analyze anything. This creates lag, cost, and complexity.
Strake flips the default. Treat your distributed data as a single logical warehouse. With intelligent caching and push-down execution, you can run production workloads directly on operational stores.
Keep data where it lives. Don't materialize data until physics demands it.
Use Strake for 90% of workloads.
Reserve for the top 10% heavy-lift tasks.
Traditional tools copy your data. Strake queries it where it lives.
No ETL pipelines. Just SQL.
Everything you need to unify your data landscape, available for everyone.
Join PostgreSQL + Snowflake + Parquet in <0.5s. Validated on TPC-H benchmarks with billions of rows.
Version control your sources. Validate configuration offline. Deploy your data mesh with full confidence.
Row-level security, column masking, and SSO out-of-the-box. Security is not an afterthought.
Connect to PostgreSQL, MySQL, SQLite, Snowflake, BigQuery, Parquet, CSV, JSON, REST APIs, and gRPC services—all through one SQL interface.
PyO3 bindings with direct conversion to PyArrow, Pandas, and Polars. No serialization overhead.
Standard-compliant Arrow Flight SQL interface with full prepared statement support for maximum compatibility.
TPC-H SF2.0 (12M Rows) — PostgreSQL + Parquet Federation
Free Forever • Apache 2.0
Everything in OSS, plus: