Local Data Platform · Open Source

Manage your data your way

The local, open-source data platform. Everything is a pipeline — visual or code — but every pipeline plugs into a data catalog, Delta Lake storage, streaming ingestion, scheduling, and a Polars-compatible Python API. That's what makes it a platform, not just an ETL tool — running on your machine, your infrastructure, your cloud.

Buy Me A Coffee

One local platform, every workflow

Everything is a pipeline — but every pipeline plugs into a data catalog, Delta Lake storage, streaming ingestion, scheduling, and a Polars-compatible Python API. A full platform, running locally on your infrastructure.

Visual Editor

Drag-and-drop nodes to build complex data pipelines without writing a single line of code. Perfect for data analysts and anyone who prefers a visual approach.

  • Intuitive drag-and-drop interface
  • Real-time data preview at each step
  • 30+ transformation nodes
  • Write to Delta Lake via the catalog

Python API

Write pipelines in Python with a familiar, Polars-like syntax. Full programmatic control with the same powerful engine under the hood.

import flowfile_frame as ff

df = ff.read_csv("sales.csv")
result = (
    df.filter(ff.col("sales") > 1000)
      .group_by("category")
      .agg(ff.sum("sales"))
)

Data Catalog & Delta Lake

Every table is stored as Delta Lake with version history, time travel, and merge/upsert support. Track lineage, runs, and artifacts in one place.

Kafka & Streaming

Ingest from Kafka or Redpanda as a canvas node or with the Python API. Bridge batch and streaming workloads in one pipeline.

Multi-Cloud & Databases

Read and write to S3, Azure Data Lake, and GCS. Connect to PostgreSQL, MySQL, and files — CSV, Excel, Parquet. Your data, wherever it lives.

Sandboxed Python Kernels

Run arbitrary Python in isolated Docker containers. Use matplotlib, scikit-learn, or your own libraries — results flow back into the pipeline.

Scheduling & Triggers

Run flows on intervals or trigger them when catalog tables update. Built-in orchestration — no external scheduler needed.

Polars Performance

Built on Polars, not Pandas. Enjoy 10-100x faster execution with lazy evaluation and query optimization. Export flows as clean Python code — no vendor lock-in.

Visual pipeline building

Connect nodes to build your data pipeline. Each node transforms the data as it flows through — from raw input to final output.

flowfile — Visual Editor
CSV Input
Filter
Group By
Output

Click a node to see its data

Raw Data 8 rows
product category sales
Widget AElectronics1,200
Widget BElectronics890
Gadget XHome2,100
Gadget YHome760
Tool ProTools3,200

Raw Data

Sales data loaded from CSV file

8 rows × 6 cols

Try it yourself

Live Demo Preview
Open fullscreen

Loading preview...

Pinch to zoom • Tap to interact • Best on desktop

Live Demo Lite
Open in new tab

Loading Flowfile...

First load may take a moment while Pyodide initializes

This is a lightweight browser version. Install the full version for database connections, larger datasets, and more.

Same pipeline, in code

Prefer coding? Build the exact same pipeline using the Flowfile Python API. Export visual flows as code, or write pipelines programmatically.

pipeline.py
import flowfile_frame as ff

# Read and filter data
df = ff.read_csv("sales_data.csv")
filtered = df.filter(ff.col("sales") > 1000)

# Group by category and aggregate
result = (
    filtered
    .group_by("category")
    .agg(
        ff.sum("sales").alias("total_sales"),
        ff.sum("quantity").alias("total_quantity"),
        ff.count().alias("count")
    )
)

result.write_parquet("output.parquet")
Polars-like syntax
Export visual flows as code
Lazy evaluation

Up and running in seconds

Install Flowfile with pip and launch the visual editor with a single command.

Install
pip install flowfile
Launch
flowfile run ui
Build

Drag & drop nodes to create your data pipeline

What makes it unique

A full local data platform — visual and code, catalog and connections — running on your infrastructure, not a vendor's SaaS.

Visual meets code

Build pipelines visually, then export as clean Python code. Switch between both anytime — no vendor lock-in.

Local-first, deploy anywhere

Runs on your machine with a single pip install, your own Docker, or as a desktop app. Your data never leaves your infrastructure.

Catalog as the source of truth

Delta Lake storage, lineage, run history, and event-based triggers. Your data assets live in one place — not scattered across notebooks.

Built on Polars

Under the hood, Flowfile uses Polars for fast, memory-efficient data processing. Same performance you'd get in code.

Local & Open Source

Ready to manage your data your way?

Join the community building the local, open-source data platform. Free, self-hosted, and ready for production — on your infrastructure.