<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>Flowfile Blog</title><description>Practical guides on local data platforms, Polars, Delta Lake, visual ETL, and modernizing Excel and Alteryx workflows with open-source tools.</description><link>https://flowfile.io/</link><language>en-us</language><item><title>Big Data Is Dead — Why Your Laptop Is Probably Big Enough</title><link>https://flowfile.io/blog/big-data-is-dead/</link><guid isPermaLink="true">https://flowfile.io/blog/big-data-is-dead/</guid><description>The &apos;big data&apos; era was real, but it ended quietly. Hardware caught up, working sets shrank, and single-node engines like Polars and DuckDB beat the cluster on most workloads.</description><pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate><category>Big Data</category><category>Local-first</category><category>Polars</category><category>DuckDB</category><category>Architecture</category><author>Edward van Eechoud</author></item><item><title>Automate Your Excel Workflows Without Writing Code</title><link>https://flowfile.io/blog/automate-excel-workflows-no-code/</link><guid isPermaLink="true">https://flowfile.io/blog/automate-excel-workflows-no-code/</guid><description>If you rebuild the same Excel report every week, you don&apos;t need Python. Here&apos;s how visual data pipelines turn that work into a repeatable, one-click process.</description><pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate><category>Excel</category><category>Beginner</category><category>Automation</category><category>No-code</category><author>Edward van Eechoud</author></item><item><title>The Best Way to Learn Python Is to Build Something You&apos;d Actually Use</title><link>https://flowfile.io/blog/learn-python-by-building/</link><guid isPermaLink="true">https://flowfile.io/blog/learn-python-by-building/</guid><description>Tutorials teach syntax. Building real things teaches everything else. Reflections on learning Python the hard way — through a year-long project called Flowfile.</description><pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate><category>Essay</category><category>Learning</category><category>Python</category><category>Building Flowfile</category><author>Edward van Eechoud</author></item><item><title>From Alteryx to Flowfile: A Practical Migration Walkthrough</title><link>https://flowfile.io/blog/alteryx-to-flowfile-migration/</link><guid isPermaLink="true">https://flowfile.io/blog/alteryx-to-flowfile-migration/</guid><description>How to rebuild a real Alteryx workflow in Flowfile — a weekly sales report with lookups, a pivot, and an Excel output — node by node, in under an hour.</description><pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate><category>Alteryx</category><category>Migration</category><category>Visual ETL</category><category>Practical</category><author>Edward van Eechoud</author></item><item><title>Demystifying Delta Lake: What It Is and Why It Matters</title><link>https://flowfile.io/blog/demystifying-delta-lake/</link><guid isPermaLink="true">https://flowfile.io/blog/demystifying-delta-lake/</guid><description>Delta Lake is not a database. It&apos;s a transaction log over Parquet that gives you ACID, time travel, and schema evolution — without a server. Here&apos;s what it does, in plain English.</description><pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate><category>Delta Lake</category><category>Data Engineering</category><category>Lakehouse</category><category>Parquet</category><author>Edward van Eechoud</author></item><item><title>76× Faster Fuzzy Joins: How pl-fuzzy-frame-match Works</title><link>https://flowfile.io/blog/fuzzy-match-at-scale/</link><guid isPermaLink="true">https://flowfile.io/blog/fuzzy-match-at-scale/</guid><description>Brute-force fuzzy matching is O(N×M) — at 1.2 billion comparisons it falls over. Here&apos;s how a two-stage hybrid (ANN + exact scoring) reduces that to seconds while preserving accuracy.</description><pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate><category>Fuzzy Match</category><category>Polars</category><category>Performance</category><category>Algorithms</category><category>Data Engineering</category><author>Edward van Eechoud</author></item><item><title>Flowfile&apos;s Kafka Source: How Micro-Batching Actually Works</title><link>https://flowfile.io/blog/flowfile-kafka-source-deep-dive/</link><guid isPermaLink="true">https://flowfile.io/blog/flowfile-kafka-source-deep-dive/</guid><description>A code-level walkthrough of Flowfile&apos;s Kafka source: the 500-message poll, the 100k-row spill to Arrow, Polars LazyFrames, and consumer-group offsets.</description><pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate><category>Kafka</category><category>Flowfile</category><category>Polars</category><category>Streaming</category><category>ETL</category><category>Apache Arrow</category><author>Edward van Eechoud</author></item><item><title>Fuzzy Match in Polars: Joining on Dirty Data with Flowfile</title><link>https://flowfile.io/blog/fuzzy-match-polars-flowfile/</link><guid isPermaLink="true">https://flowfile.io/blog/fuzzy-match-polars-flowfile/</guid><description>Every real dataset has &apos;Acme Corp&apos; vs &apos;ACME Corporation&apos; somewhere. Here&apos;s how Flowfile&apos;s fuzzy_join — built on Polars and Levenshtein — handles it without a regex in sight.</description><pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate><category>Polars</category><category>Fuzzy Match</category><category>Data Cleaning</category><category>Join</category><category>Python</category><author>Edward van Eechoud</author></item><item><title>Connections, Secrets, and the Catalog in Flowfile&apos;s Python API</title><link>https://flowfile.io/blog/connect-without-env-files/</link><guid isPermaLink="true">https://flowfile.io/blog/connect-without-env-files/</guid><description>How Flowfile registers database and cloud-storage connections once — in Python or the UI — and references them everywhere by name, with encryption handled for you.</description><pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate><category>Python</category><category>DevEx</category><category>Secrets</category><category>Database</category><category>Data Catalog</category><author>Edward van Eechoud</author></item><item><title>Kafka for Analysts: A Practical Guide to Streaming as Micro-Batches</title><link>https://flowfile.io/blog/kafka-streaming-analytics-micro-batches/</link><guid isPermaLink="true">https://flowfile.io/blog/kafka-streaming-analytics-micro-batches/</guid><description>Most analytics &apos;streaming&apos; is really a sequence of micro-batches. How to think about cleaning, combining, and enriching Kafka data without a streaming engine.</description><pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate><category>Kafka</category><category>Streaming</category><category>Analytics</category><category>ETL</category><category>Micro-Batching</category><category>Data Engineering</category><author>Edward van Eechoud</author></item><item><title>Why Your Data Should Stay on Your Laptop</title><link>https://flowfile.io/blog/local-compute-data-catalog/</link><guid isPermaLink="true">https://flowfile.io/blog/local-compute-data-catalog/</guid><description>Local compute plus a built-in data catalog gives you the speed of a desktop tool and the structure of a warehouse — without sending a single row to the cloud.</description><pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate><category>Local-first</category><category>Data Catalog</category><category>Privacy</category><category>Delta Lake</category><author>Edward van Eechoud</author></item><item><title>Open-Source Alternatives to Alteryx in 2026</title><link>https://flowfile.io/blog/open-source-alteryx-alternatives-2026/</link><guid isPermaLink="true">https://flowfile.io/blog/open-source-alteryx-alternatives-2026/</guid><description>Alteryx is powerful, but the licensing has gotten brutal. An honest comparison of the open-source visual ETL tools worth evaluating in 2026.</description><pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate><category>ETL</category><category>Open Source</category><category>Comparison</category><author>Edward van Eechoud</author></item><item><title>Polars vs Pandas in 2026: A Practical Guide</title><link>https://flowfile.io/blog/polars-vs-pandas-2026/</link><guid isPermaLink="true">https://flowfile.io/blog/polars-vs-pandas-2026/</guid><description>Polars is faster, lazier, and stricter than Pandas. Pandas has 15 years of ecosystem. A practical, honest take on when to use which in 2026.</description><pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate><category>Polars</category><category>Pandas</category><category>Performance</category><category>Python</category><author>Edward van Eechoud</author></item><item><title>Virtual Flow Tables: When a Catalog Entry Is a Pipeline</title><link>https://flowfile.io/blog/virtual-flow-tables/</link><guid isPermaLink="true">https://flowfile.io/blog/virtual-flow-tables/</guid><description>Most data catalogs know about materialized tables and SQL views. Flowfile adds a third option: a catalog entry that points at a pipeline and resolves lazily. Here&apos;s how and why.</description><pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate><category>Data Catalog</category><category>Delta Lake</category><category>Architecture</category><category>Lazy Evaluation</category><author>Edward van Eechoud</author></item><item><title>What Is a Data Pipeline? A Plain-English Guide for Analysts</title><link>https://flowfile.io/blog/what-is-a-data-pipeline/</link><guid isPermaLink="true">https://flowfile.io/blog/what-is-a-data-pipeline/</guid><description>A data pipeline is a saved recipe that turns raw data into something useful. Here&apos;s what one is, what the parts are called, and how to build your first one without a data engineering degree.</description><pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate><category>Beginner</category><category>Data Pipelines</category><category>ETL</category><category>Education</category><author>Edward van Eechoud</author></item><item><title>Bubble + Stripe + Mailchimp: A Non-Technical Founder&apos;s Playbook</title><link>https://flowfile.io/blog/bubble-stripe-mailchimp-playbook/</link><guid isPermaLink="true">https://flowfile.io/blog/bubble-stripe-mailchimp-playbook/</guid><description>You built a product on Bubble, you charge with Stripe, you email with Mailchimp. Here&apos;s how to connect all three into one view of who signs up, who pays, and who sticks around.</description><pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate><category>Small Business</category><category>Bubble</category><category>No-code</category><category>Stripe</category><category>Mailchimp</category><category>SaaS</category><category>Beginner</category><author>Edward van Eechoud</author></item><item><title>CRM, ERP, ETL: Which Three-Letter Acronyms a Small Business Actually Needs</title><link>https://flowfile.io/blog/crm-erp-etl-acronyms-for-small-business/</link><guid isPermaLink="true">https://flowfile.io/blog/crm-erp-etl-acronyms-for-small-business/</guid><description>Software vendors love three-letter acronyms. Here&apos;s a plain-English guide to which ones matter for a small business, which ones you can ignore, and what to buy when.</description><pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate><category>Small Business</category><category>Beginner</category><category>CRM</category><category>ERP</category><category>ETL</category><category>Glossary</category><author>Edward van Eechoud</author></item><item><title>Meta vs. Google Ads: How to Actually Tell Which One Is Selling</title><link>https://flowfile.io/blog/meta-vs-google-ads-attribution/</link><guid isPermaLink="true">https://flowfile.io/blog/meta-vs-google-ads-attribution/</guid><description>Meta says 40 sales. Google says 32. Shopify says 58. Here&apos;s why all three are &apos;right&apos; and how to build a single number you can trust.</description><pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate><category>Small Business</category><category>Marketing</category><category>Google Ads</category><category>Meta Ads</category><category>Attribution</category><category>Beginner</category><author>Edward van Eechoud</author></item><item><title>RFM: The 50-Year-Old Customer Segmentation Every Small Business Should Steal</title><link>https://flowfile.io/blog/rfm-customer-segmentation-small-business/</link><guid isPermaLink="true">https://flowfile.io/blog/rfm-customer-segmentation-small-business/</guid><description>Your VIPs, your churn risks, and your dead weight are hiding in the same customer list. RFM is the simple scoring model that separates them in under an hour.</description><pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate><category>Small Business</category><category>Customer Segmentation</category><category>RFM</category><category>Marketing</category><category>E-commerce</category><category>Beginner</category><author>Edward van Eechoud</author></item><item><title>Stop Copy-Pasting Between Spreadsheets</title><link>https://flowfile.io/blog/stop-copy-pasting-between-spreadsheets/</link><guid isPermaLink="true">https://flowfile.io/blog/stop-copy-pasting-between-spreadsheets/</guid><description>If your weekly routine involves downloading three exports and stitching them into one master sheet, you&apos;re doing a robot&apos;s job. Here&apos;s the non-technical way to hand it off.</description><pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate><category>Small Business</category><category>Spreadsheets</category><category>Excel</category><category>Automation</category><category>Beginner</category><author>Edward van Eechoud</author></item><item><title>The 10 Numbers Every Small Business Should Track Each Week</title><link>https://flowfile.io/blog/ten-numbers-every-small-business-weekly/</link><guid isPermaLink="true">https://flowfile.io/blog/ten-numbers-every-small-business-weekly/</guid><description>A flagship weekly scorecard for small business owners: what to track, where to find each number, and how to stitch them into one report in under an hour.</description><pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate><category>Small Business</category><category>Metrics</category><category>KPIs</category><category>Beginner</category><category>Weekly Report</category><author>Edward van Eechoud</author></item><item><title>What Is a Data Pipeline? A Small Business Owner&apos;s Guide</title><link>https://flowfile.io/blog/what-is-a-data-pipeline-for-small-business/</link><guid isPermaLink="true">https://flowfile.io/blog/what-is-a-data-pipeline-for-small-business/</guid><description>If you&apos;ve heard the term &apos;data pipeline&apos; and assumed it wasn&apos;t for you, here&apos;s the plain-English version for small business owners — no engineering background required.</description><pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate><category>Small Business</category><category>Beginner</category><category>Data Pipeline</category><category>Glossary</category><category>No-code</category><author>Edward van Eechoud</author></item></channel></rss>