<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://duckdb.org/feed.xml" rel="self" type="application/atom+xml" /><link href="https://duckdb.org/" rel="alternate" type="text/html" /><updated>2026-04-23T10:02:05+00:00</updated><id>https://duckdb.org/feed.xml</id><title type="html">DuckDB</title><subtitle>DuckDB is an in-process SQL database management system focused on analytical query processing. It is designed to be easy to install and easy to use. DuckDB has no external dependencies. DuckDB has bindings for C/C++, Python, R, Java, Node.js, Go and other languages.</subtitle><author><name>GitHub User</name><email>your-email@domain.com</email></author><entry><title type="html">Announcing DuckDB 1.5.2</title><link href="https://duckdb.org/2026/04/13/announcing-duckdb-152.html" rel="alternate" type="text/html" title="Announcing DuckDB 1.5.2" /><published>2026-04-13T00:00:00+00:00</published><updated>2026-04-13T00:00:00+00:00</updated><id>https://duckdb.org/2026/04/13/announcing-duckdb-152</id><content type="html" xml:base="https://duckdb.org/2026/04/13/announcing-duckdb-152.html"><![CDATA[<p>In this blog post, we highlight a few important fixes in DuckDB v1.5.2, the second patch release in <a href="/2026/03/09/announcing-duckdb-150.html">DuckDB's v1.5 line</a>.
You can find the complete <a href="https://github.com/duckdb/duckdb/releases/tag/v1.5.2">release notes on GitHub</a>.</p>

<p>To install the new version, please visit the <a href="/install/">installation page</a>.</p>

<h2 id="data-lake-and-lakehouse-formats">Data Lake and Lakehouse Formats</h2>

<h3 id="ducklake">DuckLake</h3>

<p>We are proud to release a stable, production-ready lakehouse specification and its reference implementation in DuckDB.</p>

<p>We published a <a href="https://ducklake.select/2026/04/13/ducklake-10/">detailed blog post on the DuckLake site</a> but here's a quick summary: DuckLake v1.0 ships dozens of bugfixes and guarantees backward-compatibility. Additionally, it has a number of cool features: <a href="https://ducklake.select/2026/04/02/data-inlining-in-ducklake/">data inlining</a>, sorted tables, bucket partitioning, and deletion buffers as Iceberg-compatible Puffin files. More on this in the <a href="https://ducklake.select/2026/04/13/ducklake-10/">announcement blog post</a>.</p>

<h3 id="iceberg">Iceberg</h3>

<p>The <a href="/docs/current/core_extensions/iceberg/overview.html">Iceberg extension</a> ships a number of new features. It now supports the following:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">GEOMETRY</code> type</li>
  <li><code class="language-plaintext highlighter-rouge">ALTER TABLE</code> statement</li>
  <li>Updates and deletes from <a href="https://iceberg.apache.org/docs/latest/partitioning/">partitioned tables</a></li>
  <li>Truncate and bucket partitions</li>
</ul>

<p>Last week, DuckDB Labs engineer Tom Ebergen gave a talk at the <a href="https://www.icebergsummit.org/">Iceberg Summit</a> titled <a href="/library/building-duckdb-iceberg-exploring-the-iceberg-ecosystem/">“Building DuckDB-Iceberg: Exploring the Iceberg Ecosystem”</a>, where he shared his experiences with Iceberg.</p>

<h2 id="preliminary-jepsen-test-results">Preliminary Jepsen Test Results</h2>

<p>To make DuckDB as robust as possible, we started a collaboration with <a href="https://jepsen.io/">Jepsen</a>. The preliminary test suite is available at <a href="https://github.com/duckdb/duckdb-jepsen">https://github.com/duckdb/duckdb-jepsen</a>.</p>

<p>The test suite has uncovered a bug that was triggered by <code class="language-plaintext highlighter-rouge">INSERT INTO</code> statements that perform conflict resolution on a primary key, and already <a href="https://github.com/duckdb/duckdb/pull/21489">shipped a fix</a> in this release.</p>

<h2 id="new-online-shell">New Online Shell</h2>

<p>The online <a href="/docs/current/clients/wasm/overview.html">WebAssembly</a> shell at <a href="https://shell.duckdb.org/"><code class="language-plaintext highlighter-rouge">shell.duckdb.org</code></a> received a complete overhaul.
A highlight of the new shell is the ability to store and list files using the <code class="language-plaintext highlighter-rouge">.files</code> dot command and its variants.</p>

<p>Using the file storage feature, you can turn your browser session into workbench: you can drag-and-drop files from your local file system to upload them, create new ones using DuckDB's <a href="/docs/current/sql/statements/copy.html#copy--to"><code class="language-plaintext highlighter-rouge">COPY ... TO</code> statement</a> and download the results. For more information on this feature, use the <code class="language-plaintext highlighter-rouge">.help</code> command.</p>

<p><img src="/images/blog/online-shell-example.png" alt="Example use of the new online shell at shell.duckdb.org" width="800" /></p>

<p>The new shell comes with a few built-in datasets: you're welcome to try them out and experiment.
Your old links to <code class="language-plaintext highlighter-rouge">shell.duckdb.org</code> should still work but if you experience any problems, please submit an issue in the <a href="https://github.com/duckdb/duckdb-wasm"><code class="language-plaintext highlighter-rouge">duckdb-web</code> repository</a>.</p>

<h2 id="benchmarks">Benchmarks</h2>

<p>We benchmarked DuckDB using the Linux v7 kernel on an <a href="https://instances.vantage.sh/aws/ec2/r8gd.8xlarge?currency=USD">r8gd.8xlarge</a> instance with 32 vCPUs, 256 GiB RAM, and an NVMe SSD.
We first ran the scale factor 300 test on Ubuntu 24.04 LTS, then upgraded to Ubuntu 26.04 beta.
We noticed that the composite TPC-H score shows a <strong>~10% improvement</strong>, jumping from 778,041 to 854,676 when measured with TPC-H's QphH@Score metric.</p>

<h2 id="coming-up">Coming Up</h2>

<p>This quarter, we have quite a few exciting events lined up.</p>

<p><strong>DuckCon #7.</strong> On June 24, we'll host our next user conference, <a href="/events/2026/06/24/duckcon7/">DuckCon #7</a>, in Amsterdam's beautiful <a href="https://www.kit.nl/about-us/">Royal Tropical Institute</a>. If you have been building cool things with DuckDB, consider submitting a talk by April 22. Registrations are also open – and free!</p>

<p><strong>AI Council Talk.</strong> On May 12, DuckDB co-creator Hannes Mühleisen will give a talk at AI Council 2026 titled <a href="/library/super-secret-next-big-thing-for-duckdb/">“Super-Secret Next Big Thing for DuckDB”</a>. Well, at this point, we cannot tell you more than he will present the super-secret next big thing for DuckDB. But, if you cannot make it, don't worry: we'll publish the presentation afterwards.</p>

<p><strong>Ubuntu Summit Talk.</strong> We already talked about performance on Ubuntu. In late May, Gábor Szárnyas of DuckDB Labs will give a talk titled <a href="/library/duckdb-not-quack-science/">“DuckDB: Not Quack Science”</a> at the <a href="https://ubuntu.com/summit">Ubuntu Summit</a>.</p>

<h2 id="conclusion">Conclusion</h2>

<p>This post is a short summary of the changes in v1.5.2. As usual, you can find the <a href="https://github.com/duckdb/duckdb/releases/tag/v1.5.2">full release notes on GitHub</a>.</p>]]></content><author><name>The DuckDB team</name></author><category term="release" /><summary type="html"><![CDATA[We are releasing DuckDB version v1.5.2, a patch release with bugfixes and performance improvements, and support for the DuckLake v1.0 lakehouse format.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://duckdb.org/images/blog/thumbs/duckdb-release-1-5-2.png" /><media:content medium="image" url="https://duckdb.org/images/blog/thumbs/duckdb-release-1-5-2.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">DuckLake v1.0: The Lakehouse Format Built on SQL Reaches Production-Readiness</title><link href="https://duckdb.org/2026/04/13/ducklake-10.html" rel="alternate" type="text/html" title="DuckLake v1.0: The Lakehouse Format Built on SQL Reaches Production-Readiness" /><published>2026-04-13T00:00:00+00:00</published><updated>2026-04-13T00:00:00+00:00</updated><id>https://duckdb.org/2026/04/13/ducklake-10</id><content type="html" xml:base="https://duckdb.org/2026/04/13/ducklake-10.html"><![CDATA[]]></content><author><name>The DuckDB team</name></author><category term="extensions" /><summary type="html"><![CDATA[We released the DuckLake v1.0 standard!]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://duckdb.org/images/blog/thumbs/ducklake-1-0.png" /><media:content medium="image" url="https://duckdb.org/images/blog/thumbs/ducklake-1-0.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Data Inlining in DuckLake: Unlocking Streaming for Data Lakes</title><link href="https://duckdb.org/2026/04/02/data-inlining-in-ducklake.html" rel="alternate" type="text/html" title="Data Inlining in DuckLake: Unlocking Streaming for Data Lakes" /><published>2026-04-02T00:00:00+00:00</published><updated>2026-04-02T00:00:00+00:00</updated><id>https://duckdb.org/2026/04/02/data-inlining-in-ducklake</id><content type="html" xml:base="https://duckdb.org/2026/04/02/data-inlining-in-ducklake.html"><![CDATA[]]></content><author><name>{&quot;twitter&quot; =&gt; &quot;holanda_pe&quot;, &quot;picture&quot; =&gt; &quot;/images/blog/authors/pedro_holanda.jpg&quot;}</name></author><category term="deep dive" /><summary type="html"><![CDATA[DuckLake’s data inlining stores small updates directly in the catalog, eliminating the “small files problem” and making continuous streaming into data lakes practical. Our benchmark shows 926× faster queries and 105× faster ingestion when compared to Iceberg.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://duckdb.org/images/blog/thumbs/ducklake-inlining.png" /><media:content medium="image" url="https://duckdb.org/images/blog/thumbs/ducklake-inlining.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">DuckDB Now Speaks Dutch!</title><link href="https://duckdb.org/2026/04/01/duckdb-now-speaks-dutch.html" rel="alternate" type="text/html" title="DuckDB Now Speaks Dutch!" /><published>2026-04-01T00:00:00+00:00</published><updated>2026-04-01T00:00:00+00:00</updated><id>https://duckdb.org/2026/04/01/duckdb-now-speaks-dutch</id><content type="html" xml:base="https://duckdb.org/2026/04/01/duckdb-now-speaks-dutch.html"><![CDATA[<p>Historically speaking, SQL queries have always been formulated in English. The initial name of the language was even Structured <strong>English</strong> Query Language (SEQUEL), before it became SQL. Now, what if the Dutch hadn't traded away New Amsterdam (present-day New York)? Would we all have been writing SQL in Dutch instead?</p>

<p>Well, wonder no longer. Today we're releasing <a href="/community_extensions/extensions/eenddb.html"><strong>EendDB</strong></a>: a DuckDB extension that brings you the <strong>Gestructureerde Zoektaal,</strong> or GZT for short.</p>

<p>Want joins? We've got <code class="language-plaintext highlighter-rouge">SAMENVOEGEN</code>. Aggregates? <code class="language-plaintext highlighter-rouge">GROEP PER</code>. Window functions? Those work too — though you'll have to look up the Dutch keywords in the repository yourself.</p>

<p>You can try it out right now in <a href="/2026/03/23/announcing-duckdb-151.html">DuckDB v1.5.1</a>:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">INSTALL</span><span class="n"> eenddb</span> <span class="k">FROM</span> <span class="n">community</span><span class="p">;</span>
<span class="k">LOAD</span><span class="n"> eenddb</span><span class="p">;</span>
<span class="k">CALL</span> <span class="nf">enable_dutch_parser</span><span class="p">();</span>

<span class="k">MAAK</span> <span class="k">TABEL</span> <span class="n">eend</span> <span class="p">(</span>
    <span class="n">id</span>        <span class="nb">GEHEEL_GETAL</span><span class="p">,</span>
    <span class="n">naam</span>      <span class="nb">TEKST</span><span class="p">,</span>
    <span class="n">leeftijd</span>  <span class="nb">GEHEEL_GETAL</span><span class="p">,</span>
    <span class="n">gewicht</span>   <span class="nb">KOMMAGETAL</span><span class="p">,</span>
    <span class="n">soort</span>     <span class="nb">TEKST</span>
<span class="p">);</span>

<span class="k">TOEVOEGEN</span> <span class="k">AAN</span> <span class="n">eend</span> <span class="k">WAARDEN</span>
    <span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="s1">'Donald'</span><span class="p">,</span>  <span class="mi">29</span><span class="p">,</span> <span class="mf">1.2</span><span class="p">,</span> <span class="s1">'Wilde eend'</span><span class="p">),</span>
    <span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="s1">'Daffy'</span><span class="p">,</span>   <span class="mi">35</span><span class="p">,</span> <span class="mf">1.5</span><span class="p">,</span> <span class="s1">'Zwarte eend'</span><span class="p">),</span>
    <span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="s1">'Daisy'</span><span class="p">,</span>   <span class="mi">27</span><span class="p">,</span> <span class="mf">1.1</span><span class="p">,</span> <span class="s1">'Wilde eend'</span><span class="p">),</span>
    <span class="p">(</span><span class="mi">4</span><span class="p">,</span> <span class="s1">'Scrooge'</span><span class="p">,</span> <span class="mi">75</span><span class="p">,</span> <span class="mf">1.8</span><span class="p">,</span> <span class="s1">'Wilde eend'</span><span class="p">);</span>

<span class="k">SELECTEER</span> <span class="o">*</span>
<span class="k">VAN</span> <span class="n">eend</span>
<span class="k">WAARBIJ</span> <span class="n">gewicht</span> <span class="o">&gt;</span> <span class="mf">1.2</span> <span class="k">EN</span> <span class="n">naam</span> <span class="k">ZOALS</span> <span class="s1">'%D%'</span>
<span class="k">VOLGORDE</span> <span class="nb">PER</span> <span class="n">leeftijd</span><span class="p">;</span>
</code></pre></div></div>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>┌───────┬─────────┬──────────┬─────────┬─────────────┐
│  id   │  naam   │ leeftijd │ gewicht │    soort    │
│ int32 │ varchar │  int32   │  float  │   varchar   │
├───────┼─────────┼──────────┼─────────┼─────────────┤
│     2 │ Daffy   │       35 │     1.5 │ Zwarte eend │
└───────┴─────────┴──────────┴─────────┴─────────────┘
</code></pre></div></div>

<p>Of course, no query language is complete without joins and aggregates. Let's create a second table and count the ducks per <em>soort:</em></p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">MAAK</span> <span class="k">TABEL</span> <span class="n">soorten</span> <span class="p">(</span><span class="n">soort</span> <span class="nb">TEKST</span><span class="p">,</span> <span class="n">leefgebied</span> <span class="nb">TEKST</span><span class="p">);</span>

<span class="k">TOEVOEGEN</span> <span class="k">AAN</span> <span class="n">soorten</span> <span class="k">WAARDEN</span>
    <span class="p">(</span><span class="s1">'Wilde eend'</span><span class="p">,</span>  <span class="s1">'Meren en rivieren'</span><span class="p">),</span>
    <span class="p">(</span><span class="s1">'Zwarte eend'</span><span class="p">,</span> <span class="s1">'Kustgebieden'</span><span class="p">);</span>

<span class="k">SELECTEER</span> <span class="n">s.leefgebied</span><span class="p">,</span> <span class="nf">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="k">ALS</span> <span class="n">aantal_eenden</span>
<span class="k">VAN</span> <span class="n">eend</span> <span class="k">ALS</span> <span class="n">e</span>
<span class="k">LINKS</span> <span class="k">SAMENVOEGEN</span> <span class="n">soorten</span> <span class="k">ALS</span> <span class="n">s</span> <span class="k">OP</span> <span class="n">e.soort</span> <span class="o">=</span> <span class="n">s.soort</span>
<span class="k">GROEP</span> <span class="nb">PER</span> <span class="n">s.leefgebied</span>
<span class="k">VOLGORDE</span> <span class="nb">PER</span> <span class="n">aantal_eenden</span> <span class="k">AFLOPEND</span><span class="p">;</span>
</code></pre></div></div>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>┌───────────────────┬───────────────┐
│    leefgebied     │ aantal_eenden │
│      varchar      │     int64     │
├───────────────────┼───────────────┤
│ Meren en rivieren │             3 │
│ Kustgebieden      │             1 │
└───────────────────┴───────────────┘
</code></pre></div></div>

<p>After we are done playing around, we obviously have to clean up after ourselves. Rather than <code class="language-plaintext highlighter-rouge">DROP</code> a table, in Dutch we like to throw it away (“weggooien”):</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">GOOI_WEG</span> <span class="k">TABEL</span> <span class="n">eend</span><span class="p">;</span>
<span class="k">GOOI_WEG</span> <span class="k">TABEL</span> <span class="n">soorten</span><span class="p">;</span>
</code></pre></div></div>

<p>Under the hood, the parser is using DuckDB's <a href="/2026/03/09/announcing-duckdb-150.html#peg-parser">new experimental parser</a>, based on <a href="/2024/11/22/runtime-extensible-parsers.html">Parsing Expression Grammar</a>.</p>

<p>For more examples, check out the <a href="https://github.com/Dtenwolde/eenddb/">repository on GitHub</a>.</p>]]></content><author><name>Daniël ten Wolde</name></author><category term="extensions" /><summary type="html"><![CDATA[DuckDB now speaks Dutch! Load the EendDB community extension and start writing your queries in het Nederlands.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://duckdb.org/images/blog/thumbs/duckdb-now-speaks-dutch.png" /><media:content medium="image" url="https://duckdb.org/images/blog/thumbs/duckdb-now-speaks-dutch.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Announcing DuckDB 1.5.1</title><link href="https://duckdb.org/2026/03/23/announcing-duckdb-151.html" rel="alternate" type="text/html" title="Announcing DuckDB 1.5.1" /><published>2026-03-23T00:00:00+00:00</published><updated>2026-03-23T00:00:00+00:00</updated><id>https://duckdb.org/2026/03/23/announcing-duckdb-151</id><content type="html" xml:base="https://duckdb.org/2026/03/23/announcing-duckdb-151.html"><![CDATA[<p>In this blog post, we highlight a few important fixes in DuckDB v1.5.1, the first patch release in <a href="/2026/03/09/announcing-duckdb-150.html">DuckDB's v1.5 line</a>.
You can find the complete <a href="https://github.com/duckdb/duckdb/releases/tag/v1.5.1">release notes on GitHub</a>.</p>

<p>To install the new version, please visit the <a href="/install/">installation page</a>.</p>

<h2 id="data-lake-and-lakehouse-formats">Data Lake and Lakehouse Formats</h2>

<h3 id="lance-support">Lance Support</h3>

<p>Thanks to the collaboration with LanceDB, DuckDB now supports reading and writing the <a href="https://github.com/lance-format/lance/">Lance lakehouse format</a> through the <a href="/docs/current/core_extensions/lance.html"><code class="language-plaintext highlighter-rouge">lance</code> core extension</a>.</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">INSTALL</span><span class="n"> lance</span><span class="p">;</span>
<span class="k">LOAD</span><span class="n"> lance</span><span class="p">;</span>
</code></pre></div></div>

<p>You can write to Lance as follows:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">COPY</span> <span class="p">(</span>
    <span class="k">SELECT</span> <span class="mi">1</span><span class="p">::</span><span class="nb">BIGINT</span> <span class="k">AS</span> <span class="n">id</span><span class="p">,</span> <span class="s1">'a'</span><span class="p">::</span><span class="nb">VARCHAR</span> <span class="k">AS</span> <span class="n">s</span>
    <span class="nb">UNION</span> <span class="k">ALL</span>
    <span class="k">SELECT</span> <span class="mi">2</span><span class="p">::</span><span class="nb">BIGINT</span> <span class="k">AS</span> <span class="n">id</span><span class="p">,</span> <span class="s1">'b'</span><span class="p">::</span><span class="nb">VARCHAR</span> <span class="k">AS</span> <span class="n">s</span>
<span class="p">)</span> <span class="k">TO</span> <span class="s1">'example.lance'</span> <span class="p">(</span><span class="k">FORMAT</span> <span class="k">lance</span><span class="p">);</span>
</code></pre></div></div>

<p>And read it like so:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span> <span class="nf">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="k">FROM</span> <span class="s1">'example.lance'</span><span class="p">;</span>
</code></pre></div></div>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>┌──────────────┐
│ count_star() │
│    int64     │
├──────────────┤
│            2 │
└──────────────┘
</code></pre></div></div>

<blockquote>
  <p>Lance support is also available for DuckDB v1.4.4 LTS and v1.5.0.</p>
</blockquote>

<h3 id="iceberg-support">Iceberg Support</h3>

<p>We extended support for <a href="https://iceberg.apache.org/spec/#version-3">Iceberg v3</a> tables, including:</p>

<ul>
  <li>the <a href="https://github.com/duckdb/duckdb-iceberg/pull/474"><code class="language-plaintext highlighter-rouge">VARIANT</code></a> and <a href="https://github.com/duckdb/duckdb-iceberg/pull/765"><code class="language-plaintext highlighter-rouge">TIMESTAMP_NS</code></a> types</li>
  <li><a href="https://iceberg.apache.org/spec/#default-values">default values</a></li>
  <li><a href="https://github.com/duckdb/duckdb-iceberg/pull/728">deletion vectors</a> (delete and update v3 tables)</li>
  <li><a href="https://github.com/duckdb/duckdb-iceberg/pull/744">inserting into a partitioned table</a></li>
  <li><a href="https://github.com/duckdb/duckdb-iceberg/pull/744">creating a partitioned table</a></li>
  <li><a href="https://github.com/duckdb/duckdb-iceberg/pull/765">Parquet Copy options support</a></li>
</ul>

<h2 id="configuration-options">Configuration Options</h2>

<p>The <a href="/docs/current/core_extensions/httpfs/overview.html"><code class="language-plaintext highlighter-rouge">httpfs</code> extension</a> has a <a href="https://github.com/duckdb/duckdb-httpfs/pull/285">new setting</a>:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SET</span> <span class="n">force_download_threshold</span> <span class="o">=</span> <span class="mi">2_000_000</span><span class="p">;</span>
</code></pre></div></div>

<p>This will force full file download on any file &lt; 2 MB.
The default value is 0, but we may revisit the setting default in the next release.</p>

<h2 id="fixes">Fixes</h2>

<h3 id="globbing-performance">Globbing Performance</h3>

<p>There have been reports by users (thanks!) that S3 globbing performance degraded in certain cases – this has now been <a href="https://github.com/duckdb/duckdb-httpfs/pull/284">addressed</a>.</p>

<h3 id="non-interactive-shell">Non-Interactive Shell</h3>

<p>On Linux and macOS, DuckDB's new CLI had an issue executing the input received through a <a href="https://github.com/duckdb/duckdb/issues/21243">non-interactive shell</a>.
In practice, this meant that scripts piped into DuckDB were not executed.
For v1.5.0, there was a <a href="/docs/current/guides/troubleshooting/command_line.html">simple workaround available</a>.
We fixed the issue in v1.5.1, so there is no need for a workaround.</p>

<h3 id="indexes">Indexes</h3>

<p>This release ships <a href="https://github.com/duckdb/duckdb/pull/21270">two</a> <a href="https://github.com/duckdb/duckdb/pull/21427">fixes</a> for <a href="/docs/current/sql/indexes.html">ART indexes</a>.
If you are using indexes in your workload (directly or through key / unique constraints), we recommend updating to v1.5.1 as soon as possible.</p>

<h2 id="landing-page-improvements">Landing Page Improvements</h2>

<p>We are shipping a new section of the landing page that showcases all the technologies DuckDB can run on… or in! <a href="/#ecosystem">Check it out!</a></p>

<h2 id="conclusion">Conclusion</h2>

<p>This post is a short summary of the changes in v1.5.1. As usual, you can find the <a href="https://github.com/duckdb/duckdb/releases/tag/v1.5.1">full release notes on GitHub</a>.</p>]]></content><author><name>The DuckDB team</name></author><category term="release" /><summary type="html"><![CDATA[We are releasing DuckDB version 1.5.1, a patch release with bugfixes, performance improvements and support for the Lance lakehouse format.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://duckdb.org/images/blog/thumbs/duckdb-release-1-5-1.png" /><media:content medium="image" url="https://duckdb.org/images/blog/thumbs/duckdb-release-1-5-1.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">DuckDB.ExtensionKit: Building DuckDB Extensions in C#</title><link href="https://duckdb.org/2026/03/20/duckdb-extensionkit-csharp.html" rel="alternate" type="text/html" title="DuckDB.ExtensionKit: Building DuckDB Extensions in C#" /><published>2026-03-20T00:00:00+00:00</published><updated>2026-03-20T00:00:00+00:00</updated><id>https://duckdb.org/2026/03/20/duckdb-extensionkit-csharp</id><content type="html" xml:base="https://duckdb.org/2026/03/20/duckdb-extensionkit-csharp.html"><![CDATA[<h2 id="introduction">Introduction</h2>

<p>DuckDB has a flexible extension mechanism that allows extensions to be loaded dynamically at runtime. This makes it easy to extend DuckDB’s main feature set without adding everything to the main binary. Extensions can add support for new file formats, introduce custom types, or provide new scalar and table functions. A significant part of DuckDB’s functionality is actually implemented using this extension mechanism in the form of core extensions, which are developed alongside the engine itself by the DuckDB team. For example, DuckDB can read and write JSON files via the <code class="language-plaintext highlighter-rouge">json</code> extension and integrate with PostgreSQL using the <code class="language-plaintext highlighter-rouge">postgres</code> extension.</p>

<p>DuckDB also has a thriving ecosystem of <a href="/community_extensions/">community extensions</a>, i.e., third-party extensions, maintained by community members, covering a wide range of use cases and integrations. For example, you can expose additional cryptographic functionality through the <code class="language-plaintext highlighter-rouge">crypto</code> community extension.</p>

<h2 id="how-extensions-are-built-today">How Extensions Are Built Today</h2>

<p>Today, developers can use the same C++ API that the core extensions use for developing extensions. A template for creating extensions is available in the <a href="https://github.com/duckdb/extension-template/"><code class="language-plaintext highlighter-rouge">extension-template</code> repository</a>. While powerful, the C++ extension API is tightly coupled to DuckDB’s internal APIs, so it can (and often will) change between DuckDB versions. Additionally, using it requires building the whole DuckDB engine and its documentation is not as complete as that of the C API.</p>

<p>To solve these issues, DuckDB also provides an <a href="https://github.com/duckdb/extension-template-c">experimental template</a> for C/C++ based extensions that link with the <strong>C Extension API</strong> of DuckDB. This API provides a stable, backwards-compatible interface for developing extensions and is designed to allow extensions to work across different DuckDB versions. Because it is a C-based API, it can also be used from other programming languages such as Rust.</p>

<p>Even with the C API, writing extensions still means working at a low level, performing manual memory management, and writing a lot of boilerplate code. While the C API solves stability and compatibility, it doesn’t solve <em>developer experience</em> for higher-level ecosystems. This is where DuckDB.ExtensionKit comes in, aiming to make extension development more accessible to developers working in the .NET ecosystem. By building on top of the DuckDB C Extension API and compiling extensions using the <a href="https://learn.microsoft.com/en-us/dotnet/core/deploying/native-aot/">.NET Native AOT (Ahead-of-Time) compilation</a>, DuckDB.ExtensionKit offers the best of both worlds: native DuckDB extensions that integrate like any other extension, combined with the productivity and rich library ecosystem of C# and .NET.</p>

<h2 id="duckdbextensionkit">DuckDB.ExtensionKit</h2>

<p>DuckDB.ExtensionKit provides a set of C# APIs and build tooling for implementing DuckDB extensions. It exposes the low-level DuckDB C Extension API as C# methods, and also provides type-safe, higher-level APIs for defining scalar and table functions, while still producing native DuckDB extensions. The toolkit also includes a source generator that automatically generates the required boilerplate code, including the native entry point and API initialization.</p>

<p>With DuckDB.ExtensionKit, building an extension closely resembles building a regular C# library. Extension authors create a C# project that references the ExtensionKit runtime and implements functions using the provided, type-safe APIs that expose DuckDB concepts.</p>

<p>At build time, the source generator emits the required boilerplate, including the native entry point and extension initialization. The project is then compiled using .NET Native AOT, producing a native DuckDB extension binary that can be loaded and used by DuckDB like any other extension, without requiring a .NET runtime.</p>

<p>To show a concrete example for this process, the following snippet shows a small DuckDB extension implemented using DuckDB.ExtensionKit that exposes both a scalar function and a table function for working with JWTs (JSON Web Token). At a high level, writing an extension with DuckDB.ExtensionKit involves defining a C# type that represents the extension and registering functions explicitly. In the example below, this is done by creating a <code class="language-plaintext highlighter-rouge">partial</code> class annotated with the <code class="language-plaintext highlighter-rouge">[DuckDBExtension]</code> attribute and implementing the <code class="language-plaintext highlighter-rouge">RegisterFunctions</code> method. The implementation makes use of the <code class="language-plaintext highlighter-rouge">System.IdentityModel.Tokens.Jwt</code> NuGet package, illustrating how extensions can easily take advantage of existing .NET libraries.</p>

<p>We'll add two functions, a scalar function for extracting <em>a single claim</em> from a JWT and a table function for extracting <em>multiple claims.</em></p>

<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">public</span> <span class="k">static</span> <span class="k">partial</span> <span class="k">class</span> <span class="nc">JwtExtension</span>
<span class="p">{</span>
  <span class="k">private</span> <span class="k">static</span> <span class="k">void</span> <span class="nf">RegisterFunctions</span><span class="p">(</span><span class="n">DuckDBConnection</span> <span class="n">connection</span><span class="p">)</span>
  <span class="p">{</span>
    <span class="n">connection</span><span class="p">.</span><span class="n">RegisterScalarFunction</span><span class="p">&lt;</span><span class="kt">string</span><span class="p">,</span> <span class="kt">string</span><span class="p">,</span> <span class="kt">string</span><span class="p">?&gt;(</span><span class="s">"extract_claim_from_jwt"</span><span class="p">,</span> <span class="n">ExtractClaimFromJwt</span><span class="p">);</span>

    <span class="n">connection</span><span class="p">.</span><span class="nf">RegisterTableFunction</span><span class="p">(</span><span class="s">"extract_claims_from_jwt"</span><span class="p">,</span> <span class="p">(</span><span class="kt">string</span> <span class="n">jwt</span><span class="p">)</span> <span class="p">=&gt;</span> <span class="nf">ExtractClaimsFromJwt</span><span class="p">(</span><span class="n">jwt</span><span class="p">),</span>
                                     <span class="n">c</span> <span class="p">=&gt;</span> <span class="k">new</span> <span class="p">{</span> <span class="n">claim_name</span> <span class="p">=</span> <span class="n">c</span><span class="p">.</span><span class="n">Key</span><span class="p">,</span> <span class="n">claim_value</span> <span class="p">=</span> <span class="n">c</span><span class="p">.</span><span class="n">Value</span> <span class="p">});</span>
  <span class="p">}</span>

  <span class="k">private</span> <span class="k">static</span> <span class="kt">string</span><span class="p">?</span> <span class="nf">ExtractClaimFromJwt</span><span class="p">(</span><span class="kt">string</span> <span class="n">jwt</span><span class="p">,</span> <span class="kt">string</span> <span class="n">claim</span><span class="p">)</span>
  <span class="p">{</span>
    <span class="kt">var</span> <span class="n">jwtHandler</span> <span class="p">=</span> <span class="k">new</span> <span class="nf">JwtSecurityTokenHandler</span><span class="p">();</span>
    <span class="kt">var</span> <span class="n">token</span> <span class="p">=</span> <span class="n">jwtHandler</span><span class="p">.</span><span class="nf">ReadJwtToken</span><span class="p">(</span><span class="n">jwt</span><span class="p">);</span>
    <span class="k">return</span> <span class="n">token</span><span class="p">.</span><span class="n">Claims</span><span class="p">.</span><span class="nf">FirstOrDefault</span><span class="p">(</span><span class="n">c</span> <span class="p">=&gt;</span> <span class="n">c</span><span class="p">.</span><span class="n">Type</span> <span class="p">==</span> <span class="n">claim</span><span class="p">)?.</span><span class="n">Value</span><span class="p">;</span>
  <span class="p">}</span>

  <span class="k">private</span> <span class="k">static</span> <span class="n">Dictionary</span><span class="p">&lt;</span><span class="kt">string</span><span class="p">,</span> <span class="kt">string</span><span class="p">&gt;</span> <span class="nf">ExtractClaimsFromJwt</span><span class="p">(</span><span class="kt">string</span> <span class="n">jwt</span><span class="p">)</span>
  <span class="p">{</span>
    <span class="kt">var</span> <span class="n">jwtHandler</span> <span class="p">=</span> <span class="k">new</span> <span class="nf">JwtSecurityTokenHandler</span><span class="p">();</span>
    <span class="kt">var</span> <span class="n">token</span> <span class="p">=</span> <span class="n">jwtHandler</span><span class="p">.</span><span class="nf">ReadJwtToken</span><span class="p">(</span><span class="n">jwt</span><span class="p">);</span>
    <span class="k">return</span> <span class="n">token</span><span class="p">.</span><span class="n">Claims</span><span class="p">.</span><span class="nf">ToDictionary</span><span class="p">(</span><span class="n">c</span> <span class="p">=&gt;</span> <span class="n">c</span><span class="p">.</span><span class="n">Type</span><span class="p">,</span> <span class="n">c</span> <span class="p">=&gt;</span> <span class="n">c</span><span class="p">.</span><span class="n">Value</span><span class="p">);</span>
  <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>In just 25 lines, we have built an extension that adds <code class="language-plaintext highlighter-rouge">extract_claim_from_jwt</code> and <code class="language-plaintext highlighter-rouge">extract_claims_from_jwt</code> functions to DuckDB. We can call these functions just like any other function. For example, to extract the <code class="language-plaintext highlighter-rouge">name</code> field from a claim, we can run:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span> <span class="nf">extract_claim_from_jwt</span><span class="p">(</span>
    <span class="s1">'eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiIsImtpZCI6ImExZmIyY2NjN2FiMjBiMDYyNzJmNGUxMjIwZDEwZmZlIn0.eyJpc3MiOiJodHRwczovL2lkcC5sb2NhbCIsImF1ZCI6Im15X2NsaWVudF9hcHAiLCJuYW1lIjoiR2lvcmdpIERhbGFraXNodmlsaSIsInN1YiI6IjViZTg2MzU5MDczYzQzNGJhZDJkYTM5MzIyMjJkYWJlIiwiYWRtaW4iOnRydWUsImV4cCI6MTc2NjU5MTI2NywiaWF0IjoxNzY2NTkwOTY3fQ.N7h2xc4rgS4oPo8IO9wyG1lnr2wqTUC80YudWTXp7rXmU2JdsUiweKmuYVVbygdJAR4PJmbQtak4_VuZg2fZFILVpzDyLvGITfUW_18XuDQ_SIm3VlfAuHOVHfruuvvSAfjUkTW2Jlrv3ihFYgusV58vjhcVFHssOGMEbtMNo10Jf62dczVVGNZXh_OOLS0nTLffhY94sZddqQIE56W8xhLK5YMO4gO8voMzhUwDwucnVvyNfui38MPDNdTSKjn3Ab0hG8jzOVhbYSCHf0eQsbxPzGtXUCJobScWDb78IphFWec6W4ugIYp5CMh3C_noQi94NYjQg2P-AJ5FLCKzKA'</span><span class="p">,</span>
    <span class="s1">'name'</span>
<span class="p">);</span>
</code></pre></div></div>

<p>This returns <code class="language-plaintext highlighter-rouge">Giorgi Dalakishvili</code>. Let's test the table function:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span> <span class="o">*</span>
<span class="k">FROM</span> <span class="nf">extract_claims_from_jwt</span><span class="p">(</span>
    <span class="s1">'eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiIsImtpZCI6ImExZmIyY2NjN2FiMjBiMDYyNzJmNGUxMjIwZDEwZmZlIn0.eyJpc3MiOiJodHRwczovL2lkcC5sb2NhbCIsImF1ZCI6Im15X2NsaWVudF9hcHAiLCJuYW1lIjoiR2lvcmdpIERhbGFraXNodmlsaSIsInN1YiI6IjViZTg2MzU5MDczYzQzNGJhZDJkYTM5MzIyMjJkYWJlIiwiYWRtaW4iOnRydWUsImV4cCI6MTc2NjU5MTI2NywiaWF0IjoxNzY2NTkwOTY3fQ.N7h2xc4rgS4oPo8IO9wyG1lnr2wqTUC80YudWTXp7rXmU2JdsUiweKmuYVVbygdJAR4PJmbQtak4_VuZg2fZFILVpzDyLvGITfUW_18XuDQ_SIm3VlfAuHOVHfruuvvSAfjUkTW2Jlrv3ihFYgusV58vjhcVFHssOGMEbtMNo10Jf62dczVVGNZXh_OOLS0nTLffhY94sZddqQIE56W8xhLK5YMO4gO8voMzhUwDwucnVvyNfui38MPDNdTSKjn3Ab0hG8jzOVhbYSCHf0eQsbxPzGtXUCJobScWDb78IphFWec6W4ugIYp5CMh3C_noQi94NYjQg2P-AJ5FLCKzKA'</span>
<span class="p">);</span>
</code></pre></div></div>

<p>This returns:</p>

<div class="monospace_table"></div>

<!-- markdownlint-disable MD034 -->

<table>
  <thead>
    <tr>
      <th>claim_name</th>
      <th>claim_value</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>iss</td>
      <td>https://idp.local</td>
    </tr>
    <tr>
      <td>aud</td>
      <td>my_client_app</td>
    </tr>
    <tr>
      <td>name</td>
      <td>Giorgi Dalakishvili</td>
    </tr>
    <tr>
      <td>sub</td>
      <td>5be86359073c434bad2da3932222dabe</td>
    </tr>
    <tr>
      <td>admin</td>
      <td>true</td>
    </tr>
    <tr>
      <td>exp</td>
      <td>1766591267</td>
    </tr>
    <tr>
      <td>iat</td>
      <td>1766590967</td>
    </tr>
  </tbody>
</table>

<!-- markdownlint-enable MD034 -->

<h2 id="how-duckdbextensionkit-works">How DuckDB.ExtensionKit Works</h2>

<p>DuckDB.ExtensionKit relies on several modern C# language and runtime features to efficiently bridge DuckDB’s C extension API to managed code. These features make it possible to build native extensions in C# without introducing a managed runtime dependency at load time.</p>

<h2 id="function-pointers">Function Pointers</h2>

<p>DuckDB’s C extension API is exposed as a <strong>versioned function table</strong>: a large struct (<a href="https://github.com/duckdb/extension-template-c/blob/152f7fba8df6ef2d3c48caf344fead63aa1e0501/duckdb_capi/duckdb_extension.h#L70-L545">duckdb_ext_api_v1</a>) whose fields are C function pointers (e.g., <code class="language-plaintext highlighter-rouge">duckdb_open</code>, <code class="language-plaintext highlighter-rouge">duckdb_register_scalar_function</code>, <code class="language-plaintext highlighter-rouge">duckdb_vector_get_data</code>, and so on). DuckDB.ExtensionKit mirrors this mechanism in C#. It defines a <a href="https://github.com/Giorgi/DuckDB.ExtensionKit/blob/99e4b91d50c5c840a3c4f69ea92d4fd4e49e7b76/DuckDB.ExtensionKit/DuckDBExtApiV1.cs#L7-L551">C# representation of the struct</a> (<code class="language-plaintext highlighter-rouge">DuckDBExtApiV1</code>), where each field is declared as a <a href="https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/unsafe-code#function-pointers">C# function pointer</a> (<code class="language-plaintext highlighter-rouge">delegate* unmanaged[Cdecl]&lt;...&gt;</code>). This maps the C ABI directly: calling into DuckDB becomes a simple indirect call through a function pointer field, rather than a delegate invocation with runtime marshaling.</p>

<h2 id="entrypoint">Entrypoint</h2>

<p>A DuckDB extension needs to expose an <strong>entrypoint function</strong> following the C calling convention (the entrypoint that should be exported from the binary is the name of the extension plus <code class="language-plaintext highlighter-rouge">_init_c_api</code>). This way, DuckDB can locate it when the extension is loaded. In the C extension template, this is handled with macros that generate the exported function and the surrounding boilerplate.</p>

<p>DuckDB.ExtensionKit follows the same model, but generates the boilerplate from C# instead of C macros. The source generator emits a native-compatible entrypoint that retrieves the API table (via the <code class="language-plaintext highlighter-rouge">access</code> object) and performs the required initialization, just like the C template does. The generated method is annotated with <code class="language-plaintext highlighter-rouge">[UnmanagedCallersOnly(EntryPoint = "...")]</code>, which instructs the .NET toolchain to <a href="https://learn.microsoft.com/en-us/dotnet/core/deploying/native-aot/interop#native-exports">export a real native symbol</a> with that name and make it callable from C. With .NET Native AOT, this becomes an actual exported function in the produced binary – allowing DuckDB to load and call into the extension exactly as it would for a C implementation.</p>

<h2 id="native-aot">Native AOT</h2>

<p>Finally, Native AOT is what makes this approach practical for DuckDB extensions. Once the extension code and generated sources are compiled, the project is published using .NET Native AOT. This step produces a native binary with no dependency on a managed runtime at load time. The resulting artifact is a native DuckDB extension that can be loaded and executed in the same way as extensions written in C or C++. From DuckDB’s perspective, there is no difference between an extension built with DuckDB.ExtensionKit and one implemented in a traditional native language.</p>

<h2 id="current-status-and-limitations">Current Status and Limitations</h2>

<p>DuckDB.ExtensionKit, just like the C extension template, is currently experimental. The APIs are still evolving, and not all extension features supported by DuckDB are exposed yet.</p>

<p>The toolkit relies on .NET Native AOT, which means extensions need to be built for specific target platforms (for example, <code class="language-plaintext highlighter-rouge">linux-x64</code>, <code class="language-plaintext highlighter-rouge">osx-arm64</code>, or <code class="language-plaintext highlighter-rouge">win-x64</code>). As with other native extensions, binaries are platform-specific and need to be built accordingly.</p>

<h2 id="build-your-own-extension-in-c">Build Your Own Extension in C#</h2>

<p><a href="https://github.com/Giorgi/DuckDB.ExtensionKit">DuckDB.ExtensionKit</a> is available as an open-source project on GitHub under the MIT license. The project includes example extensions that demonstrate how to define and build DuckDB extensions in C#. The repository contains a JWT-based example extension that showcases both scalar functions and table functions, as well as the full build and publishing workflow using .NET Native AOT.</p>

<p>Feedback, bug reports, and contributions are welcome through <a href="https://github.com/Giorgi/DuckDB.ExtensionKit/issues">GitHub issues</a>.</p>

<h2 id="closing-thoughts">Closing Thoughts</h2>

<p>DuckDB’s extension mechanism has proven to be a flexible foundation for extending the system without complicating the core engine. DuckDB.ExtensionKit explores how this mechanism can be made accessible to a broader audience by leveraging the .NET ecosystem, while still producing native extensions that integrate directly with DuckDB.</p>

<p>Although C# is typically viewed as a high-level language, this project demonstrates that it can also be used to implement low-level, ABI-compatible components when needed. By combining modern C# features with DuckDB’s existing extension interface, it is possible to write extensions in a high-level language without giving up control over native boundaries.</p>]]></content><author><name>Giorgi Dalakishvili</name></author><category term="extensions" /><summary type="html"><![CDATA[DuckDB.ExtensionKit brings DuckDB extension development to the .NET ecosystem. By building on DuckDB's stable C Extension API and leveraging .NET Native AOT compilation, it lets C# developers define scalar and table functions, which can be shipped as native DuckDB extensions.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://duckdb.org/images/blog/thumbs/vortex.svg" /><media:content medium="image" url="https://duckdb.org/images/blog/thumbs/vortex.svg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Big Data on the Cheapest MacBook</title><link href="https://duckdb.org/2026/03/11/big-data-on-the-cheapest-macbook.html" rel="alternate" type="text/html" title="Big Data on the Cheapest MacBook" /><published>2026-03-11T00:00:00+00:00</published><updated>2026-03-11T00:00:00+00:00</updated><id>https://duckdb.org/2026/03/11/big-data-on-the-cheapest-macbook</id><content type="html" xml:base="https://duckdb.org/2026/03/11/big-data-on-the-cheapest-macbook.html"><![CDATA[<p>Apple released the <a href="https://en.wikipedia.org/wiki/MacBook_Neo">MacBook Neo</a> today and there is no shortage of tech reviews explaining whether it's the right device for you if you are a student, a photographer or a writer.
What they <em>don't</em> tell you is whether it fits into our <a href="https://blobs.duckdb.org/merch/duckdb-2024-big-data-on-your-laptop-poster.pdf">Big Data on Your Laptop</a> ethos.
We wanted to answer this <em>using a data-driven approach,</em> so we went to the nearest Apple Store, picked one up and took it for a spin.</p>

<h2 id="whats-in-the-box">What's in the Box?</h2>

<p>Well, not much! If you buy this machine in the EU, there isn't even a charging brick included. All you get is the laptop and a braided USB-C cable. But you likely already have a few USB-C bricks lying around – let's move on to the laptop itself!</p>

<p><img src="/images/blog/macbook-neo/box.jpg" width="600" /></p>

<p>The only part of the hardware specification that you can select is the disk: you can pick either 256 or 512 GB.
As our mission is to deal with alleged “Big Data”, we picked the larger option, which brings the price to $700 in the US or €800 in the EU.
The amount of memory is fixed to 8 GB.
And while there is only a single CPU option, it is quite an interesting one:
this laptop is powered by the 6-core <a href="https://en.wikipedia.org/wiki/Apple_A18#CPU">Apple A18 Pro</a>, originally built for the iPhone 16 Pro.</p>

<p>It turns out that we have already <a href="/2024/12/06/duckdb-tpch-sf100-on-mobile.html#a-song-of-dry-ice-and-fire">tested this phone</a> under some unusual circumstances. Back in 2024, with DuckDB v1.2-dev, we found that the iPhone 16 Pro could complete all <a href="/docs/current/core_extensions/tpch.html">TPC-H</a> queries at scale factor 100 in about 10 minutes when air-cooled and in less than 8 minutes while lying in a box of dry ice. The MacBook Neo should definitely be able to handle this workload – but maybe it can even handle a bit more. Cue the inevitable benchmarks!</p>

<h2 id="clickbench">ClickBench</h2>

<p>For our first experiment, we used <a href="https://benchmark.clickhouse.com/">ClickBench</a>, an analytical database benchmark. ClickBench has 43 queries that focus on aggregation and filtering operations. The operations run on a single wide table with 100M rows, which uses about 14 GB when serialized to Parquet and 75 GB when stored in CSV format.</p>

<h3 id="benchmark-environment">Benchmark Environment</h3>

<p>We ported <a href="https://github.com/szarnyasg/ClickBench/tree/duckdb-macos-compatible">ClickBench's DuckDB implementation to macOS</a> and ran it on the MacBook Neo using the freshly minted <a href="/2026/03/09/announcing-duckdb-150.html">v1.5.0 release</a>.
We only applied a small tweak: as suggested in <a href="/docs/current/guides/performance/my_workload_is_slow.html">our performance guide</a>, we slightly lowered the memory limit to 5 GB, to reduce relying on the OS' swapping and to let DuckDB handle memory management for <a href="/docs/current/guides/performance/how_to_tune_workloads.html#larger-than-memory-workloads-out-of-core-processing">larger-than-memory workloads</a>. This is a common trick in memory-constrained environments where other processes are likely using more than 20% of the total system memory.</p>

<p><img src="/images/blog/macbook-neo/laptop.jpg" width="600" /></p>

<p>We also re-ran ClickBench with DuckDB v1.5.0 on two cloud instances, yielding the following lineup:</p>

<ul>
  <li>The star of our show, the MacBook Neo with 2 performance cores, 4 efficiency cores and 8 GB RAM</li>
  <li><a href="https://instances.vantage.sh/aws/ec2/c6a.4xlarge">c6a.4xlarge</a> with 16 AMD EPYC vCPU cores and 32 GB RAM. This instance is <a href="https://benchmark.clickhouse.com/#system=-&amp;type=-&amp;machine=+ca4e&amp;cluster_size=-&amp;opensource=-&amp;hardware=+c&amp;tuned=+n&amp;metric=combined&amp;queries=-">popular in ClickBench</a> with about 80 individual results reported.</li>
  <li><a href="https://instances.vantage.sh/aws/ec2/c8g.metal-48xl">c8g.metal-48xl</a> with a whopping 192 Graviton4 vCPU cores and 384 GB RAM. This instance is often at the top of the <a href="https://benchmark.clickhouse.com/">overall ClickBench leaderboard</a>.</li>
</ul>

<p>The benchmark script first loaded the Parquet file into the database. Then, as per <a href="https://github.com/ClickHouse/ClickBench/blob/main/README.md#rules-and-contribution">ClickBench's rules</a>, it ran each query three times to capture both cold runs (the first run when caches are cold) and hot runs (when the system has a chance to exploit e.g. file system caching).</p>

<h3 id="results-and-analysis">Results and Analysis</h3>

<p>Our experiments produced the following aggregate runtimes, in seconds:</p>

<table>
  <thead>
    <tr>
      <th>Machine</th>
      <th style="text-align: right">Cold run (median)</th>
      <th style="text-align: right">Cold run (total)</th>
      <th style="text-align: right">Hot run (median)</th>
      <th style="text-align: right">Hot run (total)</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>MacBook Neo</td>
      <td style="text-align: right">0.57</td>
      <td style="text-align: right">59.73</td>
      <td style="text-align: right">0.41</td>
      <td style="text-align: right">54.27</td>
    </tr>
    <tr>
      <td>c6a.4xlarge</td>
      <td style="text-align: right">1.34</td>
      <td style="text-align: right">145.08</td>
      <td style="text-align: right">0.50</td>
      <td style="text-align: right">47.86</td>
    </tr>
    <tr>
      <td>c8g.metal-48xl</td>
      <td style="text-align: right">1.54</td>
      <td style="text-align: right">169.67</td>
      <td style="text-align: right">0.05</td>
      <td style="text-align: right">4.35</td>
    </tr>
  </tbody>
</table>

<p><strong>Cold run.</strong> The results start with a big surprise: in the cold run, the MacBook Neo is the clear winner with a sub-second median runtime, <em>completing all queries in under a minute!</em> Of course, if we dig deeper into the setups, there is an explanation for this. The cloud instances have network-attached disks, and accessing the database on these dominates the overall query runtimes. The MacBook Neo has a local NVMe SSD, which is far from best-in-class, but still provides relatively quick access on the first read.</p>

<p><strong>Hot run.</strong> In the hot runs, the MacBook's <em>total runtime</em> only improves by approximately 10%, while the cloud machines come into their own, with the c8g.metal-48xl winning by an order of magnitude. However, it's worth noting that on <em>median query runtimes</em> the MacBook Neo can still beat the c6a.4xlarge, a mid-sized cloud instance. And the laptop's <em>total runtime</em> is only about 13% slower despite the cloud box having 10 more CPU threads and 4 times as much RAM.</p>

<h2 id="tpc-ds">TPC-DS</h2>

<p>For our second experiment, we picked the queries of the TPC-DS benchmark. Compared to the ubiquitous TPC-H benchmark, which has 8 tables and 22 queries, TPC-DS has 24 tables and 99 queries, many of which are more complex and include features such as <a href="/docs/current/sql/functions/window_functions.html">window functions</a>. And while TPC-H has been <a href="https://homepages.cwi.nl/~boncz/snb-challenge/chokepoints-tpctc.pdf">optimized to death</a>, there is still some semblance of value in TPC-DS results. Let's see whether the cheapest MacBook can handle these queries!</p>

<p>For this round, we used DuckDB's <a href="/install/?version=lts">LTS version</a>, v1.4.4. We generated the datasets using DuckDB's <a href="/docs/current/core_extensions/tpcds.html"><code class="language-plaintext highlighter-rouge">tpcds</code> extension</a> and set the memory limit to 6 GB.</p>

<p>At SF100, the laptop breezed through most queries with a median query runtime of 1.63 seconds and a total runtime of 15.5 minutes.</p>

<p>At SF300, the memory constraint started to show. While the median query runtime was still quite good at 6.90 seconds, DuckDB occasionally used up to 80 GB of space for <a href="/docs/current/guides/performance/how_to_tune_workloads.html">spilling to disk</a> and it was clear that some queries were going to take a long time. Most notably, <a href="https://github.com/duckdb/duckdb/blob/main/extension/tpcds/dsdgen/queries/67.sql">query 67</a> took 51 minutes to complete. But hardware and software continued to work together tirelessly, and they ultimately passed the test, completing all queries in 79 minutes.</p>

<h2 id="should-you-buy-one">Should You Buy One?</h2>

<p>Here's the thing: if you are running Big Data workloads on your laptop every day, you probably shouldn't get the MacBook Neo. Yes, DuckDB runs on it, and can handle a lot of data by leveraging <a href="/docs/current/guides/performance/how_to_tune_workloads.html#larger-than-memory-workloads-out-of-core-processing">out-of-core processing</a>. But the MacBook Neo's disk I/O is lackluster compared to the Air and Pro models (about 1.5 GB/s compared to 3–6 GB/s), and the 8 GB memory will be limiting in the long run. If you need to process <a href="/2025/09/08/duckdb-on-the-framework-laptop-13.html">Big Data on the move</a> and can pay up a bit, the other MacBook models will serve your needs better and there are also good options for Linux and Windows.</p>

<p>All that said, if you run <a href="/library/duckdb-in-the-cloud/">DuckDB in the cloud</a> and primarily use your laptop as a client, this is a great device. And you can rest assured that if you <em>occasionally</em> need to crunch some data locally, DuckDB on the MacBook Neo will be up to the challenge.</p>]]></content><author><name>{&quot;twitter&quot; =&gt; &quot;none&quot;, &quot;picture&quot; =&gt; &quot;/images/blog/authors/gabor_szarnyas.png&quot;}</name></author><category term="benchmark" /><summary type="html"><![CDATA[How does the latest entry-level MacBook perform on database workloads? We benchmarked it using ClickBench and TPC-DS SF300. We found that it could complete both workloads, sometimes with surprisingly good results.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://duckdb.org/images/blog/thumbs/macbook-neo.jpg" /><media:content medium="image" url="https://duckdb.org/images/blog/thumbs/macbook-neo.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Announcing DuckDB 1.5.0</title><link href="https://duckdb.org/2026/03/09/announcing-duckdb-150.html" rel="alternate" type="text/html" title="Announcing DuckDB 1.5.0" /><published>2026-03-09T00:00:00+00:00</published><updated>2026-03-09T00:00:00+00:00</updated><id>https://duckdb.org/2026/03/09/announcing-duckdb-150</id><content type="html" xml:base="https://duckdb.org/2026/03/09/announcing-duckdb-150.html"><![CDATA[<p>We are proud to release DuckDB v1.5.0, codenamed “Variegata” after the <em>Paradise shelduck</em> (Tadorna variegata) endemic to New Zealand.</p>

<p>In this blog post, we cover the most important updates for this release around support, features and extensions. As always, there is more: for the complete release notes, see the <a href="https://github.com/duckdb/duckdb/releases/tag/v1.5.0">release page on GitHub</a>.</p>

<blockquote>
  <p>To install the new version, please visit the <a href="/install/">installation page</a>. Note that it can take a few days to release some extensions (e.g., the <a href="/docs/current/core_extensions/ui.html">UI</a>) client libraries (e.g., Go, R, Java) due to the extra changes and review rounds required.</p>
</blockquote>

<p>With this release, we will have two DuckDB releases available: v1.4 (LTS) and v1.5 (current).
The next release – planned for September – will ship a major version, DuckDB v2.0.</p>

<h2 id="new-features">New Features</h2>

<h3 id="command-line-client">Command Line Client</h3>

<p>For users who use DuckDB through the terminal, the highlight of the new release is a rework of the CLI client with a new color scheme, dynamic prompts, a pager and many other convenience features.</p>

<h4 id="color-scheme">Color Scheme</h4>

<p>We shipped a <a href="/docs/current/clients/cli/friendly_cli.html">new color palette</a> and harmonized it with the documentation. The color palette is available in both dark mode and light mode. Both use two shades of gray, and five colors for keywords, strings, errors, functions and numbers. You can find the color palette in the <a href="/design/manual/#color-palette">Design Manual</a>.</p>

<p>You can customize the color scheme using the <code class="language-plaintext highlighter-rouge">.highlight_colors</code> dot command:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">.highlight_colors</span> <span class="n">column_name</span> <span class="n">darkgreen</span> <span class="n">bold_underline</span>
<span class="k">.highlight_colors</span> <span class="n">numeric_value</span> <span class="n">red</span> <span class="n">bold</span>
<span class="k">.highlight_colors</span> <span class="n">string_value</span> <span class="n">purple2</span>
<span class="k">FROM</span> <span class="n">ducks</span><span class="p">;</span>
</code></pre></div></div>

<p><img src="/images/blog/v150/cli-colors-example-light.png" alt="DuckDB CLI light mode" class="lightmode-img" />
<img src="/images/blog/v150/cli-colors-example-dark.png" alt="DuckDB CLI dark mode" class="darkmode-img" /></p>

<h4 id="dynamic-prompts-in-the-cli">Dynamic Prompts in the CLI</h4>

<p>DuckDB v1.5.0 introduces dynamic prompts for the CLI (<a href="https://github.com/duckdb/duckdb/pull/19579">PR #19579</a>). By default, these show the database and schema that you are currently connected to:</p>

<div class="language-batch highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">duckdb</span>
</code></pre></div></div>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">memory</span> <span class="n">D</span> <span class="k">ATTACH</span> <span class="s1">'my_database.duckdb'</span><span class="p">;</span>
<span class="gp">memory</span> <span class="n">D</span> <span class="k">USE</span> <span class="n">my_database</span><span class="p">;</span>
<span class="gp">my_database</span> <span class="n">D</span> <span class="k">CREATE</span> <span class="k">SCHEMA</span> <span class="n">my_schema</span><span class="p">;</span>
<span class="gp">my_database</span> <span class="n">D</span> <span class="k">USE</span> <span class="n">my_schema</span><span class="p">;</span>
<span class="gp">my_database.my_schema</span> <span class="n">D</span> <span class="p">...</span>
</code></pre></div></div>

<p>These prompts can be configured using bracket codes to have a maximum length, run a custom query, use different colors, etc. (<a href="https://github.com/duckdb/duckdb/pull/19579">#19579</a>).</p>

<h4 id="tables-and-describe"><code class="language-plaintext highlighter-rouge">.tables</code> and <code class="language-plaintext highlighter-rouge">DESCRIBE</code></h4>

<p>To show the columns of an individual table, use the <a href="/docs/current/sql/statements/describe.html"><code class="language-plaintext highlighter-rouge">DESCRIBE</code> statement</a>:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">memory</span> <span class="n">D</span> <span class="k">ATTACH</span> <span class="s1">'https://blobs.duckdb.org/data/animals.db'</span> <span class="k">AS</span> <span class="n">animals_db</span><span class="p">;</span>
<span class="gp">memory</span> <span class="n">D</span> <span class="k">USE</span> <span class="n">animals_db</span><span class="p">;</span>
<span class="gp">animals_db</span> <span class="n">D</span> <span class="k">DESCRIBE</span> <span class="n">ducks</span><span class="p">;</span>
</code></pre></div></div>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>┌──────────────────────┐
│        ducks         │
│                      │
│ id           integer │
│ name         varchar │
│ extinct_year integer │
└──────────────────────┘
</code></pre></div></div>

<p>The <a href="/docs/current/clients/cli/dot_commands.html"><code class="language-plaintext highlighter-rouge">.tables</code> dot command</a> lists the attached catalogs, the schemas and tables in them, and the columns in each table.</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">memory</span> <span class="n">D</span> <span class="k">ATTACH</span> <span class="s1">'https://blobs.duckdb.org/data/animals.db'</span> <span class="k">AS</span> <span class="n">animals_db</span><span class="p">;</span>
<span class="gp">memory</span> <span class="n">D</span> <span class="k">ATTACH</span> <span class="s1">'https://blobs.duckdb.org/data/numbers1.db'</span><span class="p">;</span>
<span class="gp">memory</span> <span class="n">D</span> <span class="k">.tables</span>
</code></pre></div></div>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code> ────────────── animals_db ───────────────
 ───────────────── main ──────────────────
┌─────────────────┐┌──────────────────────┐
│      swans      ││        ducks         │
│                 ││                      │
│ id      integer ││ id           integer │
│ name    varchar ││ name         varchar │
│ species varchar ││ extinct_year integer │
│ color   varchar ││                      │
│ habitat varchar ││        5 rows        │
│                 │└──────────────────────┘
│     3 rows      │
└─────────────────┘
  numbers1
 ── main ──
┌──────────┐
│   tbl    │
│          │
│ i bigint │
│          │
│  2 rows  │
└──────────┘
</code></pre></div></div>

<h4 id="accessing-the-last-result-using-_">Accessing the Last Result Using <code class="language-plaintext highlighter-rouge">_</code></h4>

<p>You can access the last result of a query inline using the underscore character <code class="language-plaintext highlighter-rouge">_</code>. This is not only convenient but also makes it unnecessary to re-run potentially long-running queries:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">memory</span> <span class="n">D</span> <span class="k">ATTACH</span> <span class="s1">'https://blobs.duckdb.org/data/animals.db'</span> <span class="k">AS</span> <span class="n">animals_db</span><span class="p">;</span>
<span class="gp">memory</span> <span class="n">D</span> <span class="k">USE</span> <span class="n">animals_db</span><span class="p">;</span>
<span class="gp">animals_db</span> <span class="n">D</span> <span class="k">FROM</span> <span class="n">ducks</span> <span class="k">WHERE</span> <span class="n">extinct_year</span> <span class="k">IS</span> <span class="k">NOT</span> <span class="nb">NULL</span><span class="p">;</span>
<span class="err">┌───────┬──────────────────┬──────────────┐</span>
<span class="err">│</span>  <span class="n">id</span>   <span class="err">│</span>       <span class="n">name</span>       <span class="err">│</span> <span class="n">extinct_year</span> <span class="err">│</span>
<span class="err">│</span> <span class="n">int32</span> <span class="err">│</span>     <span class="n">varchar</span>      <span class="err">│</span>    <span class="n">int32</span>     <span class="err">│</span>
<span class="err">├───────┼──────────────────┼──────────────┤</span>
<span class="err">│</span>     <span class="mi">1</span> <span class="err">│</span> <span class="n">Labrador</span> <span class="n">Duck</span>    <span class="err">│</span>         <span class="mi">1878</span> <span class="err">│</span>
<span class="err">│</span>     <span class="mi">3</span> <span class="err">│</span> <span class="n">Crested</span> <span class="n">Shelduck</span> <span class="err">│</span>         <span class="mi">1964</span> <span class="err">│</span>
<span class="err">│</span>     <span class="mi">5</span> <span class="err">│</span> <span class="n">Pink</span><span class="o">-</span><span class="n">headed</span> <span class="n">Duck</span> <span class="err">│</span>         <span class="mi">1949</span> <span class="err">│</span>
<span class="err">└───────┴──────────────────┴──────────────┘</span>
<span class="gp">animals_db</span> <span class="n">D</span> <span class="k">FROM</span> <span class="n">_</span><span class="p">;</span>
<span class="err">┌───────┬──────────────────┬──────────────┐</span>
<span class="err">│</span>  <span class="n">id</span>   <span class="err">│</span>       <span class="n">name</span>       <span class="err">│</span> <span class="n">extinct_year</span> <span class="err">│</span>
<span class="err">│</span> <span class="n">int32</span> <span class="err">│</span>     <span class="n">varchar</span>      <span class="err">│</span>    <span class="n">int32</span>     <span class="err">│</span>
<span class="err">├───────┼──────────────────┼──────────────┤</span>
<span class="err">│</span>     <span class="mi">1</span> <span class="err">│</span> <span class="n">Labrador</span> <span class="n">Duck</span>    <span class="err">│</span>         <span class="mi">1878</span> <span class="err">│</span>
<span class="err">│</span>     <span class="mi">3</span> <span class="err">│</span> <span class="n">Crested</span> <span class="n">Shelduck</span> <span class="err">│</span>         <span class="mi">1964</span> <span class="err">│</span>
<span class="err">│</span>     <span class="mi">5</span> <span class="err">│</span> <span class="n">Pink</span><span class="o">-</span><span class="n">headed</span> <span class="n">Duck</span> <span class="err">│</span>         <span class="mi">1949</span> <span class="err">│</span>
<span class="err">└───────┴──────────────────┴──────────────┘</span>
</code></pre></div></div>

<h4 id="pager">Pager</h4>

<p>Last but not least, the CLI now has a pager! It is triggered when there are more than 50 rows in the results.</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">memory</span> <span class="n">D</span> <span class="k">.maxrows</span> <span class="mi">100</span>
<span class="gp">memory</span> <span class="n">D</span> <span class="k">FROM</span> <span class="nf">range</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">100</span><span class="p">);</span>
</code></pre></div></div>

<p>You can navigate on Linux and Windows using <code class="language-plaintext highlighter-rouge">Page Up</code> / <code class="language-plaintext highlighter-rouge">Page Down</code>. On macOS, use <code class="language-plaintext highlighter-rouge">Fn</code> + <code class="language-plaintext highlighter-rouge">Up</code> / <code class="language-plaintext highlighter-rouge">Down</code>. To exit the pager, press <code class="language-plaintext highlighter-rouge">Q</code>.</p>

<p>The initial implementation of the pager was provided by <a href="https://github.com/tobwen"><code class="language-plaintext highlighter-rouge">tobwen</code></a> in <a href="https://github.com/duckdb/duckdb/pull/19004">#19004</a>.</p>

<h3 id="peg-parser">PEG Parser</h3>

<p>DuckDB v1.5 ships an experimental parser based on PEG (Parser Expression Grammars). The new parser enables better suggestions, improved error messages, and allows extensions to extend the grammar. The PEG parser is currently disabled by default but you can opt-in using:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CALL</span> <span class="nf">enable_peg_parser</span><span class="p">();</span>
</code></pre></div></div>

<p>The PEG parser is already used for generating suggestions. You can cycle through the options using <code class="language-plaintext highlighter-rouge">TAB</code>.</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">animals_db</span> <span class="n">D</span> <span class="k">FROM</span> <span class="n">ducks</span> <span class="k">WHERE</span> <span class="n">habitat</span> <span class="k">IS</span> 
<span class="k">IS</span>           <span class="k">ISNULL</span>       <span class="k">ILIKE</span>        <span class="gs">IN</span>           <span class="k">INTERSECT</span>    <span class="k">LIKE</span>
</code></pre></div></div>

<p>We are planning to make the switch to the new parser in the upcoming DuckDB release.</p>

<blockquote>
  <p>As a tradeoff, the parser has a slight performance overhead, however, this is in the range of milliseconds and is thus negligible for analytical queries. For more details on the rationale for using a PEG parser and benchmark results, please refer to the <a href="/library/runtime-extensible-parsers/">CIDR 2026 paper</a> by Hannes and Mark, or their <a href="/2024/11/22/runtime-extensible-parsers.html">blog post</a> summarizing the paper.</p>
</blockquote>

<h3 id="variant-type"><code class="language-plaintext highlighter-rouge">VARIANT</code> Type</h3>

<p>DuckDB now natively supports the <a href="https://github.com/duckdb/duckdb/pull/18609"><code class="language-plaintext highlighter-rouge">VARIANT</code> type</a>, inspired by <a href="https://docs.snowflake.com/en/sql-reference/data-types-semistructured">Snowflake's semi-structured <code class="language-plaintext highlighter-rouge">VARIANT</code> data type</a> and available <a href="https://github.com/apache/parquet-format/blob/master/VariantEncoding.md">in Parquet since 2025</a>. Unlike the <a href="/docs/current/data/json/json_type.html">JSON type</a>, which is physically stored as text, VARIANT stores typed, binary data. Each row in a VARIANT column is self-contained with its own type information. This leads to better compression and query performance. Here are a few examples of using <code class="language-plaintext highlighter-rouge">VARIANT</code>.</p>

<p>Store different types in the same column:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">events</span> <span class="p">(</span><span class="n">id</span> <span class="nb">INTEGER</span><span class="p">,</span> <span class="n">data</span> <span class="nb">VARIANT</span><span class="p">);</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">events</span> <span class="k">VALUES</span>
    <span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">42</span><span class="p">::</span><span class="nb">VARIANT</span><span class="p">),</span>
    <span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="s1">'hello world'</span><span class="p">::</span><span class="nb">VARIANT</span><span class="p">),</span>
    <span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">]::</span><span class="nb">VARIANT</span><span class="p">),</span>
    <span class="p">(</span><span class="mi">4</span><span class="p">,</span> <span class="p">{</span><span class="s1">'name'</span><span class="p">:</span> <span class="s1">'Alice'</span><span class="p">,</span> <span class="s1">'age'</span><span class="p">:</span> <span class="mi">30</span><span class="p">}::</span><span class="nb">VARIANT</span><span class="p">);</span>

<span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">events</span><span class="p">;</span>
</code></pre></div></div>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>┌───────┬────────────────────────────┐
│  id   │            data            │
│ int32 │          variant           │
├───────┼────────────────────────────┤
│     1 │ 42                         │
│     2 │ hello world                │
│     3 │ [1, 2, 3]                  │
│     4 │ {'name': Alice, 'age': 30} │
└───────┴────────────────────────────┘
</code></pre></div></div>
<p>Check the underlying type of each row:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span> <span class="n">id</span><span class="p">,</span> <span class="n">data</span><span class="p">,</span> <span class="n">variant_typeof</span><span class="p">(</span><span class="n">data</span><span class="p">)</span> <span class="k">AS</span> <span class="n">vtype</span>
<span class="k">FROM</span> <span class="n">events</span><span class="p">;</span>
</code></pre></div></div>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>┌───────┬────────────────────────────┬───────────────────┐
│  id   │            data            │       vtype       │
│ int32 │          variant           │      varchar      │
├───────┼────────────────────────────┼───────────────────┤
│     1 │ 42                         │ INT32             │
│     2 │ hello world                │ VARCHAR           │
│     3 │ [1, 2, 3]                  │ ARRAY(3)          │
│     4 │ {'name': Alice, 'age': 30} │ OBJECT(name, age) │
└───────┴────────────────────────────┴───────────────────┘
</code></pre></div></div>

<p>You can extract fields from nested variants using the dot notation or the <code class="language-plaintext highlighter-rouge">variant_extract</code> function:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span> <span class="n">data.name</span> <span class="k">FROM</span> <span class="n">events</span> <span class="k">WHERE</span> <span class="n">id</span> <span class="o">=</span> <span class="mi">4</span><span class="p">;</span>
<span class="c1">-- or </span>
<span class="k">SELECT</span> <span class="n">variant_extract</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="s1">'name'</span><span class="p">)</span> <span class="k">AS</span> <span class="n">name</span> <span class="k">FROM</span> <span class="n">events</span> <span class="k">WHERE</span> <span class="n">id</span> <span class="o">=</span> <span class="mi">4</span><span class="p">;</span>
</code></pre></div></div>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>┌─────────┐
│  name   │
│ variant │
├─────────┤
│ Alice   │
└─────────┘
</code></pre></div></div>

<p>DuckDB also supports reading <code class="language-plaintext highlighter-rouge">VARIANT</code> types from Parquet files, including <em>shredding</em> (storing nested data as flat values).</p>

<h3 id="read_duckdb-function"><code class="language-plaintext highlighter-rouge">read_duckdb</code> Function</h3>

<p>The <code class="language-plaintext highlighter-rouge">read_duckdb</code> table function can read DuckDB databases without first attaching them. This can make reading from DuckDB databases more ergonomic – for example, you can use globbing. You can read the <a href="#appendix-example-dataset">example</a> <code class="language-plaintext highlighter-rouge">numbers</code> databases as follows:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span> <span class="nf">min</span><span class="p">(</span><span class="n">i</span><span class="p">),</span> <span class="nf">max</span><span class="p">(</span><span class="n">i</span><span class="p">)</span>
<span class="k">FROM</span> <span class="nf">read_duckdb</span><span class="p">(</span><span class="s1">'numbers*.db'</span><span class="p">);</span>
</code></pre></div></div>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>┌────────┬────────┐
│ min(i) │ max(i) │
│ int64  │ int64  │
├────────┼────────┤
│      1 │      5 │
└────────┴────────┘
</code></pre></div></div>

<h3 id="azure-writes">Azure Writes</h3>

<p>You can now <a href="/docs/current/core_extensions/azure.html#writing-to-azure-blob-storage">write to the Azure Blob or ADLSv2 storage</a> using the <code class="language-plaintext highlighter-rouge">COPY</code> statement:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">-- Write query results to a Parquet file on Blob Storage</span>
<span class="k">COPY</span> <span class="p">(</span><span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">my_table</span><span class="p">)</span>
<span class="k">TO</span> <span class="s1">'az://my_container/path/output.parquet'</span><span class="p">;</span>

<span class="c1">-- Write a table to a CSV file on ADLSv2 Storage</span>
<span class="k">COPY</span> <span class="n">my_table</span>
<span class="k">TO</span> <span class="s1">'abfss://my_container/path/output.csv'</span><span class="p">;</span>
</code></pre></div></div>

<h3 id="odbc-scanner">ODBC Scanner</h3>

<p>We are now shipping an ODBC scanner extension. This allows you to query a remote endpoint as follows:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">LOAD</span><span class="n"> odbc_scanner</span><span class="p">;</span>
<span class="k">SET</span> <span class="k">VARIABLE</span> <span class="n">conn</span> <span class="o">=</span> <span class="nf">odbc_connect</span><span class="p">(</span><span class="s1">'Driver={Oracle Driver};DBQ=//127.0.0.1:1521/XE;UID=scott;PWD=tiger;'</span><span class="p">);</span>
<span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="nf">odbc_query</span><span class="p">(</span><span class="nf">getvariable</span><span class="p">(</span><span class="s1">'conn'</span><span class="p">),</span> <span class="s1">'SELECT SYSTIMESTAMP FROM dual;'</span><span class="p">);</span>
</code></pre></div></div>

<p>In the coming weeks, we'll publish the documentation page and release a followup post on the ODBC scanner.
In the meantime, please refer to the <a href="https://github.com/duckdb/odbc-scanner/blob/main/README.md">project's README</a>.</p>

<h2 id="major-changes">Major Changes</h2>

<h3 id="breaking-change-for-datetime-function">Breaking Change for Datetime Function</h3>

<p>The <a href="/docs/current/sql/functions/timestamptz.html#date_truncpart-timestamptz"><code class="language-plaintext highlighter-rouge">date_trunc</code></a> function, when applied to a <code class="language-plaintext highlighter-rouge">DATE</code>, now returns a <code class="language-plaintext highlighter-rouge">TIMESTAMP</code> instead of a date.</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">-- v1.4.4:</span>
<span class="k">SELECT</span> <span class="nf">typeof</span><span class="p">(</span><span class="nf">date_trunc</span><span class="p">(</span><span class="s1">'month'</span><span class="p">,</span> <span class="nb">DATE</span><span class="p">(</span><span class="s1">'2026-03-27'</span><span class="p">)));</span>
<span class="c1">-- returns DATE</span>

<span class="c1">-- v1.5.x:</span>
<span class="k">SELECT</span> <span class="nf">typeof</span><span class="p">(</span><span class="nf">date_trunc</span><span class="p">(</span><span class="s1">'month'</span><span class="p">,</span> <span class="nb">DATE</span><span class="p">(</span><span class="s1">'2026-03-27'</span><span class="p">)));</span>
<span class="c1">-- returns TIMESTAMP</span>
</code></pre></div></div>

<h3 id="lakehouse-updates">Lakehouse Updates</h3>

<p>All of DuckDB’s supported Lakehouse formats have received some updates in DuckDB v1.5.</p>

<h4 id="ducklake">DuckLake</h4>

<p>The main <a href="https://ducklake.select/">DuckLake</a> change for DuckDB v1.5 is updating the DuckLake specification to v0.4.
We are aiming for this to be the same specification that ships with DuckLake v1.0, which will be released in April.
Its main highlights include:</p>

<ul>
  <li>Macro support.</li>
  <li>Sorted tables.</li>
  <li>Deletion inlining and addition of partial delete files.</li>
  <li>Internal rework of DuckLake options.</li>
</ul>

<p>We'll announce more details about these features in the blog post for DuckLake v1.0.</p>

<h4 id="delta-lake">Delta Lake</h4>

<p>For the <a href="/docs/current/core_extensions/delta.html">Delta Lake extension</a>, the team has focused on improving support for writes via <a href="/docs/current/core_extensions/unity_catalog.html">Unity Catalog</a>, Delta idempotent writes and table <code class="language-plaintext highlighter-rouge">CHECKPOINT</code>s.</p>

<h4 id="iceberg">Iceberg</h4>

<p>For the <a href="/docs/current/core_extensions/iceberg/overview.html">Iceberg extension</a>, the team is working on a larger release for v1.5.1. For v1.5.0, the main feature is the addition of table properties in the <code class="language-plaintext highlighter-rouge">CREATE TABLE</code> statement:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">test_create_table</span> <span class="p">(</span><span class="n">a</span> <span class="nb">INTEGER</span><span class="p">)</span>
<span class="k">WITH</span> <span class="p">(</span>
    <span class="s1">'format-version'</span> <span class="o">=</span> <span class="s1">'2'</span><span class="p">,</span> <span class="c1">-- format version will be elevated to format-version when creating a table</span>
    <span class="s1">'location'</span> <span class="o">=</span> <span class="s1">'s3://path/to/data'</span><span class="p">,</span> <span class="c1">-- location will be elevated to location when creating a table</span>
    <span class="s1">'property1'</span> <span class="o">=</span> <span class="s1">'value1'</span><span class="p">,</span>
    <span class="s1">'property2'</span> <span class="o">=</span> <span class="s1">'value2'</span>
<span class="p">);</span>
</code></pre></div></div>

<p>Other minor additions have been made to enable passing <code class="language-plaintext highlighter-rouge">EXTRA_HTTP_HEADERS</code> when attaching to an Iceberg catalog, which has unlocked <a href="https://cloud.google.com/biglake">Google’s BigLake</a>.</p>

<blockquote>
  <p>Both Delta and DuckLake have implemented the <a href="#variant-type"><code class="language-plaintext highlighter-rouge">VARIANT</code> type</a>. Iceberg’s <code class="language-plaintext highlighter-rouge">VARIANT</code> type will ship in the v1.5.1 release with some other features that are specific to the Iceberg v3 specification.</p>
</blockquote>

<h3 id="network-stack">Network Stack</h3>

<p>The default backend for the <a href="/docs/current/core_extensions/httpfs/overview.html">httpfs extension</a> has changed from <a href="https://github.com/yhirose/cpp-httplib"><code class="language-plaintext highlighter-rouge">httplib</code></a> to <a href="https://curl.se/"><code class="language-plaintext highlighter-rouge">curl</code></a>. As one of the most popular and well-tested open-source projects, we expect <code class="language-plaintext highlighter-rouge">curl</code> to provide long-standing stability and security for DuckDB. Regardless of the <code class="language-plaintext highlighter-rouge">http</code> library used, <code class="language-plaintext highlighter-rouge">openssl</code> is still the backing SSL library and options such as <code class="language-plaintext highlighter-rouge">http_timeout</code>, <code class="language-plaintext highlighter-rouge">http_retries</code>, etc. are still the same.</p>

<p>Our community has been <a href="https://github.com/duckdb/duckdb/issues/20977">testing the new network stack</a> for the last few weeks. Still, if you encounter any issues, please submit them to the <a href="https://github.com/duckdb/duckdb-httpfs"><code class="language-plaintext highlighter-rouge">duckdb-httpfs</code> repository</a>.</p>

<details>
  <summary>
If you are interested in more details, click here.
</summary>
  <p>Due to technical reasons, <code class="language-plaintext highlighter-rouge">httplib</code> is still the library we use for downloading the <code class="language-plaintext highlighter-rouge">httpfs</code> extension. When <code class="language-plaintext highlighter-rouge">httpfs</code> is loaded with the (now default) <code class="language-plaintext highlighter-rouge">curl</code> backend, subsequent extension installations go through <code class="language-plaintext highlighter-rouge">https://</code>, with the default endpoint for core extensions pointing to <a href="https://extensions.duckdb.org"><code class="language-plaintext highlighter-rouge">https://extensions.duckdb.org</code></a>.</p>

  <p>All core and community extensions are cryptographically signed, so installing them through <code class="language-plaintext highlighter-rouge">http://</code> does not pose a security risk. However, some users reported issues about <code class="language-plaintext highlighter-rouge">http://</code> extension installs in environments with firewalls.</p>
</details>

<h3 id="lambda-syntax">Lambda Syntax</h3>

<p>Up to DuckDB v1.2, the syntax for defining lambda expressions used the arrow notation <code class="language-plaintext highlighter-rouge">x -&gt; x + 1</code>. While this was a nice syntax, it clashed with the JSON extract operator (<code class="language-plaintext highlighter-rouge">-&gt;</code>) due to operator precedence and led to error messages that some users found difficult to troubleshoot. To work around this, we introduced a new, Python-style <a href="/2025/05/21/announcing-duckdb-130.html#lambda-function-syntax">lambda syntax in v1.3</a>, <code class="language-plaintext highlighter-rouge">lambda x: x + 1</code>.</p>

<p>While DuckDB v1.5 supports both styles of writing lambda expressions, using the deprecated arrow syntax will now throw a warning:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span> <span class="nf">list_transform</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">],</span> <span class="n">x</span> <span class="o">-&gt;</span> <span class="n">x</span> <span class="o">+</span> <span class="mi">1</span><span class="p">);</span>
</code></pre></div></div>

<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code>WARNING:
Deprecated lambda arrow (-&gt;) detected. Please transition to the new lambda syntax, i.e., lambda x, i: x + i, before DuckDB's next release.
</code></pre></div></div>

<p>You can use the <code class="language-plaintext highlighter-rouge">lambda_syntax</code> configuration option to change this behavior to suppress the warning or to behave more strictly:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">-- Suppress the warning</span>
<span class="k">SET</span> <span class="py">lambda_syntax</span> <span class="o">=</span> <span class="s1">'ENABLE_SINGLE_ARROW'</span><span class="p">;</span>
<span class="c1">-- Turn the deprecation warning into an error</span>
<span class="k">SET</span> <span class="py">lambda_syntax</span> <span class="o">=</span> <span class="s1">'DISABLE_SINGLE_ARROW'</span><span class="p">;</span>
</code></pre></div></div>

<p>DuckDB 2.0 will disable the single arrow syntax by default and it will only be available if you opt-in explicitly.</p>

<h3 id="spatial-extension">Spatial Extension</h3>

<p>The <a href="/docs/current/core_extensions/spatial/overview.html">spatial extension</a> ships several important changes.</p>

<h4 id="breaking-change-flipping-of-axis-order">Breaking Change: Flipping of Axis Order</h4>

<p>Most functions in <code class="language-plaintext highlighter-rouge">spatial</code> operate in Cartesian space and are unaffected by axis order, e.g., whether the <code class="language-plaintext highlighter-rouge">X</code> and <code class="language-plaintext highlighter-rouge">Y</code> axes represent “longitude” and “latitude” or the other way around. But there are some functions where this matters, and where the assumption, counterintuitively, is that all input geometries use (x = latitude, y = longitude). These are:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">ST_Distance_Spheroid</code></li>
  <li><code class="language-plaintext highlighter-rouge">ST_Perimeter_Spheroid</code></li>
  <li><code class="language-plaintext highlighter-rouge">ST_Area_Spheroid</code></li>
  <li><code class="language-plaintext highlighter-rouge">ST_Distance_Sphere</code></li>
  <li><code class="language-plaintext highlighter-rouge">ST_DWithin_Spheroid</code></li>
</ul>

<p>Additionally, <code class="language-plaintext highlighter-rouge">ST_Transform</code> also expects that the input geometries are in the same axis order as defined by the source coordinate reference system, which in the case of e.g., <code class="language-plaintext highlighter-rouge">EPSG:4326</code> is also (x = latitude, y = longitude).</p>

<p>This has been a long-standing source of confusion and numerous issues, as other databases, formats and GIS systems tend to always treat <code class="language-plaintext highlighter-rouge">X</code> as “easting”, “left-right” or “longitude”, and <code class="language-plaintext highlighter-rouge">Y</code> as “northing”, “up-down” or “latitude”.</p>

<p>We are changing how this currently works in DuckDB to be consistent with how other systems operate, and hopefully cause less confusion for new users in the future. However, to avoid silently breaking existing workflows that have adapted to this quirk (e.g., by using <code class="language-plaintext highlighter-rouge">ST_FlipCoordinates</code>), we are rolling out this change gradually via a new <code class="language-plaintext highlighter-rouge">geometry_always_xy</code> setting:</p>

<ul>
  <li>In DuckDB v1.5, setting <code class="language-plaintext highlighter-rouge">geometry_always_xy = true</code> enables the new behavior (x = longitude, y = latitude). Without it, affected functions emit a warning.</li>
  <li>In DuckDB v2.0, the warning will become an error. Set <code class="language-plaintext highlighter-rouge">geometry_always_xy = false</code> to preserve the old behavior.</li>
  <li>In DuckDB v2.1, <code class="language-plaintext highlighter-rouge">geometry_always_xy = true</code> will become the default.</li>
</ul>

<p>So to summarize, nothing is changing by default in this release, but to avoid being affected by this change in the future, set <code class="language-plaintext highlighter-rouge">geometry_always_xy</code> explicitly now. Set it to <code class="language-plaintext highlighter-rouge">true</code> to opt into the new behavior, or <code class="language-plaintext highlighter-rouge">false</code> to keep the existing one.</p>

<h3 id="geometry-rework">Geometry Rework</h3>

<h4 id="geometry-becomes-a-built-in-type"><code class="language-plaintext highlighter-rouge">GEOMETRY</code> Becomes a Built-In Type</h4>

<p>The <code class="language-plaintext highlighter-rouge">GEOMETRY</code> type has been moved from the <code class="language-plaintext highlighter-rouge">spatial</code> extension into core DuckDB!</p>

<p>Geospatial data is no longer niche. The Parquet standard now treats <code class="language-plaintext highlighter-rouge">GEOMETRY</code> as a first-class column type, and open table formats like Apache Iceberg and DuckLake are moving in the same direction. Many widely used data formats and systems also have geospatial counterparts—GeoJSON, PostGIS, GeoPandas, GeoPackage/Spatialite, and more.</p>

<p>DuckDB already offers extensions that integrate with many of these formats and systems. But there’s a structural problem: as long as <code class="language-plaintext highlighter-rouge">GEOMETRY</code> lives inside the <code class="language-plaintext highlighter-rouge">spatial</code> extension, other extensions that want to read or write geospatial data must either depend on <code class="language-plaintext highlighter-rouge">spatial</code>, implement their own incompatible geometry representation, or force users to handle the conversions themselves.</p>

<p>By moving <code class="language-plaintext highlighter-rouge">GEOMETRY</code> into DuckDB’s core, extensions can now produce and consume geometry values natively, without depending on <code class="language-plaintext highlighter-rouge">spatial</code>. While the <code class="language-plaintext highlighter-rouge">spatial</code> extension still provides most of the functions for working with geometries, the type itself becomes a shared foundation that the entire ecosystem can build on. We’ve already added <code class="language-plaintext highlighter-rouge">GEOMETRY</code> support to the Postgres scanner and GeoArrow conversion for Arrow import and export. Geometry support in additional extensions is coming soon.</p>

<p>This change also enables deeper integration with DuckDB’s storage engine and query optimizer, unlocking new compression techniques, query optimizations, and CRS awareness capabilities that were not possible when <code class="language-plaintext highlighter-rouge">GEOMETRY</code> only existed as an extension type. This is all documented in the new <a href="/docs/current/sql/data_types/geometry.html">geometry page</a> in the documentation, but we will highlight some below.</p>

<h4 id="improved-storage-wkb-and-shredding">Improved Storage: WKB and Shredding</h4>

<p>Geometry values are now stored using the industry-standard little-endian <a href="https://en.wikipedia.org/wiki/Well-known_text_representation_of_geometry#Well-known_binary">Well-Known Binary (WKB)</a> encoding, replacing the custom format used by the <code class="language-plaintext highlighter-rouge">spatial</code> extension. However, we are still experimenting with the in-memory representation we want to use in the execution engine so you should still use the conversion functions (e.g., <code class="language-plaintext highlighter-rouge">ST_AsWKT</code>, <code class="language-plaintext highlighter-rouge">ST_AsWKB</code>, <code class="language-plaintext highlighter-rouge">ST_GeomFromText</code>, <code class="language-plaintext highlighter-rouge">ST_GeomFromWKB</code>) when moving data in and out of DuckDB.</p>

<p>We’ve also implemented a new storage technique specialized for <code class="language-plaintext highlighter-rouge">GEOMETRY</code>. When a geometry column contains values that all share the same type and vertex dimensions, DuckDB can additionally apply "shredding": rather than storing opaque blobs, the column is decomposed into primitive <code class="language-plaintext highlighter-rouge">STRUCT</code>, <code class="language-plaintext highlighter-rouge">LIST</code>, and <code class="language-plaintext highlighter-rouge">DOUBLE</code> segments that compress far more efficiently. This can reduce on-disk size by roughly 3x for uniform geometry columns such as point clouds. Shredding is applied automatically for uniform row groups of a certain size, but can be configured via the <code class="language-plaintext highlighter-rouge">geometry_minimum_shredding_size</code> configuration option.</p>

<h4 id="geometry-statistics-and-query-optimization">Geometry Statistics and Query Optimization</h4>

<p>Geometry columns now track per-row-group statistics - including the bounding box and the set of geometry types and vertex dimensions present. The query optimizer can use these to skip row groups that cannot match a query's spatial predicates, similar to min/max pruning for numeric columns. The <code class="language-plaintext highlighter-rouge">&amp;&amp;</code> (bounding box intersection) operator is the first to benefit; broader support across <code class="language-plaintext highlighter-rouge">spatial</code> functions is in progress.</p>

<h4 id="coordinate-reference-system-support">Coordinate Reference System Support</h4>

<p>The <code class="language-plaintext highlighter-rouge">GEOMETRY</code> type now accepts an optional CRS parameter (e.g., <code class="language-plaintext highlighter-rouge">GEOMETRY('OGC:CRS84')</code>), making CRS part of the type system rather than implicit metadata. Spatial functions enforce CRS consistency across their inputs, catching a common class of silent errors that arises when mixing geometries from different coordinate systems. Only a couple of CRSs are built in by default, but loading the <code class="language-plaintext highlighter-rouge">spatial</code> extension registers over 7,000 CRSs from the EPSG dataset. While CRS support is still a bit experimental, we are planning to develop it further to support e.g., custom CRS definitions.</p>

<h3 id="optimizations">Optimizations</h3>

<h4 id="non-blocking-checkpointing">Non-Blocking Checkpointing</h4>

<p>During checkpointing, it's now possible to run concurrent reads (<a href="https://github.com/duckdb/duckdb/pull/19867">#19867</a>), writes (<a href="https://github.com/duckdb/duckdb/pull/20052">#20052</a>), insertions with indexes (<a href="https://github.com/duckdb/duckdb/pull/20160">#20160</a>) and deletes (<a href="https://github.com/duckdb/duckdb/pull/20286">#20286</a>). The rework of checkpointing benefits concurrent RW workloads and increases the TPC-H throughput score on SF100 from 246,115.60 to 287,122.97, a <strong>17% improvement</strong>.</p>

<h4 id="aggregates">Aggregates</h4>

<p>Aggregate functions received several optimizations. For example, the <code class="language-plaintext highlighter-rouge">last</code> aggregate function was optimized by community member <a href="https://github.com/xe-nvdk"><code class="language-plaintext highlighter-rouge">xe-nvdk</code></a> to iterate from the end of each vector batch instead of the beginning. In synthetic benchmarks, this results in a <a href="https://github.com/duckdb/duckdb/pull/20567">40% speedup</a>.</p>

<!-- markdownlint-disable MD001 -->

<h2 id="distribution">Distribution</h2>

<h4 id="python-pip">Python Pip</h4>

<p>You can install the DuckDB CLI on any platform where pip is available:</p>

<div class="language-batch highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">pip </span>install duckdb-cli
</code></pre></div></div>

<p>You can then launch DuckDB in your virtual environment using:</p>

<div class="language-batch highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">duckdb</span>
</code></pre></div></div>

<p>Both DuckDB v1.4 and v1.5 are supported. We are working on shipping extensions as extras using the <code class="language-plaintext highlighter-rouge">duckdb[extension_name]</code> syntax – stay tuned!</p>

<h4 id="windows-install-script-beta">Windows Install Script (Beta)</h4>

<p>On Windows, you can now use an install script:</p>

<div class="language-batch highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">powershell</span> <span class="nt">-NoExit</span> iex <span class="o">(</span>iwr <span class="s2">"https://install.duckdb.org/install.ps1"</span><span class="o">)</span>.Content
</code></pre></div></div>

<p>Please note that this is currently in the beta stage. If you have any feedback, please <a href="https://github.com/duckdb/duckdb/issues">let us know</a>.</p>

<h4 id="cli-for-linux-with-musl-libc">CLI for Linux with musl libc</h4>

<p>We are distributing CLI clients that work with <a href="/docs/lts/dev/building/linux.html">musl libc</a> (e.g., for Alpine Linux, commonly used in Docker images). The archives are available <a href="https://github.com/duckdb/duckdb/releases/tag/v1.5.0">on GitHub</a>.</p>

<p>Note that the musl libc CLI client requires the <code class="language-plaintext highlighter-rouge">libstdc++</code>. To install this package, run:</p>

<div class="language-batch highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">apk </span>add libstdc++
</code></pre></div></div>

<h4 id="extension-sizes">Extension Sizes</h4>

<p>We reworked our build system to make the extension binaries smaller! The DuckLake extension's size was reduced by ~30%, from 17 MB to 12 MB. For smaller extensions such as Excel, the reduction is more than 60%, from 9 MB to 3 MB.</p>

<!-- markdownlint-enable MD001 -->

<h2 id="summary">Summary</h2>

<p>These were a few highlights – but there are many more features and improvements in this release.
There have been over 6500 commits by close to 100 contributors since v1.4. The full <a href="https://github.com/duckdb/duckdb/releases/tag/v1.5.0">release notes can be found on GitHub</a>. We would like to thank our community for providing detailed issue reports and feedback. And again, our special thanks go to external contributors!</p>

<p>PS: If you visited this blog post through a direct link – we also rolled out a new <a href="/">landing page</a>!</p>

<!-- markdownlint-disable MD040 -->

<h2 id="appendix-example-dataset">Appendix: Example Dataset</h2>

<details>
  <summary>
See the code that creates the example databases.
</summary>
  <div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">ATTACH</span> <span class="s1">'numbers1.db'</span><span class="p">;</span>
<span class="k">ATTACH</span> <span class="s1">'numbers2.db'</span><span class="p">;</span>
<span class="k">ATTACH</span> <span class="s1">'animals.db'</span><span class="p">;</span>

<span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">numbers1.tbl</span> <span class="k">AS</span> <span class="k">FROM</span> <span class="nf">range</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span> <span class="n">t</span><span class="p">(</span><span class="n">i</span><span class="p">);</span>

<span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">numbers2.tbl</span> <span class="k">AS</span> <span class="k">FROM</span> <span class="nf">range</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mi">6</span><span class="p">)</span> <span class="n">t</span><span class="p">(</span><span class="n">i</span><span class="p">);</span>

<span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">animals.ducks</span> <span class="k">AS</span>
<span class="k">FROM</span> <span class="p">(</span><span class="k">VALUES</span>
    <span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="s1">'Labrador Duck'</span><span class="p">,</span> <span class="mi">1878</span><span class="p">),</span>
    <span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="s1">'Mallard'</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">),</span>
    <span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="s1">'Crested Shelduck'</span><span class="p">,</span> <span class="mi">1964</span><span class="p">),</span>
    <span class="p">(</span><span class="mi">4</span><span class="p">,</span> <span class="s1">'Wood Duck'</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">),</span>
    <span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="s1">'Pink-headed Duck'</span><span class="p">,</span> <span class="mi">1949</span><span class="p">)</span>
<span class="p">)</span> <span class="n">t</span><span class="p">(</span><span class="n">id</span><span class="p">,</span> <span class="n">name</span><span class="p">,</span> <span class="n">extinct_year</span><span class="p">);</span>

<span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">animals.swans</span> <span class="k">AS</span>
<span class="k">FROM</span> <span class="p">(</span><span class="k">VALUES</span>
    <span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="s1">'Aurora'</span><span class="p">,</span> <span class="s1">'Mute Swan'</span><span class="p">,</span> <span class="s1">'White'</span><span class="p">,</span> <span class="s1">'European lakes and rivers'</span><span class="p">),</span>
    <span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="s1">'Midnight'</span><span class="p">,</span> <span class="s1">'Black Swan'</span><span class="p">,</span> <span class="s1">'Black'</span><span class="p">,</span> <span class="s1">'Australian wetlands'</span><span class="p">),</span>
    <span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="s1">'Tundra'</span><span class="p">,</span> <span class="s1">'Tundra Swan'</span><span class="p">,</span> <span class="s1">'White'</span><span class="p">,</span> <span class="s1">'Arctic and subarctic regions'</span><span class="p">)</span>
<span class="p">)</span> <span class="n">t</span><span class="p">(</span><span class="n">id</span><span class="p">,</span> <span class="n">name</span><span class="p">,</span> <span class="n">species</span><span class="p">,</span> <span class="n">color</span><span class="p">,</span> <span class="n">habitat</span><span class="p">);</span>

<span class="k">DETACH</span> <span class="n">numbers1</span><span class="p">;</span>
<span class="k">DETACH</span> <span class="n">numbers2</span><span class="p">;</span>
<span class="k">DETACH</span> <span class="n">animals</span><span class="p">;</span>
</code></pre></div>  </div>
</details>]]></content><author><name>The DuckDB team</name></author><category term="release" /><summary type="html"><![CDATA[We are releasing DuckDB version 1.5.0, codenamed “Variegata”. This release comes with a friendly CLI (a new, more ergonomic command line client), support for the `VARIANT` type, a built-in `GEOMETRY` type, along with many other features and optimizations. The v1.4.0 LTS line (“Andium”) will keep receiving updates until its end-of-life in September 2026.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://duckdb.org/images/blog/thumbs/duckdb-release-1-5-0.png" /><media:content medium="image" url="https://duckdb.org/images/blog/thumbs/duckdb-release-1-5-0.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Announcing DuckDB 1.4.4 LTS</title><link href="https://duckdb.org/2026/01/26/announcing-duckdb-144.html" rel="alternate" type="text/html" title="Announcing DuckDB 1.4.4 LTS" /><published>2026-01-26T00:00:00+00:00</published><updated>2026-01-26T00:00:00+00:00</updated><id>https://duckdb.org/2026/01/26/announcing-duckdb-144</id><content type="html" xml:base="https://duckdb.org/2026/01/26/announcing-duckdb-144.html"><![CDATA[<p>In this blog post, we highlight a few important fixes in DuckDB v1.4.4, the fourth patch release in <a href="/2025/09/16/announcing-duckdb-140.html">DuckDB's 1.4 LTS line</a>.
The release ships bugfixes, performance improvements and security patches. You can find the complete <a href="https://github.com/duckdb/duckdb/releases/tag/v1.4.4">release notes on GitHub</a>.</p>

<p>To install the new version, please visit the <a href="/install/">installation page</a>.</p>

<h2 id="fixes">Fixes</h2>

<p>This version ships a number of performance improvements and bugfixes.</p>

<ul>
  <li><a href="https://github.com/duckdb/duckdb/issues/20233"><code class="language-plaintext highlighter-rouge">#20233</code> Function chaining not allowed in QUALIFY #20233</a></li>
</ul>

<h3 id="correctness">Correctness</h3>

<ul>
  <li><a href="https://github.com/duckdb/duckdb/issues/20008"><code class="language-plaintext highlighter-rouge">#20008</code> Unexpected Result when Using Utility Function ALIAS #20008</a></li>
  <li><a href="https://github.com/duckdb/duckdb/issues/20410"><code class="language-plaintext highlighter-rouge">#20410</code> ANTI JOIN produces wrong results with materialized CTEs</a></li>
  <li><a href="https://github.com/duckdb/duckdb/issues/20156"><code class="language-plaintext highlighter-rouge">#20156</code> Streaming window unions produce incorrect results</a></li>
  <li><a href="https://github.com/duckdb/duckdb/issues/20413"><code class="language-plaintext highlighter-rouge">#20413</code> ASOF joins with <code class="language-plaintext highlighter-rouge">predicate</code> fail with different errors for FULL, RIGHT, SEMI, and ANTI join types</a></li>
  <li><a href="https://github.com/duckdb/duckdb/issues/20090"><code class="language-plaintext highlighter-rouge">#20090</code> mode() produces corrupted UTF-8 strings in parallel execution</a></li>
</ul>

<h3 id="crashes-and-internal-errors">Crashes and Internal Errors</h3>

<ul>
  <li><a href="https://github.com/duckdb/duckdb-python/issues/127"><code class="language-plaintext highlighter-rouge">#20468</code> Segfault in Hive partitioning with NULL values</a></li>
  <li><a href="https://github.com/duckdb/duckdb/issues/20086"><code class="language-plaintext highlighter-rouge">#20086</code> Incorrect results when using positional joins and indexes</a></li>
  <li><a href="https://github.com/duckdb/duckdb/issues/20415"><code class="language-plaintext highlighter-rouge">#20415</code> C API data creation causes segfault</a></li>
</ul>

<h3 id="performance">Performance</h3>

<ul>
  <li><a href="https://github.com/duckdb/duckdb/pull/20252"><code class="language-plaintext highlighter-rouge">#20252</code> Optimize prepared statement parameter lookups</a></li>
  <li><a href="https://github.com/duckdb/duckdb/pull/20284"><code class="language-plaintext highlighter-rouge">#20284</code> dbgen: use TaskExecutor framework to respect the <code class="language-plaintext highlighter-rouge">threads</code> setting</a></li>
</ul>

<h3 id="miscellaneous">Miscellaneous</h3>

<ul>
  <li><a href="https://github.com/duckdb/duckdb/issues/20233"><code class="language-plaintext highlighter-rouge">#20233</code> Function chaining not allowed in QUALIFY #20233</a></li>
  <li><a href="https://github.com/duckdb/duckdb/pull/20339"><code class="language-plaintext highlighter-rouge">#20339</code> Use UTF-16 console output in Windows shell</a></li>
</ul>

<h2 id="conclusion">Conclusion</h2>

<p>This post was a short summary of the changes in v1.4.4. As usual, you can find the <a href="https://github.com/duckdb/duckdb/releases/tag/v1.4.4">full release notes on GitHub</a>.
We would like to thank our contributors for providing detailed issue reports and patches.
In the coming month, we'll release DuckDB v1.5.0.
We'll also keep v1.4 LTS updated until mid-September. We'll announce the release date of v1.4.5 in the <a href="/release_calendar.html">release calendar</a> in the coming months.</p>

<blockquote>
  <p>Earlier today, we pushed an incorrect tag that was visible for a few minutes.
No binaries or extensions were available under this tag and we replaced it as soon as we noticed the issue.
Our apologies for the erroneous release.</p>
</blockquote>]]></content><author><name>The DuckDB team</name></author><category term="release" /><summary type="html"><![CDATA[Today we are releasing DuckDB 1.4.4 with bugfixes and performance improvements.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://duckdb.org/images/blog/thumbs/duckdb-release-1-4-4-lts.jpg" /><media:content medium="image" url="https://duckdb.org/images/blog/thumbs/duckdb-release-1-4-4-lts.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Announcing Vortex Support in DuckDB</title><link href="https://duckdb.org/2026/01/23/duckdb-vortex-extension.html" rel="alternate" type="text/html" title="Announcing Vortex Support in DuckDB" /><published>2026-01-23T00:00:00+00:00</published><updated>2026-01-23T00:00:00+00:00</updated><id>https://duckdb.org/2026/01/23/duckdb-vortex-extension</id><content type="html" xml:base="https://duckdb.org/2026/01/23/duckdb-vortex-extension.html"><![CDATA[<p>I think it is worth starting this intro by talking a little bit about the established format for columnar data. Parquet has done some amazing things for analytics. If you go back to the times where CSV was the better alternative, then you know how important Parquet is. However, even if the  specification has evolved over time, Parquet has some design constraints. A particular limitation is that it is block-compressed and engines need to decompress pages in order to do further operations like filtering, decoding values, etc. For a while, <a href="https://www.cs.cmu.edu/~pavlo/blog/2026/01/2025-databases-retrospective.html?#fileformats">researchers and private companies</a> have been working on alternatives to Parquet that could improve on some of Parquet’s shortcomings. Vortex, from the SpiralDB team, is one of them.</p>

<h2 id="what-is-vortex">What is Vortex?</h2>

<p><a href="https://vortex.dev/">Vortex</a> is an extensible, open source format for columnar data. It was created to handle heterogeneous compute patterns and different data modalities. But, what does this mean?</p>

<blockquote>
  <p>The project was donated to the Linux Foundation by the <a href="https://spiraldb.com/post/vortex-a-linux-foundation-project">SpiralDB</a> team in August 2025.</p>
</blockquote>

<p>Vortex provides different layouts and encodings for different data types. Some of the most notable are <a href="/library/alp/">ALP</a> for floating point encoding or <a href="/2022/10/28/lightweight-compression.html">FSST</a> for string encoding. This lightweight compression strategy keeps data sizes down while allowing one of Vortex’s most important features: compute functions. By knowing the encoded layout of the data, Vortex is able to run arbitrary expressions on compressed data. This allows a Vortex reader to execute, for example, filter expressions within storage segments without decompressing data.</p>

<p>We mentioned heterogeneous compute to emphasize that Vortex was designed with the idea of having optimized layouts for different data types, including vectors, large text or even image or audio, but also to maximize CPU or GPU saturation. The idea is that decompression is deferred all the way to the GPU or CPU, enabling what Vortex calls “late materialization”. The <a href="/library/fastlanes/">FastLanes</a> encoding, a project originating at CWI (like DuckDB), is one of the main drivers behind this feature.</p>

<p>Vortex also supports dynamically loaded libraries (similar to DuckDB extensions) to provide new encodings for specific types as well as specific compute functions, e.g. for geospatial data. Another very interesting feature is encoding WebAssembly into the file, which can allow the reader to benefit from specific compute kernels applied to the file.</p>

<p>Besides DuckDB, other engines such as DataFusion, Spark and Arrow already offer integration with Vortex.</p>

<blockquote>
  <p>For more information, check out the <a href="https://spiraldb.com/post/vortex-a-linux-foundation-project">Vortex documentation</a>.</p>
</blockquote>

<h2 id="the-duckdb-vortex-extension">The DuckDB Vortex Extension</h2>

<p>DuckDB is a database as the name says, yes, but it is also widely used as an engine to query many different data sources. Through core or community extensions, DuckDB can integrate with:</p>

<ul>
  <li>Databases like Snowflake, BigQuery or PostgreSQL.</li>
  <li>Lakehouse formats like Delta, Iceberg or DuckLake.</li>
  <li>File formats, most notably JSON, CSV, Parquet and most recently Vortex.</li>
</ul>

<blockquote>
  <p>The community has gotten very creative, though, so these days you can even read YAML and Markdown with DuckDB using <a href="/community_extensions/">community extensions</a>.</p>
</blockquote>

<p>All this is possible due to the DuckDB <a href="/docs/lts/extensions/overview.html">extension system</a>, which makes it relatively easy to implement logic to interact with different file formats or external systems.</p>

<p>The SpiralDB team built a <a href="https://github.com/vortex-data/duckdb-vortex">DuckDB extension</a>. Together with the <a href="https://duckdblabs.com/">DuckDB Labs</a> team, we have made the extension available as a <a href="/docs/lts/core_extensions/overview.html">core DuckDB extension</a>, so that the community can enjoy Vortex as a first-class citizen in DuckDB.</p>

<h3 id="example-usage">Example Usage</h3>

<p>Installing and using the Vortex extension is very simple:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">INSTALL</span><span class="n"> vortex</span><span class="p">;</span>
<span class="k">LOAD</span><span class="n"> vortex</span><span class="p">;</span>
</code></pre></div></div>

<p>Then, you can easily use it to read and write, similar to other extensions such as Parquet.</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="nf">read_vortex</span><span class="p">(</span><span class="s1">'my.vortex'</span><span class="p">);</span>

<span class="k">COPY</span> <span class="p">(</span><span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="nf">generate_series</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span> <span class="n">t</span><span class="p">(</span><span class="n">i</span><span class="p">))</span>
<span class="k">TO</span> <span class="s1">'my.vortex'</span> <span class="p">(</span><span class="k">FORMAT</span> <span class="k">vortex</span><span class="p">);</span>
</code></pre></div></div>

<h3 id="why-vortex-and-duckdb">Why Vortex and DuckDB?</h3>

<p>Vortex claims to do well primarily at three use cases:</p>

<ul>
  <li>Traditional SQL analytics: Through late decompression and compute expressions on compressed data, Vortex can filter down data within the storage segment, reducing IO and memory consumption.</li>
  <li>Machine learning pre-processing pipelines: By supporting a wide variety of encodings for different data types, Vortex claims to be effective at reading and writing data, whether it is audio, text, images or vectors.</li>
  <li>AI model training: Encodings such as FastLanes allow for a very efficient copy of data to the GPU. Vortex is aiming at being able to copy data directly from S3 object storage to the GPU.</li>
</ul>

<p>The promise of more efficient IO and memory use through late decompression is a good reason to try DuckDB and Vortex for SQL analytics. On another note, if you are looking at running analytics on unified datasets that are used for multiple use cases, including pre-processing pipelines and AI training, then Vortex may be a good candidate since it is designed to fit all of these use cases well.</p>

<h3 id="performance-experiment">Performance Experiment</h3>

<p>For those who are number hungry, we decided to run a TPC-H benchmark scale factor 100 with DuckDB to understand how Vortex can perform as a storage format compared to Parquet. We tried to make the benchmark as fair as possible. These are the parameters:</p>

<ul>
  <li>Run on Mac M1 with 10 cores &amp; 32 GB of memory.</li>
  <li>The benchmark runs each query 5 times and the average is used for the final report.</li>
  <li>The DuckDB connection is closed after each query to try to make runs “colder” and avoid DuckDB's caching (particularly with Parquet) from influencing the results. OS page caching does have an influence in subsequent runs but we decided to acknowledge this factor and still keep the first run.</li>
  <li>Each TPC-H table is a single file, which means that lineitem files for Parquet and Vortex are quite large (both around 20 GB). This allows us to ignore the effect of globbing and having many small files.</li>
  <li>Data files used for the benchmark are generated with <a href="https://github.com/clflushopt/tpchgen-rs">tpchgen-rs</a> and are copied out using DuckDB’s Parquet and Vortex extensions.</li>
  <li>We compared Vortex against Parquet v1 and v2. The v2 specification allows for considerably faster reading than the v1 specification but many writers do not support this, so we thought it was worth including both.</li>
</ul>

<p><strong>The results are very good.</strong> The TPC-H benchmark runs 18% faster with respect to Parquet V2 and 35% faster than Parquet V1 (using the geometric means, which is the recommended approach).</p>

<p>Another interesting result is the standard deviation across runs. There was a considerable difference between the first (and coldest) run of each query and subsequent runs in Parquet, while Vortex performed very well across all runs with a much smaller standard deviation.</p>

<p><img src="/images/blog/duckdb-vortex/tpch_summary.png" alt="summary" /></p>

<table>
  <thead>
    <tr>
      <th style="text-align: left">Format</th>
      <th style="text-align: right">Geometric Mean (s)</th>
      <th style="text-align: right">Arithmetic Mean (s)</th>
      <th style="text-align: right">Avg Std Dev (s)</th>
      <th style="text-align: right">Total Time (s)</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="text-align: left">parquet_v1</td>
      <td style="text-align: right">2.324712</td>
      <td style="text-align: right">2.875722</td>
      <td style="text-align: right">0.145914</td>
      <td style="text-align: right">63.265881</td>
    </tr>
    <tr>
      <td style="text-align: left">parquet_v2</td>
      <td style="text-align: right">1.839171</td>
      <td style="text-align: right">2.288013</td>
      <td style="text-align: right">0.182962</td>
      <td style="text-align: right">50.336281</td>
    </tr>
    <tr>
      <td style="text-align: left">vortex</td>
      <td style="text-align: right">1.507675</td>
      <td style="text-align: right">1.991289</td>
      <td style="text-align: right">0.078893</td>
      <td style="text-align: right">43.808349</td>
    </tr>
  </tbody>
</table>

<blockquote>
  <p>The times did vary across different runs of the same benchmark, and subsequent runs have yielded similar results but with slight variations. The differences between Parquet v2 and Vortex have always been around 12-18% in geometric means and around 8-14% in total times. Benchmarking is very hard!</p>
</blockquote>

<!-- markdownlint-disable MD040 MD046 -->

<details>
  <summary>
Click here to see a more detailed breakdown of the benchmark results.
</summary>

  <p>This figure shows the results per query, including the standard deviation error bar.<br />
<img src="/images/blog/duckdb-vortex/tpch_rowgram.png" alt="mean_per_query" /><br />
The following is the summary of the sizes of the datasets. Note that both Parquet v1 and v2 are using the default compression used by the DuckDB Parquet writer, which is Snappy. In this case, Vortex is not using any general-purpose compression but still keeps the data sizes competitive.</p>

  <table>
    <thead>
      <tr>
        <th style="text-align: left">Table</th>
        <th style="text-align: left">parquet_v1</th>
        <th style="text-align: left">parquet_v2</th>
        <th style="text-align: left">vortex</th>
      </tr>
    </thead>
    <tbody>
      <tr>
        <td style="text-align: left">customer</td>
        <td style="text-align: left">1.15</td>
        <td style="text-align: left">0.99</td>
        <td style="text-align: left">1.06</td>
      </tr>
      <tr>
        <td style="text-align: left">lineitem</td>
        <td style="text-align: left">21.15</td>
        <td style="text-align: left">16.02</td>
        <td style="text-align: left">18.14</td>
      </tr>
      <tr>
        <td style="text-align: left">nation</td>
        <td style="text-align: left">0.00</td>
        <td style="text-align: left">0.00</td>
        <td style="text-align: left">0.00</td>
      </tr>
      <tr>
        <td style="text-align: left">orders</td>
        <td style="text-align: left">6.02</td>
        <td style="text-align: left">4.54</td>
        <td style="text-align: left">5.03</td>
      </tr>
      <tr>
        <td style="text-align: left">part</td>
        <td style="text-align: left">0.59</td>
        <td style="text-align: left">0.47</td>
        <td style="text-align: left">0.54</td>
      </tr>
      <tr>
        <td style="text-align: left">partsupp</td>
        <td style="text-align: left">4.07</td>
        <td style="text-align: left">3.33</td>
        <td style="text-align: left">3.72</td>
      </tr>
      <tr>
        <td style="text-align: left">region</td>
        <td style="text-align: left">0.00</td>
        <td style="text-align: left">0.00</td>
        <td style="text-align: left">0.00</td>
      </tr>
      <tr>
        <td style="text-align: left">supplier</td>
        <td style="text-align: left">0.07</td>
        <td style="text-align: left">0.06</td>
        <td style="text-align: left">0.07</td>
      </tr>
      <tr>
        <td style="text-align: left"><strong>total</strong></td>
        <td style="text-align: left">33.06</td>
        <td style="text-align: left">25.40</td>
        <td style="text-align: left">28.57</td>
      </tr>
    </tbody>
  </table>

</details>

<!-- markdownlint-enable MD040 MD046 -->

<h2 id="conclusion">Conclusion</h2>

<p>Vortex is a very interesting alternative to established columnar formats like Parquet. Its focus on lightweight compression encodings, late decompression and being able to run compute expressions on compressed data makes it very interesting for a wide range of use cases. With regard to DuckDB, we see that Vortex is already very performant for analytical queries, where it is on par or better than Parquet v2 on the TPC-H benchmark queries.</p>

<blockquote>
  <p>Vortex has been <a href="https://docs.vortex.dev/specs/file-format">backwards compatible</a> since version 0.36.0, which was released more than 6 months ago. Vortex is now at version 0.56.0.</p>
</blockquote>]]></content><author><name>Guillermo Sanchez, SpiralDB Team</name></author><category term="benchmark" /><summary type="html"><![CDATA[Vortex is a new columnar file format with a very promising design. SpiralDB and DuckDB Labs have partnered to give you a very fast experience while reading and writing Vortex files!]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://duckdb.org/images/blog/thumbs/vortex.svg" /><media:content medium="image" url="https://duckdb.org/images/blog/thumbs/vortex.svg" xmlns:media="http://search.yahoo.com/mrss/" /></entry></feed>