Query Lance datasets directly from DuckDB.
Maintainer(s):
Xuanwo
Installing and Loading
INSTALL lance FROM community;
LOAD lance;
Example
INSTALL lance FROM community;
LOAD lance;
-- Query a local dataset
SELECT * FROM 'path/to/dataset.lance' LIMIT 10;
-- Query an S3 dataset
SELECT * FROM 's3://bucket/path/to/dataset.lance' LIMIT 10;
-- S3 authentication via DuckDB Secrets
CREATE SECRET (TYPE S3, provider credential_chain);
SELECT * FROM 's3://bucket/path/to/dataset.lance' LIMIT 10;
-- Vector search (returns `_distance`, smaller is closer)
SELECT id, label, _distance
FROM lance_vector_search(
'path/to/dataset.lance', 'vec',
[0.1, 0.2, 0.3, 0.4]::FLOAT[],
k = 5, prefilter = true
)
ORDER BY _distance ASC;
-- Full-text search (returns `_score`, larger is better)
SELECT id, text, _score
FROM lance_fts(
'path/to/dataset.lance', 'text', 'puppy',
k = 10, prefilter = true
)
ORDER BY _score DESC;
-- Hybrid search (returns `_hybrid_score`, larger is better)
SELECT id, _hybrid_score, _distance, _score
FROM lance_hybrid_search(
'path/to/dataset.lance',
'vec', [0.1, 0.2, 0.3, 0.4]::FLOAT[],
'text', 'puppy',
k = 10, alpha = 0.5, oversample_factor = 4
)
ORDER BY _hybrid_score DESC;
About lance
Lance is a modern columnar data format optimized for ML/AI workloads, with native cloud storage support. This extension brings Lance into a familiar SQL workflow. For detailed setup and usage instructions, visit the extension repository.
Added Functions
| function_name | function_type | description | comment | examples |
|---|---|---|---|---|
| __lance_namespace_scan | table | NULL | NULL | |
| __lance_scan | table | NULL | NULL | |
| lance_fts | table | NULL | NULL | |
| lance_hybrid_search | table | NULL | NULL | |
| lance_vector_search | table | NULL | NULL |