v3.0.0-beta.2
What's Changed
Critical Fixes ‼️
- fix: deduplicate row addresses in take to prevent panic by @wjones127 in https://github.com/lance-format/lance/pull/5881
- fix: fts flat search drops rows when avg_doc_length < 1.0 by @wjones127 in https://github.com/lance-format/lance/pull/5897
New Features 🎉
- feat: add RLE support for block by @yingjianwu98 in https://github.com/lance-format/lance/pull/4937
- feat: dictionary index always32 bits by @yingjianwu98 in https://github.com/lance-format/lance/pull/5011
- feat: abort dictionary encode if not useful by @yingjianwu98 in https://github.com/lance-format/lance/pull/5055
- feat(cdf): cdf support upsert for views by @zhangyue19921010 in https://github.com/lance-format/lance/pull/5369
- feat(compaction): binary copy capability for compaction by @zhangyue19921010 in https://github.com/lance-format/lance/pull/5434
- feat: add alter column nullable to non-nullable support by @Xuanwo in https://github.com/lance-format/lance/pull/5589
- feat: evolute all_null_layout to constant layout by @Xuanwo in https://github.com/lance-format/lance/pull/5641
- feat(java): support building vector index distributively by @majin1102 in https://github.com/lance-format/lance/pull/5664
- feat(rust): add datafusion catalog_provider through namespace by @majin1102 in https://github.com/lance-format/lance/pull/5686
- feat: support List and Struct type for KeyValue in inserted_rows.rs by @wojiaodoubao in https://github.com/lance-format/lance/pull/5713
- feat: support tencent cos by @ztorchan in https://github.com/lance-format/lance/pull/5740
- feat: add Lance-HF docs to lance.org/integrations/huggingface/ by @prrao87 in https://github.com/lance-format/lance/pull/5748
- feat(python): support namespace for tensorflow by @yuqi1129 in https://github.com/lance-format/lance/pull/5750
- feat(java): support json extraction by scanning by @majin1102 in https://github.com/lance-format/lance/pull/5770
- feat: expose blob handling APIs to python by @Xuanwo in https://github.com/lance-format/lance/pull/5790
- feat: add blob handling support for fragment by @Xuanwo in https://github.com/lance-format/lance/pull/5801
- feat: add plan/execute separation to FilteredReadExec by @LuQQiu in https://github.com/lance-format/lance/pull/5843
Bug Fixes 🐛
- fix: support system columns in dataset.take* operations by @hamersaw in https://github.com/lance-format/lance/pull/5722
- fix: skip missing indices in compaction rewrite by @AndreaBozzo in https://github.com/lance-format/lance/pull/5739
- fix(lance-linalg): check fp16kernels feature before arch-specific code by @durch in https://github.com/lance-format/lance/pull/5747
- refactor: align blob behavior that write via file format version, read via layout by @Xuanwo in https://github.com/lance-format/lance/pull/5752
- fix: fix deletion when using file-object-store:// by @cmccabe in https://github.com/lance-format/lance/pull/5760
- fix: remove unreasonable nullable check for data types in hash_joiner during merge operation by @zhangyue19921010 in https://github.com/lance-format/lance/pull/5784
- fix: allow unused_unsafe for __cpuid to support both stable and nightly by @jackye1995 in https://github.com/lance-format/lance/pull/5793
- fix: set JUnit dependency as test scope by @bryanck in https://github.com/lance-format/lance/pull/5815
- fix: fix remap so that it handles deletions correctly by @westonpace in https://github.com/lance-format/lance/pull/5828
- fix: inconsistent transposed pq code and metadata when build ivf_pq index distributedly by @yanghua in https://github.com/lance-format/lance/pull/5834
- fix(java): panic when reading CreateIndex transaction by @majin1102 in https://github.com/lance-format/lance/pull/5853
- fix: fix mini-block dictionary bitpacking panic by @Xuanwo in https://github.com/lance-format/lance/pull/5860
- fix: fix boolean inline constant decoding by @Xuanwo in https://github.com/lance-format/lance/pull/5862
- fix: open additional storage options provider related apis in lance dataset by @jackye1995 in https://github.com/lance-format/lance/pull/5869
- fix: flaky test test_ann_prefilter for HNSW by @BubbleCal in https://github.com/lance-format/lance/pull/5870
- fix(java): init allocator for new dataset when checkout branch/tag by @fangbo in https://github.com/lance-format/lance/pull/5876
- fix: avoid panic when repdef serializes empty offsets by @fenfeng9 in https://github.com/lance-format/lance/pull/5890
- fix: avoid bitmap range panic on inverted bounds by @fenfeng9 in https://github.com/lance-format/lance/pull/5893
- fix: split index_statistics to reduce rustc query depth by @Xuanwo in https://github.com/lance-format/lance/pull/5894
Documentation 📚
- docs: fix issues in HF integration docs by @prrao87 in https://github.com/lance-format/lance/pull/5778
- docs: fix MkDocs protobuf reference for ConstantLayout by @Xuanwo in https://github.com/lance-format/lance/pull/5833
- docs: add array type support by @XuQianJin-Stars in https://github.com/lance-format/lance/pull/5884
Performance Improvements 🚀
- perf: add vector throughput benchmark by @westonpace in https://github.com/lance-format/lance/pull/5644
- perf: add a full text search benchmark by @westonpace in https://github.com/lance-format/lance/pull/5665
- perf: don't concat the batches for writing posting lists by @BubbleCal in https://github.com/lance-format/lance/pull/5769
- perf: add a lightweight scheduler implementation by @westonpace in https://github.com/lance-format/lance/pull/5773
- perf: use cpu pool to process all posting lists by @BubbleCal in https://github.com/lance-format/lance/pull/5780
- perf: calculate cardinality lazily by @Xuanwo in https://github.com/lance-format/lance/pull/5783
- perf: replace flatmap in build_distance_table by @wkalt in https://github.com/lance-format/lance/pull/5898
Other Changes
- refactor: change reader's get_range result to be a static future by @westonpace in https://github.com/lance-format/lance/pull/5755
- refactor(python): migrate torch.jit.script to torch.compile by @wjones127 in https://github.com/lance-format/lance/pull/5759
- test: fix tests broken by pandas 3 release by @westonpace in https://github.com/lance-format/lance/pull/5786
Full Changelog: https://github.com/lance-format/lance/compare/release-root/3.0.0-beta.N...v3.0.0-beta.2