Unclaimed project
Are you a maintainer of mlx? Claim this project to take control of your public changelog and roadmap.
Claim this projectChangelog
mlx
MLX: An array framework for Apple silicon
mlx
Last updated 5 days ago
v0.30.4 - mlx Release Notes | AnnounceHQBack to changelogNew
v0.30.4
Highlights
- Metal: Much faster vector fused grouped-query attention for long context
- CUDA: Several improvements to speed up LLM inference for CUDA backend
- CUDA: Support for dense MoEs
- CUDA: Better support for consumer GPUs (4090, 5090, RTX 6000, ...)
What's Changed
- patch bump for next release by @awni in https://github.com/ml-explore/mlx/pull/2991
- Fix fence by @awni in https://github.com/ml-explore/mlx/pull/2998
- Reverts changing the MLX_IBV_DEVICES to MLX_JACCL_DEVICES by @angeloskath in https://github.com/ml-explore/mlx/pull/2999
- fix distributed all_to_sharded bias shard axis from -2 to -1 by @gufengc in https://github.com/ml-explore/mlx/pull/2987
- Fix sharding of quantized models with non-power-of-2 bits by @kernelpool in https://github.com/ml-explore/mlx/pull/3006
- Update CCCL to v3.1.3 by @zcbenz in https://github.com/ml-explore/mlx/pull/3012
- Fix python package install path in stubgen by @zcbenz in https://github.com/ml-explore/mlx/pull/3009
- Type Enhancement for Func Transforms and Bug Fix by @XXXXRT666 in https://github.com/ml-explore/mlx/pull/3003
- Do not clear disk space in setup-linux by @zcbenz in https://github.com/ml-explore/mlx/pull/3013
Do not give workflow boolean inputs default values by @zcbenz in https://github.com/ml-explore/mlx/pull/3014Fix negative dim indexing by @MillaFleurs in https://github.com/ml-explore/mlx/pull/2994Windows CI by @zcbenz in https://github.com/ml-explore/mlx/pull/3021Optimize erf function with expm1f in Metal backend by @bjornefisk in https://github.com/ml-explore/mlx/pull/3025[CUDA] Faster grouped mm by @zcbenz in https://github.com/ml-explore/mlx/pull/3011PR 3007 Fix Seg Fault by @MillaFleurs in https://github.com/ml-explore/mlx/pull/3008Use higher precision for linspace with double by @awni in https://github.com/ml-explore/mlx/pull/3029Handle data smaller than BUFFER_SIZE in jaccl recv by @rltakashige in https://github.com/ml-explore/mlx/pull/3033build 26.0 release in actions by @awni in https://github.com/ml-explore/mlx/pull/3035Remove xmlrunner from macOS CI by @zcbenz in https://github.com/ml-explore/mlx/pull/3032Columnwise quantize by @nastya236 in https://github.com/ml-explore/mlx/pull/2989Turn nccl_stub into a normal target by @zcbenz in https://github.com/ml-explore/mlx/pull/3037Use cuda::std for math ops by @zcbenz in https://github.com/ml-explore/mlx/pull/3041win: symbol exports and minor fixes by @dhiltgen in https://github.com/ml-explore/mlx/pull/3024CUDA gather mv by @angeloskath in https://github.com/ml-explore/mlx/pull/3039Link with prebuilt OpenBLAS and fix shared libs build on Windows by @zcbenz in https://github.com/ml-explore/mlx/pull/3036Allow take on empty array when it makes sense by @awni in https://github.com/ml-explore/mlx/pull/3046Add missing include to buffer_cache.h by @Anri-Lombard in https://github.com/ml-explore/mlx/pull/3053Build and test python package on Windows CI by @zcbenz in https://github.com/ml-explore/mlx/pull/3049Fix some MSVC compilation errors by @zcbenz in https://github.com/ml-explore/mlx/pull/3048Use C++20 by @zcbenz in https://github.com/ml-explore/mlx/pull/3050Faster two pass sdpa by @awni in https://github.com/ml-explore/mlx/pull/3023Find system-installed cuDNN on Windows by @zcbenz in https://github.com/ml-explore/mlx/pull/3052Fix some NVCC warnings when building CUDA backend with MSVC by @zcbenz in https://github.com/ml-explore/mlx/pull/3038Hide symbols by default for mac/linux by @zcbenz in https://github.com/ml-explore/mlx/pull/3057[CUDA] Fast sorting by @awni in https://github.com/ml-explore/mlx/pull/3060Fix flaky macOS test by @awni in https://github.com/ml-explore/mlx/pull/3063Update pre-commit hooks and versions for clang-format, black, and isort by @NripeshN in https://github.com/ml-explore/mlx/pull/3059GPU discovery by @dhiltgen in https://github.com/ml-explore/mlx/pull/3055Add NAX Split-K GEMM for large-K matmuls to improve performance by @hxu296 in https://github.com/ml-explore/mlx/pull/3018Improve CPU discovery by @dhiltgen in https://github.com/ml-explore/mlx/pull/3068Fix long cache file path on Windows by @zcbenz in https://github.com/ml-explore/mlx/pull/3065Better support consumer CUDA GPUs by @jessegross in https://github.com/ml-explore/mlx/pull/3056Delay load CUDA libs and resolve DLL paths at runtime by @zcbenz in https://github.com/ml-explore/mlx/pull/3061Do not require ConcurrentManagedAccess when not used by @zcbenz in https://github.com/ml-explore/mlx/pull/3062Fp qmv by @awni in https://github.com/ml-explore/mlx/pull/2984remove thrust by @awni in https://github.com/ml-explore/mlx/pull/3067New Contributors
- @gufengc made their first contribution in https://github.com/ml-explore/mlx/pull/2987
- @kernelpool made their first contribution in https://github.com/ml-explore/mlx/pull/3006
- @bjornefisk made their first contribution in https://github.com/ml-explore/mlx/pull/3025
- @rltakashige made their first contribution in https://github.com/ml-explore/mlx/pull/3033
- @dhiltgen made their first contribution in https://github.com/ml-explore/mlx/pull/3024
- @hxu296 made their first contribution in https://github.com/ml-explore/mlx/pull/3018
- @jessegross made their first contribution in https://github.com/ml-explore/mlx/pull/3056
Full Changelog: https://github.com/ml-explore/mlx/compare/v0.30.3...v0.30.4