Unclaimed project
Are you a maintainer of mlx? Claim this project to take control of your public changelog and roadmap.
Claim this projectChangelog
mlx
MLX: An array framework for Apple silicon
mlx
Last updated 3 days ago
Back to changelogNew
v0.30.1
Highlights
- RDMA over thunderbolt with the JACCL backend (macOS >= 26.2) (some numbers)
- NAX with JIT so that they can be used in MLX Swift
- CUDA improvements
- Many improvements to SDPA (masking, T_q != T_kv)
- Faster quantize/dequantize
- QQMM to make use of faster tensor cores
- Fix in col reduce speeds up training
What's Changed
- patch + fix docs build by @awni in https://github.com/ml-explore/mlx/pull/2799
- Fix macos release target and linux arm release by @awni in https://github.com/ml-explore/mlx/pull/2802
- Fix cuda allocator copy condition by @awni in https://github.com/ml-explore/mlx/pull/2800
- [CUDA] Partly fix random for large sizes by @awni in https://github.com/ml-explore/mlx/pull/2798
- patch bump for future version by @awni in https://github.com/ml-explore/mlx/pull/2804
Centralize NAX condition by @awni in https://github.com/ml-explore/mlx/pull/2811Tolerance for some ops tests on cuda by @awni in https://github.com/ml-explore/mlx/pull/2815Fix typo: refs/head/main => refs/heads/main by @zcbenz in https://github.com/ml-explore/mlx/pull/2818Add float64 Eig and complex64 SVD/Eig support (Fixes #2708) by @harsh-sutariya in https://github.com/ml-explore/mlx/pull/2737Fix mx.core.load type annotation by @CC-Yeh in https://github.com/ml-explore/mlx/pull/2819Force cudaGraphExec reinstantiation when clusters are used by @andportnoy in https://github.com/ml-explore/mlx/pull/2813Bump actions/checkout from 5 to 6 by @dependabot[bot] in https://github.com/ml-explore/mlx/pull/2828Fix mx.core.linspace type annotation by @CC-Yeh in https://github.com/ml-explore/mlx/pull/2820[CUDA] Exit on crash and more helpful errors by @awni in https://github.com/ml-explore/mlx/pull/2830[CUDA] Add debug env to save cuda graphs to dot files by @zcbenz in https://github.com/ml-explore/mlx/pull/2825[CUDA] Output of SDPA should have same layout with inputs by @zcbenz in https://github.com/ml-explore/mlx/pull/2826Merge build-cuda and build-linux actions by @zcbenz in https://github.com/ml-explore/mlx/pull/2783[CUDA] Support array mask in SDPA by @zcbenz in https://github.com/ml-explore/mlx/pull/2822[CUDA] Faster rms norm for small dimension by @awni in https://github.com/ml-explore/mlx/pull/2838Added clarification to apply_fn parameter of apply_to_modules by @yuchaoran2011 in https://github.com/ml-explore/mlx/pull/2831[CUDA] Use cuDNN attention when T_q != T_kv by @zcbenz in https://github.com/ml-explore/mlx/pull/2843[CUDA] Migrate conv code to new cuDNN APIs by @zcbenz in https://github.com/ml-explore/mlx/pull/2847Support more Numpy interfaces for masked_scatter by @CC-Yeh in https://github.com/ml-explore/mlx/pull/2832use thread local cpature mode by @awni in https://github.com/ml-explore/mlx/pull/2850Fix export scatters by @awni in https://github.com/ml-explore/mlx/pull/2852Reduce JVP by @awni in https://github.com/ml-explore/mlx/pull/2854Fix graph updating by @awni in https://github.com/ml-explore/mlx/pull/2857Fix init from double by @awni in https://github.com/ml-explore/mlx/pull/2861Update gumbel function signature parameters by @tianenchong in https://github.com/ml-explore/mlx/pull/2868Added support for pytree types that inherit from tuple and typing.namedtuple by @romanoneg in https://github.com/ml-explore/mlx/pull/2845Layer norm throws on dimension mismatch by @awni in https://github.com/ml-explore/mlx/pull/2870fix compile copying by @awni in https://github.com/ml-explore/mlx/pull/2871Do a PyPi release for cuda on arm by @awni in https://github.com/ml-explore/mlx/pull/2866Add a 2-pass col reduce for CUDA by @angeloskath in https://github.com/ml-explore/mlx/pull/2863[CUDA] Faster general copy by @awni in https://github.com/ml-explore/mlx/pull/2873[CUDA] Release build for cuda 13 by @awni in https://github.com/ml-explore/mlx/pull/2872Make allocator::malloc throw on allocation failure by @zcbenz in https://github.com/ml-explore/mlx/pull/2874[Metal] No copy array init by @awni in https://github.com/ml-explore/mlx/pull/2875Try not to fail when there should be memory available by @awni in https://github.com/ml-explore/mlx/pull/2869[CUDA] Enable more graphs to be updatable by @awni in https://github.com/ml-explore/mlx/pull/2883Fix docs: replace nonexistent mx.random.randn with mx.random.normal by @Satyam12singh in https://github.com/ml-explore/mlx/pull/2890Allow events in sub graph to be updatable by @awni in https://github.com/ml-explore/mlx/pull/2886bump minimum required Python version by @ngoldbaum in https://github.com/ml-explore/mlx/pull/2891do not use simd neon intrinsics on x86 by @davidkoski in https://github.com/ml-explore/mlx/pull/2893Fix input buffer donation in compile by @CC-Yeh in https://github.com/ml-explore/mlx/pull/2897Update nanobind pin to most recent version by @ngoldbaum in https://github.com/ml-explore/mlx/pull/2896fp quantize by @nastya236 in https://github.com/ml-explore/mlx/pull/2892Fix grad in place updates by @awni in https://github.com/ml-explore/mlx/pull/2899[CUDA] Add host nodes to subgraph types for graph update by @awni in https://github.com/ml-explore/mlx/pull/2901fix: possible heap-buffer-overflow in RandomBits::eval_cpu (follow for new ASAN CI tests) by @incertum in https://github.com/ml-explore/mlx/pull/2877Fix ccache getting disabled by @zcbenz in https://github.com/ml-explore/mlx/pull/2905Fix attention for large sizes by @awni in https://github.com/ml-explore/mlx/pull/2903No VJP for mask or sinks in attention by @awni in https://github.com/ml-explore/mlx/pull/2909Bump actions/upload-artifact from 5 to 6 by @dependabot[bot] in https://github.com/ml-explore/mlx/pull/2911Bump actions/download-artifact from 6 to 7 by @dependabot[bot] in https://github.com/ml-explore/mlx/pull/2912Use CUDA runtime headers from local python package by @zcbenz in https://github.com/ml-explore/mlx/pull/2906DOC : Add compile state example by @Satyam12singh in https://github.com/ml-explore/mlx/pull/2910qqmm by @nastya236 in https://github.com/ml-explore/mlx/pull/2789Thunderbolt RDMA communications backend by @angeloskath in https://github.com/ml-explore/mlx/pull/2808Add JIT support for NAX kernels by @jagrit06 in https://github.com/ml-explore/mlx/pull/2916Fix warnings for the NAX build by @angeloskath in https://github.com/ml-explore/mlx/pull/2921New Contributors
- @dependabot[bot] made their first contribution in https://github.com/ml-explore/mlx/pull/2828
- @yuchaoran2011 made their first contribution in https://github.com/ml-explore/mlx/pull/2831
- @tianenchong made their first contribution in https://github.com/ml-explore/mlx/pull/2868
- @romanoneg made their first contribution in https://github.com/ml-explore/mlx/pull/2845
- @Satyam12singh made their first contribution in https://github.com/ml-explore/mlx/pull/2890
- @ngoldbaum made their first contribution in https://github.com/ml-explore/mlx/pull/2891
Full Changelog: https://github.com/ml-explore/mlx/compare/v0.30.0...v0.30.1