v25.08.00

This is a beta release of cuPyNumeric.

Pip wheels are available on PyPI at https://pypi.org/project/nvidia-cupynumeric/, for Linux (x86-64 and ARM64, with CUDA and multi-node support) and macOS (for ARM64). Conda packages are available at https://anaconda.org/legate/cupynumeric, for Linux (x86-64 and ARM64, with CUDA and multi-node support). GASNet-based (rather than UCX-based) conda packages are under the gex label. Windows is currently supported through WSL.

Documentation for this release can be found at https://docs.nvidia.com/cupynumeric/25.08/.

New features

Added functionality

Multi-node multi-GPU capable SVD, specialized for tall-skinny matrices
cupynumeric.cross
cupynumeric.insert
cupynumeric.logspace
cupynumeric.real_if_close
cupynumeric.roots
cupynumeric.ravel_multi_index
cupynumeric.copyto
cupynumeric.diagflat
cupynumeric.delete
cupynumeric.nan_to_num
Support multi-axis reductions

Performance Improvements

Improve robustness & speed of cupynumeric.sort, by combining allocations where possible, and adding synchronization barriers around NCCL collectives.
Remove some extraneous blocking that was only necessary to match the behavior of NumPy 1.x.
Improve performance of NumPy fallback, in particular removing extraneous array copies, and adding special cases for quick fallback to functions such as cupynumeric.concatenate.

Miscellaneous

Unify all environment variables that control cuPyNumeric's NumPy fallback heuristics, to a single one, CUPYNUMERIC_MAX_EAGER_VOLUME.
Allow any available BLAS implementation to be used in a source build.

Full Changelog: https://github.com/nv-legate/cupynumeric/compare/v25.07.00...v25.08.00

This is a beta release of cuPyNumeric.

Documentation for this release can be found at https://docs.nvidia.com/cupynumeric/25.08/.

New features

Added functionality

Multi-node multi-GPU capable SVD, specialized for tall-skinny matrices
cupynumeric.cross
cupynumeric.insert
cupynumeric.logspace
cupynumeric.real_if_close
cupynumeric.roots
cupynumeric.ravel_multi_index
cupynumeric.copyto
cupynumeric.diagflat
cupynumeric.delete
cupynumeric.nan_to_num
Support multi-axis reductions

Performance Improvements

Improve robustness & speed of cupynumeric.sort, by combining allocations where possible, and adding synchronization barriers around NCCL collectives.
Remove some extraneous blocking that was only necessary to match the behavior of NumPy 1.x.
Improve performance of NumPy fallback, in particular removing extraneous array copies, and adding special cases for quick fallback to functions such as cupynumeric.concatenate.

Miscellaneous

Unify all environment variables that control cuPyNumeric's NumPy fallback heuristics, to a single one, CUPYNUMERIC_MAX_EAGER_VOLUME.
Allow any available BLAS implementation to be used in a source build.

Full Changelog: https://github.com/nv-legate/cupynumeric/compare/v25.07.00...v25.08.00

cupynumeric

New features

Added functionality

Performance Improvements

Miscellaneous

Related Projects

mapbox-navigation-android

ToastFish

barcodelib

haze

v25.08.00

New features

Added functionality

Performance Improvements

Miscellaneous

Related Projects

mapbox-navigation-android

ToastFish

barcodelib

haze