Features Pricing Explore Docs

Sign In Get Started

Unclaimed project

Are you a maintainer of OpenBLAS? Claim this project to take control of your public changelog and roadmap.

Claim this project

Changelog

OpenBLAS

OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.

OpenMathLib/OpenBLAS

Changelog Roadmap

Product

Features
Pricing
Templates
Explore

Compare

vs Beamer
vs Canny
vs LaunchNotes

Resources

Documentation
Blog
About

Legal

Privacy
Terms

© 2026 AnnounceHQ. All rights reserved.

Back to changelog

NewJanuary 15, 2026

OpenBLAS 0.3.31 version

general:

reverted a matrix partitioning optimization from 0.3.30 that could lead to race conditions and subsequent invalid results in GEMM
added the bfloat16 extensions BGEMM and BGEMV
added a BLAS interface for the ?GEMM_BATCH extensions
added the BLAS extensions ?GEMM_BATCH_STRIDED and their CBLAS interface
added the basic infrastructure for half-precision float (FP16) format using SH prefix
reimplemented the LAPACK SLAED3/DLAED3 function using multithreading, thereby improving the performance of the SSYEVD/DSYEVD eigensolver for symmetric matrices on all platforms
limited the number of retries for initial memory allocation to avoid infinite hanging on low-memory systems
fixed a thread lockup situation encountered with python 3.9 or older and numpy
introduced a problem size threshold for multithreading in STRMV/DTRMV
introduced a problem size threshold for multithreading in CHER/CHER2/CHPR/CHPR2 and ZHER/ZHER2/ZHPR/ZHPR2
improved the problem size thresholds for multithreading in SGER/DGER
improved autodetection of the Fortran compiler
fixed passing of the INTERFACE64=1 option to the flang-new compiler

OpenBLAS 0.3.31 version - OpenBLAS Release Notes | AnnounceHQ

fixed a potential deadlock in multithreaded code after calling fork()

fixed builds using CMake on FreeBSD

fixed builds using CMake from within Cygwin on Windows

fixed builds using CMake and the NVHPC compiler on ARM64

fixed CMake build error from misdetecting compiler or OpenMP versions

improved contents of the CMake-generated OpenBLASConfig.cmake file

added support for cross-compilation to RISCV targets via CMake

fixed cross-compilation to x86 targets from non-x86 architectures

fixed failure to install cblas.h if NO_CBLAS=0 was specified

fixed missing user-defined pre- and postfixes on functions in lapack.h,lapacke.h

included fixes from the Reference-LAPACK project:

fix ordering bug in ?LAED/?LASD (Reference-LAPACK PR 1140)
revert changes in ?GEEV from PR 1129 (Reference-LAPACK PR 1142)
fix workspace allocation in LAPACKE_?TRSEN (Reference-LAPACK PR 1144)

riscv:

added optimized SBGEMM kernels for ZVL128B and ZVL256B targets
added optimized SHGEMM kernels for ZVL128B and ZVL256B targets
added optimized SBGEMV and SHGEMV kernels for ZVL128B/ZVL256B
improved performance of the GEMV kernel for ZVL256B
improved the performance of the CROT and ZROT kernels for ZVL128B and x280
improved the detection of RVV1.0 capability
improved performance of the matrix packing helper functions for ZVL128B and ZVL256B
improved performance of OMATCOPY for ZVL128B and ZVL256B

arm:

fixed spurious executable stack in the getarch utility

arm64:

fixed spurious executable stack in the getarch utility
fixed compiler warnings arising from the timer macro RPCC
fixed cache size detection for Qualcomm Oryon under Windows on Arm
fixed argument handling in the default SVE kernel for SDOT/DDOT
building the BFLOAT16 kernels is now enabled by default
improved the overall performance of GEMM,SYMM and HEMM on A64FX
improved the performance of SDOT/DDOT on A64FX
improved the multithreading performance of SDOT/DDOT on A64FX by introduction of a throttling table matching thread count to problem size
improved the performance of SGER/DGER on A64FX and NEOVERSEV1
improved the multithreading performance of GEMM on A64FX and NEOVERSEV1
improved the performance of the GEMV kernel for SVE-capable targets
improved the multithreading performance of SGEMM on NEOVERSEV1 and V2
added optimized SAXPY/DAXPY SVE kernels for A64FX and NEOVERSEV1
added optimized BGEMM and BGEMV kernels for NEOVERSEV1
added an optimized BGEMM kernel for NEOVERSEN2
added support for the NEOVERSEV2 cpu
added dedicated support for the Apple M4 cpu as VORTEXM4
added optimized SGEMM/SSYMM/STRMM/SSYRK/SSYR2K for SME-capable targets (ARMV9SME and VORTEXM4)
improved the precision of the SNRM2 kernel
added cpu autodetection and compiler settings for Ampere One processors
fixed cpu autodetection for Apple M systems running Linux
fixed building on MacOS with AppleClang,gfortran and xcode v16 or newer
fixed several errors in the C code replacements for the complex and double precision complex LAPACK functions that get used (only) when compiling with Microsoft C and NOFORTRAN=1 under MS Windows

power:

added initial support for the POWER11 architecture
improved performance of DGEMM and DGEMV on POWER10
fixed the default compiler flags to use "-O3" instead of the possibly unsafe "-Ofast"
fixed building under MacOS (for old G4 Macs) with CMake
fixed potential miscompilation of DGEMV and other assembly kernels by gcc15.1
fixed compilation with recent versions of flang

loongarch64:

fixed warnings and potential inaccuracies arising from incorrect saving of registers
fixed enumeration of logical cores on big NUMA servers
fixed building with LLVM and the INTERFACE64=1 option

x86:

fixed building the GEMM3M kernels for the GENERIC target
fixed several errors in the C code replacements for the complex and double precision complex LAPACK functions that get used (only) when compiling with Microsoft C and NOFORTRAN=1 under MS Windows

x86_64:

added cpu autodetection for Intel Lunar Lake (Core Ultra 200V)
changed all ?MIN and ?MAX assembly kernels to use unaligned operations
fixed several errors in the C code replacements for the complex and double precision complex LAPACK functions that get used (only) when compiling with Microsoft C and NOFORTRAN=1 under MS Windows
fixed potential crashes in builds for Cooper Lake, Sapphire Rapids or Zen5 cpus under MS Windows

zarch:

added support for building with CMake

sparc:

fixed a potential crash in the DNRM2 kernel

general:

reverted a matrix partitioning optimization from 0.3.30 that could lead to race conditions and subsequent invalid results in GEMM
added the bfloat16 extensions BGEMM and BGEMV
added a BLAS interface for the ?GEMM_BATCH extensions
added the BLAS extensions ?GEMM_BATCH_STRIDED and their CBLAS interface
added the basic infrastructure for half-precision float (FP16) format using SH prefix
reimplemented the LAPACK SLAED3/DLAED3 function using multithreading, thereby improving the performance of the SSYEVD/DSYEVD eigensolver for symmetric matrices on all platforms
limited the number of retries for initial memory allocation to avoid infinite hanging on low-memory systems
fixed a thread lockup situation encountered with python 3.9 or older and numpy
introduced a problem size threshold for multithreading in STRMV/DTRMV
introduced a problem size threshold for multithreading in CHER/CHER2/CHPR/CHPR2 and ZHER/ZHER2/ZHPR/ZHPR2
improved the problem size thresholds for multithreading in SGER/DGER
improved autodetection of the Fortran compiler
fixed passing of the INTERFACE64=1 option to the flang-new compiler
fixed a potential deadlock in multithreaded code after calling fork()
fixed builds using CMake on FreeBSD
fixed builds using CMake from within Cygwin on Windows
fixed builds using CMake and the NVHPC compiler on ARM64
fixed CMake build error from misdetecting compiler or OpenMP versions
improved contents of the CMake-generated OpenBLASConfig.cmake file
added support for cross-compilation to RISCV targets via CMake
fixed cross-compilation to x86 targets from non-x86 architectures
fixed failure to install cblas.h if NO_CBLAS=0 was specified
fixed missing user-defined pre- and postfixes on functions in lapack.h,lapacke.h
included fixes from the Reference-LAPACK project:
- fix ordering bug in ?LAED/?LASD (Reference-LAPACK PR 1140)
- revert changes in ?GEEV from PR 1129 (Reference-LAPACK PR 1142)
- fix workspace allocation in LAPACKE_?TRSEN (Reference-LAPACK PR 1144)

riscv:

added optimized SBGEMM kernels for ZVL128B and ZVL256B targets
added optimized SHGEMM kernels for ZVL128B and ZVL256B targets
added optimized SBGEMV and SHGEMV kernels for ZVL128B/ZVL256B
improved performance of the GEMV kernel for ZVL256B
improved the performance of the CROT and ZROT kernels for ZVL128B and x280
improved the detection of RVV1.0 capability
improved performance of the matrix packing helper functions for ZVL128B and ZVL256B
improved performance of OMATCOPY for ZVL128B and ZVL256B

arm:

fixed spurious executable stack in the getarch utility

arm64:

fixed spurious executable stack in the getarch utility
fixed compiler warnings arising from the timer macro RPCC
fixed cache size detection for Qualcomm Oryon under Windows on Arm
fixed argument handling in the default SVE kernel for SDOT/DDOT
building the BFLOAT16 kernels is now enabled by default
improved the overall performance of GEMM,SYMM and HEMM on A64FX
improved the performance of SDOT/DDOT on A64FX
improved the multithreading performance of SDOT/DDOT on A64FX by introduction of a throttling table matching thread count to problem size
improved the performance of SGER/DGER on A64FX and NEOVERSEV1
improved the multithreading performance of GEMM on A64FX and NEOVERSEV1
improved the performance of the GEMV kernel for SVE-capable targets
improved the multithreading performance of SGEMM on NEOVERSEV1 and V2
added optimized SAXPY/DAXPY SVE kernels for A64FX and NEOVERSEV1
added optimized BGEMM and BGEMV kernels for NEOVERSEV1
added an optimized BGEMM kernel for NEOVERSEN2
added support for the NEOVERSEV2 cpu
added dedicated support for the Apple M4 cpu as VORTEXM4
added optimized SGEMM/SSYMM/STRMM/SSYRK/SSYR2K for SME-capable targets (ARMV9SME and VORTEXM4)
improved the precision of the SNRM2 kernel
added cpu autodetection and compiler settings for Ampere One processors
fixed cpu autodetection for Apple M systems running Linux
fixed building on MacOS with AppleClang,gfortran and xcode v16 or newer
fixed several errors in the C code replacements for the complex and double precision complex LAPACK functions that get used (only) when compiling with Microsoft C and NOFORTRAN=1 under MS Windows

power:

added initial support for the POWER11 architecture
improved performance of DGEMM and DGEMV on POWER10
fixed the default compiler flags to use "-O3" instead of the possibly unsafe "-Ofast"
fixed building under MacOS (for old G4 Macs) with CMake
fixed potential miscompilation of DGEMV and other assembly kernels by gcc15.1
fixed compilation with recent versions of flang

loongarch64:

fixed warnings and potential inaccuracies arising from incorrect saving of registers
fixed enumeration of logical cores on big NUMA servers
fixed building with LLVM and the INTERFACE64=1 option

x86:

fixed building the GEMM3M kernels for the GENERIC target
fixed several errors in the C code replacements for the complex and double precision complex LAPACK functions that get used (only) when compiling with Microsoft C and NOFORTRAN=1 under MS Windows

x86_64:

added cpu autodetection for Intel Lunar Lake (Core Ultra 200V)
changed all ?MIN and ?MAX assembly kernels to use unaligned operations
fixed several errors in the C code replacements for the complex and double precision complex LAPACK functions that get used (only) when compiling with Microsoft C and NOFORTRAN=1 under MS Windows
fixed potential crashes in builds for Cooper Lake, Sapphire Rapids or Zen5 cpus under MS Windows

zarch:

added support for building with CMake

sparc:

fixed a potential crash in the DNRM2 kernel

md5sums: 05050271d9196f65bc4ac3a89c6a3b05 OpenBLAS-0.3.31.tar.gz 5480a9052e083e7abc9a3298fbf9079b OpenBLAS-0.3.31.zip e9a72628979846f456ac04c440b0ede5 OpenBLAS-0.3.31-x86.zip c6d0e83e9a543386291ade73022dc249 OpenBLAS-0.3.31-x64.zip 437f0c0611f7a473d3bd38e1e25ec967 OpenBLAS-0.3.31-x64-64.zip a0c1f8b37fad9bd866dc924d3bc090a4 OpenBLAS-0.3.31-woa64-static.zip fb16c99278818db855c26a3c786c470f OpenBLAS-0.3.31-woa64-dll.zip 53d3bb3e234437d6d8e43d76840c0bd6 OpenBLAS-0.3.31-woa64-64-static.zip 27474c9090dca9ca8231d2ee0d966272 OpenBLAS-0.3.31-woa64-64-dll.zip

Related Projects

mapbox-navigation-android

Mapbox Navigation SDK for Android

ToastFish

一个利用摸鱼时间背单词的软件。

barcodelib

C# Barcode Image Generation Library

haze

Background blurring for Compose Multiplatform / Jetpack Compose

View all projects →