ROCm 7.0.2 Release
ROCm 7.0.2 release notes
The release notes provide a summary of notable changes since the previous ROCm release.
Unclaimed project
Are you a maintainer of ROCm? Claim this project to take control of your public changelog and roadmap.
The release notes provide a summary of notable changes since the previous ROCm release.
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflows - all through natural language commands.
Windows inside a Docker container.
Simple Python version management
This repository started out as a learning in public project for myself and has now become a structured learning map for many in the community. We have 3 years under our belt covering all things DevOps, including Principles, Processes, Tooling and Use Cases surrounding this vast topic.
If you’re using AMD Radeon GPUs or Ryzen APUs in a workstation setting with a display connected, see the [Use ROCm on Radeon and Ryzen](https://rocm.docs.amd.com/projects/radeon-ryzen/en/latest/index.html)
documentation to verify compatibility and system requirements.
The following are notable new features and improvements in ROCm 7.0.2. For changes to individual components, see Detailed component changes.
ROCm 7.0.2 adds support for the RDNA4 architecture-based AMD Radeon RX 9060. For more information about supported AMD hardware, see Supported GPUs (Linux).
ROCm 7.0.2 adds support for the following operating systems and kernel versions:
For more information about supported operating systems, see Supported operating systems and install instructions.
Virtualization support remains unchanged in this release. For more information, see Virtualization Support.
The software for AMD Datacenter GPU products requires maintaining a hardware and software stack with interdependencies between the GPU and baseboard firmware, AMD GPU drivers, and the ROCm user space software.
|
ROCm Version |
GPU |
PLDM Bundle (Firmware) |
AMD GPU Driver (amdgpu) |
AMD GPU |
|---|---|---|---|---|
| ROCm 7.0.2 | MI355X |
01.25.15.02 (or later) 01.25.13.09 | 30.10.2 30.10.1 30.10 | 8.4.1.K |
| MI350X |
01.25.15.02 (or later) 01.25.13.09 | 30.10.2 30.10.1 30.10 | ||
| MI325X |
01.25.04.02 (or later) 01.25.03.03 |
30.10.2 30.10.1 30.10 6.4.z where z (0-3) 6.3.y where y (1-3) | ||
| MI300X | 01.25.05.00 (or later)[1] 01.25.03.12 |
30.10.2 30.10.1 30.10 6.4.z where z (0–3) 6.3.y where y (0–3) 6.2.x where x (1–4) | 8.4.1.K | |
| MI300A | BKC 26 (or later) BKC 25 | Not Applicable | ||
| MI250X | IFWI 47 (or later) | |||
| MI250 | MU5 w/ IFWI 75 (or later) | |||
| MI210 | MU5 w/ IFWI 75 (or later) | 8.4.0.K | ||
| MI100 | VBIOS D3430401-037 | Not Applicable |
[1]: PLDM bundle 01.25.05.00 will be available by October 31, 2025.
Multimedia Engine Reset is now supported in AMD GPU Driver (amdgpu) 30.10.2 for AMD Instinct MI300X GPUs. This finer-grain GPU resiliency feature allows recovery from faults related to VCN or JPEG without requiring a full GPU reset, thereby improving system stability and fault tolerance. Note that VCN queue reset functionality requires PLDM bundle 01.25.05.00 (or later) firmware.
ROCm support for RHEL 10.0 and Oracle 10 requires AMD GPU Driver 30.10.2 or later.
In September 2025, Retrieval-Augmented Generation (RAG) was added to the ROCm platform. Use RAG to build and deploy end-to-end AI pipelines on AMD GPUs. It enhances the accuracy and reliability of a large language model (LLM) by exposing it to up-to-date, relevant information. When queried, RAG retrieves relevant data from its knowledge base and uses it in conjunction with the query to generate accurate and informed responses. This approach minimizes hallucinations (the creation of false information) while also enabling the model to access current information not present in its original training data. For more information, see the ROCm-RAG documentation.
Gaussian splatting (gsplat) is an open-source library for GPU-accelerated differentiable rasterization of 3D Gaussians with Python bindings. This ROCm-enabled release of gsplat is built on top of PyTorch for ROCm, enabling innovators in computer graphics, machine learning, and 3D vision to leverage GPU acceleration with AMD Instinct GPUs. With gsplat, you can build, research, and innovate with Gaussian splatting. To install gsplat on ROCm, see installation instructions.
The ROCm Life Science (ROCm-LS) toolkit is an open-source software collection for high-performance life science and healthcare applications built on the core ROCm platform. It helps you accelerate life science processing and analyze workloads on AMD GPUs. ROCm-LS is in an early access state. Running production workloads is not recommended. For more information, see the AMD ROCm-LS documentation.
ROCm-LS provides the following tools to build a complete workflow for life science acceleration on AMD GPUs:
The hipCIM library provides powerful support for GPU-accelerated I/O operations, coupled with an array of computer vision and image processing primitives designed for N-dimensional image data in fields such as biomedical imaging. For more information, see the hipCIM documentation.
MONAI for AMD ROCm, a ROCm-enabled version of MONAI, is built on top of PyTorch for AMD ROCm, helping healthcare and life science innovators to leverage GPU acceleration with AMD Instinct GPUs for high-performance inference and training of medical AI applications. For more information, see the MONAI for AMD ROCm documentation.
ROCm provides a comprehensive ecosystem for deep learning development. For more information, see Deep learning frameworks for ROCm and the Compatibility matrix for the complete list of Deep learning and AI framework versions tested for compatibility with ROCm.
ROCm 7.0.0 introduces several newly supported versions of Deep learning and AI frameworks:
ROCm 7.0.2 enables support for PyTorch 2.8.
AMD ROCm has officially added support for the following Deep learning and AI frameworks:
FlashInfer is a library and kernel generator for Large Language Models (LLMs) that provides a high-performance implementation of graphics processing units (GPUs) kernels. FlashInfer focuses on LLM serving and inference, as well as advanced performance across diverse scenarios. It is supported on ROCm 6.4.1. For more information, see FlashInfer compatibility.
llama.cpp is an open-source framework for Large Language Model (LLM) inference that runs on both central processing units (CPUs) and graphics processing units (GPUs). It is written in plain C/C++, providing a simple, dependency-free setup. It is now supported on ROCm 7.0.0 and 6.4.x. For more information, see llama.cpp compatibility.
The ROCm Offline Installer Creator 7.0.2 includes the following features and improvements:
See ROCm Offline Installer Creator for more information.
The ROCm Runfile Installer 7.0.2 adds the following features and improvements:
untar mode.
For more information, see ROCm Runfile Installer.ROCm documentation continues to be updated to provide clearer and more comprehensive guidance for a wider variety of user needs and use cases.
Tutorials for AI developers have been expanded with the following two new inference tutorials:
For more information about the changes, see the Changelog for the AI Developer Hub.
ROCm components support a wide range of environment variables that can be used for testing, logging, debugging, experimental features, and more. The rocBLAS and RCCL components have been updated with new environment variable content.
The following table lists the versions of ROCm components for ROCm 7.0.2, including any version changes from 7.0.1 to 7.0.2. Click the component's updated version to go to a list of its changes.
Click {fab}github to go to the component's source code on GitHub.
| Category | Group | Name | Version | |
|---|---|---|---|---|
| Libraries | Machine learning and computer vision | Composable Kernel | 1.1.0 | |
| MIGraphX | 2.13.0 | |||
| MIOpen | 3.5.0 | |||
| MIVisionX | 3.3.0 | |||
| rocAL | 2.3.0 | |||
| rocDecode | 1.0.0 | |||
| rocJPEG | 1.1.0 | |||
| rocPyDecode | 0.6.0 | |||
| RPP | 2.0.0 | |||
| Communication | RCCL | 2.26.6 ⇒ 2.26.6 | ||
| rocSHMEM | 3.0.0 | |||
| Math | hipBLAS | 3.0.0 ⇒ 3.0.2 | ||
| hipBLASLt | 1.0.0 | |||
| hipFFT | 1.0.20 | |||
| hipfort | 0.7.0 | |||
| Primitives | hipCUB | 4.0.0 | ||
| hipTensor | 2.0.0 | |||
| rocPRIM | 4.0.0 ⇒ 4.0.1 | |||
| rocThrust | 4.0.0 | |||
| Tools | System management | AMD SMI | 26.0.0 ⇒ 26.0.1 | |
| ROCm Data Center Tool | 1.1.0 | |||
| rocminfo | 1.0.0 | |||
| ROCm SMI | 7.8.0 | |||
| Performance | ROCm Bandwidth Test | 2.6.0 | ||
| ROCm Compute Profiler | 3.2.3 | |||
| ROCm Systems Profiler | 1.1.0 ⇒ 1.1.1 | |||
| ROCProfiler | ||||
| Development | HIPIFY | 20.0.0 | ||
| ROCdbgapi | 0.77.3 ⇒ 0.77.4 | |||
| ROCm CMake | 0.14.0 | |||
| ROCm Debugger (ROCgdb) | 16.3 | |||
| Compilers | HIPCC | 1.1.1 | ||
| llvm-project | 20.0.0 | |||
| Runtimes | HIP | 7.0.0 ⇒ 7.0.2 | ||
| ROCr Runtime | 1.18.0 | |||
The following sections describe key changes to ROCm components.
For a historical overview of ROCm component updates, see the {doc}`ROCm consolidated changelog </release/changelog>`.
bad_page_threshold_exceeded field to amd-smi static --ras, which compares retired pages count against bad page threshold. This field displays True if retired pages exceed the threshold, False if within threshold, or N/A if threshold data is unavailable. Note that sudo is required to have the bad_page_threshold_exceeded field populated.AmdSmiTemperatureType had issues with referencing the correct attribute. As such, the following duplicate enums have been removed:
AmdSmiTemperatureType.GPUBOARD_NODE_FIRSTAmdSmiTemperatureType.GPUBOARD_VR_FIRSTAmdSmiTemperatureType.BASEBOARD_FIRSTFixed attribute error in amd-smi monitor on Linux Guest systems, where the violations argument caused CLI to break.
Fixed certain output in amd-smi monitor when GPUs are partitioned.
amd-smi monitor -Vqt, amd-smi monitor -g 0 -Vqt -w 1, amd-smi monitor -Vqt --file /tmp/test1, etc. These commands will now be able to display as normal in partitioned GPU scenarios.Fixed an issue where using amd-smi ras --folder <folder_name> was forcing the created folder's name to be lowercase. This fix also allows all string input options to be case insensitive.
Fixed an issue of some processes not being detected by AMD SMI despite making use of KFD resources. This fix, with the addition of KFD Fallback for process detection, ensures that all KFD processes will be detected.
Multiple CPER issues were fixed.
hipMemAllocationTypeUncached flag, enabling developers to allocate uncached memory. This flag is now supported in the following APIs:
hipMemGetAllocationGranularity determines the recommended allocation granularity for uncached memory.hipMemCreate allocates memory with uncached properties.hiprtc with the compiler option std=c++11.hipLaunchHostFunc. This API is now supported and permitted to run during stream capture, aligning its behavior with CUDA.globalWorkSize, in frameworks like MIOpen and PyTorch, where the grid size is not a multiple of the block size. To ensure correct replay behavior, HIP runtime now stores this remainder in hip::GraphKernelNode during hipExtModuleLaunchKernel capture, enabling accurate execution and preventing corruption.gpu_metrics, which is used in threshold logic for command submission patches to GPU device(s) during CPU synchronization.reduceCopyPacks to trigger pipelining, especially to overlap bf16 arithmetic.--force-reduce-pipeline as an option that can be passed to the install.sh script. Passing this option will enable software-triggered pipelining bfloat16 reductions (that is, all_reduce, reduce_scatter, and reduce).ROCBLAS_USE_HIPBLASLT_BATCHED variable to independently control the batched hipblaslt backend. Set ROCBLAS_USE_HIPBLASLT_BATCHED=0 to disable batched GEMM use of the hipblaslt backend.rocprim::texture_cache_iterator.hipStreamLegacy is supported. This resolves runtime errors that occur when hipStreamLegacy is used in ROCm 7.0.0 and later.Improved the performance of:
ROCm known issues are noted on {fab}github GitHub. For known
issues related to individual components, review the Detailed component changes.
Red Hat Enterprise Linux (RHEL) and related distributions automatically enable a security feature named Security-Enhanced Linux (SELinux), which may prevent ROCm debugging tools, such as ROCgdb, ROCdbgapi, and ROCR Debug Agent, from working correctly.
The problem occurs when attempting to debug a program that contains code that runs on the GPU. The debugging session might become unresponsive while attempting to reach a breakpoint or executing instruction-stepping in device code. ROCgdb will still be responsive and accept interruptions by pressing Control+C, but the breakpoint in device code won't be hit, and the instruction-stepping operation will not be completed.
The ROCR Debug Agent might also become unresponsive when attempting to capture data from a program that is experiencing queue errors, memory faults, or other triggering events.
For a detailed workaround, see the Installation troubleshooting documentation. This issue will be fixed in a future ROCm release.
Applications using the MIGraphX Python API will fail when running on Python 3.13 and return the error message AttributeError: module 'migraphx' has no attribute 'parse_onnx'. The issue does not occur when you manually build MIGraphX. For detailed instructions, see Building from source. As a workaround, change the Python version to the one found in the installed location:
ls -l /opt/rocm-7.0.0/lib/libmigraphx_py_*.so
The issue will be resolved in a future ROCm release.
OpenCV packages built on Ubuntu 24.04 are incompatible with Debian 13 due to a version conflict. As a result, applications, tests, and samples that use OpenCV might fail. To avoid the version conflict, rebuild OpenCV with the version corresponding to Debian 13, then rebuild MIVisionX on top of it. As a workaround, rebuild OpenCV from source, followed by the application that uses OpenCV. This issue will be fixed in a future ROCm release.
The following changes to the ROCm software stack are anticipated for future releases.
ROCm 7.0.2 is the last official AMD-supported distribution of ROCm Execution Provider (ROCm-EP). ROCm EP will be removed from all upcoming ROCm releases. Refer to this Pull Request for more information. Migrate your applications to use the MIGraphX Execution Provider.
ROCm SMI will be phased out in an upcoming ROCm release and will enter maintenance mode. After this transition, only critical bug fixes will be addressed and no further feature development will take place.
It's strongly recommended to transition your projects to AMD SMI, the successor to ROCm SMI. AMD SMI includes all the features of the ROCm SMI and will continue to receive regular updates, new functionality, and ongoing support. For more information on AMD SMI, see the AMD SMI documentation.
Development and support for ROCTracer, ROCProfiler, rocprof, and rocprofv2 are being phased out in favor of ROCprofiler-SDK in upcoming ROCm releases. Starting with ROCm 6.4, only critical defect fixes will be addressed for older versions of the profiling tools and libraries. All users are encouraged to upgrade to the latest version of the ROCprofiler-SDK library and the (rocprofv3) tool to ensure continued support and access to new features. ROCprofiler-SDK is still in beta today and will be production-ready in a future ROCm release.
It's anticipated that ROCTracer, ROCProfiler, rocprof, and rocprofv2 will reach end-of-life by future releases, aligning with Q1 of 2026.
Access to the wavefront size as a compile-time constant via the __AMDGCN_WAVEFRONT_SIZE
and __AMDGCN_WAVEFRONT_SIZE__ macros are deprecated and will be disabled in a future release. In ROCm 7.0.0 warpSize is only available as a non-constexpr variable. You're encouraged to update your code if needed to ensure future compatibility.
__AMDGCN_WAVEFRONT_SIZE__ macro and __AMDGCN_WAVEFRONT_SIZE alias will be removed in an upcoming release.
It is recommended to remove any use of this macro. For more information, see
AMDGPU support.warpSize is only available as a non-constexpr variable. Where required,
the wavefront size should be queried via the warpSize variable in device code,
or via hipGetDeviceProperties in host code. Neither of these will result in a compile-time constant. For more information, see warpSize.__AMDGCN_WAVEFRONT_SIZE, __AMDGCN_WAVEFRONT_SIZE__, or warpSize
can be replaced with a user-defined macro or constexpr variable with the wavefront
size(s) for the target hardware. For example: #if defined(__GFX9__)
#define MY_MACRO_FOR_WAVEFRONT_SIZE 64
#else
#define MY_MACRO_FOR_WAVEFRONT_SIZE 32
#endif
ROCm Object Tooling tools roc-obj-ls, roc-obj-extract, and roc-obj were
deprecated in ROCm 6.4, and will be removed in a future release. Functionality
has been added to the llvm-objdump --offloading tool option to extract all
clang-offload-bundles into individual code objects found within the objects
or executables passed as input. The llvm-objdump --offloading tool option also
supports the --arch-name option, and only extracts code objects found with
the specified target architecture. See llvm-objdump
for more information.
| hipRAND | 3.0.0 |
| hipSOLVER | 3.0.0 |
| hipSPARSE | 4.0.1 |
| hipSPARSELt | 0.2.4 |
| rocALUTION | 4.0.0 |
| rocBLAS | 5.0.0 ⇒ 5.0.2 |
| rocFFT | 1.0.34 |
| rocRAND | 4.0.0 |
| rocSOLVER | 3.30.0 ⇒ 3.30.1 |
| rocSPARSE | 4.0.2 ⇒ 4.0.3 |
| rocWMMA | 2.0.0 |
| Tensile | 4.44.0 |
| ROCm Validation Suite | 1.2.0 |
| 2.0.0 |
| ROCprofiler-SDK | 1.0.0 |
| ROCTracer | 4.1.0 |
| ROCr Debug Agent | 2.1.0 |