v0.5.0.0
ONNX-MLIR v0.5.0.0 is now available with exciting new features. We thank everyone who contributed to this release! Please visit onnx-mlir to learn more about ONNX-MLIR.
Key Updates
- ONNX 1.17.0
- PyBind 2.12.0
- Benchmark 1.8.4
- IBM z17 NNPA Telum II Support Enabled
What's Changed
- Add a python script for generating text using huggingface gpt2 by @tungld in https://github.com/onnx/onnx-mlir/pull/2983
- Remove a spike of memory usage in ScrubDisposablePass. by @imaihal in https://github.com/onnx/onnx-mlir/pull/2978
- RunONNXModel.py: Add a
--cache-model=pathoption by @AlexandreEichenberger in https://github.com/onnx/onnx-mlir/pull/2984 - Enable check-onnx-backend-numerical-nnpa on Jenkins s390x by @tungld in https://github.com/onnx/onnx-mlir/pull/2985
- RunONNXModel.py: save compilation info into a file when using --save-model or --cache-model by @tungld in https://github.com/onnx/onnx-mlir/pull/2994
- Fix wrong total number of phases for EmitObj and EmitJNI by @tungld in https://github.com/onnx/onnx-mlir/pull/2995
- run_gpt2_from_huggingface.py: do not download the onnx data file if it exists by @tungld in https://github.com/onnx/onnx-mlir/pull/2996
- Opening binary constants files fix on zOS by @christopherlmunoz in https://github.com/onnx/onnx-mlir/pull/2991
- [NNPA] Memory reduction of stickified constant by stickifying at file writing by @imaihal in https://github.com/onnx/onnx-mlir/pull/2917
- Option to not emit the full MLIR (only emit .tmp file) by @imaihal in https://github.com/onnx/onnx-mlir/pull/2997
- RunONNXModel.py: allow to change the default model name by @tungld in https://github.com/onnx/onnx-mlir/pull/2999
- upgrade to ONNX 1.17.0 (opset 22) by @gongsu832 in https://github.com/onnx/onnx-mlir/pull/3004
- Add decomposition for
ONNXSoftmaxCrossEntropyLossOpby @srcarroll in https://github.com/onnx/onnx-mlir/pull/2968 - Delay scrubbing disposable elements attrs as long as possible by @tungld in https://github.com/onnx/onnx-mlir/pull/3006
- Add limitation for BFLOAT supported ops for NNPA by @Sunny-Anand in https://github.com/onnx/onnx-mlir/pull/3008
- Test the return value of omMMapBinaryFile function and terminate the main program elegantly by @tungld in https://github.com/onnx/onnx-mlir/pull/3002
- Fix a wrong function call by @tungld in https://github.com/onnx/onnx-mlir/pull/3012
- Making runtime omunreachable static to support clang compiler by @christopherlmunoz in https://github.com/onnx/onnx-mlir/pull/3015
- Fix security vulenrabilities by @Sunny-Anand in https://github.com/onnx/onnx-mlir/pull/3019
- Do not fuse locations when normalizing constants for Add and Mul by @jorickert in https://github.com/onnx/onnx-mlir/pull/3016
- Handle full reduction over all dimensions by @tungld in https://github.com/onnx/onnx-mlir/pull/3022
- Use DisposableElementsAttr for ZHigh constant propagation by @tungld in https://github.com/onnx/onnx-mlir/pull/3013
- Re-enable diagnostic error/warning printing by @AlexandreEichenberger in https://github.com/onnx/onnx-mlir/pull/3020
- Transform SequenceAt to split for special cases by @chentong319 in https://github.com/onnx/onnx-mlir/pull/3018
- Add tolerance args to CheckONNXModel.py by @AlexandreEichenberger in https://github.com/onnx/onnx-mlir/pull/3024
- Return a failure instead of crashing if shape inference can not be run because of unraked operand types by @jorickert in https://github.com/onnx/onnx-mlir/pull/3023
- upgrade becnhmark by @Sunny-Anand in https://github.com/onnx/onnx-mlir/pull/3027
- Update llvm-project to llvm/llvm-project@01d233ff403823389f848 by @hamptonm1 in https://github.com/onnx/onnx-mlir/pull/3011
- Update llvm-project to llvm/llvm-project@af20aff35ec3 by @hamptonm1 in https://github.com/onnx/onnx-mlir/pull/3032
- Fix biasScaleShape of GroupNormalizationV21 to support ranks > 4 by @jorickert in https://github.com/onnx/onnx-mlir/pull/3030
- Merge from repo by @AlexandreEichenberger in https://github.com/onnx/onnx-mlir/pull/3033
- Update llvm-project to llvm/llvm-project@e86910337f98 by @hamptonm1 in https://github.com/onnx/onnx-mlir/pull/3037
- Best practice by @AlexandreEichenberger in https://github.com/onnx/onnx-mlir/pull/3039
- [NNPA] Fix some bugs for ReduceMin/Max by @tungld in https://github.com/onnx/onnx-mlir/pull/3038
- Skip over uninitialized DenseResourceAttrs in verifiers by @jorickert in https://github.com/onnx/onnx-mlir/pull/3041
- [NNPA] Revise compiler options for quantization by @tungld in https://github.com/onnx/onnx-mlir/pull/3043
- Update the instruction for building multiple accelerators by @tungld in https://github.com/onnx/onnx-mlir/pull/3046
- Add a document for quantization on NNPA by @tungld in https://github.com/onnx/onnx-mlir/pull/3045
- update onnx opset by @Sunny-Anand in https://github.com/onnx/onnx-mlir/pull/3050
- Remove element type restriction in softmax lowering by @srcarroll in https://github.com/onnx/onnx-mlir/pull/3051
- Fix ASAN/UBSAN issues in DimAnalysis by @jorickert in https://github.com/onnx/onnx-mlir/pull/3052
- Build light weight PyRuntime without llvm or onnx-mlir by @chentong319 in https://github.com/onnx/onnx-mlir/pull/3044
- Option to set the number of threads for parallel compilation by @imaihal in https://github.com/onnx/onnx-mlir/pull/3048
- Update onnx requirement to 1.17.0 by @jorickert in https://github.com/onnx/onnx-mlir/pull/3054
- Optimization for Roberta unstick->reshape->transpose->reshape->stick by @AlexandreEichenberger in https://github.com/onnx/onnx-mlir/pull/3056
- Extend GridSample support by @jorickert in https://github.com/onnx/onnx-mlir/pull/3060
- Remove the pattern unstick_4ds_squeeze_stick_3ds by @tungld in https://github.com/onnx/onnx-mlir/pull/3062
- Instrumentation cleanup when operation was removed by @AlexandreEichenberger in https://github.com/onnx/onnx-mlir/pull/3061
- Add support for ONNX.shape with permutation pattern by @AlexandreEichenberger in https://github.com/onnx/onnx-mlir/pull/3066
- Update docker image to point to github registry in devcontainer-example by @jorickert in https://github.com/onnx/onnx-mlir/pull/3055
- Parallelization of ConstProp compilation by @imaihal in https://github.com/onnx/onnx-mlir/pull/3042
- Bump various ops to opset 22, adding bf16 support by @jorickert in https://github.com/onnx/onnx-mlir/pull/3059
- Bump onnx.Cast to opset 21 , adding int/uint4 support by @jorickert in https://github.com/onnx/onnx-mlir/pull/3057
- Add runtime check for Gather Op by @chentong319 in https://github.com/onnx/onnx-mlir/pull/3069
- fix weak hash by @Sunny-Anand in https://github.com/onnx/onnx-mlir/pull/3070
- Remove the compile option -nnpa-clip-to-dlfloat-range by @tungld in https://github.com/onnx/onnx-mlir/pull/3075
- Matmul CPU performance regression by @AlexandreEichenberger in https://github.com/onnx/onnx-mlir/pull/3072
- ZHigh to ONNX optimization is default on. Switch flag from enable to disable by @AlexandreEichenberger in https://github.com/onnx/onnx-mlir/pull/3074
- Since compiler generated stick/unstick is default on, change new option to disable it by @AlexandreEichenberger in https://github.com/onnx/onnx-mlir/pull/3073
- Add lit tests for KrnlMatmulOp lowering (Krnl to affine) by @AlexandreEichenberger in https://github.com/onnx/onnx-mlir/pull/3076
- Upgrading llvm and stablehlo hash by @christopherlmunoz in https://github.com/onnx/onnx-mlir/pull/3053
- Don't try to free static array in mnist example by @Zentrik in https://github.com/onnx/onnx-mlir/pull/3049
- Handle out-of-bound value for Gather alike operation by @chentong319 in https://github.com/onnx/onnx-mlir/pull/3077
- Extend instrumentSignature to print data by @chentong319 in https://github.com/onnx/onnx-mlir/pull/3078
- Modifying RunONNXModel.py to better support external performance profiling tools by @AlexandreEichenberger in https://github.com/onnx/onnx-mlir/pull/3082
- Add support to use either docker or local compiler to compile a model by @chentong319 in https://github.com/onnx/onnx-mlir/pull/3081
- Use docker and podman package in python driver by @chentong319 in https://github.com/onnx/onnx-mlir/pull/3087
- update pybind11 to version 2.12.0 by @chentong319 in https://github.com/onnx/onnx-mlir/pull/3088
- Bump Upsample to Opset 10 and change the opset versioning to allow to skip over opset versions if a newer, backwards compatible one exists. by @jorickert in https://github.com/onnx/onnx-mlir/pull/3065
- Improve scripts by @AlexandreEichenberger in https://github.com/onnx/onnx-mlir/pull/3089
- Add result type inference to RandomNormalLike and fix wrong hardcodings for dtypes by @jorickert in https://github.com/onnx/onnx-mlir/pull/3091
- Bump various ops to opset 21, adding int4/uint4 and 8 bit float support. by @jorickert in https://github.com/onnx/onnx-mlir/pull/3064
- Added minimal support to do some timing of OM Runtime functionality by @AlexandreEichenberger in https://github.com/onnx/onnx-mlir/pull/3095
- Including __errno_location call for MVS by @christopherlmunoz in https://github.com/onnx/onnx-mlir/pull/3099
- Rewriting pattern to remove WhereOp and EqualOp. by @imaihal in https://github.com/onnx/onnx-mlir/pull/3094
- Enable NNPA saturation by default and change the option to --nnpa-disable-saturation by @tungld in https://github.com/onnx/onnx-mlir/pull/3101
- removing weak attribute of errorno by @christopherlmunoz in https://github.com/onnx/onnx-mlir/pull/3103
- Fix the custom build link for docs/Docker.md by @qjivy in https://github.com/onnx/onnx-mlir/pull/3104
- Python driver for torch model by @chentong319 in https://github.com/onnx/onnx-mlir/pull/3093
- Cherry pick updates from main for z17 and fix for ZHighConstantPropagation in QunarizedStick by @Sunny-Anand in https://github.com/onnx/onnx-mlir/pull/3133
- [cherry-pick]fix CVE-2025-32434 (#3135) by @Sunny-Anand in https://github.com/onnx/onnx-mlir/pull/3138
- Removing duplicate file (#3146) by @Sunny-Anand in https://github.com/onnx/onnx-mlir/pull/3158
New Contributors
- @Zentrik made their first contribution in https://github.com/onnx/onnx-mlir/pull/3049
- @qjivy made their first contribution in https://github.com/onnx/onnx-mlir/pull/3104
NNPA Contributors for z17
- @AlexandreEichenberger
- @gongsu832
- @imaihal
- Megan Hampton hamptonm@us.ibm.com
- Mike Essenmacher essen@us.ibm.com
- @Sunny-Anand
- @chentong319
- @tungld
- @negiyas
- @christopherlmunoz
Full Changelog: https://github.com/onnx/onnx-mlir/compare/v0.4.3.0...v0.5.0.0