New
v0.1.6
MegCC
Highlight
- 支持 ARM V8.6 平台新特性 I8MM 实现的 Int8 量化卷积,性能约为 DOT 版本的1.7倍。
- 支持 clip 新模型推理,性能与 megdnn 相当。
Bug Fixes
compiler-kernel
- reisze 算子添加 NEAREST Mode。
- 修正 int8 Winograd F23 的适用条件,避免 hybrid conv 选到该 kernel 的错误情况。
- 修正所有 ConvBias 的适用条件,限制其只支持 channel broadcast 的 bias。
- 新增对 elemwise sqrt sin cos 算子的支持。
- 添加 int8 resize 的 float32 计算实现, 修复 Arm 实现和 naive 实现结果存在误差的问题。
complier-通用
- 修复 MegCC 无法支持不带 bias 的 ConvBias 算子的 bug。
runtime
- 修复 runtime 可能释放指针为 NULL 的 dynamic tensor 导致程序 crash 的问题。
New Features
基础组件
- 升级 megbrain 到 padding channel pass bug 修复版本,以解决 megcc 部分模型编译出错的问题。
complier-kernel
- 增加 Arm64 BatchedMatmul 算子。
- 增加 IndexingMultiAxisVec 算子对多维 indexing 模式的支持。
- 新增对 nchw int8 conv1x1x kernel 的支持。
- 添加 aarch32 int8 dot nchw conv5x5 kernel。
- 添加 Float16 数据类型的 batched matmul算子。
- 新增 naive mod op。
- 支持 ARM V8.6 平台新特性 I8MM 实现的 Int8 量化卷积,性能约为 DOT 版本的1.7倍。
MegCC
Highlight
- Supports Int8 quantized convolution implemented by I8MM, a new feature of the ARM V8.6 platform, with performance approximately 1.7 times that of the DOT version.
- Support clip new model inference, performance is equivalent to megdnn.
Bug Fixes
compiler-kernel
- Add NEAREST Mode for resize operator.
- Correct the applicable conditions of int8 Winograd F23 to avoid the error of hybrid conv selecting this kernel.
- Fix all ConvBias applicability conditions to restrict support to channel broadcast only bias.
- Support sqrt sin cos elemwise kernel.
- Add float32 calculation implementation of int8 resize to fix the problem of discrepancy between Arm implementation and naive implementation results.
complier-common
- Fix the bug that MegCC does not support ConvBias operator without bias.
runtime
- Fix the bug that runtime may try to free dynamic tensor with NULL pointer.
New Features
basic components
- Update megbrain to the padding channel pass bug fix version to solve the problem of compiling some models of megcc.
complier-kernel
- Adds the BatchedMatmul operator for the Arm64 platform.
- The IndexingMultiAxisVec operator supports multi-dimensional indexing mode.
- Add support for nchw int8 conv1x1x kernel.
- Add aarch32 int8 dot nchw conv5x5 kernel.
- Add batched matmul operator for Float16 datatype.
- Add naive float32 mod elemwise op
- Supports Int8 quantized convolution implemented in I8MM, a new feature of the ARM V8.6 platform, with approximately 1.7x the performance of the DOT version.