New
MSCCL++ v0.5.2
What's Changed
- Add C++ executor test by @chhwang in https://github.com/microsoft/mscclpp/pull/304
- Cumulative Updates by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/309
- Add NPKit GPU event support by @yzygitzh in https://github.com/microsoft/mscclpp/pull/310
- Fix NPKit support for AMD by @yzygitzh in https://github.com/microsoft/mscclpp/pull/312
- Add "packet type" option for executor test by @Binyang2014 in https://github.com/microsoft/mscclpp/pull/313
- Add support for multicast reduce insruction by @roshandathathri in https://github.com/microsoft/mscclpp/pull/316
- Update quickstart.md by @angelica-moreira in https://github.com/microsoft/mscclpp/pull/314
- Simplify/improve barrier in AllReduce6 by @roshandathathri in https://github.com/microsoft/mscclpp/pull/317
- Support NCCL APIs by @caiomcbr in https://github.com/microsoft/mscclpp/pull/319
- Update allreduce_bench.py by @angelica-moreira in https://github.com/microsoft/mscclpp/pull/318
- Separate NPKit CPU timestamp access from different blocks for AMD platform by @yzygitzh in https://github.com/microsoft/mscclpp/pull/321
- AllReduce Kernel for Small Messages by @caiomcbr in https://github.com/microsoft/mscclpp/pull/322