Unclaimed project
Are you a maintainer of ara ? Claim this project to take control of your public changelog and roadmap.
Claim this project Changelog
ara The PULP Ara is a 64-bit Vector Unit, compatible with the RISC-V Vector Extension Version 1.0, working as a coprocessor to CORE-V's CVA6 core
© 2026 AnnounceHQ. All rights reserved.
Back to changelogNew November 2, 2021
v2.2.0 Fixed
Fix typo on the build instructions of the README
Fix Gnuplot installation on GitHub's CI
The number of elements requested by the Store Unit and the Element Requester now depends both on the requested eew and the past eew of the vector of the used register
When the VRF is written and EMUL > 1, the eew of all the interested registers is updated
Memory operations can change EMUL when EEW != VSEW
The LSU now correctly handles bursts with a saturated length of 256 beats
AXI transactions on an opposite channel w.r.t. the channel currently in use are started only after the completion of the previous transactions
Fix the number of elements to be requested for a vslidedown instruction
Added
benchmarks app to benchmark Ara
CI task to create roofline plots of imatmul and fmatmul, available as artifacts
Vector floating-point compare instructions (vmfeq, vmfne, vmflt, , , )
vmfle
vmfgt
vmfge
Vector single-width floating-point/integer type-convert instructions (vfcvt.xu.f, vfcvt.x.f, vfcvt.rtz.xu.f, vfcvt.rtz.x.f, vfcvt.f.xu, vfcvt.f.x)
Vector widening floating-point/integer type-convert instructions (vfwcvt.xu.f, vfwcvt.x.f, vfwcvt.rtz.xu.f, vfwcvt.rtz.x.f, vfwcvt.f.xu, vfwcvt.f.x, vfwcvt.f.f)
Vector narrowing floating-point/integer type-convert instructions (vfncvt.xu.f, vfncvt.x.f, vfncvt.rtz.xu.f, vfncvt.rtz.x.f, vfncvt.f.xu, vfncvt.f.x, vfncvt.f.f)
Vector whole-register move instruction vmv<nr>
Vector whole-register load/store vl1r, vs1r
Vector load/store mask vle1, vse1
Whole-register instructions are executed also if vtype.vl == 0
Makefile option (trace=1) to generate waveform traces when running simulations with Verilator
Changed
Add spill register at the lane edge, to cut the timing-critical interface between the Mask unit and the VFUs
Increase latency of the 16-bit multiplier from 0 to 1 to cut an in-lane timing-critical path
Widen CVA6's cache lines
Implement back-to-back accelerator instruction issue mechanism on CVA6
Use https protocol when cloning DTC from main Makefile
Use https protocol for newlib-cygwin in .gitmodules
Cut a timing-critical path from Addrgen to Sequencer (1 cycle more to start an AXI transaction)
Cut a timing-critical path in the VSTU, relative to the calculation of the pointer to the VRF word received from the lanes
Create ara_system wrapper containing Ara, Ariane, and an AXI mux, instantiated from within Ara's SoC
Retime address calculation of the addrgen
Push MASKU operand muxing from the lanes to the Mask Unit
Reduce CVA6's default cache size
Update Verilator to v4.214
Update bender to v0.23.1