Release v1.0.16
June 26, 2025
- MobileNetV5 backbone (w/ encoder only variant) for Gemma 3n image encoder
- Version 1.0.16 released
June 23, 2025
- Add F.grid_sample based 2D and factorized pos embed resize to NaFlexViT. Faster when lots of different sizes (based on example by https://github.com/stas-sl).
- Further speed up patch embed resample by replacing vmap with matmul (based on snippet by https://github.com/stas-sl).
- Add 3 initial native aspect NaFlexViT checkpoints created while testing, ImageNet-1k and 3 different pos embed configs w/ same hparams.
| Model | Top-1 Acc | Top-5 Acc | Params (M) | Eval Seq Len | |:---|:---:|:---:|:---:|:---:| | naflexvit_base_patch16_par_gap.e300_s576_in1k | 83.67 | 96.45 | 86.63 | 576 | | | 83.63 | 96.41 | 86.46 | 576 | | | 83.50 | 96.46 | 86.63 | 576 |