Release v0.2.36

Features

SGLang worker for vision language models — lower latency, higher throughput

Vision language WebUI with Gradio support

OpenAI-compatible API server now accepts image input

LightLLM worker integration for higher throughput

Apple MLX worker with async support

Yuan 2.0 model support

OpenAI embedding support for topic clustering

Training with custom templates (fixes tokenization mismatch)

Google Colab Free Tier REST API enablement

Highlights

Added SGLang worker for vision language models, lower latency and higher throughput https://github.com/lm-sys/FastChat/pull/2928
Vision langauge WebUI https://github.com/lm-sys/FastChat/pull/2960
OpenAI-compatible API server now supports image input https://github.com/lm-sys/FastChat/pull/2928
Added LightLLM worker for higher throughput https://github.com/lm-sys/FastChat/blob/main/docs/lightllm_integration.md
Added Apple MLX worker https://github.com/lm-sys/FastChat/pull/2940

What's Changed

fix specify local path issue use model from www.modelscope.cn by @liuyhwangyh in https://github.com/lm-sys/FastChat/pull/2934
support openai embedding for topic clustering by @CodingWithTim in https://github.com/lm-sys/FastChat/pull/2729
Remove duplicate API endpoint by @surak in https://github.com/lm-sys/FastChat/pull/2949
Update Hermes Mixtral by @teknium1 in https://github.com/lm-sys/FastChat/pull/2938
Enablement of REST API Usage within Google Colab Free Tier by @ggcr in https://github.com/lm-sys/FastChat/pull/2940
Create a new worker implementation for Apple MLX by @aliasaria in https://github.com/lm-sys/FastChat/pull/2937
feat: support Model Yuan2.0, a new generation Fundamental Large Language Model developed by IEIT System by @cauwulixuan in https://github.com/lm-sys/FastChat/pull/2936
Fix the pooling method of BGE embedding model by @staoxiao in https://github.com/lm-sys/FastChat/pull/2926
SGLang Worker by @BabyChouSr in https://github.com/lm-sys/FastChat/pull/2928
Update mlx_worker to be async by @aliasaria in https://github.com/lm-sys/FastChat/pull/2958
Integrate LightLLM into serve worker by @zeyugao in https://github.com/lm-sys/FastChat/pull/2888
Copy button by @surak in https://github.com/lm-sys/FastChat/pull/2963
feat: train with template by @congchan in https://github.com/lm-sys/FastChat/pull/2951
fix content maybe a str by @zhouzaida in https://github.com/lm-sys/FastChat/pull/2968
Adding download folder information in README by @dheeraj-326 in https://github.com/lm-sys/FastChat/pull/2972
use cl100k_base as the default tiktoken encoding by @bjwswang in https://github.com/lm-sys/FastChat/pull/2974
Update README.md by @merrymercy in https://github.com/lm-sys/FastChat/pull/2975
Fix tokenizer for vllm worker by @Michaelvll in https://github.com/lm-sys/FastChat/pull/2984
update yuan2.0 generation by @wangpengfei1013 in https://github.com/lm-sys/FastChat/pull/2989
fix: tokenization mismatch when training with different templates by @congchan in https://github.com/lm-sys/FastChat/pull/2996
fix: inconsistent tokenization by llama tokenizer by @congchan in https://github.com/lm-sys/FastChat/pull/3006
Fix type hint for play_a_match_single by @MonkeyLeeT in https://github.com/lm-sys/FastChat/pull/3008
code update by @infwinston in https://github.com/lm-sys/FastChat/pull/2997
Update model_support.md by @infwinston in https://github.com/lm-sys/FastChat/pull/3016
Update lightllm_integration.md by @eltociear in https://github.com/lm-sys/FastChat/pull/3014
Upgrade gradio to 4.17 by @infwinston in https://github.com/lm-sys/FastChat/pull/3027
Update MLX integration to use new generate_step function signature by @aliasaria in https://github.com/lm-sys/FastChat/pull/3021
Update readme by @merrymercy in https://github.com/lm-sys/FastChat/pull/3028
Update gradio version in pyproject.toml and fix a bug by @merrymercy in https://github.com/lm-sys/FastChat/pull/3029
Update gradio demo and API model providers by @merrymercy in https://github.com/lm-sys/FastChat/pull/3030
Gradio Web Server for Multimodal Models by @BabyChouSr in https://github.com/lm-sys/FastChat/pull/2960
Migrate the gradio server to openai v1 by @merrymercy in https://github.com/lm-sys/FastChat/pull/3032
Update version to 0.2.36 by @merrymercy in https://github.com/lm-sys/FastChat/pull/3033

New Contributors

@teknium1 made their first contribution in https://github.com/lm-sys/FastChat/pull/2938
@ggcr made their first contribution in https://github.com/lm-sys/FastChat/pull/2940
@aliasaria made their first contribution in https://github.com/lm-sys/FastChat/pull/2937
@cauwulixuan made their first contribution in https://github.com/lm-sys/FastChat/pull/2936
@staoxiao made their first contribution in https://github.com/lm-sys/FastChat/pull/2926
@zhouzaida made their first contribution in https://github.com/lm-sys/FastChat/pull/2968
@dheeraj-326 made their first contribution in https://github.com/lm-sys/FastChat/pull/2972
@bjwswang made their first contribution in https://github.com/lm-sys/FastChat/pull/2974
@MonkeyLeeT made their first contribution in https://github.com/lm-sys/FastChat/pull/3008

Full Changelog: https://github.com/lm-sys/FastChat/compare/v0.2.35...v0.2.36

Features

SGLang worker for vision language models — lower latency, higher throughput

Vision language WebUI with Gradio support

OpenAI-compatible API server now accepts image input

LightLLM worker integration for higher throughput

Apple MLX worker with async support

Yuan 2.0 model support

OpenAI embedding support for topic clustering

Training with custom templates (fixes tokenization mismatch)

Google Colab Free Tier REST API enablement

Highlights

Added SGLang worker for vision language models, lower latency and higher throughput https://github.com/lm-sys/FastChat/pull/2928

Vision langauge WebUI https://github.com/lm-sys/FastChat/pull/2960

OpenAI-compatible API server now supports image input https://github.com/lm-sys/FastChat/pull/2928

Added LightLLM worker for higher throughput https://github.com/lm-sys/FastChat/blob/main/docs/lightllm_integration.md

Added Apple MLX worker https://github.com/lm-sys/FastChat/pull/2940

What's Changed

fix specify local path issue use model from www.modelscope.cn by @liuyhwangyh in https://github.com/lm-sys/FastChat/pull/2934

support openai embedding for topic clustering by @CodingWithTim in https://github.com/lm-sys/FastChat/pull/2729

Remove duplicate API endpoint by @surak in https://github.com/lm-sys/FastChat/pull/2949

Update Hermes Mixtral by @teknium1 in https://github.com/lm-sys/FastChat/pull/2938

Enablement of REST API Usage within Google Colab Free Tier by @ggcr in https://github.com/lm-sys/FastChat/pull/2940

Create a new worker implementation for Apple MLX by @aliasaria in https://github.com/lm-sys/FastChat/pull/2937

feat: support Model Yuan2.0, a new generation Fundamental Large Language Model developed by IEIT System by @cauwulixuan in https://github.com/lm-sys/FastChat/pull/2936

Fix the pooling method of BGE embedding model by @staoxiao in https://github.com/lm-sys/FastChat/pull/2926

SGLang Worker by @BabyChouSr in https://github.com/lm-sys/FastChat/pull/2928

Update mlx_worker to be async by @aliasaria in https://github.com/lm-sys/FastChat/pull/2958

Integrate LightLLM into serve worker by @zeyugao in https://github.com/lm-sys/FastChat/pull/2888

Copy button by @surak in https://github.com/lm-sys/FastChat/pull/2963

feat: train with template by @congchan in https://github.com/lm-sys/FastChat/pull/2951

fix content maybe a str by @zhouzaida in https://github.com/lm-sys/FastChat/pull/2968

Adding download folder information in README by @dheeraj-326 in https://github.com/lm-sys/FastChat/pull/2972

use cl100k_base as the default tiktoken encoding by @bjwswang in https://github.com/lm-sys/FastChat/pull/2974

Update README.md by @merrymercy in https://github.com/lm-sys/FastChat/pull/2975

Fix tokenizer for vllm worker by @Michaelvll in https://github.com/lm-sys/FastChat/pull/2984

update yuan2.0 generation by @wangpengfei1013 in https://github.com/lm-sys/FastChat/pull/2989

fix: tokenization mismatch when training with different templates by @congchan in https://github.com/lm-sys/FastChat/pull/2996

fix: inconsistent tokenization by llama tokenizer by @congchan in https://github.com/lm-sys/FastChat/pull/3006

Fix type hint for play_a_match_single by @MonkeyLeeT in https://github.com/lm-sys/FastChat/pull/3008

code update by @infwinston in https://github.com/lm-sys/FastChat/pull/2997

Update model_support.md by @infwinston in https://github.com/lm-sys/FastChat/pull/3016

Update lightllm_integration.md by @eltociear in https://github.com/lm-sys/FastChat/pull/3014

Upgrade gradio to 4.17 by @infwinston in https://github.com/lm-sys/FastChat/pull/3027

Update MLX integration to use new generate_step function signature by @aliasaria in https://github.com/lm-sys/FastChat/pull/3021

Update readme by @merrymercy in https://github.com/lm-sys/FastChat/pull/3028

Update gradio version in pyproject.toml and fix a bug by @merrymercy in https://github.com/lm-sys/FastChat/pull/3029

Update gradio demo and API model providers by @merrymercy in https://github.com/lm-sys/FastChat/pull/3030

Gradio Web Server for Multimodal Models by @BabyChouSr in https://github.com/lm-sys/FastChat/pull/2960

Migrate the gradio server to openai v1 by @merrymercy in https://github.com/lm-sys/FastChat/pull/3032

Update version to 0.2.36 by @merrymercy in https://github.com/lm-sys/FastChat/pull/3033

New Contributors

@teknium1 made their first contribution in https://github.com/lm-sys/FastChat/pull/2938

@ggcr made their first contribution in https://github.com/lm-sys/FastChat/pull/2940

@aliasaria made their first contribution in https://github.com/lm-sys/FastChat/pull/2937

@cauwulixuan made their first contribution in https://github.com/lm-sys/FastChat/pull/2936

@staoxiao made their first contribution in https://github.com/lm-sys/FastChat/pull/2926

@zhouzaida made their first contribution in https://github.com/lm-sys/FastChat/pull/2968

@dheeraj-326 made their first contribution in https://github.com/lm-sys/FastChat/pull/2972

@bjwswang made their first contribution in https://github.com/lm-sys/FastChat/pull/2974

@MonkeyLeeT made their first contribution in https://github.com/lm-sys/FastChat/pull/3008

Full Changelog: https://github.com/lm-sys/FastChat/compare/v0.2.35...v0.2.36

FastChat

FastChat v0.2.36

Breaking Changes

Features

Fixes

Improvements

Highlights

What's Changed

New Contributors

More Python Projects

AutoGPT

stable-diffusion-webui

transformers

yt-dlp

Release v0.2.36

FastChat v0.2.36

Breaking Changes

Features

Fixes

Improvements

Highlights

What's Changed

New Contributors

More Python Projects

AutoGPT

stable-diffusion-webui

transformers

yt-dlp