v0.3.03
🎉 What's Changed
The key highlights of this update include an upgrade to Python 3.12 and optimization of the dataset pipeline.
Dependency and Environment Updates:
- Upgraded the required Python version to 3.12 in
pyproject.tomland development settings, and updated the target version for linting and type checking to Python 3.12. [1] [2] [3] - Updated dependencies: switched from a git-based install of
llamafactoryto a fixed version, addedtorchdataandtorchaudiowith CUDA 12.6 support, and refined platform-specific dependency markers for PyTorch packages. [1] [2]
Data
- Added the "<begin_chat>" marker in user messages, allowing for improved context in conversation flows.
- Updated the
qa_generator.pyto include a new mechanism for managing chat member relationships, allowing the addition of contextual information about the relationship between users in conversations. - Refactored the CSV loading function to support loading user relationship data from a
users.jsonfile, improving the context provided during QA generation. - Added a new configuration option
add_relationto the dataset settings, enabling users to toggle this feature.
others
- Introduces
OnlineLLMwith thread‑pooled batch chat and optional JSON‑guided decoding; unifies JSON parsing across vLLM and OpenAI results.
- fix: fix triton source from default cuda129 to 126 by @MapleWithered in https://github.com/xming521/WeClone/pull/198
New Contributors
- @MapleWithered made their first contribution in https://github.com/xming521/WeClone/pull/198
Full Changelog: https://github.com/xming521/WeClone/compare/v0.3.02...v0.3.03
😊 更新内容
本次更新核心亮点包括升级至Python 3.12以及数据集管线优化。
依赖与环境更新:
- 在
pyproject.toml和开发配置中将Python版本升级至3.12。 - 依赖项更新:将
llamafactory从基于git的安装方式改为固定版本,新增支持CUDA 12.6的torchdata和torchaudio,并优化了PyTorch包的平台特定依赖标记。
数据处理
- 新增"<begin_chat>"标记,以提升对话流程的上下文连贯性
- 更新
qa_generator.py,新增聊天成员关系管理机制,支持在对话中添加用户间关系的上下文信息 - 重构CSV加载函数,支持从
users.json文件加载用户关系数据,增强问答生成时的上下文信息 - 在数据集配置中新增
add_relation选项,允许用户自主启用/禁用此功能
其他
- 引入支持线程池批量聊天和可选JSON引导解码的
OnlineLLM;统一了vLLM与OpenAI结果的JSON解析流程。