liucongg/ChatGLM-Finetuning
A comprehensive toolkit for fine-tuning ChatGLM-6B, ChatGLM2-6B, and ChatGLM3-6B models using various methods like Freeze, Lora, P-tuning, and full parameter fine-tuning.
Core Features
Quick Start
CUDA_VISIBLE_DEVICES=0 deepspeed --master_port 520 train.py --train_path data/spo_0.json --model_name_or_path ChatGLM-6B/ --per_device_train_batch_size 1 --max_len 1560 --max_src_len 1024 --learning_rate 1e-4 --weight_decay 0.1 --num_train_epochs 2 --gradient_accumulation_steps 4 --warmup_ratio 0.1 --mode glm --train_type freeze --freeze_module_name "layers.27.,layers.26.,layers.25.,layers.24." --seed 1234 --ds_file ds_zero2_no_offload.json --gradient_checkpointing --show_loss_step 10 --output_dir ./output-glmDetailed Introduction
This project provides a robust and flexible framework for fine-tuning large language models from the ChatGLM series (ChatGLM-6B, ChatGLM2-6B, ChatGLM3-6B) for specific downstream tasks. It implements various efficient fine-tuning techniques, including Freeze, Lora, P-tuning, and full parameter fine-tuning, allowing users to select the optimal approach based on their resource constraints and task requirements. The framework supports both single and multi-GPU training environments and is designed to prevent catastrophic forgetting, ensuring model stability and performance after adaptation. It's particularly useful for researchers and developers looking to customize ChatGLM models for tasks like information extraction, text generation, and classification.