11/10/2023 0 Comments Paper.io 4 online![]() Our base model Vicuna v1.5, which is an instruction-tuned chatbot, will be downloaded automatically when you run our provided training scripts. Both hyperparameters used in pretraining and finetuning are provided below.ĭownload Vicuna checkpoints (automatically) We use a similar set of hyperparameters as Vicuna in finetuning. Always keep the global batch size the same: per_device_train_batch_size x gradient_accumulation_steps x num_gpus. ![]() To train on fewer GPUs, you can reduce the per_device_train_batch_size and increase the gradient_accumulation_steps accordingly. LLaVA is trained on 8 A100 GPUs with 80GB memory. LLaVA training consists of two stages: (1) feature alignment stage: use our 558K subset of the LAION-CC-SBU dataset to connect a frozen pretrained vision encoder to a frozen LLM (2) visual instruction tuning stage: use 150K GPT-generated multimodal instruction-following data, plus around 515K VQA data from academic-oriented tasks, to teach the model to follow multimodal instructions. For legacy models, please refer to README of this version for now.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |