| 2021-10-16 |
Edward Hu et al. |
LoRA: Low-Rank Adaptation of Large Language Models |
LoRA introduces a resource-efficient approach for adapting large pre-trained language models such as GPT-3 to specific tasks without the heavy costs of traditional fine-tuning. It maintains model quality, minimizes inference latency, and facilitates quick task-switching. |
| 2023-12-19 |
Lingling Xu et al. |
Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models: A Critical Review and Assessment |
This paper reviews parameter-efficient fine-tuning (PEFT) methods for large pretrained language models, highlighting their benefits in resource-limited settings and assessing performance, efficiency, and memory usage across multiple tasks. |
| 2023-05-17 |
Rohan Anil et al. |
PaLM 2 Technical Report |
Introduces PaLM 2, a state-of-the-art language model with stronger multilingual and reasoning capabilities and better compute efficiency than its predecessor. |
| 2023-05-18 |
Chunting Zhou et al. |
LIMA: Less Is More for Alignment |
Examines the relative importance of unsupervised pretraining versus large-scale instruction tuning and reinforcement learning in aligning large language models to end tasks and user preferences. |
| 2022-03-29 |
Jordan Hoffmann et al. |
Training Compute-Optimal Large Language Models |
Investigates the optimal model size and token count for training a transformer language model under a fixed compute budget. |