🍂Foundations of LLMs Lesson 1
Notes for lesson 1 of w&b course on foundations for LLM
Learning objectives:
Have a good mental model of when to train or fine-tune LLMs
Understand high level the key pieces to make it work successfully
Understand why and how to participate in the Neurips LLM Efficiency Challenge
Introduction


LLMs are decoder-only models. Trained to predict the next word in the sequence.
When to Train or fine-tune?

Pre-train/ train the LLM when you want to have full control over the training data.
Finetune the open-source LLM when you have to have more control but want to have cheaper inference.
Use commercial APIs when you have to reduce the time to market.
Finetuning the model --Types:
Task specific: When we have a task specified we can finetune the model to do that task based on a certain prompt or instruction.
Instruction tuning and RLHF instruction tuning: We give text as the instruction and expect the model to follow that instruction.
Supervised finetuning: We provide the model with instructions and responses and train the model to behave in a particular way.
RLHF Reinforcement learning with human feedback: We have two model techniques. involves training the reward model where we teach that reward model to what humans prefer. We train the LLM with reinforcement learning to align with human preferences.
Pretraining vs finetuning
Chatgpt started as the code model trained to help with programming Then in further iterations it was trained with RLHF (technique above discussed) for aligning more with human responses.

Some of the open-source models released which we can finetune today!

What do we need for finetuning?
What is the goal and what are the evaluation criteria?
Choose model architecture and foundation model
Create the right dataset.
Efficient training and finetuning of the model on data.
Introduction to Neurips LLM Efficiency Challenge

Finetuning LLMs
Quantization ( to fit the model into limited memory )
PEFT ( instead of training the whole model we train some part or append model with new parameters) ( see images below)
Techniques by model structure
adapters
LORA Low-rank matrix adaptation
QLORA Quantized low-rank matrix adaptation(using lower precision for finetuning the LLM)
Techniques by feeding data in different types
Prompt tuning
Prefix tuning
P-tuning
Data curation
Code to start experimenting
Last updated