General LLM Stuff
Pretraining
Post-Training - RL
Reinforcement Learning
Agentic AI
100

OpenAI's most popular LLM

ChatGPT

100

What is the first step of pretraining?

Web scraping

100

SFT stands for

Supervised Fine Tuning

100

RLHF stands for

Reinforcement Learning from Human Feedback

100

What do AI Agents do?

Splits up tasks into multiple steps

200

What is a large language model?

A neural network that processes and generates human-like text

200

One factor that allows an LLM to be "bigger" than another

More parameters

More tokens

Bigger context window

200

Is SFT done by humans or fully AI?

Humans

200

How does Reinforcement Learning work?

Prompt LLM many times, incentivize the correct answers

200

Name an example of an AI Agent

Coding, Web Search, Weather, etc.

300

The difference between multimodal and LLMs

LLMs only handle text, multimodals can take on images, audio, etc.

300

What is an inference?

A forward pass through the model

300

Two ways to fix hallucinations

Train it to say "I don't know"
Usage of tools

300

Difference between RL and RLHF

RL is used for concrete answers, RLHF for indefinite ones, with human assistance

300

Give an example of what Agents can do with a computer

They can perform multiple operations, browse, test out code, debug, etc.

400

What architecture is most commonly used in LLMs?

Transformers

400

Tokenization uses ____-____ encoding

byte pair

400

What is the point of <User> and <Assistant> tokens?

To provide the LLM knowledge of the user's prompt and where it should begin its response

400

What happens if you run RLHF too many times?

Finds a loophole to the model with nonsensical responses

400

Why do reasoning models take so much longer?

They "think," taking different approaches and checking work before providing a solution

500

Why can't LLMs count/spell?

Don't see one by one, it's just a token generator

500

What is a base model, and why is it different than an LLM that we use?

Base models are just token generators, not fine tuned to produce answers to questions or keep a conversation

500

Give an example of a hallucination and why it would happen

Many possible answers

500

How does RLHF avoid humans having to rate a billion LLM outputs?

Uses another neural network that simulates human scoring

500

Why are AI Agents regarded as the future?

Not just prompts and responses, they can get complex multi-step jobs done, something AI is yet to do

M
e
n
u