This model architecture is the backbone of GPT, BERT, and T5.
Transformer
This popular open-source LLM was developed by Meta and comes in variants like 7B, 13B, and 70B.
LLaMA (Large Language Model Meta AI)
This custom hardware chip was developed by Google to accelerate deep learning training and inference.
TPU (Tensor Processing Unit)
What company created the first personal computer?
IBM
This robot in “Star Wars” is known for its golden armor and etiquette.
C-3PO
This is what happens when you forget to set temperature=0.1 and your model starts writing Shakespearean poems about Kubernetes.
Hallucination
This term refers to an LLM’s ability to perform new tasks without being explicitly trained on them.
Zero-shot learning
This is the unit of compute used by OpenAI and others to estimate large-scale model training costs.
FLOP (Floating Point Operation)
Who is considered the father of modern computer science?
Alan Turing
This movie features an AI named Samantha that forms a romantic relationship with a human.
Her
This term describes a model trained to follow instructions in natural language.
Instruction-tuned model
This technique improves LLM reliability by combining it with external knowledge, like a search engine or database.
Retrieval-Augmented Generation (RAG)
NVIDIA GPUs often run in this parallel computing platform and programming model.
CUDA (Compute Unified Device Architecture)
This language was created by Guido van Rossum.
Python
This Netflix series features lifelike AI beings known as “Hosts” in a futuristic Wild West theme park.
Westworld
This type of generative model learns to compress data into a latent space and then reconstruct it.
Variational Autoencoder (VAE)
This technique lets LLMs “think” step-by-step by prompting intermediate reasoning before the final answer.
Chain-of-Thought prompting
How much electricity did it consume to train GPT-3 (175B parameters)?
~ 1,287 megawatt-hours (MWh) of electricity
This computer scientist coined the term “Artificial Intelligence” in 1956 and organised the Dartmouth Conference.
John McCarthy
The fictional AI “Skynet” became self-aware in this movie series.
Terminator
This form of training for LLMs incorporates preference feedback from humans to align model outputs with human values.
Reinforcement Learning from Human Feedback (RLHF)
This process creates a smaller, faster model by training it to mimic the behaviour of a larger model.
Distillation
This law, used to project memory and compute trends, suggests that energy cost per operation halves roughly every 2 years.
Koomey’s Law
What was the name of the first chatbot, created in 1966?
ELIZA
These two AI robots assist the crew of “Interstellar”, featuring boxy designs and witty personalities.
TARS and CASE