Video: How ChatGPT works?
Summary of the key points from the video:
– ChatGPT was released in November 2022 and reached 100M monthly active users in just 2 months, the fastest growing app ever.
– ChatGPT uses a Large Language Model (LLM) like GPT-3.5 at its core. LLMs are trained on massive amounts of text data to generate human-like text.
– GPT-3.5 has 175 billion parameters spread across 96 neural network layers. It was trained on 500 billion tokens of internet text data.
– The model predicts the next token (word piece) given previous tokens. It can generate grammatically correct text but needs guidance to avoid generating harmful content.
– The raw LLM is further “fine-tuned” using Reinforcement Learning from Human Feedback (RLHF) to align it with human preferences and make it safer.
– RLHF works by gathering human feedback on model outputs to create a reward model, then retraining using Proximal Policy Optimization to optimize for higher rewards.
– When answering a prompt, ChatGPT injects conversational context, applies prompt engineering, and passes the output through moderation models.
– So ChatGPT relies on a massively trained LLM plus additional fine-tuning and safety measures to enable natural conversation while avoiding unsafe responses.