• LLMs learn the statistical patterns that piece words together into sentences and paragraphs.
  • Neural network is an extremely complicated type of mathematical function involving millions of numbers that converts an input into an output.
  • In every rounds of training, an algorithm adjusts these numbers to try to improve its guesses, using a mathematical technique known as backpropagation.
  • This process of tuning internal numbers is what it means for a neural net to “learn”
  • Actually the neural net is not generating letters, but probabilities. It is doing a reasonable continuation.
    • reasonable - what one might expect someone to write after seeing what people have written on billions of webpages.
  • As language models grow in size, they develop Emergent abilities
  • If you pick the highest ranked next word, you will get “flat” essay - no creativity. If you pick lower-ranked words, we will get “more interesting” essay.
    • “temperature” parameter determines it.
  • There are about 40,000 reasonably commonly used words in English.
  • By looking at a large corpus of English text (say a few million books, with altogether a few hundred billion words), we can get an estimate of how common each word is.
  • LLM is a form of user interface - wrapper around specifics