Nov 15 2024

(Stanford)

  • AI/ML Learning - Backpropagation - computing gradients automatically
    • Broke down complex equations into building blocks. Covered back and forward propagation. Demonstrating how neural networks can calculate the weights for each feature for any function.
  • AI/ML Learning - Differentiable programming’
    • Feedforward NN, Convolutional NN, MaxPool, SequenceRNN, SimpleRNN, LSTM, Attention, AddNorm, TransformerBlock, BErT, Collapsing, GenerateToken, LanguageModel, SequenceToSequence
    • Once we have the EmbedTokens, we have a choice to use them either on RNNs (SequenceRNN, SimpleRNN, LSTM) or Transformers (which are based on attention - addNorm, TransfomerBlock, BERT)
      • We can Collapse to do categorization
      • We can generate new sequences (GenerateToken, LanguageModel, Seq2Seq)
    • Sequencial RNNs vs Transformers, which use attention mechanism - processing y by comparing it to each xi…and self attention, is processing x by comparing to each x~i~. Layer normalization and residual connections => AddNorm(f) – applies f to x safely, so if f() is junk, at least i have x. Basically, normalizing the feature vectors.
    • TransformerBlock
      • Attention (x) – allow the vectors to talk to each other
      • AddNorm - to normalize all these results
      • Feedforward - to each individual vector
      • And AddNorm that result as well.
    • BERT - Embeds the tokens (words or letters of language), then does a TransformerBlock 24 times! Resulting in getting a series of vectors that are highly contextualized and nuanced, and contains a lot of rich information about the sentence
      • From here you can use collapse to drive the vectors into one value → category
      • Or use it to select an answer out of the questions (predicting the next token)
    • Tokens
      • GenerateToken → x[] -> y (e.g., ‘abc’) generate token y based on [ x * EmbedToken(y) ]
      • EmbedToken - Looks up and returns the vector for a token ‘abc’ -> x[]
    • Language Model
      • If x is (‘the quick brown’) → what’s the next predicted word? To generate the token in the next sequence…
      • LanguageModel(x=’the quick brown’) = GenerateToken(Collapse(SequenceModel(EmbedToken(x))))
    • Sequence-to-Sequence models [ la maison bleue ] => [ the blue house ]
      • Generate a sequence from another sequence (sentence to translation, document summary, semantic parsing → e.g., sentence to code)