endlessly fascinating how a traditional machine learning background is basically not that helpful for modern AI. we use deep NNs and do SGD with one of two losses.
most day-to-day work lies in abstractions *on top* of this layer. everything is really just a massive system of data, models, and rules, piped into each other in different directions
‣ pre-training? that's just data ‣ RL? that's a model plus data generated from that model, plus some rules ‣ post-training? that's a model, plus some data which makes another model, then you use this plus the models's data to make a new model, use that model to make a bunch of other models, then you use all those models to make a model again














