LLM Learning Resources

This post lists resources that I find useful during my journey of learning LLM as a system enginer.

Neural Networks

LLMs are large neural networks so having a basic understanding of what neural networks are is helpful.

Victor Zhou’s Neural Networks From Scratch

Neural Networks From Scratch is a 4-post series that introduces classic neural networks, recurrent neural networks (RNNs) and convolutional neural networks (CNNs). It doesn’t require any prior knowledge except for some math. One good thing about this series is that it’s very hands-on and you will learn, step-by-step, how to write a simple nueral network from scratch using only numpy to solve a real problem.

Michael Nielsen’s Visual proof that neural nets can compute any function

Visual proof that neural nets can compute any function gives a visual explanation of the universal approximation theorem that neural networks can be used to approximate any continuous function to any desired precision.

LLM

Andrej Karpathy’s Neural Networks: Zero to Hero

Neural Networks: Zero to Hero is a hands-on course that starts with building a simple neural network from scratch and ends with building a GPT from scratch. It’s a great course to learn LLM progressively.

Inference

The best way to learn LLM inference is writing an inference engine from scratch IMO and this helps you better understand model architectures as well.

Andrej Karpathy’s llama2.c

llama2.c is an inference engine for Llama 2 in one file of pure C. It can give you a taste of what LLM inference looks like. Based on it, I wrote a Rust implementation with optimizations like tensor parallelism.

Jiajun Yao

Stay hungry, Stay foolish.