Natural Language Processing — TTIC, Winter 2027

✦

Course Information

✦

Instructor: Nicholas Tomlin
Time: Tues & Thurs, 2:00–3:20 PM CT
Location: TBD

Please Note: This syllabus is under construction, and is subject to change!

This course covers the fundamentals of natural language processing, with a focus on language models and reinforcement learning. The central organizing question for this course is: what objective are we optimizing, and how do we efficiently optimize it? We'll cover everything from next-token prediction to reinforcement learning from human feedback, with a focus on the theoretical foundations and practical details behind today's language models. Students will implement language models from scratch and build a modern post-training stack.

✦

Schedule

✦

Wk	Date	Topic
Block I — Next-token prediction: the language modeling objective
I	Tue, Jan 5	The language modeling objective; a simple n-gram model
I	Thu, Jan 7	Tokenization; learning vector representations of words
II	Tue, Jan 12	RNNs and vanishing gradients; LSTMs
II	Thu, Jan 14	Sequence-to-sequence modeling; evaluation metrics and BLEU score; attention
III	Tue, Jan 19	The transformer architecture; scaling laws
III	Thu, Jan 21	Sampling strategies; retrieval; alternative architectures
Block II — Verifiable rewards: optimizing for correctness
IV	Tue, Jan 26	Filtered behavior cloning; weakly supervised semantic parsing; STaR
IV	Thu, Jan 28	A crash-course intro to RL; REINFORCE
V	Tue, Feb 2	Reasoning models; GRPO
V	Thu, Feb 4	Midterm Examination
Block III — Learned reward models: beyond verifiable rewards
VI	Tue, Feb 9	RLHF: preference data collection, Bradley-Terry, reward modeling; PPO
VI	Thu, Feb 11	DPO and offline RLHF; learning from rubrics; process reward models
VII	Tue, Feb 16	Distillation; context distillation; self-distillation
VII	Thu, Feb 18	Additional considerations: LoRA, asynchronous RL, etc.
Block IV — Listener models: optimizing for interaction
VIII	Tue, Feb 23	Computational pragmatics; the Rational Speech Acts framework
VIII	Thu, Feb 25	Training language models with user simulators
IX	Tue, Mar 2	LLM agents; tool use; vision-language models
Block V — Objective gaming and misspecification: when objectives break down
IX	Thu, Mar 4	Goodhart’s Law; reward hacking; open problems in AI safety
X	Thu, Mar 11	Final Examination

✦

Assignments

✦

Four coding assignments and two examinations.

Assessment	Description	Due
Assignment I	Implement an n-gram language model and word2vec	End of Week II
Assignment II	Implement a transformer language model	End of Week IV
Midterm	—	Week V
Assignment III	Implement GRPO for arithmetic problems	End of Week VI
Assignment IV	Implement a reward model based on human preference data	End of Week VIII
Final	—	Week X