Computer Science
>
>

CS 224N: Natural Language Processing with Deep Learning

Winter 2023

Stanford University

CS 224N provides an in-depth introduction to neural networks for NLP, focusing on end-to-end neural models. The course covers topics such as word vectors, recurrent neural networks, and transformer models, among others.

Course Page

Overview

Natural language processing (NLP) is a crucial part of artificial intelligence (AI), modeling how people share information. In recent years, deep learning approaches have obtained very high performance on many NLP tasks. In this course, students gain a thorough introduction to cutting-edge neural networks for NLP.

Prerequisites

  • Proficiency in Python All class assignments will be in Python (using NumPy and PyTorch). If you need to remind yourself of Python, or you're not very familiar with NumPy, you can come to the Python review session in week 1 (listed in the schedule). If you have a lot of programming experience but in a different language (e.g. C/C++/Matlab/Java/Javascript), you will probably be fine.

  • College Calculus, Linear Algebra (e.g. MATH 51, CME 100) You should be comfortable taking (multivariable) derivatives and understanding matrix/vector notation and operations.

  • Basic Probability and Statistics (e.g. CS 109 or equivalent) You should know the basics of probabilities, gaussian distributions, mean, standard deviation, etc.

  • Foundations of Machine Learning (e.g. CS221, CS229, CS230, or CS124) We will be formulating cost functions, taking derivatives and performing optimization with gradient descent. If you already have basic machine learning and/or deep learning knowledge, the course will be easier; however it is possible to take CS224n without it. There are many introductions to ML, in webpage, book, and video form. One approachable introduction is Hal Daumé’s in-progress A Course in Machine Learning. Reading the first 5 chapters of that book would be good background. Knowing the first 7 chapters would be even better!

Learning objectives

What is this course about?

Natural language processing (NLP) or computational linguistics is one of the most important technologies of the information age. Applications of NLP are everywhere because people communicate almost everything in language: web search, advertising, emails, customer service, language translation, virtual agents, medical reports, politics, etc. In the last decade, deep learning (or neural network) approaches have obtained very high performance across many different NLP tasks, using single end-to-end neural models that do not require traditional, task-specific feature engineering. In this course, students will gain a thorough introduction to cutting-edge research in Deep Learning for NLP. Through lectures, assignments and a final project, students will learn the necessary skills to design, implement, and understand their own neural network models, using the Pytorch framework.

“Take it. CS221 taught me algorithms. CS229 taught me math. CS224N taught me how to write machine learning models.” – A CS224N student on Carta

Textbooks and other notes

Reference Texts

The following texts are useful, but none are required. All of them can be read free online.

If you have no background in neural networks but would like to take the course anyway, you might well find one of these books helpful to give you more background:

Other courses in Natural Language Processing

11-411/611 Natural Language Processing

Spring 2021

Carnegie Mellon University

CSE 447 and 517 Natural Language Processing

Winter 2022

University of Washington

CS 124: From Languages to Information

Winter 2023

Stanford University

COS 484: Natural Language Processing

Spring 2023

Princeton University

Courseware availability

Lecture slides and notes available at Schedule

Videos of Winter 2019 offering available on YouTube

Assignments available at Schedule

Course Materials available at Schedule

Covered concepts