Transformer (machine learning model)

The Transformer is a deep learning architecture that uses the attention mechanism and requires less training time than previous recurrent neural architectures. It is commonly used for training large language models on large datasets and has been applied in various fields such as natural language processing, computer vision, audio, and multi-modal processing. The core attention mechanism was proposed in 2014 for machine translation, and it has led to the development of pre-trained systems like GPTs and BERT.

8 courses cover this concept

CS 182/282A: Deep Neural Networks

Deep Learning

UC Berkeley

Fall 2022

An advanced course dealing with deep networks in the fields of computer vision, language technology, robotics, and control. It delves into the themes of deep learning, model families, and real-world applications. A strong mathematical background in calculus, linear algebra, probability, optimization, and statistical learning is necessary.

No concepts data

+ 14 more concepts

CS 230 Deep Learning

Deep Learning

Stanford University

Fall 2022

An in-depth course focused on building neural networks and leading successful machine learning projects. It covers Convolutional Networks, RNNs, LSTM, Adam, Dropout, BatchNorm, Xavier/He initialization, and more. Students are expected to have basic computer science skills, probability theory knowledge, and linear algebra familiarity.

No concepts data

+ 35 more concepts

CS 224N: Natural Language Processing with Deep Learning

Natural Language Processing

Stanford University

Winter 2023

CS 224N provides an in-depth introduction to neural networks for NLP, focusing on end-to-end neural models. The course covers topics such as word vectors, recurrent neural networks, and transformer models, among others.

No concepts data

+ 21 more concepts

CSE 490 G1 / 599 G1 Introduction to Deep Learning

Deep Learning

University of Washington

Autumn 2019

A survey course on neural network implementation and applications, including image processing, classification, detection, and segmentation. The course also covers semantic understanding, translation, and question-answering applications. It's ideal for those with a background in Machine Learning, Neural Networks, Optimization, and CNNs.

No concepts data

+ 13 more concepts

COS 484: Natural Language Processing

Natural Language Processing

Princeton University

Spring 2023

This course introduces the basics of NLP, including recent deep learning approaches. It covers a wide range of topics, such as language modeling, text classification, machine translation, and question answering.

No concepts data

+ 13 more concepts

CS 271 / BIOMEDIN 220 Artificial Intelligence in Healthcare

Artificial Intelligence

Interdisciplinary

Stanford University

Fall 2022-2023

Offered by Stanford University, this course focuses on AI applications in healthcare, exploring deep learning models for image, text, multimodal, and time-series data in the healthcare context. Topics also address AI integration challenges like interpretability and privacy.

No concepts data

+ 27 more concepts

CS231n: Deep Learning for Computer Vision

Deep Learning

Computer Vision

Stanford University

Spring 2022

This is a deep-dive into the details of deep learning architectures for visual recognition tasks. The course provides students with the ability to implement, train their own neural networks and understand state-of-the-art computer vision research. It requires Python proficiency and familiarity with calculus, linear algebra, probability, and statistics.

No concepts data

+ 55 more concepts

CSCI 1470/2470 Deep Learning

Deep Learning

Brown University

Spring 2022

Brown University's Deep Learning course acquaints students with the transformative capabilities of deep neural networks in computer vision, NLP, and reinforcement learning. Using the TensorFlow framework, topics like CNNs, RNNs, deepfakes, and reinforcement learning are addressed, with an emphasis on ethical applications and potential societal impacts.

No concepts data

+ 40 more concepts