Data Science: Transformers for Natural Language Processing

BERT, GPT, Deep Learning, Machine Learning, & NLP with Hugging Face, Attention in Python, Tensorflow, PyTorch, & Keras

Register for this Course

$54.99 $219.99 USD 75% OFF!

Login or signup to register for this course

Have a coupon? Click here.

Course Data

Lectures: 105
Length: 15h 10m
Skill Level: All Levels
Languages: English
Includes: Lifetime access

Course Description

Hello friends!

Welcome to Data Science: Transformers for Natural Language Processing.

Ever since Transformers arrived on the scene, deep learning hasn't been the same.

  • Machine learning is able to generate text essentially indistinguishable from that created by humans
  • We've reached new state-of-the-art performance in many NLPĀ tasks, such as machine translation, question-answering, entailment, named entity recognition, and more
  • We've created multi-modal (text and image) models that can generate amazing art using only a text prompt
  • We've solved a longstanding problem in molecular biology known as "protein structure prediction"

In this course, you will learn very practical skills for applying transformers, and if you want, detailed theory behind how transformers and attention work.

This is different from most other resources, which only cover the former.

The course is split into 3 major parts:

  1. Using Transformers
  2. Fine-Tuning Transformers
  3. Transformers In-Depth

PART 1: Using Transformers

In this section, you will learn how to use transformers which were trained for you. This costs millions of dollars to do, so it's not something you want to try by yourself!

We'll see how these prebuilt models can already be used for a wide array of tasks, including:
  • text classification (e.g. spam detection, sentiment analysis, document categorization)
  • named entity recognition
  • text summarization
  • machine translation
  • question-answering
  • generating (believable) text
  • masked language modeling (article spinning)
  • zero-shot classification

This is already very practical.

If you need to do sentiment analysis, document categorization, entity recognition, translation, summarization, etc. on documents at your workplace or for your clients - you already have the most powerful state-of-the-art models at your fingertips with very few lines of code.

One of the most amazing applications is "zero-shot classification", where you will observe that a pretrained model can categorize your documents, even without any training at all.

PART 2: Fine-Tuning Transformers

In this section, you will learn how to improve the performance of transformers on your own custom datasets. By using "transfer learning", you can leverage the millions of dollars of training that have already gone into making transformers work very well.

You'll see that you can fine-tune a transformer with relatively little work (and little cost).

We'll cover how to fine-tune transformers for the most practical tasks in the real-world, like text classification (sentiment analysis, spam detection), entity recognition, and machine translation.

PART 3: Transformers In-Depth

In this section, you will learn how transformers really work. The previous sections are nice, but a little too nice. Libraries are OK for people who just want to get the job done, but they don't work if you want to do anything new or interesting.

Let's be clear: this is very practical.

How practical, you might ask?

Well, this is where the big bucks are.

Those who have a deep understanding of these models and can do things no one has ever done before are in a position to command higher salaries and prestigious titles. Machine learning is a competitive field, and a deep understanding of how things work can be the edge you need to come out on top.

We'll also look at how to implement transformers from scratch.

As the great Richard Feynman once said, "what I cannot create, I do not understand".


  • Decent Python coding skills
  • Deep learning with CNNs and RNNs useful but not required
  • Deep learning with Seq2Seq models useful but not required
  • For the in-depth section: understanding the theory behind CNNs, RNNs, and seq2seq is very useful


  • More fine-tuning applications
  • More in-depth conceptual lectures
  • Transformers implemented from scratch

Thank you for reading and I hope to see you soon!



2 Lectures · 13min

Getting Setup

3 Lectures · 15min
  1. Where to get the code and data - instant access (01:42)
  2. How to use Github & Extra Coding Tips (Optional) (08:56)
  3. Are You Beginner, Intermediate, or Advanced? All are OK! (05:01)

Beginner's Corner

19 Lectures · 02hr 58min
  1. Transformers Section Introduction (10:14)
  2. From RNNs to Attention and Transformers - Intuition (17:01)
  3. Sentiment Analysis (10:32)
  4. Sentiment Analysis in Python (17:00)
  5. Text Generation (10:47)
  6. Text Generation in Python (11:47)
  7. Masked Language Modeling (Article Spinner) (11:37)
  8. Masked Language Modeling (Article Spinner) in Python (08:26)
  9. Named Entity Recognition (NER) (04:53)
  10. Named Entity Recognition (NER) in Python (09:49)
  11. Text Summarization (05:15)
  12. Text Summarization in Python (07:00)
  13. Neural Machine Translation (06:18)
  14. Neural Machine Translation in Python (09:50)
  15. Question Answering (07:20)
  16. Question Answering in Python (06:14)
  17. Zero-Shot Classification (05:30)
  18. Zero-Shot Classification in Python (13:47)
  19. Transformers Section Summary (04:53)

Fine-Tuning (Intermediate)

14 Lectures · 02hr 27min
  1. Fine-Tuning Section Introduction (04:30)
  2. Text Preprocessing and Tokenization Review (13:35)
  3. Models and Tokenizers (15:22)
  4. Models and Tokenizers in Python (13:16)
  5. Transfer Learning & Fine-Tuning (pt 1) (09:29)
  6. Transfer Learning & Fine-Tuning (pt 2) (10:37)
  7. Transfer Learning & Fine-Tuning (pt 3) (10:08)
  8. Fine-Tuning Sentiment Analysis and the GLUE Benchmark (12:22)
  9. Fine-Tuning Sentiment Analysis in Python (19:36)
  10. Fine-Tuning Transformers with Custom Dataset (15:04)
  11. Hugging Face AutoConfig (05:45)
  12. Fine-Tuning with Multiple Inputs (Textual Entailment) (07:16)
  13. Fine-Tuning Transformers with Multiple Inputs in Python (07:36)
  14. Fine-Tuning Section Summary (03:13)

Named Entity Recognition (NER) and POS Tagging (Intermediate)

15 Lectures · 01hr 34min
  1. Token Classification Section Introduction (06:58)
  2. Data & Tokenizer (Code Preparation) (05:04)
  3. Data & Tokenizer (Code) (07:45)
  4. Target Alignment (Code Preparation) (09:57)
  5. Create Tokenized Dataset (Code Preparation) (03:46)
  6. Target Alignment (Code) (10:09)
  7. Data Collator (Code Preparation) (03:42)
  8. Data Collator (Code) (03:15)
  9. Metrics (Code Preparation) (06:47)
  10. Metrics (Code) (05:40)
  11. Model and Trainer (Code Preparation) (02:26)
  12. Model and Trainer (Code) (03:27)
  13. POS Tagging & Custom Datasets (Exercise Prompt) (05:18)
  14. POS Tagging & Custom Datasets (Solution) (18:16)
  15. Token Classification Section Summary (02:02)

Seq2Seq and Neural Machine Translation (Intermediate)

11 Lectures · 01hr 05min
  1. Translation Section Introduction (04:34)
  2. Data & Tokenizer (Code Preparation) (05:35)
  3. Data & Tokenizer (Code) (06:16)
  4. Aside: Seq2Seq Basics (Optional) (10:39)
  5. Model Inputs (Code Preparation) (08:15)
  6. Model Inputs (Code) (08:05)
  7. Translation Metrics (BLEU Score & BERT Score) (Code Preparation) (03:52)
  8. Translation Metrics (BLEU Score & BERT Score) (Code) (05:43)
  9. Train & Evaluate (Code Preparation) (04:34)
  10. Train & Evaluate (Code) (05:00)
  11. Translation Section Summary (02:39)

Question-Answering (Advanced)

18 Lectures · 02hr 34min
  1. Question-Answering Section Introduction (04:50)
  2. Exploring the Dataset (SQuAD) (04:20)
  3. Exploring the Dataset (SQuAD) in Python (05:05)
  4. Using the Tokenizer (08:30)
  5. Using the Tokenizer in Python (11:55)
  6. Aligning the Targets (14:53)
  7. Aligning the Targets in Python (16:15)
  8. Applying the Tokenizer (09:56)
  9. Applying the Tokenizer in Python (10:39)
  10. Question-Answering Metrics (03:46)
  11. Question-Answering Metrics in Python (02:41)
  12. From Logits to Answers (21:23)
  13. From Logits to Answers in Python (16:14)
  14. Computing Metrics (05:30)
  15. Computing Metrics in Python (05:49)
  16. Train and Evaluate (02:53)
  17. Train and Evaluate in Python (05:45)
  18. Question-Answering Section Summary (04:00)

Transformers and Attention Theory (Advanced)

9 Lectures · 01hr 08min
  1. Theory Section Introduction (05:06)
  2. Basic Self-Attention (09:35)
  3. Self-Attention & Scaled Dot-Product Attention (18:02)
  4. Attention Efficiency (04:36)
  5. Attention Mask (03:56)
  6. Multi-Head Attention (07:13)
  7. Transformer Block (06:45)
  8. Positional Encodings (07:16)
  9. Encoder Architecture (06:23)

Setting Up Your Environment (Appendix/FAQ by Student Request)

2 Lectures · 37min
  1. Windows-Focused Environment Setup 2018 (20:21)
  2. How to install Numpy, Scipy, Matplotlib, Pandas, IPython, Theano, and TensorFlow (17:33)

Extra Help With Python Coding for Beginners (Appendix/FAQ by Student Request)

6 Lectures · 01hr 05min
  1. Beginner's Coding Tips (13:22)
  2. How to Code Yourself (part 1) (15:55)
  3. How to Code Yourself (part 2) (09:24)
  4. Proof that using Jupyter Notebook is the same as not using it (12:29)
  5. Python 2 vs Python 3 (04:38)
  6. Is Theano Dead? (10:04)

Effective Learning Strategies for Machine Learning (Appendix/FAQ by Student Request)

4 Lectures · 59min
  1. How to Succeed in this Course (Long Version) (10:25)
  2. Is this for Beginners or Experts? Academic or Practical? Fast or slow-paced? (22:05)
  3. What order should I take your courses in? (part 1) (11:19)
  4. What order should I take your courses in? (part 2) (16:07)

Appendix / FAQ Finale

2 Lectures · 08min
  1. What is the Appendix? (02:48)
  2. Where to get discount coupons and FREE deep learning material (05:31)
This website is using cookies. That's Fine