deep-learning 7
- Notes on PyTorch's Distributed Data Parallel (DDP)
- Different Transformers Attention Variants
- Distributed training technologies for Transformers: Overview
- Named Entity Recognition (NER) as Machine Reading Comprehension (MRC)
- Train BERT for Question Answering Task
- Abstractive Text Summarization with GPT2
- How to Improve YOLOv3