Parameter Efficient Fine Tuning Notes

Explaination of different Paramete Efficient Finetuning Techniques

Jun 15, 2025 peft, finetuning

Notes on PyTorch's Distributed Data Parallel (DDP)

A brief summary of PyTorch's implementation of Distributed Data Parallel DDP.

May 13, 2025 distributed-training, DDP

Distributed training technologies for Transformers: Overview

A brief summary of different distributed training strategies used to train LLMs.

Aug 30, 2024 distributed-training

Named Entity Recognition (NER) as Machine Reading Comprehension (MRC)

In this article, I will give you a brief overview of Named Entity Recognition (NER), its importance in information extraction, few approaches to perform NER, and at the end will also show you how t...

Aug 20, 2021 nlp, ner

Train BERT for Question Answering Task

Let us dive into the BERT's architecture and details of formulating Question Answering NLP task for transformer models.

Jan 14, 2021 nlp, qa

Abstractive Text Summarization with GPT2

In this article I will discuss an efficient abstractive text summarization approach using GPT-2 on PyTorch with the CNN/Daily Mail dataset.

Aug 15, 2020 nlp, text-summarization

How to Improve YOLOv3

YOLO has been a very popular and fast object detection algorithm, but unfortunately not the best-performing. In this article I will highlight simple training heuristics and small architectural chan...

May 20, 2020 computer-vision

Basic_principles_distributed_training

title: ‘Essential Mechanics of Distributed Deep Learning’ description: Principles that every distributed training algorithm follows. date: 2025-06-13 00:55:00 +0530 author: skrohit categories: [dis...

Jun 13, 2025

Different Transformers Attention Variants

In this article, we will discuss different variants of self attention that have been designed specifically to overcome memory bandwidth limitations of modern GPUs and improve transformer's decoding...

Jan 22, 2025 llm-architecture, attention

Context Parallelism in Transformers: A Brief Overview

A brief note on context parallelism in transformers.

Sep 28, 2024 Context Parallelism