PyTorch Memory Deep Dive: view, reshape, transpose, permute and the Contiguity Puzzle

A practical deep dive into how PyTorch tensors use storage, stride, views and contiguity.

Mar 20, 2026

Parameter Efficient Fine Tuning Notes

Explaination of different Paramete Efficient Finetuning Techniques

Jun 15, 2025

Notes on PyTorch's Distributed Data Parallel (DDP)

A brief summary of PyTorch's implementation of Distributed Data Parallel DDP.

May 13, 2025

Zero Redunduncy Optimizer (ZeRO): Paper Summary

A brief note on ZeRO's workings.

Sep 28, 2024

Distributed training technologies for Transformers: Overview

A brief summary of different distributed training strategies used to train LLMs.

Aug 30, 2024

Named Entity Recognition (NER) as Machine Reading Comprehension (MRC)

In this article, I will give you a brief overview of Named Entity Recognition (NER), its importance in information extraction, few approaches to perform NER, and at the end will also show you how to implement NER as an MRC problem.

Aug 20, 2021

Train BERT for Question Answering Task

Let us dive into the BERT's architecture and details of formulating Question Answering NLP task for transformer models.

Jan 14, 2021

Abstractive Text Summarization with GPT2

In this article I will discuss an efficient abstractive text summarization approach using GPT-2 on PyTorch with the CNN/Daily Mail dataset.

Aug 15, 2020

How to Improve YOLOv3

YOLO has been a very popular and fast object detection algorithm, but unfortunately not the best-performing. In this article I will highlight simple training heuristics and small architectural changes that can make YOLOv3 perform better than models like Faster R-CNN and Mask R-CNN.

May 20, 2020