Parameter Efficient Fine Tuning Notes
Explaination of different Paramete Efficient Finetuning Techniques
Explaination of different Paramete Efficient Finetuning Techniques
A brief summary of PyTorch's implementation of Distributed Data Parallel DDP.
A brief summary of different distributed training strategies used to train LLMs.
In this article, I will give you a brief overview of Named Entity Recognition (NER), its importance in information extraction, few approaches to perform NER, and at the end will also show you how t...
Let us dive into the BERT's architecture and details of formulating Question Answering NLP task for transformer models.
In this article I will discuss an efficient abstractive text summarization approach using GPT-2 on PyTorch with the CNN/Daily Mail dataset.
YOLO has been a very popular and fast object detection algorithm, but unfortunately not the best-performing. In this article I will highlight simple training heuristics and small architectural chan...
title: ‘Essential Mechanics of Distributed Deep Learning’ description: Principles that every distributed training algorithm follows. date: 2025-06-13 00:55:00 +0530 author: skrohit categories: [dis...
In this article, we will discuss different variants of self attention that have been designed specifically to overcome memory bandwidth limitations of modern GPUs and improve transformer's decoding...
A brief note on context parallelism in transformers.