Context Parallelism in Transformers: A Brief OverviewA brief note on context parallelism in transformers. Sep 28, 2024 Context Parallelism
Distributed Tensor (DTensor) in PyTorch: OverviewA brief note on DTensor's workings. Sep 28, 2024 DTensor, pytorch
Zero Redunduncy Optimizer (ZeRO): Paper SummaryA brief note on ZeRO's workings. Sep 28, 2024 ZeRO Optimizer