this is a space to document and distill my thoughts on publications and articles within the research community. literature reviews are personal and messy; blog post are intended for a wider audience (marked with a dot).

i write for my own edification; and posts are subject to my own interest for the topic at the time of writing.

maximize gpu utilization

August 29, 2024

i read two posts [1, 2] which led to some notes, which led to this post. --- Most people are resource-constrained, and even those who aren't still need to utilize their resources effectively. Optimizing deep learning systems is therefore crucial, especially as models grow larger and more complex. To do this effectively, we need to understand the kinds of constraints that our system may suffer under, and how these constraints interact with different aspects of model architecture and training pipelines. Typically, the accelerator of choice here is a GPU, and...


the bitter transformer lesson

July 24, 2024

Richard Sutton's The Bitter Lesson is a fantastic piece that I highly recommend you read. He posits that progress in AI research over the past 70 years fundamentally boils down to two principles: - Develop general methods, discarding assumptions and attempts at modeling intelligence. - Leverage computation by scaling data and compute resources. This approach has proven successful across prominent ML fields, including computer vision, reinforcement learning, and speech recognition. The latest example is the astounding progress in NLP. As available compute increases at an extraordinary scale, leveraging it consistently...