Blog posts

2026

Reinforcement Learning: From Value-Iteration to Q-Learning to Deep-Q-Networks (DQNs)

10 minute read

Published: January 17, 2026

Talking about different Reinforcement Learning algorithms in increasing order of complexity and how they relate to each other.

2025

Notes about vectorization in CPUs

3 minute read

Published: December 13, 2025

Some notes while thinking about vectorization and its connection with CPU clock rate.

Transferring data between GPUs in fully parallel fashion in PyTorch

10 minute read

Published: November 16, 2025

Summarising results from my experiments in transferring data (pytorch tensors) between GPUs in a fully parallel fashion, i.e. non-blocking on host CPU, and also on the sender and receiver GPUs.

Measuring CPU cache line size using Python code

7 minute read

Published: October 31, 2025

Estimating with a small experiment, the cache line size of a cpu (Apple M1 in this case) using Python code.

Understanding asynchronous computations in GPU in PyTorch

2 minute read

Published: October 11, 2025

Understanding with a small experiment, the feature of pytorch of (almost) always having computations in GPU in an asynchronous fashion.

Notes about in-place operations on tensors in PyTorch

2 minute read

Published: August 05, 2025

Notes about some possible in-place operation cases with tensors in pytorch.

Two ways to think about matrix-vector multiplication

1 minute read

Published: July 10, 2025

Useful post about how one can think about matrix-vector multiplication in two different ways, which can be useful in different contexts.

Thinking about Backpropagation and Computation Graph in Pytorch

4 minute read

Published: June 19, 2025

Guide to thinking about what happens when one calls tensor.backward() in pytorch.

Shashank Katiyar

Blog posts

2026

2025