Claire Zhao
Machine Learning
Mathematics
Low Precision LLM Pre-training with NVFP4
mixed-precision
quantization
engineering
MoE Kernels
Fast GPU Kernel for Mixture of Expert Training
infra
gpu kernels
engineering
Time Reversal SDE in Diffusion Models
Heurestic for reversing time in diffusion process.
ml-theory
diffusion model
sde
The Fokker Planck Equation
Switching lens between SDE and operator views of the Fokker-Planck equation in diffusion models.
ml-theory
diffusion model
old-blog
Optimal Transportation and Diffusion Models
Switching lens between SDE and operator views of the Fokker-Planck equation in diffusion models.
ml-theory
diffusion model
old-blog
Matrix Calculus
Matrix derivative, Laplacian, polar body, convexity theorems
math
Operator Identities
Concerning extreme eigenvalues of some linear operators between Euclidean spaces.
math
Primal Dual Langevin Monte Carlo Algorithm
ml-theory
optimization
old-blog
Claire Zhao © 2026