Research notes about agi

Essays and field notes about engineering decisions, debugging, product tradeoffs, and the systems behind shipping software.

Mar 17, 2026

A detailed reading note on Kimi K2.5, native multimodal training, and parallel agent orchestration.

Multi-Modal UnderstandingTrainingModel Architecture

Mar 14, 2026

A short framework for reading cross-attention design choices in modern vision-language model papers.

Model Architecture

Mar 10, 2026

Notes on reducing visual token load while keeping cross-modal reasoning stable in large VLMs.

Multi-Modal Understanding

Mar 6, 2026

A working summary of the training choices that most affect stability, convergence, and downstream transfer.

Training

Feb 28, 2026

A concise look at why unified latent representations keep appearing in modern image, video, and audio generation systems.

Multi-Modal Generation