About

Research notes about agi

This site collects reading notes, paper breakdowns, and working hypotheses around multi-modal understanding, generation, training systems, and model design.

What this archive covers

The focus is current research in vision-language models and adjacent systems: how they are trained, how they fuse modalities, and where their design tradeoffs show up.

Topics

Multi-modal understanding, multi-modal generation, training recipes, cross-attention design, token compression, and alignment strategies.

Approach

Short summaries, architecture-first reading, and practical comparisons between methods, rather than generic commentary.

Profiles

GitHub X / Twitter

Update the profile links through your deployment environment variables before publishing.