Research notes about agi
This site collects reading notes, paper breakdowns, and working hypotheses around multi-modal understanding, generation, training systems, and model design.
What this archive covers
The focus is current research in vision-language models and adjacent systems: how they are trained, how they fuse modalities, and where their design tradeoffs show up.
Topics
Multi-modal understanding, multi-modal generation, training recipes, cross-attention design, token compression, and alignment strategies.
Approach
Short summaries, architecture-first reading, and practical comparisons between methods, rather than generic commentary.
Profiles
Update the profile links through your deployment environment variables before publishing.