Research Notes

Blog

Notes on frontier AI safety, cybersecurity agents, benchmark design, and trustworthy evaluation.

Latest Posts

Building Trustworthy Cyber Agent Evaluations

A short opening note on why realistic cyber-agent evaluation needs end-to-end ranges, auditable harnesses, and attention to evaluation awareness.