AI agent evaluation framework for multi-participant coordination tasks. Built with LangGraph, custom MCP tools, and LLM-as-a-Judge evaluation. MSc dissertation project (University of Edinburgh, 2025).