AI researcher building LLMs for game theory, strategic reasoning, and structured outputs. Fine-tuning Qwen models with GRPO, LoRA & RLVR.
A full research stack: benchmark dataset → fine-tuned solver → GRPO-trained reasoner → formulator → live demos
The first comprehensive, RLVR-ready game theory dataset for LLM training and evaluation. 2,913 computationally verified problems spanning 10 categories and 35+ subcategories — each with a natural-language statement, step-by-step solution, concise answer, and machine-checkable verification object.
Fine-tuned and GRPO-trained language models on HuggingFace Hub
Reinforcement-learning fine-tuned (GRPO) Qwen2.5-7B for game theory reasoning. Trained on GameTheory-Bench to solve Nash equilibria, dominant strategies, and multi-step strategic problems.
Supervised fine-tuned Qwen2.5-7B for solving game theory problems including Nash equilibria and economic games. Based on the GameTheory-Bench dataset.
Specialised model for converting natural-language strategic scenarios into formal game-theoretic representations — the formulation step in the full pipeline.
LoRA adapter on Qwen2.5-3B-Instruct fine-tuned for reliable structured JSON output generation — useful for tool-use and agentic pipelines.
Ultra-compact Qwen2-0.5B fine-tuned for function-calling and calculator tool use. Explores how small models can reliably perform structured tool invocations.
Open datasets published for the research community
2,913 computationally verified game theory problems spanning 10 categories and 35+ subcategories. The first RLVR-ready game theory benchmark for LLMs, with machine-checkable verification objects.
Original business-strategy scenario dataset. Superseded by the broader GameTheory-Bench collection, which covers 6 domains including 220+ business scenarios with full formulation steps.
Interactive demos and live applications
Interactive demo for the GameTheory-Solver model. Enter a game theory problem and watch the model find Nash equilibria and optimal strategies step by step.
Conversational interface for the GameTheory models. Ask strategic questions, explore dominant strategies, and reason through multi-player games interactively.
Activation visualisation and interpretability tool for transformer models. Explore attention patterns, hidden states, and mechanistic interpretability of Qwen models.
Research blog documenting the methodology, experiments, and findings from training LLMs on game theory — covering GRPO, RLVR, and fine-tuning insights.
AI medical reasoning application exploring LLM capabilities in clinical and diagnostic reasoning tasks.