MIRAI: Evaluating LLM Agents for Event Forecasting
We introduce MIRAI, a benchmark crafted for evaluating LLM agents in temporal forecasting of international events with tool use and complex reasoning. With 59,161 unique events and 296,630 unique news articles, we curate a test set of 705 forecasting query-answer pairs.
Jul 1, 2024