GEO-Bench-2: A Capability-Aware Benchmark for Geospatial Foundation Models

GEO-Bench-2: A Capability-Aware Benchmark
for Geospatial Foundation Models

Naomi Simumba^1†, Nils Lehmann^2,9†, Paolo Fraccaro^1†‡, Hamed Alemohammad³, Geeth De Mel¹, Salman Khan⁵, Manil Maskey⁶, Nicolas Longepe⁷, Xiao Xiang Zhu^2,9, Hannah Kerner⁴, Juan Bernabe-Moreno¹, Alexandre Lacoste^8‡

1 IBM Research Europe 2 Technical University Munich 3 Clark University 4 Arizona State University
5 MBZUAI 6 NASA IMPACT 7 ESA Φ-lab 8 ServiceNow AI Research 9 Munich Center for Machine Learning (MCML)
† Equal contribution ‡ Corresponding author

Related projects

GEO-Bench-2 is a large-scale, capability-aware benchmark for evaluating Geospatial Foundation Models through fine-tuning-based evaluation across diverse sensing modalities, temporal contexts, and downstream applications. It emphasizes open licensing, reproducibility, and capability-specific evaluation, enabling the community to measure progress in perception, reasoning, and generalization within the geospatial domain.

For zero-shot evaluation of Geospatial Vision-Language Models, please refer to our complementary benchmark GEO-Bench-VLM.