kerner-lab

Code repository for EarthShift: Benchmarking the robustness of geospatial foundation models (GFMs) to realistic distribution shifts in Earth Observation

19
2
100% credibility
Found May 30, 2026 at 19 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

EarthShift is a benchmarking testbed for evaluating the robustness and generalization capabilities of geospatial foundation models across various out-of-distribution scenarios including geographic, temporal, sensor, data source, and spatial scale shifts.

Star Growth

See how this repo grew from 19 to 19 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is earthshift?

EarthShift is a Python testbed that benchmarks how well geospatial AI models handle real-world deployment scenarios. It measures a specific problem nobody was talking about: satellite and remote sensing models tank performance when they encounter data from new geographies, time periods, sensors, or spatial scales. The pipeline lets you run standardized experiments comparing in-distribution versus out-of-distribution performance across classification and segmentation tasks. You pick a model, a shift type (sensor, temporal, geographic, scale, or data source), and it quantifies exactly how much the model's accuracy drops.

Why is it gaining traction?

The key insight here is a universal 20% performance gap across every model they tested, regardless of size or pretraining approach. GFM robustness turns out to be roughly equal to plain ImageNet models. That's a significant finding for anyone building satellite-based products. The hook is that this is the first public benchmark specifically designed for out-of-distribution testing in Earth observation, filling a gap between high benchmark scores and real deployment reliability. Researchers get paired datasets with matched in/out distributions, and developers get CLI commands to run experiments without building their own evaluation infrastructure.

Who should use this?

ML researchers evaluating foundation models for satellite applications. Teams building crop monitoring, flood detection, or land-use classification systems who need to know if their model will survive deployment. Practitioners tired of discovering distribution shift problems only after production. Dataset maintainers who want standardized robustness metrics for their geospatial benchmarks.

Verdict

With just 19 stars, EarthShift is clearly early-stage academic work rather than production-ready tooling. The 1.0% credibility score reflects this novelty, not a flaw in methodology. Documentation exists via the arxiv paper, the conda environment covers dependencies comprehensively, and the included sub-project has Docker support. If you're researching geospatial model robustness, this is worth exploring now. If you need battle-tested infrastructure for deployment, wait for community adoption to mature.

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.