The organizations¶
The National Oceanic and Atmospheric Administration (NOAA) is a leading U.S. science agency within the Department of Commerce. Its mission is to understand and predict changes in climate, weather, oceans, and coasts—and to use that knowledge to protect communities and natural resources. Predicting space weather, including geomagnetic storms, is a critical part of that mission.
DrivenData partnered directly with two key NOAA-affiliated organizations:
-
The National Centers for Environmental Information (NCEI), which maintains one of the world’s largest environmental data archives and develops tools for scientific data discovery and use.
-
The Cooperative Institute for Research in Environmental Sciences (CIRES), a joint institute of NOAA and the University of Colorado Boulder, which brings together over 800 researchers focused on advancing Earth and space science.
The challenge¶
Geomagnetic storms—caused by solar activity—can disrupt power grids, satellite operations, GPS, and communications. To reduce those risks, forecasters rely on the Disturbance Storm-time Index (Dst), a key measure of geomagnetic activity.
NOAA set out to improve real-time Dst forecasting by inviting the global data science community to help build better models. The goal: develop accurate, operationally viable models to forecast Dst for the current and following hour, using real-time solar-wind data from spacecraft like ACE and DSCOVR.
The approach¶
DrivenData partnered with NOAA, NCEI, and CIRES to design and run the MagNet: Model the Geomagnetic Field data science competition. Our work included:
-
Problem framing and dataset preparation:
We worked closely with NOAA scientists to define the forecasting task, identify operationally relevant evaluation metrics, and prepare a high-quality dataset from real-time solar-wind data.
-
Global competition management
We hosted the challenge on the DrivenData platform, providing clear instructions, a real-time leaderboard, and community support for more than 600 participating data scientists and researchers.
-
Operational realism
We designed a custom code execution environment to simulate real-world constraints, including data latency and resource limitations, ensuring that models were not only accurate but also practical for deployment. This infrastructure allowed participants to submit fully executable code, which was evaluated in conditions that mirrored NOAA’s operational forecasting environment.
-
Evaluation and verification
We implemented a rigorous, multi-stage evaluation process, and NOAA independently validated the top-performing models on recent, unseen data to confirm real-world applicability.
Comparison of real-time Disturbance Storm-time (Dst) index values with predictions from an operational machine learning model ("Dst CNN") developed through the MagNet data science competition. The model, based on the second-place winning solution, runs in real time using solar wind inputs and is now publicly available via NOAA and CIRES.
The results¶
-
High-performing forecasting models
More than 1,200 submissions were made by over 600 participants. The winning models, using techniques like LSTMs, GRUs, CNNs, and LightGBM, significantly outperformed existing benchmarks for forecasting the Dst index. An ensemble of the top solutions was able to push the state-of-the-art on unseen data, reducing error by 30% compared with the NCEI benchmark model.
-
Operational adoption and public release
Following the challenge, NOAA adopted the top-performing solutions and integrated them into operational forecasting systems. NOAA and NCEI researchers partnered with one of the winners to productionize an ensemble of the two best models and integrated it into NOAA’s High Definition Geomagnetic Model (HDGM). Real time predictions are now publicly available, advancing national capabilities for space weather prediction.
-
Scientific recognition The competition and its results were published in Space Weather, a journal of the American Geophysical Union. The paper was among the top 10% most-read in 2023—highlighting the project's impact on the research community.