Objectives
- Build a Collaborative Ocean-AI Community
- Establish Scientifically Rigorous Pathways for AI in Oceanography
- Enable Future NASA Ocean Missions Through AI Innovation
Program highlights
- Lectures and tutorials on ocean observing systems, remote sensing, and AI
- Short research talks from participants and mentors
- Team-based hackathons using ML-ready datasets
- Project and proposal driven hackathon, promote joint proposal writing during the workshop
- Early career (undergraduate, graduate and postdocs) development and mentoring
- Interaction with funding agencies and industry leaders
Click here to learn more.
Our mission is to enhance and amplify NASA oceanography efforts by leveraging modern data science, including machine learning and artificial intelligence.
This working group serves as a conduit to bridge disciplinary gaps, foster collaboration on existing projects, and initiate new endeavors.
Our objectives
- Advance the development of state-of-the-art machine learning tools designed for the analysis and synthesis of NASA’s oceanography mission data.
- Transform ocean data analysis to address the big data challenges of modern remote sensing.
- Facilitate integration of multi-mission datasets, enabling real-time applications and supporting in-depth scientific exploration of causal relationships and future mission formulation.
What Is Our Approach?
Our work will enable multi-mission data synthesis, causal investigations and scientific applications, future mission formulation. It will also aid forecasting and real-time applications.
Our Roadmap
1. Data Foundations: High-Quality, Multi-Modal Training Data
High-quality training data form the backbone of any Ocean–AI system. This pillar focuses on constructing integrated, multi-modal datasets that combine satellite observations (e.g., SWOT, PACE), high-resolution numerical simulations (such as 1/48th-degree MITgcm ocean simulation llc4320 etc.), and reanalysis products (such as ECCO) into a unified framework. Each data source contributes complementary strengths: satellites provide global coverage with realistic sampling and noise characteristics; simulations resolve fine-scale dynamics and offer physically consistent fields; reanalyses bridge observations and models to produce dynamically coherent, multi-variable estimates of the ocean state. The central challenge is not simply aggregation, but harmonization. This includes aligning spatial and temporal resolutions, reconciling variable definitions, preserving cross-variable relationships, and explicitly characterizing uncertainties such as instrument noise, sampling gaps, and model biases. Equally important is maintaining the multi-scale structure of ocean variability, from submesoscale turbulence to basin-scale circulation.
By building such curated datasets, this pillar establishes a data-centric representation of the ocean system that is both physically meaningful and machine-learning ready. These datasets enable robust training of downstream models and ensure that learned representations reflect true ocean dynamics rather than artifacts of individual data sources.
2. Data Refinement Models: Task-Specific Foundation Models
Task-specific foundation models serve as the data refinement layer of the Ocean–AI ecosystem, focusing on improving the quality, completeness, and interpretability of ocean observations. These models are trained to perform targeted operations such as gap-filling, denoising, bias correction, and feature extraction, addressing the inherent limitations of observational systems.
Ocean data are often sparse, noisy, and irregularly sampled. For example, satellite swaths contain missing regions, radiometric measurements include instrument noise, and retrieval algorithms introduce systematic biases. Task-specific models leverage spatiotemporal structure and multi-variable relationships to reconstruct missing information, suppress noise while preserving signals, and correct systematic errors. Importantly, these models can operate in both space and time, enabling reconstruction of evolving ocean fields rather than static snapshots.
Beyond data cleaning, these models enable scientific interpretation through classification and segmentation tasks. They can identify dynamically meaningful structures such as mesoscale eddies, fronts, filaments, and internal waves, translating raw data into physically interpretable features.
By transforming raw observations into analysis-ready, self-consistent datasets, this pillar creates a critical bridge between measurements and scientific applications. It ensures that downstream models and analyses are built on reliable, high-fidelity representations of the ocean state.
3. Ocean Foundation Models: Physics-Constrained, Multi-Variable Learning
Ocean foundation models represent the unifying layer of the Ocean–AI strategy, designed to learn the coupled dynamics of the ocean system from multi-variable data while incorporating physical constraints. Unlike task-specific models, these models aim to develop generalizable representations that capture relationships across variables (e.g., SSH, SST, salinity, currents) and across scales.
A key requirement is the integration of physical knowledge into learning. This may include enforcing conservation laws, embedding geophysical balances (e.g., geostrophy), or constraining model outputs using known dynamical relationships. By combining data-driven learning with physics, these models achieve both flexibility and interpretability.
Such models enable transformative capabilities. In data assimilation, they can learn nonlinear, flow-dependent relationships that improve state estimation beyond traditional methods. In dynamical studies, they provide new tools to separate and analyze processes such as wave–eddy interactions and submesoscale variability. They can also inform future flight mission design by identifying optimal sampling strategies and observational requirements. Finally, they support forecasting, offering data-driven predictions across multiple time scales.
Together, ocean foundation models act as integrated engines of ocean intelligence, bridging observations, models, and theory to advance both scientific understanding and predictive capability.