[Paper/Article] WeatherMesh-2 Model #130

jacobbieker · 2025-01-20T18:44:16Z

Arxiv/Blog/Paper Link

https://windbornesystems.com/blog/weathermesh-2-technical-blog

Detailed Description

Interesting model trained on only 4090s apparently, and 180M parameters, but good performance on forecasting and able to swap different encoders, decoders, and processors together. Trained all the way out to 6 days during training. Most processing is done in the latent space with pure transformer blocks, so there isn't a decode/encode step each time, to reduce error.

U-net encoding to the latent space. Different encoders for surface vs pressure level variables. Total latent space goes from low pressure to high pressure, with the bottom layer being surface variables.

Uses CPU checkpointing to move activations to CPU when doing long rollouts during training.

Context

Good, interesting performance from a relatively small model and seemingly relatively easy to train. Unfortunately, doesn't seem like the model will be open sourced, but an okay amount of detail in the report.

Brayden-Zhang · 2025-01-21T22:54:09Z

Main change in the architecture (compared to WM-1) is using neighborhood attention instead of Swin transformers.

I have a draft implementation of the key parts of the architecture based on the blog Brayden-Zhang/WeatherMesh, although I'm not totally sure about the correctness.

jacobbieker · 2025-01-22T09:07:58Z

Yeah, I noticed that too! And cool repo, yeah since they don't release that much detail, its always hard to really figure out how correct it is.

jacobbieker added the enhancement New feature or request label Jan 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Paper/Article] WeatherMesh-2 Model #130

[Paper/Article] WeatherMesh-2 Model #130

jacobbieker commented Jan 20, 2025

Brayden-Zhang commented Jan 21, 2025 •

edited

Loading

jacobbieker commented Jan 22, 2025

[Paper/Article] WeatherMesh-2 Model #130

[Paper/Article] WeatherMesh-2 Model #130

Comments

jacobbieker commented Jan 20, 2025

Arxiv/Blog/Paper Link

Detailed Description

Context

Brayden-Zhang commented Jan 21, 2025 • edited Loading

jacobbieker commented Jan 22, 2025

Brayden-Zhang commented Jan 21, 2025 •

edited

Loading