Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Paper/Article] WeatherMesh-2 Model #130

Open
jacobbieker opened this issue Jan 20, 2025 · 2 comments
Open

[Paper/Article] WeatherMesh-2 Model #130

jacobbieker opened this issue Jan 20, 2025 · 2 comments
Labels
enhancement New feature or request

Comments

@jacobbieker
Copy link
Member

Arxiv/Blog/Paper Link

https://windbornesystems.com/blog/weathermesh-2-technical-blog

Detailed Description

Interesting model trained on only 4090s apparently, and 180M parameters, but good performance on forecasting and able to swap different encoders, decoders, and processors together. Trained all the way out to 6 days during training. Most processing is done in the latent space with pure transformer blocks, so there isn't a decode/encode step each time, to reduce error.

U-net encoding to the latent space. Different encoders for surface vs pressure level variables. Total latent space goes from low pressure to high pressure, with the bottom layer being surface variables.

Uses CPU checkpointing to move activations to CPU when doing long rollouts during training.

Context

Good, interesting performance from a relatively small model and seemingly relatively easy to train. Unfortunately, doesn't seem like the model will be open sourced, but an okay amount of detail in the report.

@jacobbieker jacobbieker added the enhancement New feature or request label Jan 20, 2025
@Brayden-Zhang
Copy link

Brayden-Zhang commented Jan 21, 2025

Main change in the architecture (compared to WM-1) is using neighborhood attention instead of Swin transformers.

I have a draft implementation of the key parts of the architecture based on the blog Brayden-Zhang/WeatherMesh, although I'm not totally sure about the correctness.

@jacobbieker
Copy link
Member Author

Yeah, I noticed that too! And cool repo, yeah since they don't release that much detail, its always hard to really figure out how correct it is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants