-
-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to new nwp-consumer #697
Comments
Started doing this in #715 |
This is for version 1.0.5. |
Version 1.0.5 is 331 MB (4556 files), where as 0.5.33 is 34.3MB (182 files) |
This is because it pulls in all the ECMWF live data, which covers both the UK and India. Crops are taken when the data is loaded anyway by the forecasters so it shouldn't be a problem downstream. |
ok, I could clip to UK, before it does some regridding to help |
NWP Consumer 1.0.7 works for PVnet 2.4.18 and for PVNet DA 2.4.18 for ECMWF EDIT: not yet! |
Made a bug report here - joblib/joblib#1637, just to see if anyone can help |
PVnet and PNVnet DA works for 2.4.19 |
PVnet on dev, makes the foreacst look weird. need to investigate |
tested ECMWF India and saved to s3://india-nwp-development/ecmwf/data//2025010706.zarr/ |
This should solve PVnet 19.30 bug |
UKV does not work yet, for new nwp-consumer |
u10 values looks quite different too. I wonder if we arent taking surface variables or something like htat? |
Pre refactor consumer (https://github.com/openclimatefix/nwp-consumer/blob/612bb6f9dbd09e52283f966485a1415338826ccb/src/nwp_consumer/internal/inputs/noaa/aws.py#L93) # URLs
filename=f"gfs.t{it.hour:02}z.pgrb2.1p00.f{step:03}"
url=f"{self.baseurl}/gfs.{it.strftime('%Y%m%d')}/{it.hour:02}/atmos"
# Process
# * Splits files, then re merges
surface = [d for d in ds if "surface" in d.coords]
heightAboveGround = [d for d in ds if "heightAboveGround" in d.coords]
isobaricInhPa = [d for d in ds if "isobaricInhPa" in d.coords]
for i, d in enumerate(surface):
unwanted_variables = [v for v in d.data_vars if v not in self.parameters]
surface[i] = d.drop_vars(unwanted_variables)
for i, d in enumerate(heightAboveGround):
unwanted_variables = [v for v in d.data_vars if v not in self.parameters]
heightAboveGround[i] = d.drop_vars(unwanted_variables)
for i, d in enumerate(isobaricInhPa):
unwanted_variables = [v for v in d.data_vars if v not in self.parameters]
isobaricInhPa[i] = d.drop_vars(unwanted_variables)
surface_merged = xr.merge(surface, compat="override").drop_vars(
["unknown_surface_instant", "valid_time"],
errors="ignore",
)
del surface
hag_merged = xr.merge(heightAboveGround).drop_vars("valid_time", errors="ignore")
del heightAboveGround
iso_merged = xr.merge(isobaricInhPa).drop_vars("valid_time", errors="ignore")
del isobaricInhPa
total_ds = (
xr.merge([surface_merged, hag_merged, iso_merged])
.rename({"time": "init_time"})
.expand_dims("init_time")
.expand_dims("step")
.transpose("init_time", "step", ...)
.sortby("step")
.chunk({"init_time": 1, "step": 1})
)
del surface_merged, hag_merged, iso_merged Refactored consumer: # Process
dss: list[xr.Dataset] = cfgrib.open_datasets(
path.as_posix(),
backend_kwargs={
"squeeze": True,
"ignore_keys": {
"levelType": ["isobaricInhPa", "depthBelowLandLayer", "meanSea"],
},
"errors": "raise",
"indexpath": "", # TODO: Change when above TODO is resolved
},
)
processed_das: list[xr.DataArray] = []
for i, ds in enumerate(dss):
ds = entities.Parameter.rename_else_drop_ds_vars(
ds=ds,
allowed_parameters=NOAAS3RawRepository.model().expected_coordinates.variable,
)
# Ignore datasets with no variables of interest
if len(ds.data_vars) == 0:
continue
# Ignore datasets with multi-level variables
# * This would not work without the "squeeze" option in the open_datasets call,
# which reduces single-length dimensions to scalar coordinates
if any(x not in ["latitude", "longitude" ,"time"] for x in ds.dims):
continue
da: xr.DataArray = (
ds
.drop_vars(names=[
c for c in ds.coords if c not in ["time", "step", "latitude", "longitude"]
])
.rename(name_dict={"time": "init_time"})
.expand_dims(dim="init_time")
.expand_dims(dim="step")
.to_dataarray(name=NOAAS3RawRepository.model().name)
)
da = (
da.drop_vars(
names=[
c for c in da.coords
if c not in NOAAS3RawRepository.model().expected_coordinates.dims
],
)
.transpose(*NOAAS3RawRepository.model().expected_coordinates.dims)
.assign_coords(coords={"longitude": (da.coords["longitude"] + 180) % 360 - 180})
.sortby(variables=["step", "variable", "longitude"])
.sortby(variables="latitude", ascending=False)
) There's a difference in the processing step between the two, where the pre-refactor consumer split and then re-merged, the new refactor can avoid this since it writes regionally so each dataset can be written individually. I'm investigating to see whether this logic difference could be the cause of it though. |
Temperature exists at multiple levels in the grib of course, so I'm wondering whether the wrong one (I.e. not the surface temperature) is being surfaced in the consumer. $ grib_ls -w shortName=t gfs.t00z.pgrb2.1p00.f000.grib
gfs.t00z.pgrb2.1p00.f000.grib
edition centre date dataType gridType stepRange typeOfLevel level shortName packingType
2 kwbc 20250128 fc regular_ll 0 isobaricInPa 1 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInPa 2 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInPa 4 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInPa 7 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInPa 10 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInPa 20 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInPa 40 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInPa 70 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 1 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 2 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 3 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 5 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 7 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 10 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 15 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 20 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 30 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 40 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 50 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 70 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 100 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 150 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 200 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 250 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 300 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 350 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 400 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 450 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 500 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 550 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 600 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 650 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 700 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 750 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 800 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 850 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 900 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 925 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 950 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 975 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 isobaricInhPa 1000 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 surface 0 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 tropopause 0 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 maxWind 0 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 heightAboveGround 80 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 heightAboveGround 100 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 heightAboveSea 1829 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 heightAboveSea 2743 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 heightAboveSea 3658 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 pressureFromGroundLayer 3000 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 sigma 1 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 potentialVorticity 2000 t grid_complex_spatial_differencing
2 kwbc 20250128 fc regular_ll 0 potentialVorticity 2147485648 t grid_complex_spatial_differencing
53 of 696 messages in gfs.t00z.pgrb2.1p00.f000.grib
53 of 696 total messages in 1 files |
See openclimatefix/nwp-consumer#232. No longer pulls "t" to avoid overriding "t2m". |
Detailed Description
Update to new NWP-consumer, after major refactor
Context
Possible Implementation
The text was updated successfully, but these errors were encountered: