Skip to content

Commit

Permalink
Merge branch 'main' into main
Browse files Browse the repository at this point in the history
  • Loading branch information
josephnowak authored Jan 29, 2025
2 parents 8819f1f + 018b241 commit 3e7aa91
Show file tree
Hide file tree
Showing 26 changed files with 535 additions and 193 deletions.
1 change: 1 addition & 0 deletions doc/ecosystem.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ Geosciences
- `salem <https://salem.readthedocs.io>`_: Adds geolocalised subsetting, masking, and plotting operations to xarray's data structures via accessors.
- `SatPy <https://satpy.readthedocs.io/>`_ : Library for reading and manipulating meteorological remote sensing data and writing it to various image and data file formats.
- `SARXarray <https://tudelftgeodesy.github.io/sarxarray/>`_: xarray extension for reading and processing large Synthetic Aperture Radar (SAR) data stacks.
- `shxarray <https://shxarray.wobbly.earth/>`_: Convert, filter,and map geodesy related spherical harmonic representations of gravity and terrestrial water storage through an xarray extension.
- `Spyfit <https://spyfit.readthedocs.io/en/master/>`_: FTIR spectroscopy of the atmosphere
- `windspharm <https://ajdawson.github.io/windspharm/index.html>`_: Spherical
harmonic wind analysis in Python.
Expand Down
62 changes: 62 additions & 0 deletions doc/internals/time-coding.rst
Original file line number Diff line number Diff line change
Expand Up @@ -473,3 +473,65 @@ on-disk resolution, if possible.
coder = xr.coders.CFDatetimeCoder(time_unit="s")
xr.open_dataset("test-datetimes2.nc", decode_times=coder)
Similar logic applies for decoding timedelta values. The default resolution is
``"ns"``:

.. ipython:: python
attrs = {"units": "hours"}
ds = xr.Dataset({"time": ("time", [0, 1, 2, 3], attrs)})
ds.to_netcdf("test-timedeltas1.nc")
.. ipython:: python
:okwarning:
xr.open_dataset("test-timedeltas1.nc")
By default, timedeltas will be decoded to the same resolution as datetimes:

.. ipython:: python
:okwarning:
coder = xr.coders.CFDatetimeCoder(time_unit="s")
xr.open_dataset("test-timedeltas1.nc", decode_times=coder)
but if one would like to decode timedeltas to a different resolution, one can
provide a coder specifically for timedeltas to ``decode_timedelta``:

.. ipython:: python
timedelta_coder = xr.coders.CFTimedeltaCoder(time_unit="ms")
xr.open_dataset(
"test-timedeltas1.nc", decode_times=coder, decode_timedelta=timedelta_coder
)
As with datetimes, if a coarser unit is requested the timedeltas are decoded
into their native on-disk resolution, if possible:

.. ipython:: python
attrs = {"units": "milliseconds"}
ds = xr.Dataset({"time": ("time", [0, 1, 2, 3], attrs)})
ds.to_netcdf("test-timedeltas2.nc")
.. ipython:: python
:okwarning:
xr.open_dataset("test-timedeltas2.nc")
.. ipython:: python
:okwarning:
coder = xr.coders.CFDatetimeCoder(time_unit="s")
xr.open_dataset("test-timedeltas2.nc", decode_times=coder)
To opt-out of timedelta decoding (see issue `Undesired decoding to timedelta64 <https://github.com/pydata/xarray/issues/1621>`_) pass ``False`` to ``decode_timedelta``:

.. ipython:: python
xr.open_dataset("test-timedeltas2.nc", decode_timedelta=False)
.. note::
Note that in the future the default value of ``decode_timedelta`` will be
``False`` rather than ``None``.
2 changes: 1 addition & 1 deletion doc/user-guide/pandas.rst
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ To create a ``Dataset`` from a ``DataFrame``, use the
xr.Dataset.from_dataframe(df)
Notice that that dimensions of variables in the ``Dataset`` have now
Notice that the dimensions of variables in the ``Dataset`` have now
expanded after the round-trip conversion to a ``DataFrame``. This is because
every object in a ``DataFrame`` must have the same indices, so we need to
broadcast the data of each array to the full size of the new ``MultiIndex``.
Expand Down
52 changes: 40 additions & 12 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,23 +19,36 @@ What's New
v2025.01.2 (unreleased)
-----------------------

This release brings non-nanosecond datetime resolution to xarray. In the
last couple of releases xarray has been prepared for that change. The code had
to be changed and adapted in numerous places, affecting especially the test suite.
The documentation has been updated accordingly and a new internal chapter
on :ref:`internals.timecoding` has been added.

To make the transition as smooth as possible this is designed to be fully backwards
compatible, keeping the current default of ``'ns'`` resolution on decoding.
To opt-in decoding into other resolutions (``'us'``, ``'ms'`` or ``'s'``) the
new :py:class:`coders.CFDatetimeCoder` is used as parameter to ``decode_times``
kwarg (see also :ref:`internals.default_timeunit`):
This release brings non-nanosecond datetime and timedelta resolution to xarray.
In the last couple of releases xarray has been prepared for that change. The
code had to be changed and adapted in numerous places, affecting especially the
test suite. The documentation has been updated accordingly and a new internal
chapter on :ref:`internals.timecoding` has been added.

To make the transition as smooth as possible this is designed to be fully
backwards compatible, keeping the current default of ``'ns'`` resolution on
decoding. To opt-into decoding to other resolutions (``'us'``, ``'ms'`` or
``'s'``) an instance of the newly public :py:class:`coders.CFDatetimeCoder`
class can be passed through the ``decode_times`` keyword argument (see also
:ref:`internals.default_timeunit`):

.. code-block:: python
coder = xr.coders.CFDatetimeCoder(time_unit="s")
ds = xr.open_dataset(filename, decode_times=coder)
Similar control of the resoution of decoded timedeltas can be achieved through
passing a :py:class:`coders.CFTimedeltaCoder` instance to the
``decode_timedelta`` keyword argument:

.. code-block:: python
coder = xr.coders.CFTimedeltaCoder(time_unit="s")
ds = xr.open_dataset(filename, decode_timedelta=coder)
though by default timedeltas will be decoded to the same ``time_unit`` as
datetimes.

There might slight changes when encoding/decoding times as some warning and
error messages have been removed or rewritten. Xarray will now also allow
non-nanosecond datetimes (with ``'us'``, ``'ms'`` or ``'s'`` resolution) when
Expand All @@ -50,10 +63,19 @@ eventually be deprecated.

New Features
~~~~~~~~~~~~
- Relax nanosecond datetime restriction in CF time decoding (:issue:`7493`, :pull:`9618`).
- Relax nanosecond datetime restriction in CF time decoding (:issue:`7493`, :pull:`9618`, :pull:`9977`, :pull:`9966`, :pull:`9999`).
By `Kai Mühlbauer <https://github.com/kmuehlbauer>`_ and `Spencer Clark <https://github.com/spencerkclark>`_.
- Enable the ``compute=False`` option in :py:meth:`DataTree.to_zarr`. (:pull:`9958`).
By `Sam Levang <https://github.com/slevang>`_.
- Improve the error message raised when no key is matching the available variables in a dataset. (:pull:`9943`)
By `Jimmy Westling <https://github.com/illviljan>`_.
- Added a ``time_unit`` argument to :py:meth:`CFTimeIndex.to_datetimeindex`.
Note that in a future version of xarray,
:py:meth:`CFTimeIndex.to_datetimeindex` will return a microsecond-resolution
:py:class:`pandas.DatetimeIndex` instead of a nanosecond-resolution
:py:class:`pandas.DatetimeIndex` (:pull:`9965`). By `Spencer Clark
<https://github.com/spencerkclark>`_ and `Kai Mühlbauer
<https://github.com/kmuehlbauer>`_.
- :py:meth:`DatasetGroupBy.first` and :py:meth:`DatasetGroupBy.last` can now use ``flox`` if available. (:issue:`9647`)
By `Deepak Cherian <https://github.com/dcherian>`_.

Expand All @@ -63,6 +85,12 @@ Breaking changes

Deprecations
~~~~~~~~~~~~
- In a future version of xarray decoding of variables into
:py:class:`numpy.timedelta64` values will be disabled by default. To silence
warnings associated with this, set ``decode_timedelta`` to ``True``,
``False``, or a :py:class:`coders.CFTimedeltaCoder` instance when opening
data (:issue:`1621`, :pull:`9966`). By `Spencer Clark
<https://github.com/spencerkclark>`_.


Bug fixes
Expand Down
43 changes: 31 additions & 12 deletions xarray/backends/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@
_normalize_path,
)
from xarray.backends.locks import _get_scheduler
from xarray.coders import CFDatetimeCoder
from xarray.coders import CFDatetimeCoder, CFTimedeltaCoder
from xarray.core import indexing
from xarray.core.combine import (
_infer_concat_order_from_positions,
Expand Down Expand Up @@ -487,7 +487,10 @@ def open_dataset(
| CFDatetimeCoder
| Mapping[str, bool | CFDatetimeCoder]
| None = None,
decode_timedelta: bool | Mapping[str, bool] | None = None,
decode_timedelta: bool
| CFTimedeltaCoder
| Mapping[str, bool | CFTimedeltaCoder]
| None = None,
use_cftime: bool | Mapping[str, bool] | None = None,
concat_characters: bool | Mapping[str, bool] | None = None,
decode_coords: Literal["coordinates", "all"] | bool | None = None,
Expand Down Expand Up @@ -555,11 +558,14 @@ def open_dataset(
Pass a mapping, e.g. ``{"my_variable": False}``,
to toggle this feature per-variable individually.
This keyword may not be supported by all the backends.
decode_timedelta : bool or dict-like, optional
decode_timedelta : bool, CFTimedeltaCoder, or dict-like, optional
If True, decode variables and coordinates with time units in
{"days", "hours", "minutes", "seconds", "milliseconds", "microseconds"}
into timedelta objects. If False, leave them encoded as numbers.
If None (default), assume the same value of decode_time.
If None (default), assume the same value of ``decode_times``; if
``decode_times`` is a :py:class:`coders.CFDatetimeCoder` instance, this
takes the form of a :py:class:`coders.CFTimedeltaCoder` instance with a
matching ``time_unit``.
Pass a mapping, e.g. ``{"my_variable": False}``,
to toggle this feature per-variable individually.
This keyword may not be supported by all the backends.
Expand Down Expand Up @@ -712,7 +718,7 @@ def open_dataarray(
| CFDatetimeCoder
| Mapping[str, bool | CFDatetimeCoder]
| None = None,
decode_timedelta: bool | None = None,
decode_timedelta: bool | CFTimedeltaCoder | None = None,
use_cftime: bool | None = None,
concat_characters: bool | None = None,
decode_coords: Literal["coordinates", "all"] | bool | None = None,
Expand Down Expand Up @@ -785,7 +791,10 @@ def open_dataarray(
If True, decode variables and coordinates with time units in
{"days", "hours", "minutes", "seconds", "milliseconds", "microseconds"}
into timedelta objects. If False, leave them encoded as numbers.
If None (default), assume the same value of decode_time.
If None (default), assume the same value of ``decode_times``; if
``decode_times`` is a :py:class:`coders.CFDatetimeCoder` instance, this
takes the form of a :py:class:`coders.CFTimedeltaCoder` instance with a
matching ``time_unit``.
This keyword may not be supported by all the backends.
use_cftime: bool, optional
Only relevant if encoded dates come from a standard calendar
Expand Down Expand Up @@ -927,7 +936,10 @@ def open_datatree(
| CFDatetimeCoder
| Mapping[str, bool | CFDatetimeCoder]
| None = None,
decode_timedelta: bool | Mapping[str, bool] | None = None,
decode_timedelta: bool
| CFTimedeltaCoder
| Mapping[str, bool | CFTimedeltaCoder]
| None = None,
use_cftime: bool | Mapping[str, bool] | None = None,
concat_characters: bool | Mapping[str, bool] | None = None,
decode_coords: Literal["coordinates", "all"] | bool | None = None,
Expand Down Expand Up @@ -995,7 +1007,10 @@ def open_datatree(
If True, decode variables and coordinates with time units in
{"days", "hours", "minutes", "seconds", "milliseconds", "microseconds"}
into timedelta objects. If False, leave them encoded as numbers.
If None (default), assume the same value of decode_time.
If None (default), assume the same value of ``decode_times``; if
``decode_times`` is a :py:class:`coders.CFDatetimeCoder` instance, this
takes the form of a :py:class:`coders.CFTimedeltaCoder` instance with a
matching ``time_unit``.
Pass a mapping, e.g. ``{"my_variable": False}``,
to toggle this feature per-variable individually.
This keyword may not be supported by all the backends.
Expand Down Expand Up @@ -1150,7 +1165,10 @@ def open_groups(
| CFDatetimeCoder
| Mapping[str, bool | CFDatetimeCoder]
| None = None,
decode_timedelta: bool | Mapping[str, bool] | None = None,
decode_timedelta: bool
| CFTimedeltaCoder
| Mapping[str, bool | CFTimedeltaCoder]
| None = None,
use_cftime: bool | Mapping[str, bool] | None = None,
concat_characters: bool | Mapping[str, bool] | None = None,
decode_coords: Literal["coordinates", "all"] | bool | None = None,
Expand Down Expand Up @@ -1222,9 +1240,10 @@ def open_groups(
If True, decode variables and coordinates with time units in
{"days", "hours", "minutes", "seconds", "milliseconds", "microseconds"}
into timedelta objects. If False, leave them encoded as numbers.
If None (default), assume the same value of decode_time.
Pass a mapping, e.g. ``{"my_variable": False}``,
to toggle this feature per-variable individually.
If None (default), assume the same value of ``decode_times``; if
``decode_times`` is a :py:class:`coders.CFDatetimeCoder` instance, this
takes the form of a :py:class:`coders.CFTimedeltaCoder` instance with a
matching ``time_unit``.
This keyword may not be supported by all the backends.
use_cftime: bool or dict-like, optional
Only relevant if encoded dates come from a standard calendar
Expand Down
2 changes: 2 additions & 0 deletions xarray/backends/zarr.py
Original file line number Diff line number Diff line change
Expand Up @@ -444,6 +444,7 @@ def extract_zarr_variable_encoding(
safe_to_drop = {"source", "original_shape", "preferred_chunks"}
valid_encodings = {
"chunks",
"shards",
"compressor", # TODO: delete when min zarr >=3
"compressors",
"filters",
Expand Down Expand Up @@ -825,6 +826,7 @@ def open_store_variable(self, name):
{
"compressors": zarr_array.compressors,
"filters": zarr_array.filters,
"shards": zarr_array.shards,
}
)
if self.zarr_group.metadata.zarr_format == 3:
Expand Down
6 changes: 2 additions & 4 deletions xarray/coders.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,6 @@
"encoding/decoding" process.
"""

from xarray.coding.times import CFDatetimeCoder
from xarray.coding.times import CFDatetimeCoder, CFTimedeltaCoder

__all__ = [
"CFDatetimeCoder",
]
__all__ = ["CFDatetimeCoder", "CFTimedeltaCoder"]
Loading

0 comments on commit 3e7aa91

Please sign in to comment.