geocat.comp.ndpolyfit

geocat.comp.ndpolyfit(x: Iterable, y: Iterable, deg: int, axis: int = 0, **kwargs) -> (<MagicMock id='140459842718864'>, <MagicMock name='mock.Array' id='140459843225936'>)

An extension to numpy.polyfit function to work with multi-dimensional arrays. If y is of shape, let’s say (s0, s1, s2, s3), the axis=1, and deg=1, then the output would be (s0, 2, s2, s3). So, the function fits a first degree polynomial (because deg=1) along the second dimension (because axis=1) for every other dimension. The other change from numpy.polyfit is that this method also handles the missing values. Also, this version, has support for Dask array and chunked Dask arrays.

Parameters
  • x (array_like) – x-coordinate, an Iterable object of shape (M,), (M, 1), or (1, M) where M = y.shape(axis). It cannot have nan or missing values.

  • y (array_like) – y-coordinate, an Iterable containing the data. It could be list, numpy.ndarray, xr.DataArray, Dask array. or any Iterable convertible to numpy.ndarray. In case of Dask Array, The data could be chunked. It is recommended no to chunk along the axis provided.

  • axis (int) – the axis to fit the polynomial to. Default is 0.

  • deg (int) – degree of the fitting polynomial

  • kwargs (dict, optional) –

    Extra parameter controlling the method behavior:

    rcond (float, optional):

    Relative condition number of the fit. Refer to numpy.polyfit for further details.

    full (bool, optional):

    Switch determining nature of return value. Refer to numpy.polyfit for further details.

    w (array_like, optional):

    Weights applied to the y-coordinates of the sample points. Refer to numpy.polyfit for further details.

    cov (bool, optional):

    Determines whether to return the covariance matrix. Refer to numpy.polyfit for further details.

    missing_value (number or np.nan, optional):

    The value to be treated as missing. Default is np.nan

    meta (bool, optional):

    If set to True and the input, i.e. y, is of type xr.DataArray, the attributes associated to the input are transferred to the output.

Returns

an xarray.DataArray or numpy.ndarray containing the coefficients of the fitted polynomial.

Examples

  • Fitting a line to a one dimensional array:

>>> import numpy as np
>>> from geocat.comp.polynomial import ndpolyfit
>>> x = np.arange(10, dtype=np.float)
>>> y = 2*x + 3
>>> p = ndpolyfit(x, y, deg=1)
>>> print(p)
<xarray.DataArray (dim_0: 2)>
array([2., 3.])
Dimensions without coordinates: dim_0
Attributes:
    deg:             1
    provided_rcond:  None
    full:            False
    weights:         None
    covariance:      False
  • Fitting a second degree polynomial to a one dimensional array:

>>> y = 4*x*x + 3*x + 2
>>> p = ndpolyfit(x, y, deg=2)
>>> print(p)
<xarray.DataArray (dim_0: 3)>
array([4., 3., 2.])
Dimensions without coordinates: dim_0
Attributes:
    deg:             2
    provided_rcond:  None
    full:            False
    weights:         None
    covariance:      False
  • Fitting polynomial with missing values: Ordinarily NaN’s are treated as missing values. In this example let’s introduce a different value to indicate missing data.

>>> # Let's introduce some missing values:
>>> y[7:] = 999
>>> p = ndpolyfit(x, y, deg=2)
>>> print(p)
<xarray.DataArray (dim_0: 3)>
array([ 21.15909091, -62.14090909,  20.4       ])
Dimensions without coordinates: dim_0
Attributes:
    deg:             2
    provided_rcond:  None
    full:            False
    weights:         None
    covariance:      False
>>> # As you can see, we got a different coefficients
>>> # Now let's define 999 as missing value
>>> p = ndpolyfit(x, y, deg=2, missing_value=999)
>>> print(p)
<xarray.DataArray (dim_0: 3)>
array([4., 3., 2.])
Dimensions without coordinates: dim_0
Attributes:
    deg:             2
    provided_rcond:  None
    full:            False
    weights:         None
    covariance:      False
>>> # Now we got the coefficient we were looking for
  • Fitting polynomial with NaN as missing values: NaN is by default considered a missing value all the time

>>> import numpy as np
>>> from geocat.comp.polynomial import ndpolyfit
>>> x = np.arange(10, dtype=np.float)
>>> y = 4*x*x + 3*x + 2
>>> y[7:] = np.nan
>>> print(y)
[  2.   9.  24.  47.  78. 117. 164.  nan  nan  nan]
>>> p = ndpolyfit(x, y, deg=2)
>>> print(p)
<xarray.DataArray (dim_0: 3)>
array([4., 3., 2.])
Dimensions without coordinates: dim_0
Attributes:
    deg:             2
    provided_rcond:  None
    full:            False
    weights:         None
    covariance:      False
>>> # as you can see, despite not specifying NaN as missing value, the coefficients are properly calculated
  • Fitting a line to a multi-dimensional array

>>> y_md = np.tile(y.reshape(1, 10, 1, 1), [2, 1, 3, 4])
>>> y_md.shape
(2, 10, 3, 4)
>>> print(y)
[  2.   9.  24.  47.  78. 117. 164. 219. 282. 353.]
>>> print(y_md[1, :, 1, 1])
[  2.   9.  24.  47.  78. 117. 164. 219. 282. 353.]
>>> p = ndpolyfit(x, y_md, deg=2, axis=1)