Important

The former name of this package is vf (Various Functions). It has been renamed to utils.

Utilities (Various functions)

The utils package provides supporting functions.

This package provides a collection of utility functions for various purposes. It is designed to be expanded with more functions over time.

Main Features:
  • idx: Get the index of val from array arr

  • parse_datetime: Parses an input from various formats into a datetime

  • storm_idx: Identify storms in Dst based

  • fixfill: Fix invalid values in array

fixfill(time, data, fillval[, method, ...])

Fix invalid values in time-series data using NaN replacement or interpolation.

idx(arr, val[, tol])

Get the index of val from array arr with an optional tolerance.

parse_datetime(date_input)

Parses a datetime input from various formats into a Python datetime object.

storm_idx(time, Dst[, threshold, gap_hours, ...])

Identify storms in Dst based on a threshold, then either return their "onset" or "minimum Dst" index.

Functions

rbamlib.utils.idx(arr, val, tol=None)

Get the index of val from array arr with an optional tolerance.

If tol is specified and the minimum absolute difference between any array element and val is greater than tol, returns NaN. Without tol, returns the index of the nearest value in arr to val.

Parameters:
  • arr (array_like) – Input array.

  • val (float) – The value to search for.

  • tol (float, optional) – Tolerance for the index selection. If not provided, the nearest index is returned.

Returns:

The index of val in arr if within tol when specified, otherwise the index of the nearest value. Returns NaN if the condition is not met or val is not found within tol.

Return type:

int or NaN

Examples

>>> arr = np.array([1, 2, 3, 4, 5])
>>> idx(arr, 3.1)
2
>>> idx(arr, 3.1, tol=0.05)
nan
rbamlib.utils.parse_datetime(date_input)

Parses a datetime input from various formats into a Python datetime object.

Parameters:

date_input (str or datetime) –

Input date in various formats, such as:

  • ’2025010112’ → YYYYMMDDHH

  • ’2025-01-01’ → YYYY-MM-DD

  • ’20250101’ → YYYYMMDD

  • ’20250101T12:00’ → ISO-like format

  • ’2025-01-01T12:00’ → ISO-like format

  • ’2025-01-01 12:30’ → Standard format

  • ’01-01-2025’ → European format

  • ’Jan 01, 2025’ → Human-readable

Returns:

Parsed datetime object.

Return type:

datetime.datetime

Raises:

ValueError – If the input format is invalid.

rbamlib.utils.storm_idx(time, Dst, threshold=-40.0, gap_hours=1.0, method='onset')

Identify storms in Dst based on a threshold, then either return their “onset” or “minimum Dst” index.

Parameters:
  • time (1D array-like of datetime.datetime) – Strictly increasing times.

  • Dst (1D array-like of float) – Dst index at the same times as time.

  • threshold (float, optional) – Storm threshold. Values below are considered “in a storm”.

  • gap_hours (float, optional) – If the time difference to the previous storm point is less than this, we treat it as the same storm. Otherwise, we start a new storm region.

  • method (str, optional) –

    • ‘onset’: Return the “start index” for each storm region (after backtracking).

    • ’minimum’: Return the single index at which Dst is minimal within each region.

Returns:

storm_indices – Indices in time (and Dst), one per identified storm, either onset or min.

Return type:

list of int

Steps

  1. Find all times where Dst < threshold.

  2. Group those indices into contiguous “storm regions” separated by at least gap_hours.

  3. For each storm region:

    • If method='onset':

      • Take the earliest index in that region (the threshold crossing),

      • Backtrack to where Dst >= 0 or index=0. That is the final “onset.”

    • If method='minimum': Return the index in that region where Dst is minimum. Double dips remain in the same region, so only one min is reported per region.

Examples

>>> import datetime
>>> from rbamlib.web import omni
>>> time, Dst = omni('20131001', '20131101', {'Dst'})
>>> # ONSET method
>>> s_onsets = storms_idx(time, Dst, threshold=-40, gap_hours=1.0, method='onset')
>>> # MINIMUM method
>>> s_mins = storms_idx(time, Dst, threshold=-40, gap_hours=1.0, method='minimum')
rbamlib.utils.fixfill(time, data, fillval, method='nan', fillval_mode='eq')

Fix invalid values in time-series data using NaN replacement or interpolation.

This function identifies missing or invalid data based on a fillval rule, marks those values as NaN, and optionally interpolates over them using a linear scheme.

Parameters:
  • time (ndarray of datetime.datetime) – 1D array of strictly increasing datetime objects.

  • data (ndarray) – 1D array of numerical values corresponding to time.

  • fillval (float) – The fill value or threshold to identify missing data.

  • method (str, optional) –

    • ‘nan’: Replace missing values with NaN.

    • ’interp’: Replace and linearly interpolate missing values.

  • fillval_mode (str, optional) –

    • ‘eq’: treat fillval as an exact match, equal.

    • ’gt’: treat all data >= fillval as missing.

    • ’lt’: treat all data <= fillval as missing.

Returns:

fixed_data – The cleaned data array with missing values replaced or interpolated.

Return type:

ndarray

Examples

>>> time = [datetime(2023,1,1,0,0) + timedelta(minutes=5*i) for i in range(6)]
>>> data = np.array([1.0, 2.0, 999, 4.0, 999, 6.0])

Mark fill values as NaN:

>>> fixfill(time, data, fillval=999, method='nan', fillval_mode='equal')
array([ 1.,  2., nan,  4., nan,  6.])

Replace fill values and interpolate linearly:

>>> fixfill(time, data, fillval=999, method='interp', fillval_mode='equal')
array([1., 2., 3., 4., 5., 6.])