Important

The former name of this package is vf (Various Functions). It has been renamed to utils.

Utilities (Various functions)

The utils package provides supporting functions.

This package provides a collection of utility functions for various purposes. It is designed to be expanded with more functions over time.

Functions

rbamlib.utils.idx(arr, val, tol=None)

Get the index of val from array arr with an optional tolerance.

If tol is specified and the minimum absolute difference between any array element and val is greater than tol, returns NaN. Without tol, returns the index of the nearest value in arr to val.

Parameters:
  • arr (array_like) – Input array.

  • val (float) – The value to search for.

  • tol (float, optional) – Tolerance for the index selection. If not provided, the nearest index is returned.

Returns:

The index of val in arr if within tol when specified, otherwise the index of the nearest value. Returns NaN if the condition is not met or val is not found within tol.

Return type:

int or NaN

Examples

>>> arr = np.array([1, 2, 3, 4, 5])
>>> idx(arr, 3.1)
2
>>> idx(arr, 3.1, tol=0.05)
nan
rbamlib.utils.parse_datetime(date_input)

Parses a datetime input from various formats into a Python datetime object.

Parameters:

date_input (str or datetime) –

Input date in various formats, such as:

  • ’2025010112’ → YYYYMMDDHH

  • ’2025-01-01’ → YYYY-MM-DD

  • ’20250101’ → YYYYMMDD

  • ’20250101T12:00’ → ISO-like format

  • ’2025-01-01T12:00’ → ISO-like format

  • ’2025-01-01 12:30’ → Standard format

  • ’01-01-2025’ → European format

  • ’Jan 01, 2025’ → Human-readable

Returns:

Parsed datetime object.

Return type:

datetime.datetime

Raises:

ValueError – If the input format is invalid.

rbamlib.utils.storm_idx(time, Dst, threshold=-40.0, gap_hours=1.0, method='onset')

Identify storms in Dst based on a threshold, then either return their “onset” or “minimum Dst” index.

Steps

  1. Find all times where Dst < threshold.

  2. Group those indices into contiguous “storm regions” separated by at least gap_hours.

  3. For each storm region:

    • If method='onset': a) Take the earliest index in that region (the threshold crossing), b) Backtrack to where Dst >= 0 or index=0. That is the final “onset.”

    • If method='minimum': Return the index in that region where Dst is minimum. Double dips remain in the same region, so only one min is reported per region.

param time:

Strictly increasing times.

type time:

1D array-like of datetime.datetime

param Dst:

Dst index at the same times as time.

type Dst:

1D array-like of float

param threshold:

Storm threshold. Values below are considered “in a storm”.

type threshold:

float, default=-40.0

param gap_hours:

If the time difference to the previous storm point is less than this, we treat it as the same storm. Otherwise, we start a new storm region.

type gap_hours:

float, default=1.0

param method:
  • ‘onset’: Return the “start index” for each storm region (after backtracking).

  • ‘minimum’: Return the single index at which Dst is minimal within each region.

type method:

{‘onset’, ‘minimum’}, default=’onset’

returns:

storm_indices – Indices in time (and Dst), one per identified storm, either onset or min.

rtype:

list of int

Examples

>>> import datetime
>>> from rbamlib.web import omni
>>> time, Dst = omni('20131001', '20131101', {'Dst'})
>>> # ONSET method
>>> s_onsets = storms_idx(time, Dst, threshold=-40, gap_hours=1.0, method='onset')
>>> # MINIMUM method
>>> s_mins = storms_idx(time, Dst, threshold=-40, gap_hours=1.0, method='minimum')
rbamlib.utils.fixfill(time, data, fillval, method='nan', fillval_mode='eq')

Fix invalid values in time-series data using NaN replacement or interpolation.

This function identifies missing or invalid data based on a fillval rule, marks those values as NaN, and optionally interpolates over them using a linear scheme.

Parameters:
  • time (ndarray of datetime.datetime) – 1D array of strictly increasing datetime objects.

  • data (ndarray) – 1D array of numerical values corresponding to time.

  • fillval (float) – The fill value or threshold to identify missing data.

  • method ({'nan', 'interp'}, default='nan') –

    • ‘nan’: Replace missing values with NaN.

    • ’interp’: Replace and linearly interpolate missing values.

  • fillval_mode ({'eq', 'gt', 'lt'}, default='equal') –

    • ‘eq’: treat fillval as an exact match, equal.

    • ’gt’: treat all data >= fillval as missing.

    • ’lt’: treat all data <= fillval as missing.

Returns:

fixed_data – The cleaned data array with missing values replaced or interpolated.

Return type:

ndarray

Examples

>>> time = [datetime(2023,1,1,0,0) + timedelta(minutes=5*i) for i in range(6)]
>>> data = np.array([1.0, 2.0, 999, 4.0, 999, 6.0])

Mark fill values as NaN: >>> fixfill(time, data, fillval=999, method=’nan’, fillval_mode=’equal’) array([ 1., 2., nan, 4., nan, 6.])

Replace fill values and interpolate linearly: >>> fixfill(time, data, fillval=999, method=’interp’, fillval_mode=’equal’) array([1., 2., 3., 4., 5., 6.])