Important
The former name of this package is vf (Various Functions). It has been renamed to utils.
Utilities (Various functions)
The utils package provides supporting functions.
This package provides a collection of utility functions for various purposes. It is designed to be expanded with more functions over time.
- Main Features:
idx: Get the index of val from array arr
parse_datetime: Parses an input from various formats into a datetime
storm_idx: Identify storms in Dst based
fixfill: Fix invalid values in array
|
Fix invalid values in time-series data using NaN replacement or interpolation. |
|
Get the index of val from array arr with an optional tolerance. |
|
Parses a datetime input from various formats into a Python datetime object. |
|
Identify storms in Dst based on a threshold, then either return their "onset" or "minimum Dst" index. |
Functions
- rbamlib.utils.idx(arr, val, tol=None)
Get the index of val from array arr with an optional tolerance.
If tol is specified and the minimum absolute difference between any array element and val is greater than tol, returns NaN. Without tol, returns the index of the nearest value in arr to val.
- Parameters:
arr (array_like) – Input array.
val (float) – The value to search for.
tol (float, optional) – Tolerance for the index selection. If not provided, the nearest index is returned.
- Returns:
The index of val in arr if within tol when specified, otherwise the index of the nearest value. Returns NaN if the condition is not met or val is not found within tol.
- Return type:
int or NaN
Examples
>>> arr = np.array([1, 2, 3, 4, 5]) >>> idx(arr, 3.1) 2 >>> idx(arr, 3.1, tol=0.05) nan
- rbamlib.utils.parse_datetime(date_input)
Parses a datetime input from various formats into a Python datetime object.
- Parameters:
date_input (str or datetime) –
Input date in various formats, such as:
’2025010112’ → YYYYMMDDHH
’2025-01-01’ → YYYY-MM-DD
’20250101’ → YYYYMMDD
’20250101T12:00’ → ISO-like format
’2025-01-01T12:00’ → ISO-like format
’2025-01-01 12:30’ → Standard format
’01-01-2025’ → European format
’Jan 01, 2025’ → Human-readable
- Returns:
Parsed datetime object.
- Return type:
datetime.datetime
- Raises:
ValueError – If the input format is invalid.
- rbamlib.utils.storm_idx(time, Dst, threshold=-40.0, gap_hours=1.0, method='onset')
Identify storms in Dst based on a threshold, then either return their “onset” or “minimum Dst” index.
- Parameters:
time (1D array-like of datetime.datetime) – Strictly increasing times.
Dst (1D array-like of float) – Dst index at the same times as time.
threshold (float, optional) – Storm threshold. Values below are considered “in a storm”.
gap_hours (float, optional) – If the time difference to the previous storm point is less than this, we treat it as the same storm. Otherwise, we start a new storm region.
method (str, optional) –
‘onset’: Return the “start index” for each storm region (after backtracking).
’minimum’: Return the single index at which Dst is minimal within each region.
- Returns:
storm_indices – Indices in time (and Dst), one per identified storm, either onset or min.
- Return type:
list of int
Steps
Find all times where Dst < threshold.
Group those indices into contiguous “storm regions” separated by at least gap_hours.
For each storm region:
If
method='onset':Take the earliest index in that region (the threshold crossing),
Backtrack to where Dst >= 0 or index=0. That is the final “onset.”
If
method='minimum': Return the index in that region where Dst is minimum. Double dips remain in the same region, so only one min is reported per region.
Examples
>>> import datetime >>> from rbamlib.web import omni >>> time, Dst = omni('20131001', '20131101', {'Dst'}) >>> # ONSET method >>> s_onsets = storms_idx(time, Dst, threshold=-40, gap_hours=1.0, method='onset') >>> # MINIMUM method >>> s_mins = storms_idx(time, Dst, threshold=-40, gap_hours=1.0, method='minimum')
- rbamlib.utils.fixfill(time, data, fillval, method='nan', fillval_mode='eq')
Fix invalid values in time-series data using NaN replacement or interpolation.
This function identifies missing or invalid data based on a fillval rule, marks those values as NaN, and optionally interpolates over them using a linear scheme.
- Parameters:
time (ndarray of datetime.datetime) – 1D array of strictly increasing datetime objects.
data (ndarray) – 1D array of numerical values corresponding to time.
fillval (float) – The fill value or threshold to identify missing data.
method (str, optional) –
‘nan’: Replace missing values with NaN.
’interp’: Replace and linearly interpolate missing values.
fillval_mode (str, optional) –
‘eq’: treat fillval as an exact match, equal.
’gt’: treat all data >= fillval as missing.
’lt’: treat all data <= fillval as missing.
- Returns:
fixed_data – The cleaned data array with missing values replaced or interpolated.
- Return type:
ndarray
Examples
>>> time = [datetime(2023,1,1,0,0) + timedelta(minutes=5*i) for i in range(6)] >>> data = np.array([1.0, 2.0, 999, 4.0, 999, 6.0])
Mark fill values as NaN:
>>> fixfill(time, data, fillval=999, method='nan', fillval_mode='equal') array([ 1., 2., nan, 4., nan, 6.])
Replace fill values and interpolate linearly:
>>> fixfill(time, data, fillval=999, method='interp', fillval_mode='equal') array([1., 2., 3., 4., 5., 6.])