perda.core_data_structures.joins#
- perda.core_data_structures.joins.inner_join(left_ts, left_val, right_ts, right_val, *, tolerance)[source]#
Inner join: keep only left timestamps that have a matching right timestamp within tolerance.
Process: 1. For each left timestamp, find the closest right timestamp 2. Keep the left timestamp only if the distance is within tolerance 3. Match right values to the kept left timestamps
- Parameters:
left_ts (np.ndarray) – Timestamps for left series
left_val (np.ndarray) – Values for left series
right_ts (np.ndarray) – Timestamps for right series
right_val (np.ndarray) – Values for right series
tolerance (float) – Maximum allowed distance between left and right timestamps for a match. Timestamps with distance > tolerance are dropped.
- Return type:
Tuple[ndarray,ndarray,ndarray]- Returns:
timestamps (np.ndarray) – Subset of left timestamps that have matches within tolerance
left_values (np.ndarray) – Left values at the matched timestamps
right_values (np.ndarray) – Right values interpolated to the matched timestamps
- perda.core_data_structures.joins.left_join(left_ts, left_val, right_ts, right_val)[source]#
Left join: keep all left timestamps, match and interpolate right values.
- Parameters:
left_ts (np.ndarray) – Timestamps for left series (these are kept exactly)
left_val (np.ndarray) – Values for left series
right_ts (np.ndarray) – Timestamps for right series (these will be matched to left)
right_val (np.ndarray) – Values for right series
interpolate (bool, optional) – If True, interpolate right values to fill all left timestamps. If False, only use matched values (NaN for unmatched). Default is True.
- Return type:
Tuple[ndarray,ndarray,ndarray]- Returns:
timestamps (np.ndarray) – The left timestamps (unchanged)
left_values (np.ndarray) – The left values (unchanged)
right_values (np.ndarray) – Right values matched/interpolated to left timestamps
Notes
Process: 1. For each right timestamp, find the closest left timestamp 2. If multiple right timestamps map to the same left timestamp, average them 3. Interpolate right values to fill remaining left timestamps (if interpolate=True)
- perda.core_data_structures.joins.outer_join(left_ts, left_val, right_ts, right_val, *, drop_nan=True, fill=0.0)[source]#
Outer join: union of timestamps with linear interpolation.
- Parameters:
left_ts (np.ndarray) – Timestamps for left series
left_val (np.ndarray) – Values for left series
right_ts (np.ndarray) – Timestamps for right series
right_val (np.ndarray) – Values for right series
drop_nan (bool, optional) – If True, drop rows where either series has NaN after interpolation. Default is True.
fill (float, optional) – Fill value for NaNs when drop_nan is False. Default is 0.0.
- Return type:
Tuple[ndarray,ndarray,ndarray]- Returns:
timestamps (np.ndarray) – Union of all timestamps
left_values (np.ndarray) – Left values interpolated to union timestamps
right_values (np.ndarray) – Right values interpolated to union timestamps