perda.core_data_structures.joins#

perda.core_data_structures.joins.inner_join(left_ts, left_val, right_ts, right_val, *, tolerance)[source]#

Inner join: keep only left timestamps that have a matching right timestamp within tolerance.

Process: 1. For each left timestamp, find the closest right timestamp 2. Keep the left timestamp only if the distance is within tolerance 3. Match right values to the kept left timestamps

Parameters:
  • left_ts (np.ndarray) – Timestamps for left series

  • left_val (np.ndarray) – Values for left series

  • right_ts (np.ndarray) – Timestamps for right series

  • right_val (np.ndarray) – Values for right series

  • tolerance (float) – Maximum allowed distance between left and right timestamps for a match. Timestamps with distance > tolerance are dropped.

Return type:

Tuple[ndarray, ndarray, ndarray]

Returns:

  • timestamps (np.ndarray) – Subset of left timestamps that have matches within tolerance

  • left_values (np.ndarray) – Left values at the matched timestamps

  • right_values (np.ndarray) – Right values interpolated to the matched timestamps

perda.core_data_structures.joins.left_join(left_ts, left_val, right_ts, right_val)[source]#

Left join: keep all left timestamps, match and interpolate right values.

Parameters:
  • left_ts (np.ndarray) – Timestamps for left series (these are kept exactly)

  • left_val (np.ndarray) – Values for left series

  • right_ts (np.ndarray) – Timestamps for right series (these will be matched to left)

  • right_val (np.ndarray) – Values for right series

  • interpolate (bool, optional) – If True, interpolate right values to fill all left timestamps. If False, only use matched values (NaN for unmatched). Default is True.

Return type:

Tuple[ndarray, ndarray, ndarray]

Returns:

  • timestamps (np.ndarray) – The left timestamps (unchanged)

  • left_values (np.ndarray) – The left values (unchanged)

  • right_values (np.ndarray) – Right values matched/interpolated to left timestamps

Notes

Process: 1. For each right timestamp, find the closest left timestamp 2. If multiple right timestamps map to the same left timestamp, average them 3. Interpolate right values to fill remaining left timestamps (if interpolate=True)

perda.core_data_structures.joins.outer_join(left_ts, left_val, right_ts, right_val, *, drop_nan=True, fill=0.0)[source]#

Outer join: union of timestamps with linear interpolation.

Parameters:
  • left_ts (np.ndarray) – Timestamps for left series

  • left_val (np.ndarray) – Values for left series

  • right_ts (np.ndarray) – Timestamps for right series

  • right_val (np.ndarray) – Values for right series

  • drop_nan (bool, optional) – If True, drop rows where either series has NaN after interpolation. Default is True.

  • fill (float, optional) – Fill value for NaNs when drop_nan is False. Default is 0.0.

Return type:

Tuple[ndarray, ndarray, ndarray]

Returns:

  • timestamps (np.ndarray) – Union of all timestamps

  • left_values (np.ndarray) – Left values interpolated to union timestamps

  • right_values (np.ndarray) – Right values interpolated to union timestamps