rollit.zscore: Python Rolling Z-Score for NumPy Arrays
The rollit.zscore() function computes the **rolling z-score** of the last element in each sliding window. This is highly optimized for streaming outlier detection, anomaly flagging, and signal processing without needing Pandas.
Formula
For each window frame, the z-score of the last value is calculated using the sample statistics of that window:
Where xlast is the final element of the sliding window, μ is the window mean, and σ is the window sample standard deviation (ddof=1). If σ = 0, the z-score is returned as NaN.
Parameters
arr (np.ndarray)
The input 1D flat NumPy array. It should contain numeric values.
window (int)
The number of items to look back to calculate statistics. Must be an integer >= 2.
min_periods (int | None, optional)
Minimum number of valid observations required in the window. If set, z-scores are calculated over windows with partial NaNs using valid elements.
Stream Processing Design: Calculating a standard global z-score requires future information. By calculating the z-score of only the last element in a rolling window, rollit.zscore() is mathematically valid for real-time streaming data since it never leaks future information into the past.
Usage for Outlier Detection
import numpy as np
import rollit
# Generate flat 1D array of data (e.g. sensor stream with an anomaly)
stream = np.array([100.0, 101.0, 100.5, 102.0, 150.0, 101.0])
# Compute rolling z-score of last elements in a sliding window (window = 3)
# Z = (x_last - mean) / std
zscores = rollit.zscore(stream, window=3)
print("Stream values:", stream)
print("Rolling z-scores:", zscores)
# Output: [ -0.58, 1.39, 1.09, -1.36 ]
# Note the high value at index 4 (150.0) yields a high z-score!