Documentation
Complete technical guide for the rollit library. Learn functions, algorithms, NaN propagation rules, and specifications.
Quickstart & Installation
`rollit` requires only NumPy to run. Install it using pip:
Understanding Output Lengths
Because rolling window calculations require a complete window, the output array will always be slightly shorter than the input array. The output length rule is:
Basic Usage
import numpy as np import rollit # Generate input data data = np.array([10.0, 12.0, 15.0, 13.0, 16.0, 18.0, 20.0]) # Compute simple 3-day moving average ma = rollit.mean(data, window=3) print(ma) # Output: [12.33, 13.33, 14.67, 15.67, 18.00] # Output length: 7 - 3 + 1 = 5
Core Reductions Reference
These functions use highly optimized NumPy C-loops to run rolling reductions instantly.
rollit.mean(arr, window, min_periods=None)Computes moving average. Matches pandas df.rolling(w).mean(). →
rollit.std(arr, window, min_periods=None)Computes rolling sample standard deviation (uses Bessel's correction ddof=1 to match pandas). →
rollit.sum(arr, window, min_periods=None)Computes rolling sums.
rollit.min(arr, window) / rollit.max(...)Computes local rolling minimums and maximums within the sliding frame.
Example
import numpy as np import rollit arr = np.array([1.0, 2.0, 4.0, 8.0, 16.0]) # Moving average: μ = 1/W * ∑ x_i means = rollit.mean(arr, window=3) # [2.33, 4.67, 9.33] # Sample standard deviation (ddof=1, matching Pandas) # σ = √ [ 1/(W-1) * ∑ (x_i - μ)² ] stds = rollit.std(arr, window=3) # [1.53, 3.06, 6.11] # Moving sum sums = rollit.sum(arr, window=3) # [7.0, 14.0, 28.0] # Rolling Min/Max mins = rollit.min(arr, window=3) # [1.0, 2.0, 4.0] maxs = rollit.max(arr, window=3) # [4.0, 8.0, 16.0]
Outliers & Normalization Functions
`rollit` includes built-in specialized functions for anomalies and scaling.
rollit.zscore(arr, window, min_periods=None)Computes the rolling Z-Score of the last element in the window: (x_last - mean) / std. →
rollit.normalize(arr, window, min_periods=None)Scales the last element in the window to [0, 1] relative to local min/max: (x_last - min) / (max - min). →
rollit.apply(arr, window, fn, min_periods=None)Custom callback support (e.g. median). Runs an internal Python loop, so it is slower than vectorized C-loops. →
Example
import numpy as np import rollit prices = np.array([100.0, 102.0, 101.0, 105.0, 110.0, 95.0, 100.0]) # Z-Score of the last element in window relative to the window's stats # Z = (x_last - μ) / σ # Returns: [ -0.58, 1.39, 1.09, -1.36, -0.19 ] z = rollit.zscore(prices, window=3) # Min-Max Normalization: scales last element of window to [0, 1] # Norm = (x_last - min) / (max - min) # Returns: [ 0.5, 0.8, 1.0, 0.0, 0.33 ] normalized = rollit.normalize(prices, window=3) # Escape hatch for custom calculations (e.g. median) # Runs a standard Python loop internally; slower than C-reductions medians = rollit.apply(prices, window=3, fn=np.median)
NaN Propagation & Memory Safety
NaN Handling (`min_periods`)
By default, if a window contains a `NaN` value, the resulting calculation is `NaN`. To calculate rolling statistics over windows containing missing data, specify `min_periods` representing the minimum number of valid (non-NaN) values required.
import numpy as np import rollit # Array with missing value arr = np.array([1.0, np.nan, 3.0, 4.0, 5.0]) # Default min_periods=None requires all values in window to be non-NaN print(rollit.mean(arr, window=3)) # Output: [nan, nan, 4.0] # Setting min_periods=2 allows calculations with at least 2 non-NaN values print(rollit.mean(arr, window=3, min_periods=2)) # Output: [2.0, 3.5, 4.0] # Window 1: [1.0, NaN, 3.0] -> 2 values -> (1+3)/2 = 2.0 # Window 2: [NaN, 3.0, 4.0] -> 2 values -> (3+4)/2 = 3.5
Safe Memory Handling
Using NumPy strides (`as_strided`) is highly performant but can cause segmentation faults and crash the Python interpreter if indexing boundaries are violated. `rollit` incorporates safety measures:
- Write-Lock Flags: Returned arrays are explicitly marked as read-only. Writing to them throws a `ValueError`, protecting the underlying original array memory.
- Dimension Validation: The library validates that the window size is a positive integer and does not exceed the input array size.
Parameter Guide
All library operations expose a consistent parameter contract:
| Parameter | Type | Description |
|---|---|---|
| arr | np.ndarray | Input 1D numeric NumPy array. Must contain floating-point or integer values. |
| window | int | Size of the moving window. Must be a positive integer smaller than the array size. |
| min_periods | int | None | Minimum number of valid observations required in the window. Defaults to window size (returning NaN for incomplete window edges). |