Time-series anomaly detection
Background
Detecting anomalies (spikes, dropouts, sensor faults) in a time series is a core monitoring task. A naive z-score using the mean and standard deviation is itself wrecked by the very outliers you want to find — a single huge spike inflates the std and hides itself. The robust fix is the modified z-score (Iglewicz & Hoaglin), built on the median and the median absolute deviation (MAD), both of which barely move when a few points go haywire.
Problem statement
Implement detect_anomalies(x, threshold=3.5) returning the indices of points whose modified z-score exceeds the threshold in magnitude:
where is the median. Flag index when .
If MAD = 0, fall back to the mean absolute deviation: . If that is also 0 (all values identical), report no anomalies.
Input
x— 1-D sequence (np.ndarrayor list) of values.threshold—float, the modified-z cutoff (default 3.5, the conventional value).
Output
An np.ndarray of integer indices (ascending) flagged as anomalies; empty if none.
Examples
Example 1
Input: x = [1, 2, 1, 2, 1, 100, 2, 1], threshold = 3.5
Output: [5]
Explanation: the median is 1.5 and MAD is 0.5. The spike at index 5 has modified z-score , while every other point scores under 1.
Constraints
- Use the median and MAD, not the mean and std — the whole point is robustness to the outliers.
- The constant scales MAD to approximate a standard deviation for normal data.
- Handle
MAD == 0with the mean-absolute-deviation fallback; if all values are equal, return an empty array.
Notes
- A threshold of 3.5 flags points beyond ~3.5 robust standard deviations — a common default from the original paper.
- The mean/std z-score suffers from masking: with points, one outlier can never exceed , so for small it may be impossible to flag at threshold 3.
This problem ships 5 hidden tests. They run in your browser via Pyodide — no backend, no submission queue. Press ▶ Run tests to execute.
- •Reference example flags the spike at index 5
- •Clean data yields no anomalies
- •Detects multiple outliers (nonzero MAD)
- •All-identical series has no anomalies
- •MAD == 0 falls back to mean absolute deviation