Search Anomaly Index (SAI)
How we identify unusual search patterns by measuring deviations from an operator's typical behavior
Overview
A Search Anomaly Index (SAI) is indicated with every search result. This composite index tells you how “anomalous” a search is—how much it deviates from an operator’s typical patterns. We analyze operators’ typical work hours, search volumes, and scopes. Anything that deviates from that baseline is assigned a score.
| Score Range | Classification | Meaning | Rationale |
|---|---|---|---|
| < 1.75 | Normal | In-shift; typical patterns | Minimal deviation |
| 1.75 – 5.0 | Elevated | In-shift; unusual burst | Z-score deviation |
| 5.0 – 8.0 | Anomaly | Out-of-shift; moderate | 5× temporal penalty |
| ≥ 8.0 | Critical | Out-of-shift; irregular | High combined risk |
Example
Here’s a “normal” search: a user who is typically most active 9am – 9pm, performing a search at 3pm. They typically conduct about 5 warrantless searches per hour around 3pm; in the hour around this search, they conducted 5.

Compare that to this high-SAI search from the same user, conducted at 2am. Their normal pattern shows virtually no activity at this time, with 0.1 searches between 2am – 3am on average.

The SAI indicates that it is statistically less likely that this search falls within the scope of the operator’s regular duties.
Primary Active Window
The first component is temporal: the system calculates the operator’s Primary Active Window (PAW), which is the dynamic, statistically determined continuous block of hours (e.g., 10 or 12 hours) that accounts for 85% of the operator’s total historical search activity. Searches outside this window receive a penalty.
Intensity Deviation
The second component measures how far the current search characteristics are from the operator’s established average behavior. We use the Standard Z-Score to calculate this distance—a Z-score measures deviation in terms of standard deviations (σ).
Volume Contribution (Bursts of Activity)
Measures if the operator is performing searches at an unusually high or low rate during the hour the search occurred.
- Current Value: Total searches by this operator in the current 60-minute window.
- Baseline: The operator’s historical average search volume for that specific hour of day and its variability. This hour-specific baseline is more accurate than an overall average because operators often have predictable daily patterns.
- Weight: 0.5
Complexity Contribution (Scope of Search)
Measures if the search scope (the data accessed) is unusual for the operator.
- Current Value: Total number of devices searched.
- Baseline: The operator’s historical average device search count (excluding zero-device records for accuracy) and its variability.
- Weight: 0.25
Special Cases
- Zero-Baseline Override: If an operator has no historical activity for a particular hour and then searches during it, the system assigns a fixed high Z-score of 10.0. First-time activity during an unusual hour is inherently anomalous.
- Low-Volume Gate: The system suppresses Z-score calculations when the current hourly count is 5 or fewer, preventing false positives from sparse baselines.
- Org-Level Median Fallback: When an operator has incomplete baseline data (e.g., no historical device search statistics), the system falls back to organization-level median values to prevent extreme scores for new operators or those with sparse data.
Final Score Calculation
The SAI score combines:
The Context Multiplier is 5.0 for out-of-shift searches and 1.0 otherwise.
Baselines Stored Per Operator
- Temporal Baseline: The PAW start hour and dynamic duration (minimum hours needed to capture 85% of total activity).
- Hourly Volume Baseline: For each hour of the day (0–23), the mean and standard deviation of search counts.
- Complexity Baseline: The mean and standard deviation for device search counts (excluding zero-device records).
Caveats
SAI values may change as the algorithm is further tuned and refined. As always: check with the sources and do your own math before drawing any hard conclusions.