HTTPS Anomaly Detection Module
This document describes how the anomaly_detection_https module detects anomalies from TLS/HTTPS traffic in Slips.

Goal
Detect unusual HTTPS behavior per host, using:
Hourly behavior changes (volume and novelty patterns).
Flow-level deviations (for known servers).
Adaptive baselines that update over time, with poisoning resistance.
Input data used
The module subscribes to SSL/TLS events and reads related connection metadata from DB for the same UID.
Main fields used:
SSL:
uid,server_name(SNI),ja3,ja3s,dport,sportConn (correlated): destination IP, total bytes, timing info
Traffic-time logic
All detection windows are based on traffic timestamps (packet/log time), not wall clock time.
This keeps behavior consistent for:
live interface capture,
live Zeek folder input,
offline PCAP,
offline Zeek logs.
Features
The module computes per-host hourly features:
ssl_flows: number of SSL flows in the hour.unique_servers: number of distinct destination servers.new_servers: number of servers not seen before for that host.ja3_changes: number of new JA3 variants seen per server in the hour.known_server_avg_bytes: mean bytes for flows to already-known servers.
Flow-level feature:
bytes_to_known_server: per-server bytes deviation on each flow.
Baseline and training
Each host has independent models.
Training phase (training_hours > 0)
For the first configured benign hours, the module does fit-only (Welford online moments):
no detection decisions are emitted from hourly z-score rules before training ends,
baseline mean/variance are learned strongly from this period.
Training fit strength is configurable with training_alpha:
Training fit technique is selected by training_fit_method:
training_fit_method = welford-> Welford benign fit.training_fit_method = ewma-> EWMA-style training adaptation.
When training_fit_method = ewma, training_alpha controls strength:
higher
training_alpha= faster adaptation,lower
training_alpha= slower adaptation.
No explicit training (training_hours = 0)
Detection starts immediately using online adaptation.
Special fallback only for ja3_changes:
if hourly
ja3_changes < ja3_min_variants_per_server, that hourly signal is ignored until enough activity exists.
Scoring
Each modeled feature uses robust scoring in three explicit steps:
Transform heavy-tail signals:
y = log(1 + x)(log1p) for non-negative count/bytes features.Estimate robust center/scale on recent transformed values:
m = median(y)MAD = median(|y - m|)sigma_robust = max(1.4826 * MAD, min_std_floor)
Score deviation:
z_robust = |y_t - m| / sigma_robust
Why this is used:
HTTPS counts and byte volumes are typically right-skewed and heavy-tailed,
mean/std-only scoring overreacts to bursts and underreacts after outliers,
log1p + median/MADis more stable under non-Gaussian traffic.
Thresholds:
empirical thresholds calibrated from benign training when
training_hours > 0,otherwise defaults (
hourly_zscore_threshold,flow_zscore_threshold).
Calibration rule:
per signal, collect robust z-scores on confirmed benign training data,
set threshold to high benign quantile (
empirical_threshold_quantile, default 0.995),fallback to defaults if training data is insufficient.
Adaptation states
After each hour closes, the module chooses model update mode.
Update event semantics:
training_fit: initial benign baseline fit whiletrained_hours < training_hours; uses training fit method (Welford-style), not EWMA alpha.baseline_update: normal post-training adaptation; uses EWMA withbaseline_alpha. In ADWIN mode, this is used when ADWIN does not signal drift.drift_update: post-training drift adaptation; uses EWMA withdrift_alpha. In ADWIN mode, this is used only after ADWIN drift signal and small/drift-like classification.suspicious_update: post-training conservative adaptation; uses EWMA withsuspicious_alpha. In ADWIN mode, this is used only after ADWIN drift signal and suspicious classification.
When use_adwin_drift=false:
training_fit
During benign training: Welford fit (no EWMA alpha).drift_update
If anomaly score is small (hourly_score <= adaptation_score_threshold) and flow anomaly count is small (<= max_small_flow_anomalies), update withdrift_alpha.suspicious_update
Otherwise update withsuspicious_alpha(much smaller), to limit poisoning.
For normal non-anomalous periods outside training, per-feature EWMA uses baseline_alpha.
ADWIN drift trigger (use_adwin_drift=true)
If use_adwin_drift=true and river is installed, ADWIN is the only drift trigger in both paths:
Hourly path: ADWIN receives each raw hourly feature stream.
Flow path: ADWIN receives each raw per-flow signal stream.
ADWIN drift detected -> classify as
drift_updateorsuspicious_updateusing existing thresholds.No ADWIN drift -> use
baseline_update(baseline_alpha), even if anomalies exist.During benign training, ADWIN is still warmed with benign scores to reduce cold-start noise after training.
Why raw signals:
drift is a distribution change in the observed variables, so ADWIN tracks the raw feature streams directly,
z-scores are still used for anomaly magnitude and evidence reasons, but not as the primary drift input.
Performance note:
hourly ADWIN cost scales with hourly feature count,
flow ADWIN cost scales with per-flow signal count,
both are constant-time scalar updates and usually lightweight.
Current tuned defaults for faster ADWIN reaction:
adwin_delta: 0.01adwin_clock: 1adwin_grace_period: 5adwin_min_window_length: 5
New server vs JA3 behavior
new_serversis modeled as an hourly statistical feature and adapted over time.new_servercan also appear as a direct flow-level novelty reason.ja3_changesis handled statistically at hourly level (with fallback gate only when training is zero).new_ja3scan appear as direct flow-level novelty reason.
Confidence and threat level
Each detection computes confidence score [0,1] from multiple factors:
anomaly severity,
persistence in recent history,
baseline quality,
multi-signal agreement.
Mapped levels:
low / medium / high confidence
Threat level used in evidence:
lowfor low or medium confidencemediumfor high confidence
Evidence format
Evidence description is human-readable and concise:
HTTPS anomaly: type=<type>; confidence=<level> (<score>); reason=<reason>; value=<value>; why=<explanation>.
Examples of reasons:
New Server
New JA3S
Bytes to Known Server
Hourly feature deviations (e.g., New Servers Count, JA3 Changes)
Configuration keys
Section: anomaly_detection_https in config/slips.yaml.
Main keys:
training_hourstraining_fit_methodtraining_alphahourly_zscore_thresholdflow_zscore_thresholdadaptation_score_thresholdbaseline_alphadrift_alphasuspicious_alphamin_baseline_pointsmax_small_flow_anomaliesja3_min_variants_per_serveruse_adwin_driftadwin_deltaadwin_clockadwin_grace_periodadwin_min_window_lengthempirical_threshold_quantilelog_verbosity
Defaults (from parser/config):
training_alpha: 1.0training_fit_method: welforduse_adwin_drift: trueadwin_delta: 0.01adwin_clock: 1adwin_grace_period: 5adwin_min_window_length: 5
Reference:
River ADWIN: https://riverml.xyz/latest/api/drift/ADWIN/
Data transformations for skew/heavy tails: https://otexts.com/fpp3/transformations.html
Robust scale (MAD): https://en.wikipedia.org/wiki/Median_absolute_deviation
Operational logs
The module logs key events such as:
flow arrivals,
hour close and computed features,
training fit updates,
drift updates,
suspicious updates,
detections and emitted evidence.