Behavioural analytics and ML detection

Dr. Crucible has been experimenting. He’s applying machine learning to authentication logs, trying to detect anomalies that rule-based detection misses.

“Look at this,” he shows Angua. “User logs in at 9:00 from Ankh-Morpork. Then at 9:15 AM from Tsort. Physically impossible. But our rules only flag it if it’s the same session. Different sessions, we miss it.”

“Impossible travel,” Angua says. “Classic indicator of compromised credentials.”

“Right. But there are more subtle patterns. Users who normally work 9-5 suddenly logging in at 15:00. Access patterns changing. Unusual sequences of actions. Machine learning can detect these.”

What they built

Dr. Crucible implements User and Entity Behaviour Analytics (UEBA) as a Graylog pipeline. Custom Python processors analyse authentication events, access patterns, command histories.

Baseline calculation: 90 days of normal behaviour per user. Login times, access locations, systems accessed, command patterns, data volumes transferred.

Feature extraction for ML model:

  • Login time deviation from normal

  • Geographic anomalies

  • Access pattern changes

  • Privilege escalation attempts

  • Volume anomalies

Isolation Forest algorithm detects outliers. Anomaly scores 0-1. Scores above 0.7 trigger SOC alerts.

User context matters. Developer accessing production at 2 AM during incident response? Not anomalous. Same developer accessing production at 2 AM on Saturday with no incident? Very anomalous.

Entity behaviour extends beyond users: service accounts, API keys, applications. A service account suddenly accessing new databases? Alert.

False positive rate starts high (18%). Tuning over three months drops it to 3.2%. Angua provides feedback on every alert, training the model.

First major catch: compromised service account. Behaving normally for weeks (attackers were patient), then suddenly exfiltrating data at unusual volume. ML model scored it 0.89. Alert fired. Incident response: 15 minutes from detection to containment.

Runbooks

  • UEBA pipeline implementation

  • Baseline calculation

  • Feature engineering

  • Model training

  • Alert tuning

  • Integration with SOC workflows