Purple team coordination¶
Purple team is where Mr. Teatime and Angua sit in the same room. This is a monthly occurrence that both of them approach with something that might generously be described as mutual professional respect. Mr. Teatime brings techniques; Angua brings detection capability. Neither is trying to win. The goal is that by the end of each session, Golem Trust’s defences are better than they were at the start, which is a goal both parties can endorse without reservation.
Exercise structure¶
Purple team exercises run on the last Thursday of each month. The schedule is:
09:00 to 09:30: scenario briefing (Mr. Teatime presents the technique to be demonstrated)
09:30 to 11:30: two-hour attack window (Mr. Teatime executes, Angua defends)
11:30 to 12:30: one-hour debrief (both teams, plus Carrot)
12:30: results recorded in the detection coverage matrix
Unlike a real engagement, Mr. Teatime announces the specific MITRE ATT&CK technique at the start of the attack window. This is not how real attackers operate. The purpose here is learning: Angua should be able to see the technique in action and confirm whether her detection rules are working, then understand exactly what she missed if they are not.
Scenario selection¶
Scenario selection uses the previous month’s threat intelligence from MISP. At the start of each month, Dr. Crucible pulls the top five techniques observed in threat reports from the past 30 days:
# Query MISP for ATT&CK techniques from last 30 days
curl -s -X POST https://misp.golemtrust.am/attributes/restSearch \
-H "Authorization: MISP_API_KEY" \
-H "Accept: application/json" \
-H "Content-Type: application/json" \
-d '{
"returnFormat": "json",
"type": "mitre-attack-pattern",
"last": "30d"
}' | jq '.response.Attribute | group_by(.value) |
map({technique: .[0].value, count: length}) |
sort_by(.count) | reverse | .[0:5]'
Mr. Teatime selects from this list, favouring techniques that are not yet in the detection coverage matrix or that previously failed to generate an alert.
Caldera scenarios and MITRE ATT&CK mapping¶
Each purple team scenario is built as a Caldera adversary profile, mapped to a specific ATT&CK technique. The profiles live in the Caldera stockpile plugin data directory.
# Example adversary profile for T1059.003 (Command and Scripting: Windows Command Shell)
# /opt/caldera/data/adversaries/purple-t1059-003.yml
id: 8f5e9bba-2021-4a44-99c1-purple-t1059-003
name: Purple - T1059.003 Windows Command Shell
description: Purple team scenario for command execution via cmd.exe
atomic_ordering:
- 7b674f21-6b38-4b2b-9498-b6534cac09a7 # execute via cmd
- a6cb2fdc-3e1b-4e7a-a5f9-f7d5b9e1c003 # write output to file
The atomic_ordering field references ability IDs from the stockpile. Each ability corresponds to a specific action a real attacker might take.
Detection coverage matrix¶
The detection coverage matrix lives at security/red-team/detection-coverage.md. After each purple team exercise, the matrix is updated with four data points:
ATT&CK Technique |
Exercise date |
Detected |
Time to detect (mins) |
Detection source |
|---|---|---|---|---|
T1566.001 Spearphishing Attachment |
2025-11-28 |
Yes |
12 |
Graylog email alert |
T1059.001 PowerShell |
2025-11-28 |
Yes |
4 |
Falco rule proc.name=powershell |
T1021.002 SMB/Windows Admin Shares |
2025-12-19 |
Yes |
45 |
Zeek SMB anomaly |
T1048.003 Exfiltration over HTTPS |
2025-12-19 |
No |
N/A |
Missed |
T1078.002 Valid Accounts: Domain |
2026-01-30 |
Yes |
18 |
Wazuh authentication alert |
For techniques marked “No” in the Detected column, the debrief identifies why detection failed: missing rule, rule present but threshold too high, alert present but routed to wrong stream, or technique genuinely difficult to detect at this layer.
Updating detection rules after exercises¶
The debrief produces a list of detection gaps. Each gap becomes a task in the security team’s GitLab issue tracker, assigned to Angua with a two-week resolution target.
For a missed technique, the typical remediation workflow is:
Identify which log source would have contained evidence (Zeek, Wazuh, Falco, Keycloak, Teleport).
Confirm the log source is capturing the relevant event type.
Write or update a Graylog pipeline rule or alert condition.
Re-test: Mr. Teatime re-runs the specific technique in a focused 30-minute session.
Confirm detection fires. Record result in the coverage matrix.
Example Graylog pipeline rule added after the T1048.003 miss:
rule "Detect large HTTPS egress to new destination"
when
has_field("bytes_out") AND
to_long($message.bytes_out) > 10485760 AND
NOT cidr_match("10.0.0.0/8", to_ip($message.dst_ip)) AND
NOT has_field("known_egress_ip")
then
set_field("alert_type", "potential_data_exfiltration");
set_field("alert_severity", "high");
route_to_stream("SOC Alerts");
end
Tracking detection coverage improvement¶
The percentage of tested ATT&CK techniques with active detection is calculated monthly and reported to Carrot and Adora Belle.
# Calculate coverage percentage from the matrix CSV export
import csv
with open('detection-coverage.csv') as f:
rows = list(csv.DictReader(f))
# Use only the most recent result per technique
latest = {}
for row in rows:
technique = row['technique_id']
if technique not in latest or row['date'] > latest[technique]['date']:
latest[technique] = row
total = len(latest)
detected = sum(1 for r in latest.values() if r['detected'] == 'Yes')
coverage_pct = (detected / total) * 100 if total > 0 else 0
print(f"Techniques tested: {total}")
print(f"Techniques detected: {detected}")
print(f"Coverage: {coverage_pct:.1f}%")
Current coverage as of the last quarterly review: 71% of tested techniques have active detection with an alert that fires within 30 minutes. The target for the end of the year is 85%.
Debrief format¶
The debrief is a structured conversation, not a post-mortem. Both teams are expected to participate constructively. Mr. Teatime explains what he did, what he expected to happen, and what actually happened. Angua explains what she saw, what she missed, and what she would change about her detection rules.
Carrot chairs the debrief. He is good at this: he takes both sides seriously, does not assign blame, and makes sure action items are clearly owned and dated.
The debrief notes are recorded in the GitLab wiki at security/red-team/purple-team/YYYY-MM.md. They are accessible to both red and blue teams, and to Adora Belle.