Network Event Cause & Risk Analysis Dataset for Slips IDS
Table of Contents
1. Task Description
The following report presents a dataset generation pipeline designed to create training data for fine-tuning LLMs on network security risk assessment and root cause analysis tasks. The pipeline processes security alerts generated by Slips, to generate expert-level assessments identifying both the root cause of detected threats and their potential risk to the network infrastructure. This enriched dataset serves as labeled training data for developing specialized security analysis models capable of automated threat assessment and incident prioritization. Unlike the parallel summarization workflow which focuses on converting technical evidence into human-readable behavioral summaries, the cause and risk workflow specifically targets the analytical reasoning required for security decision-making and threat evaluation.
The workflow used by this task focuses on structured security analysis rather than event summarization, providing:
Cause Analysis - Categorized incident attribution (Malicious Activity / Legitimate Activity / Misconfigurations)
Risk Assessment - Structured evaluation (Risk Level / Business Impact / Investigation Priority)
Same hardware constraints as summarization workflow (Raspberry Pi 5, 1.5B-3B parameter models) are considered.
The current version of the dataset for finetuning LLM for cause and risk analysis can be found here
2. Relationship to Summarization Workflow
Compared with the Summarization workflow DATASET_REPORT.md, both workflows share identical Stages 1-2 (incident sampling and DAG generation) but diverge in LLM analysis approach. The table below highgliths the differences in terms of scripts and goals
Aspect |
Summarization Workflow |
Risk Analysis Workflow |
|---|---|---|
Analysis Script |
|
|
Correlation Script |
|
|
Output Fields |
|
|
LLM Prompts |
2 per incident (event summarization + behavior patterns) |
2 per incident (cause attribution + risk scoring) |
Primary Use Case |
Incident timeline reconstruction, behavior pattern identification |
Root cause analysis, threat prioritization, SOC decision support |
Recommendation: Generate both datasets from the same sampled incidents to enable comparative analysis and multi-task model training.
3. Dataset Generation Workflow
For a detailed implementation description see README_dataset_risk_workflow.md
Workflow Overview
Stages 1-2 (Sampling + DAG): See DATASET_REPORT.md §3 - identical to summarization workflow.
Quick commands:
# Stage 1: Sample 100 incidents
./sample_dataset.sh 100 my_dataset --seed 42
# Stage 2: Generate DAG analysis
./generate_dag_analysis.sh datasets/my_dataset.jsonl
Stage 3: Multi-Model Cause & Risk Analysis
Query LLMs with dual prompts for cause attribution and risk assessment:
# GPT-4o-mini (recommended baseline)
./generate_cause_risk_analysis.sh datasets/my_dataset.jsonl \
--model gpt-4o-mini --group-events
# Qwen2.5:3b (target deployment model)
./generate_cause_risk_analysis.sh datasets/my_dataset.jsonl \
--model qwen2.5:3b \
--base-url http://10.147.20.102:11434/v1 --group-events
Output Structure (per incident):
{
"cause_analysis": "**Possible Causes:**\n\n**1. Malicious Activity:**\n• Port scanning indicates reconnaissance...\n\n**2. Legitimate Activity:**\n• Could be network monitoring tools...\n\n**3. Misconfigurations:**\n• Firewall allowing unrestricted scanning...\n\n**Conclusion:** Most likely malicious reconnaissance activity.",
"risk_assessment": "**Risk Level:** High\n\n**Justification:** Active scanning + C2 connections...\n\n**Business Impact:** Potential data breach or service disruption...\n\n**Likelihood of Malicious Activity:** High - Systematic attack pattern...\n\n**Investigation Priority:** Immediate - Block source IP and investigate."
}
Stage 4: Dataset Correlation
Merge all analyses (DAG + LLM cause/risk assessments) by incident ID:
python3 correlate_risks.py datasets/my_dataset.*.json \
--jsonl datasets/my_dataset.jsonl \
-o datasets/final_dataset_risk.json
Dataset Structure
Each incident in the final dataset includes:
1. Metadata
incident_idcategorysource_iptimewindowtimelinethreat_levelevent_count
2. DAG Analysis
A chronological breakdown of events within the incident window.
Highlights significant behaviors (e.g., port scans, brute-force attempts).
Includes threat indicators and score implications.
3. Model-Specific Assessments
For each evaluated model (e.g., gpt_4o_mini, gpt_4o, qwen2_5, etc.):
Cause Analysis
A narrative explanation describing the likely cause of the incident, derived from the DAG analysis and event pattern correlations.Risk Assessment
A structured evaluation detailing severity, potential impact, and overall risk level, including justification for the assigned rating.