Network Event Cause & Risk Analysis Dataset for Slips IDS

Table of Contents

1. Task Description
2. Relationship to Summarization Workflow
3. Dataset Generation Workflow
4. Use Cases and Applications

1. Task Description

The following report presents a dataset generation pipeline designed to create training data for fine-tuning LLMs on network security risk assessment and root cause analysis tasks. The pipeline processes security alerts generated by Slips, to generate expert-level assessments identifying both the root cause of detected threats and their potential risk to the network infrastructure. This enriched dataset serves as labeled training data for developing specialized security analysis models capable of automated threat assessment and incident prioritization. Unlike the parallel summarization workflow which focuses on converting technical evidence into human-readable behavioral summaries, the cause and risk workflow specifically targets the analytical reasoning required for security decision-making and threat evaluation.

The workflow used by this task focuses on structured security analysis rather than event summarization, providing:

Cause Analysis - Categorized incident attribution (Malicious Activity / Legitimate Activity / Misconfigurations)
Risk Assessment - Structured evaluation (Risk Level / Business Impact / Investigation Priority)

Same hardware constraints as summarization workflow (Raspberry Pi 5, 1.5B-3B parameter models) are considered.

The current version of the dataset for finetuning LLM for cause and risk analysis can be found here

2. Relationship to Summarization Workflow

Compared with the Summarization workflow DATASET_REPORT.md, both workflows share identical Stages 1-2 (incident sampling and DAG generation) but diverge in LLM analysis approach. The table below highgliths the differences in terms of scripts and goals

Aspect	Summarization Workflow	Risk Analysis Workflow
Analysis Script	`generate_llm_analysis.sh`	`generate_cause_risk_analysis.sh`
Correlation Script	`correlate_incidents.py`	`correlate_risks.py`
Output Fields	`summary` + `behavior_analysis`	`cause_analysis` + `risk_assessment`
LLM Prompts	2 per incident (event summarization + behavior patterns)	2 per incident (cause attribution + risk scoring)
Primary Use Case	Incident timeline reconstruction, behavior pattern identification	Root cause analysis, threat prioritization, SOC decision support

Recommendation: Generate both datasets from the same sampled incidents to enable comparative analysis and multi-task model training.

3. Dataset Generation Workflow

For a detailed implementation description see README_dataset_risk_workflow.md

Workflow Overview

Stages 1-2 (Sampling + DAG): See DATASET_REPORT.md §3 - identical to summarization workflow.

Quick commands:

# Stage 1: Sample 100 incidents
./sample_dataset.sh 100 my_dataset --seed 42

# Stage 2: Generate DAG analysis
./generate_dag_analysis.sh datasets/my_dataset.jsonl

Stage 3: Multi-Model Cause & Risk Analysis

Query LLMs with dual prompts for cause attribution and risk assessment:

# GPT-4o-mini (recommended baseline)
./generate_cause_risk_analysis.sh datasets/my_dataset.jsonl \
  --model gpt-4o-mini --group-events

# Qwen2.5:3b (target deployment model)
./generate_cause_risk_analysis.sh datasets/my_dataset.jsonl \
  --model qwen2.5:3b \
  --base-url http://10.147.20.102:11434/v1 --group-events

Output Structure (per incident):

{
  "cause_analysis": "**Possible Causes:**\n\n**1. Malicious Activity:**\n• Port scanning indicates reconnaissance...\n\n**2. Legitimate Activity:**\n• Could be network monitoring tools...\n\n**3. Misconfigurations:**\n• Firewall allowing unrestricted scanning...\n\n**Conclusion:** Most likely malicious reconnaissance activity.",

  "risk_assessment": "**Risk Level:** High\n\n**Justification:** Active scanning + C2 connections...\n\n**Business Impact:** Potential data breach or service disruption...\n\n**Likelihood of Malicious Activity:** High - Systematic attack pattern...\n\n**Investigation Priority:** Immediate - Block source IP and investigate."
}

Stage 4: Dataset Correlation

Merge all analyses (DAG + LLM cause/risk assessments) by incident ID:

python3 correlate_risks.py datasets/my_dataset.*.json \
  --jsonl datasets/my_dataset.jsonl \
  -o datasets/final_dataset_risk.json

Dataset Structure

Each incident in the final dataset includes:

1. Metadata

incident_id
category
source_ip
timewindow
timeline
threat_level
event_count

2. DAG Analysis

A chronological breakdown of events within the incident window.
Highlights significant behaviors (e.g., port scans, brute-force attempts).
Includes threat indicators and score implications.

3. Model-Specific Assessments

For each evaluated model (e.g., gpt_4o_mini, gpt_4o, qwen2_5, etc.):

Cause Analysis
A narrative explanation describing the likely cause of the incident, derived from the DAG analysis and event pattern correlations.
Risk Assessment
A structured evaluation detailing severity, potential impact, and overall risk level, including justification for the assigned rating.