Sitemap

Flow to Insight: AI-Driven Threat Detection with Generative Agents

Interpreting network anomalies with LLMs, suggesting remediation and generating investigation narratives — autonomously with AI.

14 min readApr 12, 2025
Illustration generated with OpenAI DALL·E via ChatGPT

In my previous post, From Tap to Lake: Scalable, Resilient, AI-Ready NetFlow Observability, I discussed building a robust NetFlow observability pipeline using nProbe, Kafka, and Bauplan. This time, we’ll explore how generative AI agents — backed by a resilient data lakehouse architecture and DPI-enriched NetFlow data — can autonomously interpret network anomalies, suggest remediations, and generate investigation narratives.

This post showcases two real-world applications:

  • 🗣️A Conversational AI Agent
  • 🧠 A Multi-Agent Cybersecurity System

These two AI applications are built upon the Pydantic AI agent framework, which supports integration with external tools via the Model Context Protocol (MCP), as well as advanced tracing with the Logfire companion service. In this architecture, MCP servers act as bridges between the AI agents and external data sources or services. Specifically, I developed and published an MCP server for Bauplan, enabling seamless interaction with the data store:

This server provides tools to list tables, retrieve schemas, and execute SQL queries on Bauplan’s Iceberg tables. By integrating this MCP server, the conversational agent can perform complex data retrieval and analysis tasks based on user prompts.​

Additional MCP servers employed include tools for web page fetching, IP address validation, access to the NIST vulnerability database, and execution of shell commands. These integrations expand the agent’s capabilities, allowing it to incorporate diverse data sources into its responses.

🗣️Conversational AI: Natural Language Access to Network Intelligence

At every stage of the threat analysis workflow, a conversational AI agent interfaces with the Bauplan data store, enabling users to interact with complex network data through natural language queries. This agent leverages large language models (LLMs) to interpret user intents, retrieve relevant data, and present insights in an accessible format.​

🧪Investigating Suspicious Traffic

When prompted with:​

“Find all the source IPs that communicated with domains containing the string ‘coin’ over the last 2 hours.”

The conversational agent processes the request, constructs the appropriate SQL query, interacts with the Bauplan data store via the MCP server, and returns the relevant results.

Sample run of the conversational agent

This task took approximately two minutes to complete and involved 14 interaction turns with Claude 3.7 Sonnet, utilizing 13 tool calls to query the Bauplan data store via the MCP server. Unlike a single-prompt LLM query, the agent engaged in iterative reasoning to refine its search strategy.​

The Logfire trace provided insights into the agent’s thought process:​

  • Initially, the agent executed a SQL query targeting the DNS_QUERY field in the netflows table for entries containing "coin" within the past 2 hours.​
  • Upon finding no results, it autonomously extended the time frame to 24 hours.​
  • Still encountering no matches, the agent reconsidered its approach, hypothesizing that the DNS_QUERY field might not capture all relevant traffic.​
  • It then expanded the search to include additional fields such as HTTP_URL, TLS_SERVER_NAME, and HTTP_SITE, broadening the scope to identify potential matches.​
  • This iterative process continued until the agent successfully retrieved relevant results.​

I am reporting below snippets from the Logfire extensive tracing of the agent’s “thinking” process while collecting the information to complete the task assigned (partial trace):

The first query was a SQL statement to search inside the DNS_QUERY field of the netflows table.
The agent is extending the time frame autonomously to 24 hours since nothing was found in the past 2 hours.
The previous queries targeted the DNS_QUERY attribute of the netflows for text containing the word “coin” but since they did not generate any results, a new search strategy is put in place.
This query now includes additional fields in search of the “coin” text inside.
This query included the L7_INFO fields as well.
The final query had all the HTTP_URL, TLS_SERVER_NAME, HTTP_SITE fields in the query.
Final step with results.

This example highlights the agent’s ability to autonomously adjust its strategy based on intermediate outcomes, showcasing a level of reasoning beyond static prompt responses.

The hybrid reasoning capabilities of Claude 3.7 Sonnet, which allow for both rapid responses and extended, step-by-step thinking, are particularly beneficial in complex tasks like threat detection and analysis. ​

This scenario underscores the value of integrating advanced AI reasoning into network security workflows, enabling more thorough and adaptive threat investigations.

🧪 Investigating Anomalous Outbound Traffic

The following is another real-world example of the AI conversational agent in action, showcasing how it supports step-by-step investigation and decision-making in a security context.

The interaction begins with a simple request:

“Compare network traffic from the past 4 hours with the traffic at the same time yesterday”

🔍 The agent processes and executes the comparison, detecting a noticeable increase in outbound traffic. This triggers a follow-up inquiry:

“Where is the increase in outbound traffic going?”

📍Once the server generating the spike is identified, the analyst asks:

“What type of data was transferred from 192.168.50.189?”

📡 The agent evaluates the types of destinations, protocols, and payload patterns, and flags several high-risk indicators suggesting potential data exfiltration.

Based on its internal risk scoring and contextual analysis, the agent concludes:

Immediate investigation is warranted. Suggested next steps include isolating the server, triggering DLP logging, and notifying a human analyst or escalating to a more specialized AI agent for deep packet analysis.

This conversation uses instead Claude 3.5 Sonnet, which exhibits a more concise and question-focused interaction style compared to the earlier example powered by Claude 3.7. The model follows prompts with higher fidelity, providing direct, well-bounded answers that align precisely with the investigation flow.

Under the Hood: A Glimpse at the Agent Code

While the production version of the agent includes additional features — such as token management, call safeguards, and a robust system prompt — the core logic remains surprisingly straightforward.

Conversational agent skeleton code.

This example illustrates how conversational agents — when backed by a robust data lakehouse architecture and DPI-enriched NetFlow data — can do more than just query information; they can autonomously guide and escalate real investigations.

By combining natural language interaction with contextual memory and procedural reasoning, these agents become a powerful interface for SOC workflows.

🧠Multi-Agent Cybersecurity System

The Multi-Agent Alerts Investigation System is designed to autonomously handle security alerts. Given a period of time, for instance the last 12 hours, the agent fetch all relevant alerts and performs the following steps:

  1. Contextual Analysis: Evaluates the alert in the context of historical data and current network activity.
  2. Threat Intelligence Correlation: Cross-references the alert with known threat intelligence feeds to assess credibility.
  3. Remediation Suggestions: Proposes actionable steps to mitigate the identified threat.
  4. Narrative Generation: Compiles a detailed narrative of the investigation, suitable for reporting and auditing purposes.

1. System Overview

The Cybersecurity Alert Analysis System implements a multi-agent architecture for comprehensive security alert analysis. The system employs specialized AI agents to decompose complex security analysis into distinct functional domains, allowing for thorough investigation of high-severity alerts and suspicious network activities.

2. Multi-Agent Architecture

The system is built on a coordinated multi-agent framework where specialized agents perform discrete analytical tasks under the orchestration of a central coordinator. This architecture provides several advantages:

  • Separation of concerns: Each agent focuses on specific analytical domains
  • Specialization: Agents leverage domain-specific knowledge and techniques
  • Parallel processing potential: Components can be executed concurrently (though currently implemented sequentially)
  • Extensibility: New agent types can be added without modifying existing ones

🧩Architecture Diagram

┌─────────────────┐
│ Coordinator │
│ Agent │
└───────┬─────────┘

│ Orchestrates

┌─────────────────────────────────────────────────────────┐
│ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌──────────┐ │
│ │ Alert │ │ IP │ │ Network │ │Validation│ │
│ │ Analyzer│ │Investig.│ │ Analyzer│ │ Agent │ │
│ └─────────┘ └─────────┘ └─────────┘ └──────────┘ │
│ │
│ ┌─────────────────────┐ │
│ │ Recommendation │ │
│ │ Agent │ │
│ └─────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────┘

The Pydantic AI Framework provides the foundation for creating the agents in the system. Logfire provides execution tracing and timing.

3. Agent Roles & Functions

3.1 Coordinator Agent

  • Primary Function: Orchestrates workflow between specialized agents and maintains high-level investigation coherence
  • Key Responsibilities:
  • Directing information flow between agents
  • Resolving analysis inconsistencies
  • Ensuring comprehensive coverage of security aspects
  • Producing final coherent reports

Implementation: Uses broad system prompts to maintain balanced reporting with appropriate confidence levels

3.2 Alert Analyzer Agent

  • Primary Function: Analyzes security alerts to identify patterns and trends
  • Key Capabilities:
  • Querying alert databases using SQL
  • Pattern recognition across multiple alerts
  • Alert severity assessment
  • Temporal correlation of alerts

Implementation: Leverages Bauplan MCP Server for alert data queries and employs structured pattern identification techniques

3.3 IP Investigator Agent

  • Primary Function: Gathers and analyzes intelligence about IP addresses
  • Key Capabilities:
  • IP geolocation and ownership analysis
  • Organizational profiling of IP owners
  • Historical IP behavior assessment
  • Domain resolution and WHOIS lookups

Implementation: Interfaces with ipinfo and whois MCP servers for enrichment data

3.4 Network Analyzer Agent

  • Primary Function: Analyzes network flow data for suspicious patterns
  • Key Capabilities:
  • Traffic timing pattern analysis
  • Data volume anomaly detection
  • Protocol and port usage assessment
  • Connection pattern recognition

Implementation: Uses SQL queries against netflows table and the LLM model analytical capabilities to identify patterns, with results interpreted by the agent to detect timing regularities, data volume anomalies, and connection characteristics.

3.5 Validation Agent

  • Primary Function: Provides alternative explanations and assesses evidence strength
  • Key Capabilities:
  • Generating non-malicious explanations for suspicious patterns
  • Assessing confidence levels for conclusions
  • Identifying information gaps
  • Evaluating evidence strength

Implementation: Applies structured critical thinking to prevent false positives and maintain balanced assessment

3.6 Recommendation Agent

  • Primary Function: Develops balanced, prioritized security action plans
  • Key Capabilities:
  • Creating phased remediation approaches
  • Prioritizing actions based on risk and confidence
  • Balancing security with operational impact
  • Developing comprehensive threat assessments

Implementation: Uses a four-phase approach (Verification, Containment, Investigation, Remediation)

4. Agent Coordination Workflow

The system implements a sequential workflow with structured information passing:

Coordinator

│ 1. Initiates analysis

Alert Analyzer──┐
│ │
│ 2. Alert │ 3. Pattern
│ Summary │ Analysis
▼ │
Network Analyzer◄───┘

│ 4. Network
│ Timing

IP Investigator

│ 5. IP
│ Intelligence

Validation Agent

│ 6. Alternative
│ Explanations

Recommendation Agent
│ │
│ 7. Threat │ 8. Recommendations
│ Assessment│
▼ │
Coordinator◄────┘

│ 9. Executive Summary
│ & Final Report

Final Report

5. Shared Memory Implementation

The system employs a simple shared memory mechanism for inter-agent communication:

class SharedMemory:
def __init__(self):
self._memory = {}
self._schema_cache = {} # For database schemas

def store(self, key: str, value: Any) -> None:
self._memory[key] = value

def retrieve(self, key: str) -> Any:
return self._memory.get(key)

# Additional methods for schema caching & memory management

This provides efficient data sharing while maintaining clear boundaries between agent responsibilities. The shared memory stores intermediate analysis results, ensuring consistent information access across all agents.

6. External Integration

The system integrates with external data sources through MCP servers:

┌───────────────┐     ┌───────────────┐     ┌───────────────┐
│ Fetch Server │ │Bauplan Server │ │ IPInfo Server │
└───────┬───────┘ └───────┬───────┘ └───────┬───────┘
│ │ │
└─────────────┬───────┴─────────────┬───────┘
│ │
┌─────▼─────┐ ┌─────▼─────┐
│ Agent │ │ WHOIS │
│ System │◄────────┤ Server │
└───────────┘ └───────────┘

Key integrations include:

  • Fetch Server: General data retrieval from URLs
  • Bauplan Server: Bauplan data store access for alerts and netflows
  • IPInfo Server: IP address enrichment
  • WHOIS Server: Domain information lookup

7. Implementation Details

7.1 Agent Initialization

The system initializes agents with specialized system prompts and shared MCP servers:

def initialize_agents():
mcp_servers = initialize_mcp_servers()

coordinator_agent = Agent(
'anthropic:claude-3-7-sonnet-latest',
instrument=True,
system_prompt=COORDINATOR_PROMPT,
mcp_servers=mcp_servers
)

# Initialize additional specialized agents
# ...

return {
"coordinator": coordinator_agent,
"alert_analyzer": alert_analyzer_agent,
# Other agents...
}

7.2 Task Execution

Tasks are executed via a standardized function that handles MCP server lifecycle:

async def execute_agent_task(agent, prompt, model_settings=None):
async with agent.run_mcp_servers():
if model_settings:
result = await agent.run(prompt, model_settings=model_settings)
else:
result = await agent.run(prompt)
return result.data

7.3 Analysis Workflow

The main analysis workflow is implemented as a sequence of agent invocations with progressive knowledge building:

async def analyze_recent_alerts_multi_agent(time_hours=24, min_score=100):
# Initialize agents and shared memory
agents = initialize_agents()
shared_memory.clear()

# Calculate time window
current_time = datetime.now()
start_time = current_time - timedelta(hours=time_hours)
start_time_str = start_time.strftime("%Y-%m-%d %H:%M:%S")

# Store common parameters
shared_memory.store("start_time_str", start_time_str)
shared_memory.store("min_score", min_score)

# Step 1: Alert Summary (Alert Analyzer)
alert_summary = await execute_agent_task(agents["alert_analyzer"], alert_summary_prompt)
shared_memory.store("alert_summary", alert_summary)

# Steps 2-9: Execute remaining analysis steps
# ...

# Assemble final report
full_report = (
f"{report_header}"
f"{shared_memory.retrieve('executive_summary')}\n\n"
f"{shared_memory.retrieve('alert_summary')}\n\n"
# Additional sections...
)

return full_report

8. Confidence Classification Framework

The system employs a standardized confidence classification system:

def classify_confidence(evidence_strength: int) -> Tuple[str, str, str]:
if evidence_strength >= 9:
return ("High Confidence", "Strong evidence supports this conclusion",
"This is almost certainly")
elif evidence_strength >= 6:
return ("Medium Confidence", "Multiple indicators suggest this, but alternative explanations exist",
"This likely represents")
elif evidence_strength >= 3:
return ("Low Confidence", "Some indicators suggest this, but evidence is limited",
"This could potentially be")
else:
return ("Speculative", "Limited evidence with multiple possible interpretations",
"This might possibly be")

This ensures consistent language across all agents when expressing certainty levels.

9. Current Limitations & Future Extensions

The current implementation has several limitations that future versions will address:

  1. Sequential Execution: Agents operate sequentially rather than in parallel, limiting performance
  2. Limited Data Sources: The system would benefit from additional data source integrations
  3. In-Memory Storage: Persistent storage would improve historical analysis capabilities
  4. Manual Agent Coordination: The coordinator uses predefined workflows rather than dynamic orchestration

Planned extensions include:

  • Parallel agent execution for improved performance
  • Integration with additional security tools and data sources
  • Implementation of persistent storage for historical analysis
  • Dynamic workflow orchestration based on investigation needs
  • Enhanced visualization capabilities for security findings

11. Conclusion

The Cybersecurity Alert Analysis System demonstrates the effectiveness of a multi-agent architecture for complex security analysis tasks. By decomposing analysis into specialized domains while maintaining cohesive orchestration, the system produces comprehensive, balanced security assessments with appropriate confidence levels and actionable recommendations.

🧪Sample Run: Real-Time Threat Analysis

Executing an analysis of network alerts in the past 4 hours:

📌 Key Findings and Recommendations

After 6 minutes these are the key findings:

Key Security Findings

  • Two distinct high-severity patterns were identified in the network traffic
  • Pattern 1: Regular 5-minute interval communications between host 192.168.50.71 and Ubuntu repositories
  • Uses obsolete nginx servers (versions 1.14.0/1.18.0)
  • Consistently missing User-Agent headers
  • Likely represents automated system maintenance using vulnerable components
  • Pattern 2: Irregular communications between host 192.168.50.189 and Apple CDN servers
  • Shows unusual port usage (TLS on port 80)
  • Exhibits connection failures and protocol anomalies
  • Has a concerning 5:1 ratio of outbound to inbound data
  • Could represent misconfigured application, mobile device sync issues, or suspicious data movement

✅ Recommended Actions

  • Verification Phase
  • Inspect Ubuntu update mechanisms on 192.168.50.71
  • Analyze full packet content of communications with Apple CDN servers
  • Establish traffic baselines for comparison with historical patterns
  • Containment Phase
  • Implement temporary egress filtering for unusual communications
  • Deploy enhanced monitoring for protocol anomalies
  • Conduct endpoint security scans for vulnerable components
  • Investigation Phase
  • Perform process-to-connection analysis to identify initiating processes
  • Review application inventory against observed traffic patterns
  • Examine asymmetric data transfer patterns, especially the 5:1 outbound ratio
  • Remediation Phase
  • Update obsolete nginx servers regardless of malicious intent confirmation
  • Correct application configurations if misconfiguration is confirmed
  • Implement ongoing monitoring controls for protocol/port anomalies and empty User-Agent headers

Report Structure

The agent generated a security report with clearly defined sections:

  • Executive Summary: Concise overview of critical findings with high-level assessment of two distinct network activity patterns, including regular Ubuntu repository communications using obsolete components and irregular Apple CDN connections with unusual protocol characteristics
  • Alert Summary: Overview of high-severity alerts (score ≥ 100) showing concentration from two primary internal hosts (192.168.50.71 and 192.168.50.189) with alerts related to obsolete nginx servers, missing User-Agent headers, and TCP probing attempts
  • Top Alert Patterns: Identification of three primary patterns, including Pattern 1 (obsolete nginx communications), Pattern 2 (TCP probing activities), and Pattern 3 (internal network communications)
  • Alert Pattern Analysis: Detailed breakdown of observed patterns with factual observations and potential interpretations, examining protocol usage, alert classifications, and possible explanations for each pattern
  • IP Intelligence: Comprehensive analysis of the external IPs involved, including ownership information for Canonical Group Limited and Apple Inc., along with threat assessments for each entity
  • Network Timing Analysis: Examination of temporal patterns in communications, showing consistent 5-minute intervals for Ubuntu traffic and irregular patterns with failed connection attempts for Apple traffic
  • Alternative Explanations: Balanced consideration of non-malicious explanations for observed patterns, including system updates, legacy applications, mobile device synchronization issues, and misconfigured applications
  • Threat Assessment: Overall security risk evaluation (Medium with Medium-High Confidence) with specific risk ratings for each pattern and assessment of potential impacts
  • Recommended Actions: Structured approach with four phases (Verification, Containment, Investigation, Remediation) prioritizing inspection of update mechanisms, protocol monitoring, process analysis, and component updates

The full report is available here:

Essential Tools to Address Information Gaps

To address the identified information gaps in the report, as part of future improvements, the multi-agent system requires access to:

1. Process Information

  • Centralized Monitoring Systems:
  • Prometheus with node exporters on target systems
  • Elastic Stack (ELK) with Beats agents for process tracking
  • API access to EDR solutions like CrowdStrike or Microsoft Defender for Endpoint

2. Application Inventory

  • IT Asset Management Systems:
  • CMDB (Configuration Management Database) API access
  • API integration with software inventory tools like ServiceNow, Qualys, or Tanium
  • Application dependency mapping from tools like Dynatrace or AppDynamics

4. User-Agent Testing

  • Testing Infrastructure:
  • Sandboxed network environment for policy testing
  • Network policy simulation tools with API access
  • Canary deployment capability for traffic rule testing

5. Organizational Context

  • Enterprise Systems:
  • IT asset database with system classification data
  • MDM system API for device information
  • DHCP/DNS management system with hostname-to-IP mapping
  • Identity management system integration

6. Network Segmentation Details

  • Network Policy Systems:
  • Firewall management platform with rule base access
  • Network segmentation policy documentation system
  • Zero Trust policy engine API access
  • Network Access Control system integration

All these systems should provide API-based access and possibly MCP servers, to allow the multi-agent system to query and analyze data without requiring direct access to the monitored endpoints.

Summary and Key Benefits

This comprehensive network threat detection and analysis solution effectively integrates advanced sensor technology, scalable data infrastructure, sophisticated transformation pipelines, and state-of-the-art generative agentic AI workflows:

  • Enhanced Detection Capability: Deployment of nProbe with integrated nDPI provides high-quality, enriched NetFlow data, enabling rapid detection and scoring of potential threats across diverse risk categories.
  • Scalable Data Ingestion and Storage: Kafka and S3-based infrastructure deliver highly scalable, schema-enforced, cost-efficient storage solutions, ensuring robustness and reliability in high-volume operational environments.
  • Efficient Data Transformation: Bauplan-based automated pipelines systematically convert raw network data into actionable intelligence via structured lakehouses, Iceberg tables, and customized alert generation processes.
  • Automated AI-Driven Analysis: The integrated generative AI agentic workflow streamlines alert management, significantly reducing false positives and response times, while delivering enriched, insightful, automated incident reports.

The comprehensive structure of the report generated demonstrates how the AI agent systematically can successfully analyze security alerts, correlate them with network flows, and produce actionable intelligence with prioritized recommendations.

--

--

Marco Graziano
Marco Graziano

Written by Marco Graziano

Engineer, apprentice renaissance man. I am the founder of technology start-ups in Palo Alto and of Graziano Labs Corp.

No responses yet