SEC401 – Network Forensics
Lab 1.3 - AWS VPC Flow Log Analysis
Solo, Lab
Focus: Cloud Network Forensics
Level: SEC401
Date: Apr 2026
Artifacts: Sanitized screenshots from VPC flow log analysis and NetFlow conversion
TL;DR
- •Extracted 33,232 attacker flows from 173K records across 579 compressed VPC log files
- •Identified 6.5-hour attack window with 265MB exfiltrated on port 8889 and 190MB on port 80
- •Confirmed full attack surface (ports 80, 22, 8889) via PCAP-to-NetFlow conversion with nfpcapd/nfdump
Skills demonstrated
Note: Course-provided PCAPs and lab instructions are not shared. Only my own captures and sanitized notes are published.
Why this matters
VPC Flow Logs are often the first data source available during a cloud incident. Knowing how to rapidly extract attacker flows from hundreds of compressed log files, calculate data exfiltration volumes, and correlate with PCAP-derived NetFlow is exactly what a SOC analyst or incident responder does when investigating a breach in AWS.
Context
This lab demonstrates how to analyze AWS VPC Flow Logs to investigate attacker activity at scale. Starting with 579 gzip-compressed log files containing 173,198 flow records, the goal was to extract, filter, and quantify traffic from a known attacker IP (20.106.124.93), determine the attack timeframe, calculate data transfer volumes per service, and convert PCAP data into NetFlow format for comparison analysis.
Tools used
Steps taken
1List and identify VPC flow log files
Listed all files in the log directory: 579 gzip-compressed VPC flow log files. Used the file command to confirm they were gzip compressed data from a FAT filesystem, original size ~32KB each.
$ ls /sec401/labs/1.3/20230928/ | wc -l
$ file /sec401/labs/1.3/20230928/2226771286B0_vpcflowlogs_us-east-2_fl-0272f42338e6eeaaf_20230928T23552_e92fb168.log.gzwc -lcount filesfileidentify file type and compression2Inspect flow log format and sample records
Decompressed a log file with zcat and piped to head -4 to see the header and first records. The VPC flow log format includes: version, region, account-id, instance-id, interface-id, type, srcaddr, dstaddr, srcport, dstport, protocol, bytes, packets, tcp-flags, start, end, action, log-status, flow-direction, traffic-path. First records showed 35.203.211.65 being REJECT'd and 10.130.8.94 ACCEPT'd traffic.
$ zcat file /sec401/labs/1.3/20230928/2226771286B0_vpcflowlogs_us-east-2_fl-0272f42338e6eeaaf_20230928T23552_e92fb168.log.gz | head -4zcatdecompress and output to stdouthead -4show header + 3 sample records3Count total flow records
Decompressed all 579 log files and counted total lines: 173,198 flow records to investigate.
$ zcat /sec401/labs/1.3/20230928/*log.gz | wc -l*log.gzglob all compressed logswc -lcount total lines4Extract attacker flows
Used zgrep to search all compressed log files for the known attacker IP (20.106.124.93) and redirected matches to attacker-flows.log. Result: 33,232 flow records from the attacker.
$ zgrep --no-filename 20.106.124.93 /sec401/labs/1.3/20230928/*log.gz > /sec401/labs/1.3/attacker-flows.log
$ wc -l /sec401/labs/1.3/attacker-flows.logzgrepgrep compressed files--no-filenameomit file names from output> redirect to attacker-flows.log5Determine attack timeframe
Sorted attacker flows by the start-time epoch field (column 15) to find the earliest and latest timestamps. Converted epochs with date -d: the attack ran from Sep 28, 2023 5:22 PM to 11:59 PM UTC, roughly 6.5 hours.
$ sort -nk 15 /sec401/labs/1.3/attacker-flows.log | head -1
$ date -d @1695921755
$ sort -nk 15 /sec401/labs/1.3/attacker-flows.log | tail -1
$ date -d @1695945545sort -nk 15numeric sort on column 15 (start epoch)date -d @epochconvert epoch to human-readable6Quantify data transfer by port
Used awk to filter attacker flows by destination port and sum the bytes field (column 12). Port 8889 transferred 265,183,813 bytes (~265MB) and port 80 transferred 190,703,527 bytes (~190MB). The high volume on port 8889 is a strong indicator of data exfiltration over a non-standard port.
$ cat attacker-flows.log | awk '$10 == "8889"' | awk '{SUM=SUM+$12} END{print "Total bytes transferred: "SUM}'
$ cat attacker-flows.log | awk '$9 == "80"' | awk '{SUM=SUM+$12} END{print "Total bytes transferred: "SUM}'$10 == "8889"filter by dst port 8889$9 == "80"filter by dst port 80$12bytes fieldSUM+$12running total7Convert PCAP to NetFlow with nfpcapd
Used nfpcapd to convert the investigate.pcap from Lab 1.2 into NetFlow format, outputting to exported-netflow/ directory. This enables flow-level analysis of the same traffic using NetFlow tools.
$ nfpcapd -r /sec401/labs/1.2/investigate.pcap -w exported-netflow/-rread PCAP file-wwrite NetFlow output directory8Analyze NetFlow with nfdump
Dumped the converted NetFlow data to a text file and opened it. The output shows Date first seen, Duration, Proto, Src/Dst IP:Port, Packets, Bytes, and Flows columns. This structured format makes it easy to filter and correlate with VPC flow log findings.
$ nfdump -R exported-netflow/ > pcap-derived-netflow.txt-Rread recursively from directory9Filter NetFlow for attacker on port 80
Filtered the PCAP-derived NetFlow for the attacker IP on port 80. Confirmed HTTP traffic: 20.106.124.93:51278 to 10.130.8.94:80, matching the WordPress brute-force activity found in Labs 1.1 and 1.2.
$ head -1 pcap-derived-netflow.txt; cat pcap-derived-netflow.txt | grep 20.106.124.93 | head -210Filter for attacker SSH traffic
Excluded port 80 and filtered for remaining attacker flows. Found SSH connections on port 22 from 20.106.124.93:38504 to 10.130.8.94:22, indicating the attacker also accessed the server via SSH.
$ head -1 pcap-derived-netflow.txt; cat pcap-derived-netflow.txt | grep 20.106.124.93 | grep -v :80 | head -211Identify non-standard port activity
Excluded ports 80 and 22, revealing traffic on port 8889: 20.106.124.93:8889 to 10.130.8.94:36072. Port 8889 is not a well-known service (confirmed via /etc/services), making this a likely data exfiltration channel consistent with the 265MB volume found in VPC flow logs.
$ head -1 pcap-derived-netflow.txt; cat pcap-derived-netflow.txt | grep 20.106.124.93 | grep -v :80 | grep -v :22 | head -2grep -vexclude matchesSequential exclusion isolates unknown services12Confirm complete attack surface
Excluded all three known ports (80, 22, 8889) from attacker flows. Empty result confirmed the attacker used only these three services: HTTP for the initial brute-force, SSH for interactive access, and port 8889 for data exfiltration.
$ head -1 pcap-derived-netflow.txt; cat pcap-derived-netflow.txt | grep 20.106.124.93 | grep -v :80 | grep -v :22 | grep -v :8889 | head -2Key findings
Outcome / Lessons learned
This lab demonstrated how to investigate attacker activity at scale using VPC Flow Logs. Starting from 579 compressed log files with 173K records, I isolated the attacker, mapped their 6.5-hour attack window, quantified data exfiltration volumes, and confirmed the complete attack surface across three services. The PCAP-to-NetFlow conversion bridged packet-level evidence from previous labs with flow-level cloud telemetry, showing how both data sources tell the same story from different angles.
If this were production: I'd feed the attacker IP into threat intel platforms for enrichment, check if port 8889 traffic triggered any IDS/IPS alerts, audit what data was accessible from the compromised instance, verify whether the SSH session was used for lateral movement to other instances, and configure VPC Flow Log alerts for anomalous outbound traffic volumes and non-standard ports.
Security controls relevant
- Enable VPC Flow Logs on all subnets and ENIs
- Alert on high-volume outbound traffic to non-standard ports
- Network ACLs restricting egress to approved ports only
- Security group rules limiting SSH access to known IPs
- GuardDuty for automated anomaly detection on flow data
- Centralized log aggregation (CloudWatch, S3, SIEM)
What I took away from this
265MB on port 8889 is the finding that matters most in this lab, and it's the one that would be hardest to catch in production. Most security groups configure egress rules for known ports: block outbound SSH, restrict HTTP to approved destinations. But port 8889 isn't in any default deny list because it's not a well-known service. The attacker chose it precisely because it falls into the gap between 'explicitly blocked' and 'actively monitored.' The fix isn't blocking port 8889 specifically. It's flipping the egress model: deny all outbound traffic by default, and only allow what's explicitly needed.
The 6.5-hour attack window raises a practical question: how long until someone notices? In this scenario, the attacker had nearly 7 hours of uninterrupted access. That's enough to exfiltrate an entire database, establish persistence, and pivot to other instances. Most organizations measure their mean-time-to-detect (MTTD) in days, not hours. GuardDuty would have flagged the unusual outbound volume, but only if it was enabled. VPC Flow Logs were there the entire time, recording everything, but nobody was watching in real time.
Converting the PCAP to NetFlow with nfpcapd proved something important: packet-level and flow-level data tell the same story from different angles. In a real investigation, you often have one or the other, not both. Knowing how to work with both formats and correlate between them is the difference between confirming an attack and missing half the picture. The attacker used HTTP to get in, SSH for interactive access, and port 8889 to pull data out. Flow logs showed the volume, PCAP showed the content. Together, they give you the complete kill chain.