Back

The AWS logs you miss during an incident

Anurag Jain Mar 16, 2026

14 mins read

Incident response in the cloud is derailed not by a lack of skill, but by a lack of visibility. Security teams frequently discover critical blind spots only after an incident is already underway, leading to delayed containment, inaccurate attribution, and incomplete forensic analysis.

This report walks through six realistic, real-world inspired scenarios where missing log sources prevented effective investigations. Each scenario highlights the specific AWS logs required to answer the most important questions during a security incident.

The “basic logging trifecta” is essential but insufficient

AWS CloudTrail (management events), VPC Flow Logs, and Route 53 Resolver Query Logs provide a strong foundational baseline for cloud security visibility. However, modern AWS environments rely heavily on serverless, containerized, and data-plane services that require additional logging.

Relying only on this baseline leaves critical gaps in areas such as object-level access, Kubernetes control-plane activity, Lambda invocations, host-level telemetry, and DNS analysis.

Critical visibility gaps and solutions

Blind Spot	Missing Log Source	Impact
Network Traffic Origin	VPC Flow Logs	Inability to trace the internal source IP responsible for suspicious/egress traffic
Data Exfiltration Scope	S3 Server Access Logs	Inability to determine which specific S3 objects were accessed or downloaded
Kubernetes Attribution	EKS Audit Logs	Inability to attribute Kubernetes control-plane actions, such as pod creation, to a specific Kubernetes identity
Serverless Abuse	CloudTrail Lambda Data Events	Inability to detect and attribute direct lambda:InvokeFunction API calls outside normal triggers
Host-Level Forensics	OS/App Logs via CloudWatch Agent	Lack of host-level evidence such as successful logins, privilege changes, and system activity
DNS C2 Channels	Route 53 Resolver Query Logs	Inability to see the actual domains being queried and identify DNS-based command-and-control activity

At 03:12 UTC, an automated alert fires for a sudden and sustained spike in data transfer costs, traced back to a production NAT Gateway. The security team confirms the high-volume egress traffic but quickly reaches a dead end.

They have GuardDuty findings and basic AWS monitoring data, but the most important log source, VPC Flow Logs, was never enabled for the VPC. As a result, they have no way to determine which internal resource is responsible for the traffic.

Investigation breakdown

Step 1: Initial anomaly detected

Question: “Why did our cloud bill suddenly spike?”
Log/Data: CloudWatch Metrics for NAT Gateway
Signal: A sudden, sustained spike in the BytesOutToDestination metric for the production NAT Gateway.

Step 2: Corroborating threat intelligence

Question: “Is this traffic associated with a known threat?”
Log/Data: GuardDuty Findings
Signal: A GuardDuty alert such as Exfiltration:EC2/C&CActivity.B appears, indicating suspicious outbound communication. The public IP corresponds to the NAT Gateway, but the private source inside the VPC remains unknown.

Step 3: The search for the source (the blind spot)

Question: “Which instance inside the VPC is sending all this traffic?”
Log/Data: VPC Flow Logs
Signal: LOGS MISSING. Flow Logs were never configured, leaving no record of the internal source IP.

Step 4: The painstaking manual search

Question: “Without flow logs, how can we find the source?”
Action: Manually SSH into multiple EC2 instances to inspect active connections
Outcome: After hours of manual effort, a compromised instance is finally identified.

Missing log highlight

Without VPC Flow Logs, the team lacked visibility into the internal source of the traffic. These logs would have provided exact srcaddr and dstaddr details, enabling immediate identification of the affected host.

Lesson learned

Enable VPC Flow Logs for all production VPCs and critical subnets. They are essential for tracing network activity and drastically reducing investigation time.

Scenario 2: S3 data exfiltration: Which files were stolen?

A GuardDuty alert flags suspicious S3 activity from a malicious IP address. Investigation reveals that IAM credentials were compromised. CloudTrail management events show activity against a critical S3 bucket, and S3 data events are enabled but S3 Server Access Logs were never configured.

Investigation breakdown

Step 1: Initial detection

Question: “Why is an IAM user accessing S3 from a malicious IP?”
Log/Data: GuardDuty Findings
Signal: Alert Exfiltration:S3/MaliciousIPCaller is generated.

Step 2: Identify compromised principal

Log/Data: CloudTrail Management Events
Signal: CloudTrail confirms bucket activity but does not reveal which objects were downloaded.

Step 3: Attempt object-level visibility

Log/Data: CloudTrail S3 Data Events
Signal: Data events show API calls like GetObject, but attribution is incomplete.

Step 4: The practical forensic gap

Log/Data: S3 Server Access Logs
Signal: LOGS MISSING. No detailed object access information is available.

Missing log highlight

CloudTrail captures API activity but lacks the detailed context needed for real-world forensics. S3 Server Access Logs provide crucial evidence such as:

Requester IP address
Object key accessed
User agent
Amount of data transferred

Without them, precise breach scope cannot be determined.

Lesson learned

Treat S3 Server Access Logs as mandatory for all sensitive buckets, using CloudTrail data events only as a complementary source.

Scenario 3: Unmasking a rogue pod with EKS Audit Logs

An attacker, having gained initial access, moves laterally toward an EKS environment. The security team is alerted to suspicious outbound traffic from a pod but finds themselves in a difficult position. They have access to VPC Flow Logs, which show the anomalous network connections, and CloudTrail management events, which track AWS API activity. However, they soon discover a critical blind spot: EKS Kubernetes audit logs were never enabled. This missing log source prevents them from attributing control-plane actions, such as who created the rogue pod and how it was introduced into the cluster.

Investigation breakdown

Step 1: Network anomaly detected

Question: “What is this suspicious outbound traffic?”
Log/Data: VPC Flow Logs
Where: AWS Console → CloudWatch → Log Groups
Signal: High-volume, periodic outbound traffic from a pod-related ENI to a known malicious C2 IP address.

Step 2: ENI to node mapping

Question: “What owns this network interface?”
Log/Data: EC2 Network Interface Information
Where: AWS Console → EC2 → Network Interfaces
Signal: The ENI is identified as being attached to a specific EKS worker node.

Step 3: Node to pod mapping

Question: “Which pod on this node is likely responsible for the traffic?”
Log/Data: Kubernetes API / CloudWatch Container Insights
Where: CLI: kubectl get pods or CloudWatch Container Insights
Signal: An unfamiliar pod named rogue-cryptominer-pod is discovered running on the identified worker node. Attribution to the exact pod is possible in many environments, but can be challenging depending on network configuration and CNI setup.

Step 4: The control plane blind spot

Question: “Who created this rogue pod and through what action?”
Log/Data: EKS Audit Logs (Missing)
Where: CloudWatch Log Groups for EKS control plane logs (not enabled)
Signal: No audit log data is available. There is no record of the Kubernetes API call such as verb=create for the rogue-cryptominer-pod, leaving the originating Kubernetes identity and request context completely unknown.

Missing log highlight

The absence of EKS audit logs was the critical failure in this investigation. Had they been enabled, the team would have had records of Kubernetes API server activity, including fields such as:

verb: create
objectRef: name of the pod
user.username: the Kubernetes user or service account that initiated the request
source IP and request metadata

These logs would have identified which Kubernetes identity created the pod and when. While audit logs do not always map directly to an AWS IAM user, they provide the essential control-plane visibility required to trace activity back to a service account, automation system, or compromised cluster credential.

Lesson learned

Always enable EKS API server and audit logging for production clusters. These logs are indispensable for understanding who performed actions in the Kubernetes control plane and for reconstructing malicious activity during incident response.

Scenario 4: Tracing serverless abuse with Lambda Invoke Events

The security team receives an alert from an external partner reporting suspicious activity originating from one of the company’s public APIs. The partner claims they are seeing automated requests performing unexpected actions at high speed. Internal monitoring confirms that a specific Lambda function is executing far more frequently than normal.

CloudWatch Logs show unusual behavior inside the function, and API Gateway access logs explain some of the activity. However, the team soon realizes a critical gap: CloudTrail data events for Lambda Invoke were never enabled. As a result, they have no visibility into whether the function is being invoked directly through the Lambda API rather than through its intended triggers.

Investigation breakdown

Step 1: Anomalous activity detected

Question: “Why is this Lambda function executing so frequently?”
Log/Data: CloudWatch Metrics for Lambda
Where: AWS Console → CloudWatch → Metrics → Lambda
Signal: A sudden and sustained spike in invocation count and duration for a function that normally runs only a few times per minute.

Step 2: Function runtime analysis

Question: “What is the function doing during these executions?”
Log/Data: CloudWatch Logs (Lambda Function Output)
Where: AWS Console → CloudWatch → Log Groups
Signal: Runtime logs show the function making unexpected outbound connections and processing requests that do not match normal application behavior.

Step 3: Trigger analysis

Question: “Which known services are triggering this function?”
Log/Data: API Gateway Access Logs and other configured trigger logs
Where: API Gateway → Access Logs / CloudWatch
Signal: API Gateway logs explain some invocations, but the volume of requests does not match the total Lambda execution count. A large portion of invocations appear to be coming from an unknown source.

Step 4: The invocation blind spot

Question: “Is the function being invoked directly through the Lambda API?”
Log/Data: CloudTrail Data Events for Lambda (Missing)
Where: CloudTrail Trail destination (S3 or CloudWatch Logs)
Signal: No visibility into direct lambda:InvokeFunction API calls. The team cannot determine whether an IAM principal, compromised credentials, automation script, or an external actor is invoking the function outside of normal application pathways.

Missing log highlight

CloudTrail is enabled by default in AWS accounts, but only management events are captured automatically. Lambda invocations (InvokeFunction) are data plane events, which are not logged unless CloudTrail data events are explicitly configured.

Because Lambda Invoke data events were not enabled, the team lacked the only native AWS mechanism for identifying direct API invocations of the function, including:

The eventName: Invoke action
The source IP address of the caller
The IAM principal or service making the call
The exact time and frequency of direct invocations

It is important to note that CloudTrail Lambda data events do not capture every possible execution. Invocations triggered by services such as API Gateway, SQS, EventBridge, or SNS rely on their own trigger-layer logs for attribution. However, data events are essential for detecting unexpected or unauthorized direct invocations that bypass those services.

Lesson learned

Enable CloudTrail data events for critical or publicly exposed Lambda functions. Default CloudTrail logging does not include invocation activity. Trigger-specific logs (API Gateway, ALB, SQS, EventBridge) are necessary for normal attribution, but only CloudTrail Lambda Invoke data events can reveal direct API abuse. Relying solely on runtime logs and trigger logs leaves a dangerous blind spot.

Scenario 5: The compromised instance with no host-level footprints

A security operations center receives a high-severity alert from AWS GuardDuty for ‘UnauthorizedAccess:EC2/SSHBruteForce’ targeting a production EC2 instance. Shortly after, VPC Flow Logs show suspicious outbound traffic from the same instance to an unknown IP address over a non-standard port. The incident response team has access to GuardDuty findings, VPC Flow Logs, and CloudTrail management events. However, they quickly discover that the CloudWatch Agent was not configured to collect operating system (OS) or application logs from the instance, leaving them with almost no visibility into what occurred on the host after the suspected compromise.

Investigation breakdown

Step 1: Initial detection

Question: “What triggered the investigation?”
Log/Data: AWS GuardDuty Finding
Where: AWS Console → GuardDuty → Findings
Signal: GuardDuty finding UnauthorizedAccess:EC2/SSHBruteForce identifies multiple failed SSH login attempts from a malicious IP address.

Step 2: Network activity corroboration

Question: “Can we see the suspicious network traffic?”
Log/Data: VPC Flow Logs
Where: CloudWatch Log Groups (Flow Log destination)
Signal: Flow logs confirm a high volume of REJECT records for TCP port 22, followed by an ACCEPT record, and then new outbound ACCEPT records to a suspicious external IP.

Step 3: Host-level activity investigation (the blind spot)

Question: “What did the attacker do after gaining access to the instance?”
Log/Data: OS Logs (e.g., Linux /var/log/auth.log, /var/log/syslog)
Where: CloudWatch Log Groups (if CloudWatch Agent is configured)
Signal: No data available. The investigation hits a dead end. The team has no visibility into the successful login event, the user account used, privilege escalation activity, service changes, or other system actions that followed the intrusion.

Missing log highlight

The absence of OS and application logs collected by the CloudWatch Agent created a major investigative blind spot. Without centralized host logs, the team could not determine:

Which Linux user account successfully authenticated
Whether sudo or other privilege escalation occurred
What services were modified or started
Whether new user accounts or SSH keys were added
Basic system activity following the intrusion

It is important to note that the CloudWatch Agent primarily collects and centralizes existing logs. It does not generate detailed process execution or command history telemetry on its own. Deeper visibility into commands run and processes spawned would require additional tooling such as auditd, EDR solutions, osquery, or AWS Systems Manager Session Manager logging. However, even standard OS logs would have provided critical foundational evidence that was completely absent in this case.

Lesson learned

Deploy the unified CloudWatch Agent to all production EC2 instances to collect essential OS and application logs (e.g., Linux auth.log and syslog, Windows Security logs). Centralizing these logs in CloudWatch provides the minimum host-level telemetry required for incident response. For advanced investigations, complement this with process-level monitoring solutions such as auditd, EDR tools, or SSM Session Manager logging to capture detailed attacker activity.

Scenario 6: Detecting C2 Channels with Route 53 Resolver Query Logs

During a routine review of network traffic, security analysts notice an unusual pattern: small, periodic bursts of UDP traffic on port 53 egressing from a production VPC. The team has access to VPC Flow Logs, which confirm the source instance and destination IPs. However, they are missing a critical piece of the puzzle Route 53 Resolver query logs were never enabled for the VPC. This means that while they can see DNS traffic is occurring, they have no visibility into what domains are being queried, leaving them blind to a potential DNS-based command-and-control (C2) channel.

Investigation Breakdown

Step 1: Network anomaly detected

Question: “What is causing the periodic outbound UDP/53 traffic?”
Log/Data: VPC Flow Logs
Where: CloudWatch Log Groups (Flow Log destination)
Signal: Logs show an instance sending UDP packets to multiple external IPs on port 53 at regular intervals, behavior consistent with possible C2 beaconing.

Step 2: Source instance identification

Question: “Which resource is generating this DNS traffic?”
Log/Data: EC2 Network Interface (ENI) Information
Where: AWS Console → EC2 → Network Interfaces
Signal: The source IP is mapped to a specific ENI attached to an EC2 instance.

Step 3: DNS Query Analysis (the blind spot)

Question: “What domains is the compromised instance querying?”
Log/Data: Route 53 Resolver Query Logs
Where: AWS Console → Route 53 → Resolver → Query Logging
Signal: No data available. Because Resolver query logging was not enabled for the VPC, the team cannot see the actual DNS queries. They are unable to determine whether the instance is querying legitimate domains, known malicious domains, or algorithmically generated domains (DGAs).

Missing log highlight

The absence of Route 53 Resolver query logs created a critical visibility gap, obscuring the adversary’s DNS infrastructure and severely slowing down containment. The team could identify which instance was generating suspicious DNS traffic, but not the domains being contacted. Without this data, they also could not analyze for techniques such as DNS tunneling, where attackers embed exfiltrated data within DNS queries themselves.

Lesson learned

Enable Route 53 Resolver query logs on all production VPCs and centralize them for analysis. Integrating these logs with a SIEM and Route 53 Resolver DNS Firewall provides powerful capabilities to detect and block advanced threats such as DGA-based command-and-control and DNS tunneling.

Conclusion

Across every scenario, the pattern is the same: investigations stall when logs are missing.

Effective AWS forensics depends on answering a few core questions:

Who did what? – CloudTrail
Who talked to whom? – VPC Flow Logs
Who queried what? – Resolver Query Logs
Who accessed which objects? – S3 Server Access Logs
What happened on the host? – OS Logs
What happened in the cluster? – EKS Audit Logs

Don’t wait for a real incident to discover your blind spots. Enable comprehensive logging now, centralize it securely, and ensure immutability using tools such as S3 Object Lock.

A proactive logging strategy is one of the highest-impact investments you can make in cloud security. Your future incident response team will be grateful you did.

On this page

The AWS logs you miss during an incident

The “basic logging trifecta” is essential but insufficient

Critical visibility gaps and solutions

Scenario 1: Missed VPC Flow Logs exposes blind spot in network traffic

Investigation breakdown

Missing log highlight

Lesson learned

Scenario 2: S3 data exfiltration: Which files were stolen?

Investigation breakdown

Missing log highlight

Lesson learned

Scenario 3: Unmasking a rogue pod with EKS Audit Logs

Investigation breakdown

Missing log highlight

Lesson learned

Scenario 4: Tracing serverless abuse with Lambda Invoke Events

Investigation breakdown

Missing log highlight

Lesson learned

Scenario 5: The compromised instance with no host-level footprints

Investigation breakdown

Missing log highlight

Lesson learned

Scenario 6: Detecting C2 Channels with Route 53 Resolver Query Logs

Investigation Breakdown

Missing log highlight

Lesson learned

Conclusion

Related articles

The Security Trifecta: Operationalizing API Protection with AWS, Wallarm, and Coralogix

Slack, Teams & Google Chat in Your SIEM: Why Collaboration Audit Logs Matter

Hybrid Cloud Defense Grid: Bridging Wiz and Runtime Telemetry

Be Our Partner

Thank You

Download our logo in high resolution