On October 20, 2025, Amazon Web Services (AWS) experienced a major outage starting around 3 a.m. ET, lasting roughly 15 hours until resolution by 6 p.m. ET. The disruption originated in AWS’s oldest and largest data center cluster, US-EAST-1 in northern Virginia, affecting services worldwide despite being regional.
Root Causes
The primary trigger was a Domain Name System (DNS) failure, which acts as the internet’s “phonebook” by translating domain names into IP addresses. This prevented applications from locating the correct address for AWS’s DynamoDB API, a key cloud database for storing user data and other essentials. Cybersecurity expert Ian Lin explained that without DNS, “services kind of just can’t talk to each other.” Potential underlying issues could include physical problems (e.g., hardware damage), software glitches, or malware, though no cyberattack was indicated. Professor Shion Guha noted the complexity: fixing one issue requires checking interconnected products to avoid cascading failures. This marks at least the third major US-EAST-1-related meltdown in five years.
Global Impacts
The outage rippled globally, disrupting apps like Snapchat, Reddit, Fortnite, Roblox, Netflix, Disney+, Zoom, Venmo, and Amazon’s own services. In Canada, it affected phone bill payments, driver’s license renewals, and more, as many telecoms and governments rely on AWS. Downdetector logged over 13 million reports worldwide, including 351,000 from Canada. Expert Davi Ottenheimer called it “unique” for impacting so many major services.
Why Cloud Services Are Vulnerable
AWS dominates with 41% market share, centralizing critical infrastructure for businesses and governments. This creates a “critical point of failure,” as Guha warned, leading to networked effects where one company’s downtime cascades across ecosystems. Vulnerabilities stem from system complexity, over-reliance on single providers, and risks in key regions like US-EAST-1. While protocols exist for quick recovery, such events highlight how even robust clouds can falter due to interconnected dependencies, urging diversification and resilience planning.

