Page Delivery And Publishing Issues Observed

Incident Report for Edge Delivery Services for Adobe Experience Manager

Postmortem

At 6:49 PM UTC we observed an issue that affected page delivery, authoring services and code deployments due to an outage with one of our cloud providers.

At 7:14 PM UTC, we switched delivery to another provider which resolved the delivery errors immediately. The franklin authoring services were still impacted at this time. Between 6:50pm UTC and 7:14pm UTC we experienced a slightly higher error rate (0.83%)

At 8:42 PM UTC, full service was recovered and publishing and code deployments could resume.

At 9:20 PM UTC the affected services from the provider experiencing the outage marked the incident as resolved.

At 9:34 PM UTC we switched the delivery stack back to the configuration before the incident occurred.

During the first 25 minutes of the outage, we experienced an elevated error rate of 0.83 percent, so no noticeable downtime was incurred on any customer site. The switch to another delivery infrastructure allowed AEM to remain fully operational, even while major parts of the internet were affected by the wider AWS outage.

To improve our quality of service in the future, following steps will be taken:

Improve monitoring of primary and secondary delivery infrastructure
Improve status messages and communication on status.hlx.page
Establish procedures for faster switchover from primary to secondary infrastructure

Posted Jun 14, 2023 - 21:48 UTC

Resolved

This incident has been resolved.

Posted Jun 13, 2023 - 20:53 UTC

Investigating

We are observing issues that is affecting our delivery services for AEM customers. The issue is under active investigation and we are working with full effort to reach a speedy resolution.

Posted Jun 13, 2023 - 18:54 UTC