Major Partial Outage – All Regions - FTP, SFTP, WebDAV, and the legacy ExaVault API
Incident Report for Files.com
Postmortem

From 3:08 AM PST through 4:09 AM PST, Files.com customers experienced elevated error rates when connecting via certain protocols.  These included SFTP, FTP, and WebDAV, as well as our support for the legacy ExaVault API (which is only applicable to a small number of customers).

 Although this incident may seem similar to the incident that occurred on September 5th, the root cause was not related.

The elevated error rates during this period were caused when an internal SSL certificate expired, disrupting internal system communication between certain servers at Files.com.

Files.com has a sophisticated system for certificate management and this sort of failure is embarrassing and unacceptable.

Files.com constantly and automatically checks the certificate of every service we operate.  However, the internal service impacted by this issue was inadvertently left out of our Service Catalog.  We have reviewed and updated our procedures to ensure this does not happen in the future, and performed an audit to ensure there are no other unregistered services.

Files.com uses a service called Consul Template to ensure certificates are up to date. A bug was identified where this particular certificate was not correctly configured for updates. A project has been initiated to standardize and improve the handling of certificates to prevent this issue in the future.

The root cause of this issue was that Files.com’s configuration management did not update a certificate on an internal service in a timely manner.  A secondary cause was a failure to properly monitor that service, which prevented us from detecting the expiring certificate in advance.

We promise a system that works perfectly, all of the time, and today we failed to deliver that to you. Our entire engineering team is working hard to prevent issues like this one from occurring in the future. If you need additional assistance or continue to experience issues, please contact our Customer Support team.

Posted Sep 10, 2024 - 15:35 PDT

Resolved
We have resolved a major partial outage of SFTP, FTP, WebDAV, and the legacy ExaVault API in all regions.

This outage only affected SFTP, FTP, WebDAV, and the minimally implemented ExaVault API. Other services were not impacted.

This incident occurred between the times of 3:08am and 4:09am Pacific Time.

We are compiling a Root Cause Analysis that we will post here.
Posted Sep 10, 2024 - 04:19 PDT
Identified
We have identified the issue that is causing a major partial outage of SFTP, FTP, and WebDAV in all regions, and we are working to resolve it.
Posted Sep 10, 2024 - 04:11 PDT
Investigating
We are investigating a major partial outage of the Files.com service in all regions.
Posted Sep 10, 2024 - 04:00 PDT
This incident affected: SFTP, USA Region, Canada Region, Australia Region, EU (Germany) Region, UK Region, Japan Region, and Singapore Region.