degraded performance affecting the tado° app

Incident Report for tado GmbH

Postmortem

Yesterday, between around 11:30 and 13:00 CET, tado° backend systems encountered a major outage. The result was that the tado° app was not responding. However, manual control on the devices was not affected and working fine for our customers.

At 11:30 a scheduled change was made in our main database in order to allow users to toggle one type of push notifications in the app settings. These types of routine changes usually result in the database only being unresponsive for only a few seconds, which does not have any visible impact on our customers' app performance. For this specific change, the database stayed unresponsive after the planned time frame which turned this routine change into an incident. 

Directly after we noticed the database staying unresponsive, our server development team was on top of the problem and working hard to resolve it. The main reason for taking longer than expected to resolve the issue was the fact that we faced side effects after restarting the database which prevented our servers from running in a healthy state.

We want you to know that we take the performance and reliability of tado° seriously and are currently taking steps to prevent this specific issue from happening again while at the same time looking into options to communicate future scheduled database changes better. We’re always striving to optimise the way we communicate with our customers and we appreciate all the feedback you gave us during this incident.

You can find more information about how you can control your heating system on the tado° thermostats themselves in this article: https://support.tado.com/hc/en-gb/articles/207704943

We are very sorry for any inconvenience this may have caused.

Your tado° Team

Posted Oct 15, 2020 - 13:17 CEST

Resolved

This incident has been resolved.
Posted Oct 14, 2020 - 13:07 CEST

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Oct 14, 2020 - 13:02 CEST

Update

We are continuing to work on a fix for this issue.
Posted Oct 14, 2020 - 11:49 CEST

Identified

We have identified an incident affecting the performance of our app. Please subscribe to our status page for further updates and information on possible self-help.
Posted Oct 14, 2020 - 11:40 CEST
This incident affected: Smart Device Control.