Are you being served? Ensuring service reliability in trying times

0

By Stuart Templeton, Head of UK at Slack 

The world's workforce is more reliant on tech than ever before. Gartner notes that 69% of corporate boards are accelerating their digital transformation plans in the wake of the pandemic. This is good news for individuals, teams and entire organisations. 

However, behind every tool there's a hard working team of engineers. Not only have these tech all-stars been navigating the same disruptions to work the rest of us have, they've been doing so whilst scaling their solutions rapidly to unprecedented demand. Whether it's collaboration platforms, video conferencing or e-commerce platforms, developers have been in high demand. 

With that demand, though, comes added pressure. With greater use, the cost of downtime rises, and any issue can cause customers to second guess their investments. 

What can those valuable tech teams learn from the last year to boost service reliability, under increased demand?  

Understanding the impact of downtime 

Every second a service is down, the bill goes up. Estimates suggest that for the average business, downtime costs a huge £4,400 per minute—a sum no organisation can afford to burn through regularly. For larger companies, the hit is even more eye watering—just look at Facebook, which lost an estimated £75 million due to a 14 hour outage

Large enterprises also face reputational challenges from downtime. Google services were down for 45 minutes in December 2020, and the news was covered in the pages of hundreds of publications worldwide. 

The stakes are high. Yet downtime is a fact of life when working with technology. And, with workers spread across different locations, including their own homes, mastering incident responses can be challenging. 

But it doesn't need to be. With the right tools, processes and mindsets, the challenges downtime creates can be minimised.  

Think agile and nimble 

With every minute critical (and costly), agility is essential for dealing with service issues. Teams need to identify issues rapidly, collaborate seamlessly and share findings in case an issue recurs. This agility at the back-end also works to help organisations stay close to customers and pivot quickly to serve them—a key driver of the success of any business moving forward.  

That doesn't mean that every company, however, needs to be a super streamlined start-up. Even the largest and most established enterprises can become much more agile than they thought. The key is optimising for faster decision-making, empowering teams with the ability to respond rapidly, without waiting for unnecessary sign-offs, and keeping the whole company aligned on mission strategy. 

What does that mean when it comes to service issues in particular? Giving tech teams what they need to build incident-ready workflows. 

Workflows that can handle strain 

You've seen it in movies. A red light is flashing, the alarm is going off, something's seriously wrong and people are looking stressed (and a little sweaty). 

It's not quite as dramatic, but when services go down, the pressure dials-up. In these fast moving situations, even those with the best intentions can be liable to make mistakes. Throw in communication barriers that can happen when people are working remotely—particularly if they are relying on email—and things get even tougher. 

Moving quickly, though, only works if communication is effective. For tech teams in the hot seat, having a collaboration platform that can connect all their conversations, real-time integrations with key software, and relevant docs is essential. This enables them to bring all the critical context they need into one space where they can problem solve and take action as an aligned team. It also cuts time lost switching between apps and searching for information. And, when it comes to writing up a report or handing the issue over to another global office, everything is logged transparently in one space. 

Non-human teammates also have a role to play in managing tech challenges. Take Starling Bank for example, which took an inventive approach to diagnosing and dealing with issues. Starbot, a custom built Slack bot created in-house is used to assign less-senior developers temporary privileges to deploy releases when needed. This frees up hours of time for senior developers, while also keeping track of all the escalations so that they can stay on top of audit requirements. 

Ensuring reliability in the year ahead 

Every department has experienced an acceleration in the transformation of work over the past year. However, ironically it's the tech-teams—those who already best understand the importance of digital tools—that have faced some of the greatest pressures. 

Empowering those teams with technology that enables them to move rapidly, bring all the information they need into one place, and take action, is crucial for every business moving forward. By doing so, costly downtime can be cut, and customers can be served more effectively.