Technology Analysis

Why we can't afford Optus outages

By | | comments |
(Adapted Image by Alan Levine | flickr)

We can’t afford to have a single point of failure in our telecoms system.

The recent Optus outage, which affected 10 million customers, cannot be considered a "rare occasion". Over the last few years, we have witnessed several major outages across telecom networks, making it imperative for us to prepare ourselves for such events.

Today, over 99 per cent of telecoms traffic comprises data. Virtually every organisation and nearly all Australians rely on data services through their phones and fixed-line connections. As we've observed, an outage of this magnitude can cause significant disruptions in the economy and people's private lives. In this case, even the 000 emergency service on landlines was disconnected.

These outages are of national interest and thus we require national solutions to mitigate the considerable fallout from such events.

What occurred at Optus was likely a software problem. While such issues occur more frequently, most systems recover in seconds or minutes, resulting in minimal disruption. However, in some cases, as appears to have happened this time, a critical fault during a software update can cascade through the computer systems that underpin the network's operation. Unravelling, fixing and bringing all these different systems back online can take hours, and sometimes even days. Moreover, not all systems are likely to come back online simultaneously; they need to be restarted one by one, further extending the recovery time.

In the end, this is an infrastructure problem.

There are essentially two long-term solutions for such events.

The reason for the delay in implementing this solution in Australia is the resistance from telecom companies.

The first one pertains to the individual networks of the operators. It is unacceptable for a single point of failure in a network that can bring down an entire country, or as seen before, the entire East Coast. With over 100 years of telecom experience and a wealth of engineering knowledge and skills, networks can be designed to eliminate single points of failure. In the event of a disruption, traffic should be rerouted through other network systems. In other words, there should be duplicated, unconnected systems where one can take over from the other in emergencies.

The other solution involves the combined telecoms infrastructure in Australia. In an emergency, there should be a "gateway" facility connecting the networks, allowing them to take over traffic from one another. In the case of mobile networks, this writer has advocated for this for over 20 years.  The solution is called roaming. After government pressure, an announcement was finally made last week that roaming via mobile networks is now possible in emergency situations such as bushfires or floods. It's technically feasible and we should explore its use in other emergency scenarios, like the one we experienced yesterday.

The reason for the delay in implementing this solution in Australia is the resistance from telecom companies. They view the size of their networks as a competitive advantage and question why they should allow others to use their network.

The issue is that these networks aren't just commercial operations; they are vital infrastructure for our society and economy. Protecting the national interest in the face of serious network failures is paramount. Implementing such solutions requires the government's commitment and the regulatory authority's influence.

However, there is also a responsibility on the part of users, both organisations and individuals, to acknowledge that such events will happen and assess their vulnerability. For example, if a company's sales system goes down, financial systems shut down, transport systems don't work, or emergency operations fail, these organisations need to consider the need for their own redundancy solutions.

For individuals, it's important to be prepared. Most people are familiar with communication methods such as WhatsApp, Skype and Facetime, for example. In emergencies such as the one that occurred yesterday, these systems still function. Mobile phones are increasingly software-based, using e-SIMs (no physical SIM cards needed), which allow you to switch between operators, like switching from Optus to another operator in a situation like this.

Solutions need to encompass all these aspects. Networks must be more resilient and users must explore their options in such situations.

One thing is certain: more outages will occur, so preparedness is crucial.

Paul Budde is an IA columnist and managing director of independent telecommunications research and consultancy organisation, Paul Budde Consulting. Follow Paul on Twitter @PaulBudde.

Related Articles

Support independent journalism Subscribe to IA.

 
Recent articles by Paul Budde
Norway a world leader in broadband access

Norway boasts an impressive telecommunications infrastructure in both home and ...  
NBN market sees smaller telcos gain ground

Smaller telcos have seen a rise in the NBN market and increased competition among ...  
Transforming AI: The emergence of ethical biocomputing

Artificial intelligence is paving the way for advancements in biocomputing, where ...  
Join the conversation
comments powered by Disqus

Support Fearless Journalism

If you got something from this article, please consider making a one-off donation to support fearless journalism.

Single Donation

$

Support IAIndependent Australia

Subscribe to IA and investigate Australia today.

Close Subscribe Donate