3h | Natalie Bannerman
Following the well-publicised CDN outages of 2021, Capacity’s Natalie Bannerman explores how we can future-proof this infrastructure to avoid such events happening again
In the age of content streaming, we are all likely to have some kind of video streaming service such as Disney+, Prime Video or Netflix in our homes and – depending on our interests – some kind of gaming service in addition.
But how many of us actually think about the networks that deliver this content to us? The global video streaming market was valued at $50.11 billion in 2020 and is expected to increase at a compound annual growth rate of 21% from 2021 to 2028.
With such a large market, it is important that content delivery networks, sometimes referred to as content distribution networks (CDNs), are not only optimised and scalable, but secure.
As a series of geographically distributed network proxy servers and their data centres, CDNs are just as susceptible to outages and security threats as any other form of infrastructure.
A case in point occurred earlier this year, when Fastly, the cloud computing services and CDN provider, caused a widespread digital blackout as a result of a software bug that brought down its edge cloud platform. The hour-long outage affected the likes of Reddit, gov.uk, Twitch, Spotify, Amazon, the New York Times, The Guardian, CNN and the BBC.
At the time, Fastly’s SVP of engineering and infrastructure, Nick Rockwell, confirmed that the incident was triggered “by a valid customer configuration change” following a software deployment on 12 May.
Following this, on 8 June, a customer pushed a valid configuration change, which included “the specific circumstances that triggered the bug, which caused 85% of our network to return errors”, said Rockwell.
Though the Fastly incident was short-lived, a week or so later a similar disruption occurred in Australia, when web services company Akamai also faced a problem with its CDNs – this time taking down airlines and banks including ANZ, Westpac, St George, ME bank, Macquarie Bank, American Airlines, Southwest Airlines, United Airlines and Delta Air Lines.
At the time, Akamai confirmed that it had “experienced an outage for one of its Prolexic DDoS services (Routed 3.0)”, affecting roughly 500 of its customers. The company was keen to stress that: “The issue was not caused by a system update or a cyberattack. A routing table value used by this particular service was inadvertently exceeded. The effect was an unanticipated disruption of service.”
As a result of these disruptions, a wider conversation around CDN security and resiliency was sparked, starting with the most prevalent question: will these types of outages become the new normal?
The new normal
According to Ranjan Goel, vice president of product management at LogicMonitor, a cloud-based infrastructure monitoring and observability platform, as our infrastructure becomes more complicated, the chances of such blackouts also increase.
“As our IT infrastructures grow in complexity, sweeping outages are likely to increase in frequency and severity unless infrastructure monitoring capabilities keep pace with the rate of complexity,” he says.
“The only way to prevent issues resulting from IT infrastructure – such as CDNs – that may lead to widespread outages is through holistic visibility into the entire IT infrastructure to identify problems before they result in wider damage.”
His answer to this problem is leveraging automation and AI – because, as he puts it: “This is a task that humans cannot deal with alone: AIOps and machine learning software must be put to the task.
“AI solutions can pore over the billions of data points IT environments produce and quickly alert IT teams when there is an issue that needs looking at before it results in an outage event,” he explains.
“This may not prevent every outage, as there is no single silver bullet, but it will greatly mitigate the issue and shorten outage mitigation times if and when they do occur.”
Interestingly, Andy Still, CTO of Netacea, a provider of AI-enabled bot detection and mitigation, takes a slightly different view, believing that “outages like these are very rare”.
“Platform resilience is generally getting better, and outages are much less common than they used to be. This is driven by improvements in technology and automation of high-availability systems – systems that are designed to be highly available so will automatically failover to replacements in the event of any issues,” he says.
If we resign ourselves to the fact that more infrastructure blackouts are likely to occur, the question then becomes how to minimise the effects of such blackouts – and perhaps increased competition in the space is the answer. Since the Fastly outage many industry experts have said that the overreliance on a small number of cloud/CDN providers means that if services go down, it is much bigger in scale and impact.
“Companies should use a multi-CDN infrastructure from multiple vendors to minimise, or even avoid, the impact of catastrophic outages like the Fastly and Akamai events,” said Kris Beevers, CEO at NS1, an application traffic intelligence company.
“This also helps them avoid lock-in and gain leverage to keep CDN costs in check. But they must have high observability of their global application delivery performance as well as the ability to immediately take action if a CDN fails to perform as expected.”
This sentiment is echoed by Magnus Bjornsson, CEO of Men&Mice, a provider of sustainable network management, who reminds us that as more business rely on a small circle of CDN providers, when problems do occur “the fallout is likely to be sweeping”.
“The only effective way of resolving this issue is to add redundancy,” he adds. “It is therefore critical that CDNs have a proper redundancy solution in place, but also for the users themselves to think about redundancy and build their product to not use a single CDN service.”
As important as redundancy is, Still says that the larger size of some CDNs is directly tied to its effective content delivery and in fact the use of smaller providers could lead to more outages altogether.
“One of the key benefits of a large CDN is its size. In fact, size is one of the drivers for using a CDN. The more companies using this service, the better the underlying network can be – a large number of small CDNs would lose this benefit,” explains Still.
“The size of a bigger CDN means the impact appears larger, but that is simply because there are a lot of simultaneous outages. Smaller companies would likely have more outages, but they would be more distributed, so not as noticeable.”
The most important part of the solution is, of course, security. But first, how are they currently being fortified? Well, like most other cloud-based infrastructure this includes everything from DDoS mitigation, SSL certification, application firewalls, monitoring and visibility platforms.
“Leveraging multiple managed DNS services to ensure optimal DNS redundancy is quickly becoming the clear best practice for CDNs which are being used by businesses,” says Bjornsson.
“That means automatically taking care of the replication and synchronisation of data in a reliable and consistent way.”
On the topic of building high availability into the key layers in the infrastructure stack, Paul Speciale, chief product officer at Scality, says that software-based virtualisation and software-defined storage and networking have become “commonplace in the data centre, and they leverage commodity hardware” meaning that “high availability, security and manageability really need to be planned at all the key layers in the infrastructure stack”.
The key attributes required to achieve “gold standard” availability – or the commonly referred to 99.999% availability – Speciale says, include solutions built on distributed systems, with redundancy in both software and hardware components to eliminate single points-of-failure.
“These systems should be designed to fully anticipate and expect failure events to occur: components fail, services fail – so modern system design is to expect failures to happen and have a design that can route around the failures through alternative paths,” adds Speciale.
Aside from the use of AI and automation, to detect and correct problems and cost-effective networking, he also says that “self-healing systems have become more common”, which means the ability to restore automatically from events such as server failures or disk drive failures, by rebuilding data and storing them redundantly on other servers or disk drives to restore protection levels.
Overall, Still says that “any good security approach will consider security of both the infrastructure and application”, pointing out that often security attacks occur via the business logic of an application.
“So, rather than exploiting a technical weakness, they undertake legitimate activity for illegitimate aims – for example, creating thousands of fake accounts to get a free bonus with each account.”
Streaming content continues to surge, not just from video and gaming but also from the high-definition delivery networks by non-media companies,” says Goel.
This in turn means that these companies are recognising that CDNs are now part of enterprise architecture that “needs to be actively monitored for the overall availability of their business services.”
As such, “It’s important that CDNs are designed with redundancy in place to ensure that they can continue delivering content to users without failure,” adds Bjornsson.
Source of this news: https://www.capacitymedia.com/articles/3829483/cdns-down-but-not-out
Quantum has published the release of the H4000 A must, an all-in-one appliance in which integrates Quantum CatDV about asset management and Dole StorNext 7 shared storage software on the H4000 li...
This course is about the How to make creator voice control commands in macOS Catalina. We will try our best so that you will understand this guide. I hope that appeal to you this blog How to ma...
Marketers have to find new ways to identify preferencesBy Raviteja DoddaFor long, this is the challenge that marketers have been grappling with – how to make subscribers open the mail and how to give...
If you are just an average Joe, then you probably don’t have a full understanding of the purpose and use of proxy servers. Most people heard about using a proxy for unblocking the US library on ...
(Bloomberg) -- Alphawave IP Group Plc sank as much as 15% after the sacrifice of fowl.|leaving the|a|using} 856 million-pound ($1. a pair of billion) initial public funding on the London Stock Ex...
ProPublica is a nonprofit newsroom that investigates abuses of power. Sign up to receive our biggest stories as soon as they’re published. This post was co-published with Source. As a member of...
A new study advocates roughly 67% of contain with breakthrough covid conditions had positive nasal swabs, compared to nearly 85% from unvaccinated patients. Separately, a substitute study repo...
Himachal Pradesh Police on Sunday made aware the netizens of hacker creating fake WhatsApp IDs of important persons since cops of the state to successfully dupe people and aware them not to inte...
We know a VPN will definitely help mask your identity and therefore hide your location from the internet companies. But is using a VPN service illegal? Have you ever evaluated it? To som...
Web scraping on a large scale doesn’t have to be a complicated and frustrating task. Some of the more common hurdles that people have to jump through when scraping through data are IP bans and scalab...
As the title may suggest, i have problems with my internet connection, everytime i open a website that's all i see.I have already looked for many solutions on the internet and tried anything i can f...
In a country of all the stories with a multicultural base and over 750 million smartphone the bracket is a big recent base for any product insurer, and selling stories is not at all new content. ...
If you're travelling overseas and want to stay connected with family and friends or use your phone to get from A to B, you'll need to work out whether you'll use your Australian SIM and pay for inter...
As public online data acquisition becomes increasingly important to decision-making, AI, web scraping and proxies will continue to find their way into business activities. While the inclusion of AI i...
If you are adding PPA repo in Linux mint 20.02 and getting an error Cannot add PPA: ”This PPA does not support focal”. Then follow the simple command given in the article that wi...
Regulation exists to stop email tracking without your consent. In Europe, pixels are covered by the Privacy Electronic Communications Regulations 2003 (Pecr) and the EU’s General Data Protection Regu...
Getty/KTSDESIGN/SCIENCE PHOTO LIBRARY Don't worry. Changing your IP address is easy, even if you've never done it before. It's also perfectly safe and -- as long as you're not using it to break oth...
Abuse Desk Abuse Desk is the common name for the group of network administrators charged with enforcing Acceptable Use Policy/Terms of Service agree...