Inside Azure Datacenters 2020: The Truth

Now here, you can see a full spectrum of what I would consider Azure datacenters. It might not be what you were thinking of. You were probably thinking of what you see on the far right, which is our hyper-scale cloud regions. But in some sense, all the way down to a tiny sphere device that is sitting outside in the wild environment on an MCU-type sensor, that in some sense, is an Azure datacenter because Azure services get pushed down onto that device and it connects to the hyper-scale cloud regions.

The spectrum of data centers that we’ve got in between those two extremes. From on the small end, Azure IoT devices, where you install Azure IoT runtime onto a Raspberry Pi or a server, where those now become Azure-enabled to Azure Stack Hub and Azure Stack HCI, where Azure Stack Hub is actually, in essence, full Azure from the portal to the control plane APIs. Azure resource manager platform is a service or offerings, infrastructure is a service that you can deploy in as little as four servers into your own datacenter or out on the edge.

Azure private Edge Zone and Azure Edge Zones, which are intermediate class deployments of Azure services that I’ll talk more about a little bit later. This full-spectrum is Azure datacenters. But when you start to talk about the far-right into that spectrum, which is the hyper-scale public cloud regions, you start to talk about our overall cloud footprint. And Azure has more regions than any other cloud provider, with more than 60 at this point.

The company launched multiple since just the start of this year. But at this point, we’reconnecting all of these regions, 61, with our dark fiber backbone, which consists of over160,000 miles and growing of connections between the regions where, if your traffic enters one of our regions on one side of the planet and transverses to another region, it stays on our dark fiber backbone. We’ve got 170 edge sites so that the traffic coming from outside of Azure, for example, from the edge or your customers that are using mobile apps or other client devices to interact with your services that are running in Azure, will enter Azure’s dark fiber backbone through our edge sites, maybe even have SSL termination at those points. And then the rest of the traffic is carried directly into Azure regions for the highest SLAs. We’ve got peering agreements with over 500 network providers to make sure that we can honor strong SLAs, even when the traffic is coming in from their networks into ours. And we continue to build this out with some of the largest subsea cables, for example, MAREA, whichI’ve talked about before in previous versions of this talk.

In terms of bandwidth, connecting multiple continents together, we continue to make huge investments in capital to build this network up. When we talk about those regions, they’re in the larger context of our region strategy. Our region strategy starts with the definition of geography, where geography is datacenter sovereignty or a data sovereignty boundary, where customers are comfortable having their data anywhere within that boundary, usually required in some cases to put them in there because of regulatory requirements that are placed on them by whatever regulators are regulating their businesses.

Within geography, then, it’s typically just a country because of regulations vary from country to country. So when we go in and define geography for when we launch in Spainor in Norway or in Israel, those would become geographies. Within geography, we strive to have two regions. The reason that we have two regions is so that you can build a disaster recovery solution on top of Azure, where large scale disasters, either software or hardware or environmental that impact one region and might take it down, you can survive that by migrating your workloads or serving your traffic from the other pair region within the same geography. That means that those pair-wised regions are usually hundreds of miles apart so they can survive those kinds of large scale hurricanes or earthquakes or large scale floods.

Within regions, we’ve got more and more an availability zone architecture, where availability zones mean we’ve got discreet datacenters, three of them at the minimum, that have independent power cooling and networking. And that means that one of the datacenters, one of the availability zones, and availability zone can actually consist of more than one datacenter, can be impacted by these kinds of local disasters within a data center, like a brownout or around a data center, like a local flood, and that will only impact that one availability zone. Because there are two others, then, that will be operational, you can continue to serve even durable, quorum-based storage traffic or data traffic out of those two zones because they’re close enough together that you can synchronously replicate.

In the case of the disaster recovery pairs, they might be tens of milliseconds apart from each other to achieve that disaster recovery resilience within a region. All of the datacenters are within two milliseconds or less … so that you can achieve that synchronous replication and serve online traffic. This large-scale footprint that we’re deploying around the world, larger and larger regions with more and more datacenters mean we’ve gotta be very careful about and considerate about our impact on the environment. And this is something that Microsoft have been considering for many, many years now. In fact, we’ve been carbon neutral since 2012. But carbon-neutrality doesn’t get us far enough. You can still be impacting the environment and be carbon-neutral. What we really wanna do is go for carbon negativity.

Carbon negativity means that you’re actually having zero impact, real impact on the climate. If you take a look at Microsoft’s investments towards this goal, it starts with clean, renewable energy. We hope to be completely100% renewable by 2025. And to support that, we’ve been driving some of the largest clean energy deals on the planet. For example, we signed a 315-megawatt solar deal in the state of Virginia a couple of years ago that was the largest solar corporate deal in the United States history. We also had started and set an initiative and a commitment that you can see Satya, Amy Hood, our CFO and Brad Smith, our president, made early this year to go to carbon-negative by the year 2030, and the goal to be carbon-negative for the history of Microsoft by 2050. What that means is that for every carbon atom that Microsoft’s put directly or indirectly through our supply chain or through employee travel or through the shuttles on campus, by the year 2050, we’ll have compensated for all of that, that Microsoft put in since its founding in 1975. That means we’re gonna have zero Microsoft carbon footprint on the planet by that time.

To support that, we’ve created a $1 billion climate innovation fund that will fund startups and the research into going carbon negative. But as a result of all these investments, we’ve got some of the most efficient data centers in the world. You see can our average PUEor power usage effectiveness, which is the industry-standard way to measure data center efficiency is just 1.189 across all of our data centers, including the datacenters that we created and are still operating that are 15 years old. Our latest-generation datacenters have even lower PUEs. And to take a look at that, I thought it’d be interesting to see Microsoft datacenter evolution by going back in time and seeing how Microsoft has evolved to take advantage of newer technologies, to experiment with new ways to be efficient. And you can see through this walk down memory lane, just how far we’ve progressed. ‘Cause if we start back in generation one, generation was the colo era for Microsoft, which is when we were deploying discrete servers of, all different types into colo facilities that were mechanically cooled, operated at low temperature, delivered five-nines availability, so redundant power to every server.

But with this kind of colo approach with no focus on efficiency, our PUE was 2.0 or higher. This isn’t far off what most enterprises are experiencing today in terms of their own efficiency, because most enterprises still continue to operate in this colo manner. But in the mid-aughts, Microsoft started to get into cloud services. And so a focus was made on how can we be more efficient as we deploy these large scale services? So in 2007, in our generation two-phase, we focused on density, where we would configure the racks, provision them as racks of servers, high density provisioning of the servers within them, and then deploy those. And this higher density allowed us to get the PUE down through between 1.5 and 1.8.

With generation three, our container phase, we focused on deploying containers of servers. And this allowed us to move a little more quickly. It also allowed us to segregate, three nines capacity from five nines. And what that means is that five-nine capacity, it’s kind of gold-plated, dual redundant power cords to every server, have higher PUEs than the three nines containers, where we could run workloads that were more resilient total failures. In fact, office services today are designed to work in three nines configurations, such that even a whole region can go down.

In-Office 365, from a customer experience, we’ll see no impact at all. So this allowed us to get down to PUEs of 1.4 to 1.6. In generation four, we moved away from containers, which we found actually, for long-term maintenance, actually were more expensive than the IT packer approach. Where we pre-deploy a reason the data center footprint for certain types of servers, have them go in with the network cabling and everything else pre-configured. We switched at this point to adiabatic cooling, meaning we’d use the ambient air temperature around the datacenter to lower cooling costs, and we could also operate the servers at a higher temperature.

This allowed us to get the PUEsdown to between 1.1 and 1.3. Generation five was Microsoft’s hyper-scale phase. We were deploying so quickly back in the 2015 timeframe that we focused on how can we get big footprints out quickly and make sure we can grow them to meet hyper-scale demand. You can see here in the picture are DC2015 design, which I showed early in previous versions of this talk, which consisted of four, eight-megawatt colo facilities with independent power cooling and network so that it could each serve as an independent availability zone. We made them in three nines configuration, realized that really, we needed to deliver five nines, so upgraded them to five nines. And these could deliver PUEs of 1.17 to 1.25. And generation six, our scalable form factor generation.

Instead of creating these large 32-megawatt tranches of the datacenter, we moved to creating smaller colo size, where we could connect them together into a high-scale configuration. But this allowed us to also save costs and to be more scalable and more agile in deploying. With these, we were able to get the PUE to between 1.17 and 1.19. With generation seven, which we launched last year, datacenters with this Ballard design. In this one, we simplified the design, the electrical systems. We introduced support for flex capacity, meaning we could offer provision power in the data center, but by using intelligence software, monitor the usage of power and then throttle services like the three nine workloads that we’ve got, such that we would never trip power, even when a high energy-consuming workload would spin up on a bunch of CPUs. What this allowed us to get to is between 1.15 and 1.18. And then we’ve taken this design further and made a rapid deployment version of it, where we pre-deploy modular construction and can get this set up in a very short amount of time at smaller footprints than the full Ballardcolo minimum footprint.

But what this gives us, as slightly more expense, is the ability to deploy new regions extremely rapidly. And so this is kind of a trade-off in getting faster deployment, versus a little bit of extra cost of initial landing. And then once we initially land, as we start to grow, we can be building ballad datacenters on site to continue to expand and meet customer demand as that region demand grows. Again, because it’s based on Ballard 1.15 to 1.8 PUE. And so, you can see how the deployment of these new types of servers has brought down our PUE over time. But we’re gonna go even further. If you take a look at where we’re exploring, it’s to get to PUEs of 1.07 or less. And the path that we see to that is something called project Natick.

Project Natick is a joint effort between Microsoft Research and Microsoft Azure. I’ve talked about it before. The idea here is that we take cylinders, seal them, evacuate the air out of them, and refill them with inert nitrogen gas, and then drop them on the ocean floor. This has a number of different benefits. First of all, the majority of the world’s population centers are near a coastline. That means we can deploy underwater datacenters and serve just about all of the world very efficiently. Because they’re on the ocean floor, they can take advantage of the ambient water temperature to cool. And that means that we can drive those UEs down to levels that we couldn’t get on land-based datacenters.

Being on the ocean floor means they’re also resistant to hurricanes, solar storms, earthquakes, other kinds of phenomena that impact land-based datacenters. And we can deploy them extremely quickly, much more quickly leaving the rapidly deployed datacenters. For example, you’re looking at Natick V2, which is the second version of this experiment, where we took 12 racks of Azure servers, put them in the cylinder, dropped them a kilometer off the coast of Scotland in about a hundred feet of water. And from the time that we had that cylinder made to the time we had the servers in it, on the ocean floor, powered on and serving Microsoftworkloads, was 30 days. So really rapid deployment.

A whole bunch of benefits. But one of the theories here is if we are sealing these things up and putting them on the ocean floor, we can’t really serve them the way we do land-based datacenters, where if there’s a server with a fault in it, we go and replace that server. We’ve got to go to a fail-in-place model where the servers degrade as they fail over time. Once the number of servers in the container has dropped to a certain threshold, we need to replace that whole cylinder, that whole container. So we’ll deploy a new one alongside the old one and migrate the data and workloads, and then pull up the old one, refurbish it, and then get it ready to go again as another replacement. But that theory is based on the kind of premise that with the inert gas, with the constant temperature you have on the ocean floor, that the servers will actually be much more reliable than the land-based versions. Because at the failure rates we see on land-based datacenters, we just wouldn’t be able to do the fail-in-place model, at least not with the current approach.

So we ran project Natickon the ocean floor for a little over a year and a half, pulled it up, and now we’re doing, evaluation of the servers. But along the way, when we started the project Natick V2 test, we took three test groups plus a control group a little over a thousand Azure servers. We split them across these different groups, about 280 in each one, and then did these experiments, where the first set is nitrogen with normal air temperature that we would see in an a land-based datacenter.

The second one, nitrogen gas, but with cooling to bring them down to even cooler than normal land-based data centers, on a theory that maybe running cool would preserve and extend that server lifetime. And then nitrogen with constant temperature, which is what we’d experienced in a standard Natick deployment on the bottom of the ocean floor. And the control of course is 280 of those servers sitting in a land-based datacenter like we have today. So after 16 months, the results are in. And in fact, you can see that all three of the Natick approaches had one eighth the failure rate of the control on land, which had a failure rate of about 6%. Well, the other ones vary between 0.4 and 1.1%. So we don’t think really that it’s the normal temperature or cold or constant temperature plays much of a role, it’s really the inert nitrogen gas that really preserves and extends the server lifetimes.

Leave a Reply

Your email address will not be published. Required fields are marked *