Source: 5 principles for cloud-native architecture—what it is and how to master it from Google Cloud
At Google Cloud, we often throw around the term ‘cloud-native architecture’ as the desired end goal for applications that you migrate or build on Google Cloud Platform (GCP). But what exactly do we mean by cloud-native? More to the point, how do you go about designing such a system?
At a high level, cloud-native architecture means adapting to the many new possibilities—but very different set of architectural constraints—offered by the cloud compared to traditional on-premises infrastructure. Consider the high level elements that we as software architects are trained to consider:
While the functional aspects don’t change too much, the cloud offers, and sometimes requires, very different ways to meet non-functional requirements, and imposes very different architectural constraints. If architects fail to adapt their approach to these different constraints, the systems they architect are often fragile, expensive, and hard to maintain. A well-architected cloud native system, on the other hand, should be largely self-healing, cost efficient, and easily updated and maintained through Continuous Integration/Continuous Delivery (CI/CD).
The good news is that cloud is made of the same fabric of servers, disks and networks that makes up traditional infrastructure. This means that almost all of the principles of good architectural design still apply for cloud-native architecture. However, some of the fundamental assumptions about how that fabric performs change when you’re in the cloud. For instance, provisioning a replacement server can take weeks in traditional environments, whereas in the cloud, it takes seconds—your application architecture needs to take that into account.
In this post we set out five principles of cloud-native architecture that will help to ensure your designs take full advantage of the cloud while avoiding the pitfalls of shoe-horning old approaches into a new platform.
Principles for cloud-native architecture
The principle of architecting for the cloud, a.k.a. cloud-native architecture, focuses on how to optimize system architectures for the unique capabilities of the cloud. Traditional architecture tends to optimize for a fixed, high-cost infrastructure, which requires considerable manual effort to modify. Traditional architecture therefore focuses on the resilience and performance of a relatively small fixed number of components. In the cloud however, such a fixed infrastructure makes much less sense because cloud is charged based on usage (so you save money when you can reduce your footprint) and it’s also much easier to automate (so automatically scaling-up and down is much easier). Therefore, cloud-native architecture focuses on achieving resilience and scale though horizontal scaling, distributed processing, and automating the replacement of failed components. Let’s take a look.
Principle 1: Design for automation
Automation has always been a best practice for software systems, but cloud makes it easier than ever to automate the infrastructure as well as components that sit above it. Although the upfront investment is often higher, favouring an automated solution will almost always pay off in the medium term in terms of effort, but also in terms of the resilience and performance of your system. Automated processes can repair, scale, deploy your system far faster than people can. As we discuss later on, architecture in the cloud is not a one-shot deal, and automation is no exception—as you find new ways that your system needs to take action, so you will find new things to automate.
Some common areas for automating cloud-native systems are:
Principle 2: Be smart with state
Storing of ‘state’, be that user data (e.g., the items in the users shopping cart, or their employee number) or system state (e.g., how many instances of a job are running, what version of code is running in production), is the hardest aspect of architecting a distributed, cloud-native architecture. You should therefore architect your system to be intentional about when, and how, you store state, and design components to be stateless wherever you can.
Stateless components are easy to:
Principle 3: Favor managed services
Cloud is more than just infrastructure. Most cloud providers offer a rich set of managed services, providing all sorts of functionality that relieve you of the headache of managing the backend software or infrastructure. However, many organizations are cautious about taking advantage of these services because they are concerned about being ‘locked in’ to a given provider. This is a valid concern, but managed services can often save the organization hugely in time and operational overhead.
Broadly speaking, the decision of whether or not to adopt managed services comes down to portability vs. operational overhead, in terms of both money, but also skills. Crudely, the managed services that you might consider today fall into three broad categories:
However, practical experience has shown that most cloud-native architectures favor managed services; the potential risk of having to migrate off of them rarely outweighs the huge savings in time, effort, and operational risk of having the cloud provider manage the service, at scale, on your behalf.
Principle 4: Practice defense in depth
Traditional architectures place a lot of faith in perimeter security, crudely a hardened network perimeter with ‘trusted things’ inside and ‘untrusted things’ outside. Unfortunately, this approach has always been vulnerable to insider attacks, as well as external threats such as spear phishing. Moreover, the increasing pressure to provide flexible and mobile working has further undermined the network perimeter.
Cloud-native architectures have their origins in internet-facing services, and so have always needed to deal with external attacks. Therefore they adopt an approach of defense-in-depth by applying authentication between each component, and by minimizing the trust between those components (even if they are ‘internal’). As a result, there is no ‘inside’ and ‘outside’.
Cloud-native architectures should extend this idea beyond authentication to include things like rate limiting and script injection. Each component in a design should seek to protect itself from the other components. This not only makes the architecture very resilient, it also makes the resulting services easier to deploy in a cloud environment, where there may not be a trusted network between the service and its users.
Principle 5: Always be architecting
One of the core characteristics of a cloud-native system is that it’s always evolving, and that’s equally true of the architecture. As a cloud-native architect, you should always seek to refine, simplify and improve the architecture of the system, as the needs of the organization change, the landscape of your IT systems change, and the capabilities of your cloud provider itself change. While this undoubtedly requires constant investment, the lessons of the past are clear: to evolve, grow, and respond, IT systems need to live and breath and change. Dead, ossifying IT systems rapidly bring the organization to a standstill, unable to respond to new threats and opportunities.
The only constant is change
In the animal kingdom, survival favors those individuals who adapt to their environment. This is not a linear journey from ‘bad’ to ‘best’ or from ‘primitive’ to ‘evolved’, rather everything is in constant flux. As the environment changes, pressure is applied to species to evolve and adapt. Similarly, cloud-native architectures do not replace traditional architectures, but they are better adapted to the very different environment of cloud. Cloud is increasingly the environment in which most of us find ourselves working, and failure to evolve and adapt, as many species can attest, is not a long term option.
The principles described above are not a magic formula for creating a cloud-native architecture, but hopefully provide strong guidelines on how to get the most out of the cloud. As an added benefit, moving and adapting architectures for cloud gives you the opportunity to improve and adapt them in other ways, and make them better able to adapt to the next environmental shift. Change can be hard, but as evolution has shown for billions of years, you don’t have to be the best to survive—you just need to be able to adapt.
If you would like to learn more about the topics in this post, check out the following resources: