This paper describes the AWS Well-Architected Framework, which enables customers to review and improve their cloud-based architectures and better understand the business impact of their design decisions. We address general design principles as well as specific best practices and guidance in five conceptual areas that we define as the pillars of the Well-Architected Framework.
The AWS Well-Architected Framework is based on five pillars—security,
reliability, performance efficiency, cost optimization, and operational
Security The ability to protect information, systems, and assets while
delivering business value through risk assessments and
Reliability The ability of a system to recover from infrastructure or
service failures, dynamically acquire computing resources to
meet demand, and mitigate disruptions such as
misconfigurations or transient network issues.
Performance Efficiency The ability to use computing resources efficiently to meet
system requirements, and to maintain that efficiency as
demand changes and technologies evolve.
Cost Optimization The ability to avoid or eliminate unneeded cost or
Operational Excellence The ability to run and monitor systems to deliver business
value and to continually improve supporting processes and
General Design Principles
The Well-Architected Framework identifies a set of general design principles to
facilitate good design in the cloud:
Stop guessing your capacity needs: Eliminate guessing your
infrastructure capacity needs. When you make a capacity decision before you deploy a system, you might end up sitting on expensive idle
resources or dealing with the performance implications of limited
capacity. With cloud computing, these problems can go away. You can
use as much or as little capacity as you need, and scale up and down
Test systems at production scale: In the cloud, you can create a
production-scale test environment on demand, complete your testing,
and then decommission the resources. Because you only pay for the test
environment when it is running, you can simulate your live environment
for a fraction of the cost of testing on premises.
Automate to make architectural experimentation easier:
Automation allows you to create and replicate your systems at low cost
and avoid the expense of manual effort. You can track changes to your
automation, audit the impact, and revert to previous parameters when
Allow for evolutionary architectures: In a traditional environment,
architectural decisions are often implemented as a static, one-time
events, with a few major versions of a system during its lifetime. As a
business and its context continue to change, these initial decisions might
hinder the system’s ability to deliver changing business requirements. In
the cloud, the capability to automate and test on demand lowers the risk
of impact from design changes. This allows systems to evolve over time
so that businesses can take advantage of innovations as a standard
Data-Driven architectures: In the cloud you can collect data on how
your architectural choices affect the behavior of your workload. This lets
you make fact-based decisions on how to improve your workload. Your
cloud infrastructure is code, so you can use that data to inform your
architecture choices and improvements over time.
Improve through game days: Test how your architecture and
processes perform by regularly scheduling game days to simulate events
in production. This will help you understand where improvements can be
made and can help develop organizational experience in dealing with
In the cloud, there are a number of principles that can help you achieve
Democratize advanced technologies: Technologies that are difficult
to implement can become easier to consume by pushing that knowledge
and complexity into the cloud vendor’s domain. Rather than having your
IT team learn how to host and run a new technology, they can simply
consume it as a service. For example, NoSQL databases, media
transcoding, and machine learning are all technologies that require
expertise that is not evenly dispersed across the technical community. In
the cloud, these technologies become services that your team can
consume while focusing on product development rather than resource
provisioning and management.
Go global in minutes: Easily deploy your system in multiple regions
around the world with just a few clicks. This allows you to provide lower
latency and a better experience for your customers at minimal cost.
Use serverless architectures: In the cloud, server-less architectures
remove the need for you to run and maintain servers to carry out
traditional compute activities. For example, storage services can act as
static websites, removing the need for web servers; and event services can
host your code for you. This not only removes the operational burden of
managing these servers, but also can lower transactional costs because
these managed services operate at cloud scale.
Experiment more often: With virtual and automatable resources, you
can quickly carry out comparative testing using different types of
instances, storage, or configurations.
Mechanical sympathy: Use the technology approach that aligns best
to what you are trying to achieve. For example consider data access
patterns when selecting database or storage approaches.
There are four best practice areas for Performance Efficiency in the cloud:
1. Selection (compute, storage, database, network)
Take a data-driven approach to selecting a high performance architecture.
Gather data on all aspects of the architecture, from the high level design to the
selection and configuration of resource types. By reviewing your choices on a
cyclical basis, you will ensure that you are taking advantage of the continually
evolving AWS platform. Monitoring will ensure that you are aware of any
deviance from expected performance and can take action on it. Finally, your
architecture can make tradeoffs to improve performance, such as using
compression or caching, or relaxing consistency requirements.