Amazon EKS Deep Dive is a series of blogposts that will describe and provide hands-on examples of the essential parts required to create an EKS cluster and manage your container environment successfully.
Kubernetes + AWS =
Over the past few years, Kubernetes has made a revolution in application development and become a standard for running containerized workloads. Amazon Web Services, the biggest cloud player, has facilitated this change by creating Amazon Elastic Kubernetes Service — a managed service that simplifies Kubernetes clusters’ lifecycle.
This exercise aims to build a self-service container environment that enables the rapid delivery of business features. It is not enough to just create an EKS cluster and send kubeconfig to developers to achieve this. Below is the list of six essential components that are required:
- Container orchestration
- Continuous Delivery
- Observability Stack — Log Management
- Observability Stack — Monitoring
- Edge Stack
- Infrastructure as code
Let’s dive deeper into each component.
An ability to deploy, run, and manage containers at scale. This is a core feature of Kubernetes, EKS, or any other container platform. Obviously, this is the main capability of any cloud-native platform. However, as I already mentioned, practice shows that this is not enough to create an effective environment.
It provides developers with a convenient, automated, and ideally self-service way to release new functionality as fast as required by the business without jeopardizing application stability. The continuous delivery stack consists of three major parts.
- A version control system. Records changes in application code and tracks modification—the main source of truth and integration point—for example, GitHub or AWS CodeCommit.
- A CICD engine. Automation framework that downloads changes from version control, makes builds, runs tests, and deploys to Kubernetes. For example, Jenkins or AWS CodePipeline.
- A Templating engine. This tool is required because there is no native way to parametrize Kubernetes yaml manifests. For example, Helm or Kustomize.
Observability stack — Log management
Kubernetes workloads are chaotic by nature, most often represented as short-lived containers that are deleted and recreated frequently. In this environment, all data essential for troubleshooting (i.e., logs) is going away together with a container on each restart. This makes debugging extremely challenging. To solve the problem, it is required to dump the logs into centralized storage. However, persisting the logs is only a half-accomplished mission. It is also necessary to index and represent them in a searchable way. For example, developers might want to know which exact container, pod, or deployment generates an error. To achieve this, appropriate search capabilities should be in place.
Log management stack consists of:
- A Log collector. Captures logs from a container and send them to the storage layer. For example, Fluentd.
- A Storage and indexing layer. Accepts data from the log collector and represents it in a searchable way—for example, Elasticsearch.
- A Dashboard. Visualizes indexed data and provides querying capabilities—for example, Kibana.
Amazon simplifies maintenance of Elasticsearch domains with AWS Elasticsearch — a managed Elasticsearch/Kibana service. The only responsibility of us is to configure the log collector and make sure it can talk to the Elasticsearch domain.
Observability stack — Monitoring
Running more than one container on a single virtual machine creates a demand for more granular insights that go beyond traditional node-level metrics. Also, an increased number of application components running separately from each other have created a need for more advanced reporting and visualization.
Monitoring stack consists of:
- Metric collection tool. For example, Prometheus.
- Metric visualization tool. For example, Grafana.
Also, there are out-of-the-box solutions that combine both functions, some of them are paid, and some are free—for example, DataDog, NewRelic, and Amazon CloudWatch.
This capability serves several primary purposes:
- The ability to perform controlled releases with minimal downtime using canary releases, rolling updates, or blue-green deployments.
- Security features: SSL termination, authentication, and DDoS protection.
- Advanced flow control: rate limiting, circuit breaking, and timeouts.
Edge stack consists of:
- Ingress controller. For example, Nginx Ingress Controller or ALB Ingress Controller.
- Service mesh. For example, AWS AppMesh or Istio.
Infrastructure as code
By now, we have drafted a very complex and dynamic setup with a lot of dependencies and parts interacting with each other on multiple layers. This complexity needs to be managed using Infrastructure as a code solution. It allows to:
- Create reusable configurations that can be shared and applied to multiple environments.
- Version control infrastructure and audit changes.
- Create a consistent and reliable workflow to safely and predictably changes to infrastructure.
I prefer using Hashicorp Terraform for a few reasons; however, there are other ways to automate infrastructure that will be covered.
Let us summarize what we have learned so far on a diagram:
But that’s not all
In addition to the points mentioned earlier, there are other smaller yet important challenges that cover additional aspects of using Amazon EKS as a container environment:
- Building a data layer when persistent storage is required?
- Scaling worker nodes correctly in Kubernetes environments?
- Handling IAM roles, so containers are running with the least privileges?
- Managing secrets?
- Integrating services with DNS?
In this series of blog posts, I will provide answers to these questions as well.
As a next step, I will discuss how to automate creating an Amazon EKS control plane with a worker group that will act as a base for our container environment. Stay tuned for the first part that is coming soon!