1. Installation
1.1. Google Kubernetes Engine (GKE)
Create an account on GCP and follow any tutorial, for example this video workshop.
1.2. AWS / EKS
Have an admin account on AWS.
1.2.1. Manual installation
|
Getting Started with Amazon EKS
of Official documentation to :-
Create your Amazon EKS Service Role
-
Create your Amazon EKS Cluster VPC
-
Install and Configure kubectl for Amazon EKS
-
Download and Install the Latest AWS CLI
-
Create Your Amazon EKS Cluster
-
Configure kubectl for Amazon EKS
-
Launch and Configure Amazon EKS Worker Nodes
1.2.2. Terraform installation
To destroy the cluster : terraform destroy --force
|
If needed, install Terraform using this guide :
cd /tmp/
wget https://releases.hashicorp.com/terraform/0.12.6/terraform_0.12.6_linux_amd64.zip
unzip terraform_0.12.6_linux_amd64.zip
sudo mv terraform /usr/local/bin/
terraform --version
Official repository
Using the official terraform/eks repo as a module, without cloning it.
-
Create the main config file
main.tf
terraform {
required_version = ">= 0.11.8"
}
provider "aws" {
version = ">= 2.0.0"
region = "${local.region}"
}
# data "aws_region" "current" {}
data "aws_availability_zones" "available" {}
locals {
cluster_name = "my-sweet-cluster"
region = "eu-north-1"
worker_groups = [
{
instance_type = "t3.small" # 2CPU, 2GO RAM. t2 does not exist in Stockholm
asg_desired_capacity = "1" # Desired worker capacity in the autoscaling group.
asg_max_size = "5" # Maximum worker capacity in the autoscaling group.
asg_min_size = "1" # Minimum worker capacity in the autoscaling group.
autoscaling_enabled = true # Sets whether policy and matching tags will be added to allow autoscaling.
# spot_price = "" # "0.01" or any value to use "spot" (cheap but can leave) instances
},
]
map_users = [
{
user_arn = "arn:aws:iam::<AWS_ACCOUNT>:user/my.user.one"
username = "my.user.one"
group = "system:masters"
},
{
user_arn = "arn:aws:iam::<AWS_ACCOUNT>:user/my.user.one"
username = "my.user.two"
group = "system:masters"
},
]
map_users_count = 6
tags = {
Environment = "POC"
creation-date = "${timestamp()}"
}
}
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
name = "my-sweet-cluster-vpc"
cidr = "10.0.0.0/16"
# azs = ["${local.region}a", "${local.region}b", "${local.region}c"]
azs = ["${data.aws_availability_zones.available.names[0]}", "${data.aws_availability_zones.available.names[1]}", "${data.aws_availability_zones.available.names[2]}"]
private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
public_subnets = ["10.0.4.0/24", "10.0.5.0/24", "10.0.6.0/24"]
enable_nat_gateway = true
single_nat_gateway = true
tags = "${merge(local.tags, map("kubernetes.io/cluster/${local.cluster_name}", "shared"))}"
}
module "eks" {
source = "terraform-aws-modules/eks/aws"
cluster_name = "${local.cluster_name}"
subnets = ["${module.vpc.private_subnets}"]
tags = "${local.tags}"
vpc_id = "${module.vpc.vpc_id}"
worker_groups = "${local.worker_groups}"
map_users = "${local.map_users}"
map_users_count = "${local.map_users_count}"
worker_sg_ingress_from_port = "0" # default 1025, which means no POD port exposed below 1024
}
-
Create a minimal output file
outputs.tf
output "cluster_endpoint" {
description = "Endpoint for EKS control plane."
value = "${module.eks.cluster_endpoint}"
}
output "cluster_security_group_id" {
description = "Security group ids attached to the cluster control plane."
value = "${module.eks.cluster_security_group_id}"
}
output "kubectl_config" {
description = "kubectl config as generated by the module."
value = "${module.eks.kubeconfig}"
}
output "config_map_aws_auth" {
description = ""
value = "${module.eks.config_map_aws_auth}"
}
-
If you configured autoscaling usage, create the associated config file
autoscaler.yml
#
# Config values specific to AWS/EKS
# see https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/autoscaling.md
#
rbac:
create: true
sslCertPath: /etc/ssl/certs/ca-bundle.crt
cloudProvider: aws
awsRegion: eu-north-1
autoDiscovery:
clusterName: my-sweet-cluster
enabled: true
-
If not already done, configure AWS CLI
aws configure
-
Create the cluster. It should take about 10min.
terraform init
terraform apply
-
Configure kubeconfig
-
Either with terraform output
-
terraform output kubeconfig > ~/.kube/my-sweet-cluster
export KUBECONFIG=~/.kube/my-sweet-cluster
-
Or with
aws eks
aws eks update-kubeconfig --name my-sweet-cluster
-
If you configured it to use the autoscaler
-
Install and initialize helm as described in Helm.
-
Apply the helm chart
-
helm install stable/cluster-autoscaler --values=autoscaler.yml --name cas --namespace kube-system
-
Test the autoscaler
-
Scale up
-
kubectl run example --image=nginx --port=80 --replicas=50
kubectl logs -l "app.kubernetes.io/instance=cas" -f
kubectl get nodes -w
-
Scale down
kubectl delete deployment example
kubectl logs -l "app.kubernetes.io/instance=cas" -f
After 10 minutes (by default), the cluster should scale down and you should see
I0423 12:18:52.539729 1 scale_down.go:600] ip-10-0-3-163.eu-north-1.compute.internal was unneeded for 10m8.928762095s I0423 12:18:52.539815 1 scale_down.go:600] ip-10-0-3-149.eu-north-1.compute.internal was unneeded for 10m8.928762095s I0423 12:18:52.539884 1 scale_down.go:600] ip-10-0-1-206.eu-north-1.compute.internal was unneeded for 10m8.928762095s I0423 12:18:52.539947 1 scale_down.go:600] ip-10-0-1-222.eu-north-1.compute.internal was unneeded for 10m8.928762095s I0423 12:18:52.540077 1 scale_down.go:819] Scale-down: removing empty node ip-10-0-3-163.eu-north-1.compute.internal I0423 12:18:52.540190 1 scale_down.go:819] Scale-down: removing empty node ip-10-0-3-149.eu-north-1.compute.internal I0423 12:18:52.540261 1 scale_down.go:819] Scale-down: removing empty node ip-10-0-1-206.eu-north-1.compute.internal I0423 12:18:52.540331 1 scale_down.go:819] Scale-down: removing empty node ip-10-0-1-222.eu-north-1.compute.internal
Alternative repository
Smaller option list, easier to understand |
-
clone this working repo
git clone https://github.com/WesleyCharlesBlake/terraform-aws-eks.git
-
check that this issue is merged, else apply the changes locally
-
create a configuration file at root and change values if needed. See default values in
variables.tf
or on Github page.
cluster-name = "my-sweet-cluster"
k8s-version = "1.12"
aws-region = "eu-west-1"
node-instance-type = "t2.medium"
desired-capacity = "1"
max-size = "5"
min-size = "0"
-
Configure AWS to the right account
pip install --upgrade awscli
aws configure
-
Create the cluster. It should take about 10min.
terraform apply
-
Configure kubeconfig
terraform output kubeconfig > ~/.kube/trekea-cluster
export KUBECONFIG=~/.kube/trekea-cluster
or
aws eks update-kubeconfig --name my-sweet-cluster
-
Make the workers join the cluster and watch
terraform output config-map > config-map-aws-auth.yaml
kubectl apply -f config-map-aws-auth.yaml
kubectl get nodes --watch
1.2.3. Administration
Initial admin
For a user / maintainer of the cluster, here are the pre-requisites :
-
Install kubectl
-
Install aws-iam-authenticator
-
Install AWS CLI
-
Check installation
kubectl version
aws-iam-authenticator help
aws --version
-
Configure AWS and kubectl
create-access-key --user-name <my-user>
aws configure
aws eks --region <EKS_REGION> update-kubeconfig --name <CLUSTER>
Additional admins
To grant cluster rights to other admins than the cluster creator, do the following.
IAM
Go to AWS console / IAM to give EKS/ECR admin rights to the user(s).
Default policies do not cover EKS and ECR admin usage (!), so we create some custom policies.
-
Create EKS policies with Policies → Create policy
-
Service =
EKS
-
Action =
All EKS actions
-
Resources =
All resources
-
-
Click Review policy
-
Name =
EKS-admin
-
-
Create EKS policies with Policies → Create policy
-
Service =
ECR
-
Action =
All ECR actions
-
Resources =
All resources
-
Name =
ECR-admin
-
-
Create a group with Policies → Create policy
-
Group Name =
MyDevTeam
-
Choose policies :
-
IAMSelfManageServiceSpecificCredentials
-
IAMFullAccess
-
IAMUserChangePassword
-
IAMUserSSHKeys
-
EKS-admin
-
ECR-admin
-
-
Click Next then Create Group
-
If needed, create the user in AWS console / IAM. Copy the userarn
for next step.
Attach the group to the user by clicking on the user → Groups → Add user to groups → select MyDevTeam
→ Add to Groups.
Configmap
The actual admin has to update the configmap with the new user.
kubectl edit -n kube-system configmap/aws-auth
apiVersion: v1
data:
mapRoles: |
- rolearn: arn:aws:iam::<AWS_ACCOUNT>:role/adx-worker-nodes-NodeInstanceRole-XXXXX
username: system:node:{{EC2PrivateDNSName}}
groups:
- system:bootstrappers
- system:nodes
mapUsers: |
- userarn: arn:aws:iam::<AWS_ACCOUNT>:user/my-user
username: my-user
groups:
- system:masters
- userarn: arn:aws:iam::<AWS_ACCOUNT>:user/another-user (1)
username: another-user
groups:
- system:masters
kind: ConfigMap
[...]
1 | Add user(s) to the list in mapUsers |
Modification is asynchronous, you may have to wait a few minutes for this to be taken into account |
Tools configuration
You now have the rights, you can then perform operations described in Initial admin.
1.2.4. Kubernetes dashboard
Follow official guide :
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/recommended/kubernetes-dashboard.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/heapster.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/influxdb.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/rbac/heapster-rbac.yaml
kubectl apply -f https://raw.githubusercontent.com/ericdahl/hello-eks/master/k8s/dashboard/eks-admin-service-account.yaml
kubectl apply -f https://raw.githubusercontent.com/ericdahl/hello-eks/master/k8s/dashboard/eks-admin-cluster-role-binding.yaml
kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep eks-admin | awk '{print $1}')
This gets you a token for connecting to a local redirection of the dashboard when typing
kubectl proxy
CPU / RAM
You may not have CPU and RAM on EKS as of late 2018 / early 2019, see this issue for explanation.
-
check heapster logs
kubectl get all
kubectl logs heapster-***
E1228 12:13:05.074233 1 manager.go:101] Error in scraping containers from kubelet:10.0.30.39:10255: failed to get all container stats from Kubelet URL "http://10.0.30.39:10255/stats/container/": Post http://10.0.30.39:10255/stats/container/: dial tcp 10.0.30.39:10255: getsockopt: connection refused
-
Fix the deployment by editing the yml to add some extra parameters to
--source
kubectl edit deployment heapster
spec: template: spec: containers: - command: - /heapster - --source=kubernetes:kubernetes:https://kubernetes.default?useServiceAccount=true&kubeletHttps=true&kubeletPort=10250&insecure=true (1) - --sink=influxdb:http://monitoring-influxdb.kube-system.svc:8086 name: heapster
-
Add some ClusterRole & ClusterRoleBinding by creating this file
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: node-stats-full
rules:
- apiGroups: [""]
resources: ["nodes/stats"]
verbs: ["get", "watch", "list", "create"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: heapster-node-stats
subjects:
- kind: ServiceAccount
name: heapster
namespace: kube-system
roleRef:
kind: ClusterRole
name: node-stats-full
apiGroup: rbac.authorization.k8s.io
-
And applying it
kubectl apply -f heapster-node-stats.yml
Error logs should stop. After a few minutes, with enough data, the dashboard should show CPU / RAM in All
namespace list. Else delete the kubernetes-dashboard-*
to restart it.
1.2.5. CI pipelines
Full automation requires a CI technical user being able to interract with the cluster. For this :
-
Create a technical user in IAM
-
Install, configure and use
AWS CLI
,aws-iam-authenticator
andkubectl
in your pipeline
Bitbucket
For Bitbucket, we use an AWS docker image. First add some configuration by going to your repository → Settings → Repository variables
AWS_ACCOUNT_ID 777777777777 AWS_DEFAULT_REGION eu-west-3 AWS_ACCESS_KEY_ID XXXXXXX AWS_SECRET_ACCESS_KEY tPbfRc/wx3JmPp6XXXXXXXty2yFJ6wl4rZ0B/Q
Then define the pipeline
- step: &deploy-to-develop-k8s
name: Deploy to Develop on Kubernetes cluster
image: atlassian/pipelines-awscli
max-time: 5
services:
- docker
script:
- export REPOSITORY_URL=${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_DEFAULT_REGION}.amazonaws.com
# Download the necessary tools to deploy to kubernetes
- apk add --no-cache curl
- curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl
- chmod +x ./kubectl
- mv ./kubectl /usr/local/bin/kubectl
# Download aws-iam-authenticator
- curl -o aws-iam-authenticator https://amazon-eks.s3-us-west-2.amazonaws.com/1.10.3/2018-07-26/bin/linux/amd64/aws-iam-authenticator
- chmod +x ./aws-iam-authenticator
- mkdir $HOME/bin && cp ./aws-iam-authenticator $HOME/bin/aws-iam-authenticator && export PATH=$HOME/bin:$PATH
- echo 'export PATH=$HOME/bin:$PATH' >> ~/.bashrc
- aws eks update-kubeconfig --name ${KUBERNETES_CLUSTER_NAME}
- kubectl set image deployment/api-dpl api=${REPOSITORY_URL}/adx/adx-api:develop
- kubectl set image deployment/client-dpl client=${REPOSITORY_URL}/adx/adx-client:develop
1.3. AWS / Kubespray / Terraform
1.3.1. Prerequisites
-
Install pip3 (Python 3) or pip (Python 2)
sudo apt update
sudo apt install python3-pip
pip3 --version
-
clone kubespray repository
git clone https://github.com/kubernetes-sigs/kubespray.git
-
Install the requirements using pip3
pip3 install -r requirements.txt
-
This installs Ansible but not Terraform, which we will use to generate the
hosts.ini
file used by Ansible. Let’s install it following this guide.
cd /tmp/
wget https://releases.hashicorp.com/terraform/0.11.11/terraform_0.11.11_linux_amd64.zip
unzip terraform_0.11.11_linux_amd64.zip
sudo mv terraform /usr/local/bin/
terraform --version
-
Create a key pair in AWS Console
-
Services → Network & Security → Key Pairs → Create Key Pair
-
Save it and change rights
-
cp my-private-key.pem ~/.ssh
chmod 700 ~/.ssh/my-private-key.pem
1.3.2. Terraform : EC2 servers and host.ini creation
We will use Terraform to generate the host.ini file.
-
Go to your cloned kubespray project
-
Create the file
credentials.tfvar
#AWS Access Key
AWS_ACCESS_KEY_ID = "XXXXXXXXX"
#AWS Secret Key
AWS_SECRET_ACCESS_KEY = "YYYYYYYYYYYYYYYYYYY"
#EC2 SSH Key Name
AWS_SSH_KEY_NAME = "my-key-pair-name"
#AWS Region
AWS_DEFAULT_REGION = "eu-west-3"
-
Copy the file
terraform.tfvars.example
to create one that suits your needs
#Global Vars
aws_cluster_name = "mycluster"
#VPC Vars
aws_vpc_cidr_block = "10.250.192.0/18"
aws_cidr_subnets_private = ["10.250.192.0/20","10.250.208.0/20"]
aws_cidr_subnets_public = ["10.250.224.0/20","10.250.240.0/20"]
#Bastion Host
aws_bastion_size = "t2.medium"
#Kubernetes Cluster
aws_kube_master_num = 1
aws_kube_master_size = "t2.medium"
aws_etcd_num = 1
aws_etcd_size = "t2.medium"
aws_kube_worker_num = 1
aws_kube_worker_size = "t2.medium"
#Settings AWS ELB
aws_elb_api_port = 6443
k8s_secure_api_port = 6443
kube_insecure_apiserver_address = "0.0.0.0"
default_tags = {
Env = "mycluster"
App = "mycluster-my-app"
Product = "kubernetes"
}
inventory_file = "../../../inventory/mycluster/hosts.ini"
-
Optional : if you want Ubuntu/Debian images for your cluster (instead of CoreOS), change this in
variables.tf
data "aws_ami" "distro" {
most_recent = true
filter {
name = "name"
values = ["ubuntu/images/hvm-ssd/ubuntu-xenial-16.04-amd64-*"]
# values = ["debian-stretch-hvm-x86_64-gp2-2018-11-10-63975-572488bb-fc09-4638-8628-e1e1d26436f4-ami-0f4768a55eaaac3d7.4"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
owners = ["099720109477"] #ubuntu
# owners = ["679593333241"] #debian
}
-
Initialize inventory folder (where host.ini will be written in next step)
mv inventory/sample inventory/mycluster
-
Initialize Terraform and apply
cd contrib/terraform/aws
terraform init
terraform plan --var-file=credentials.tfvars
terraform apply --var-file=credentials.tfvars
-
Edit the generated host.ini file to put the internal DNS instead of generated names (see names in AWS console)
[all]
ip-10-250-206-126.eu-west-3.compute.internal ansible_host=10.250.206.126
ip-10-250-201-250.eu-west-3.compute.internal ansible_host=10.250.201.250
ip-10-250-204-239.eu-west-3.compute.internal ansible_host=10.250.204.239
bastion ansible_host=35.180.250.230
bastion ansible_host=35.180.55.194
[bastion]
bastion ansible_host=35.180.250.230
bastion ansible_host=35.180.55.194
[kube-master]
ip-10-250-206-126.eu-west-3.compute.internal
[kube-node]
ip-10-250-201-250.eu-west-3.compute.internal
[etcd]
ip-10-250-204-239.eu-west-3.compute.internal
[k8s-cluster:children]
kube-node
kube-master
[k8s-cluster:vars]
apiserver_loadbalancer_domain_name="kubernetes-elb-mycluster-*****.eu-west-3.elb.amazonaws.com"
Yes, there are 2 bastions with the same name, you can change names if you want both to be configured |
-
Optionnal : to restart from scrach (helpfull when you messed a lot with ansible)
terraform destroy --var-file=credentials.tfvars
terraform apply --var-file=credentials.tfvars
1.3.3. Ansible : Cluster creation and configuration
-
Change some configuration files
cloud_provider: aws
cluster_name: mycluster
# Make a copy of kubeconfig on the host that runs Ansible in {{ inventory_dir }}/artifacts
kubeconfig_localhost: true
# Download kubectl onto the host that runs Ansible in {{ bin_dir }}
kubectl_localhost: true
-
Go to project root directry and launch Ansible
cd [..]/kubespray
ansible-playbook -vvvv -i ./inventory/mycluster/hosts.ini ./cluster.yml -e ansible_user=core -b --become-user=root --flush-cache --private-key=~/.ssh/my-private-key.pem -e ansible_ssh_private_key_file=~/.ssh/my-private-key.pem 2>&1 | tee "ansible_$(date +"%Y-%m-%d_%I-%M-%p").log"
|
-
With the minimal configuration we used before, the script should take 10min. Use this time to check operations on the master
ssh -F ./ssh-bastion.conf core@<master-ip> -i ~/.ssh/my-private-key.pem
journalctl -f
-
When finished, check the cluster connecting on master
ssh -F ./ssh-bastion.conf core@<master-ip> -i ~/.ssh/my-private-key.pem
sudo kubectl --kubeconfig=/etc/kubernetes/admin.conf get nodes
1.3.4. Kubectl configuration behind a bastion
-
Get Master and Bastion IPs from
./inventory/my-cluster/hosts.ini
or./ssh-bastion.conf
-
Forward cluster API port to your local machine
ssh -L 6443:<MASTER-IP>:6443 ubuntu@<BASTION-IP> -N -i ~/.ssh/traffic_staging_k8s_fr.pem
-
Use admin.conf generated by Ansible playbook as a kube config file, replacing master ip by
localhost
. Suggested approach : copy it and link it to default kube config path. Then it won’t be replaced by a playbook run.
cp ./inventory/my-cluster/artifacts/admin.conf ./inventory/my-cluster/artifacts/kubeconfig
mv ~/.kube/config ~/.kube/config.old
sudo ln -s ./inventory/my-cluster/artifacts/kubeconfig ~/.kube/config
a non updated kubeconfig could lead to the following error on kubectl usage : Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")`
|
1.3.5. Prepare bastion for CI usage
You can use the bastion as a server for the CI to interract with the cluster.
-
connect to bastion
ssh-add ~/.ssh/cluster-ssh-key.pem
ssh BASTION_USER@BASTION_IP
-
install docker
sudo apt-get remove docker docker-engine docker.io containerd runc
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl gnupg2 software-properties-common
curl -fsSL https://download.docker.com/linux/debian/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/debian $(lsb_release -cs) stable"
sudo apt-get update
sudo apt-get install -y docker-ce docker-ce-cli containerd.io
-
install kubectl
sudo apt-get update && sudo apt-get install -y apt-transport-https
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
echo "deb https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee -a /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubectl
-
create kubeconfig with content of admin.conf
mkdir .kube
vi .kube/config
-
add CI runner public ssh key to authorized_keys
vi ~/.ssh/authorized_keys
-
download/create the key file to be used to connect to a cluster node
vi ~/.ssh/cluster-key.pem
chmod 400 ~/.ssh/cluster-key.pem
-
test the ssh connection to the node (and then accept new host)
ssh admin@$MASTER_IP -i ~/.ssh/cluster-key.pem
2. Tips
2.1. Kubectl autocompletion
You can add autocompletion to kubectl commands.
sudo apt-get install bash-completion
sudo kubectl completion bash >/etc/bash_completion.d/kubectl
2.2. Kubectl aliases
This article mentions this repo for alias list. Let’s use this fork with more aliases.
-
Clone the alias file
curl -o ~/.kubectl_aliases https://raw.githubusercontent.com/jessestuart/kubectl-aliases/master/.kubectl_aliases
-
replace
kubectl
withkctl
, for use together with print command
sed -i 's/kubectl/kctl/g' ~/.kubectl_aliases
-
Add it to
.bashrc
, with command print and some custom aliases
cat <<EOT >> ~/.bashrc
[ -f ~/.kubectl_aliases ] && source ~/.kubectl_aliases
# k8s change namespace
alias kn='kctl config set-context $(kubectl config current-context) --namespace'
# k8s get events sorted by last time
alias kgel='kubectl get events --sort-by=.lastTimestamp'
# k8s get events sorted by creation time
alias kgec='kubectl get events --sort-by=.metadata.creationTimestamp'
# print kubectl command and then execute it
function kctl() {
echo "+ kubectl $@";
command kubectl $@;
}
EOT
2.3. Latest kubectl patch
To get the latest kubectl patch for a Kubernetes major.minor version, visit the release page of Kubectl Github repo.
2.4. Tools
2.4.1. K9S : terminal UI
-
Installation
cd /tmp/
curl -L https://github.com/derailed/k9s/releases/download/0.5.2/k9s_0.5.2_Linux_x86_64.tar.gz | tar zx
sudo mv k9s /usr/bin/
2.4.2. Kubectx : context switching
-
Demo
-
Installation
sudo curl -o /usr/bin/kubectx https://raw.githubusercontent.com/ahmetb/kubectx/master/kubectx
sudo chmod +x /usr/bin/kubectx
-
Autocompletion
sudo apt-get install bash-completion
sudo curl -o /etc/bash_completion.d/kubectx https://raw.githubusercontent.com/ahmetb/kubectx/master/completion/kubectx.bash
2.4.3. Kubens : namespace switching
-
Demo
-
Installation
sudo curl -o /usr/bin/kubens https://raw.githubusercontent.com/ahmetb/kubectx/master/kubens
sudo chmod +x /usr/bin/kubens
-
Autocompletion
not working under AWS |
sudo apt-get install bash-completion
sudo curl -o /etc/bash_completion.d/kubens https://raw.githubusercontent.com/ahmetb/kubectx/master/completion/kubens.bash
2.4.4. Kubeval : manifest validator
kubeval is a tool for validating a Kubernetes YAML or JSON configuration file.
cd /tmp
wget https://github.com/instrumenta/kubeval/releases/latest/download/kubeval-linux-amd64.tar.gz
tar xf kubeval-linux-amd64.tar.gz
sudo cp kubeval /usr/bin
2.4.5. Stern : multi-pods tailing
-
Installation
sudo curl -L -o /usr/bin/stern https://github.com/wercker/stern/releases/download/1.10.0/stern_linux_amd64
sudo chmod +x /usr/bin/stern
-
Example
stern --all-namespaces . --since 1m
2.4.6. Kubectl-debug : sidecar debugging pod
-
Demo
kubectl-debug is an out-of-tree solution for troubleshooting running pods, which allows you to run a new container in running pods for debugging purpose. Follow official instructions to install it.
2.4.7. Microk8s
Multiple nodes on a single host is possible, yet a bit tricky, see this issue. |
-
Installation
sudo snap install microk8s --classic
sudo usermod -a -G microk8s $USER
su - $USER
-
To use it like a regular cluster among others, add this to your
.bashrc
:
export PATH=/snap/bin:$PATH
microk8s.kubectl config view --raw > $HOME/.kube/microk8s.config
export KUBECONFIG=$HOME/.kube/config
export KUBECONFIG=$KUBECONFIG:$HOME/.kube/microk8s.config
-
To install dns, storage and dashboard, and get the token :
microk8s.enable dns storage ingress dashboard
token=$(microk8s.kubectl -n kube-system get secret | grep default-token | cut -d " " -f1)
microk8s.kubectl -n kube-system describe secret $token
kubectl proxy
Now you can serve the dashboard with kubectl proxy
and then browse the dashboard using the token.
-
To list plugins
microk8s.kubectl status
-
To delete all
microk8s.kubectl reset
2.5. Useful commands
-
Switch namespace
kubectl config set-context $(kubectl config current-context) --namespace=kube-system
-
View an applied yaml
kubectl apply view-last-applied svc/client
-
Get pod’s name
kubectl get pods --show-labels
kubectl get pod -l app=sonarqube -o jsonpath="{.items[0].metadata.name}"
-
Decode a secret
kubectl get secrets/my-secret --template={{.data.password}} | base64 --decode
2.6. ClusterIP vs NodePort vs Ingress
Great article on the subject.
2.7. Restart deployment pods
To have fresh pods, you can kill them. But you can also scale down and up again the deployment.
This induces a service interruption |
kubectl scale deployment kubernetes-dashboard --replicas=0
deployment.extensions "kubernetes-dashboard" scaled
kubectl scale deployment kubernetes-dashboard --replicas=1
deployment.extensions "kubernetes-dashboard" scaled
2.8. Getting logged in to ECR for image pulling
aws ecr get-login
docker login -u AWS -p <PASSWORD> -e none https://<AWS_ACCOUNT>.dkr.ecr.<EKS_REGION>.amazonaws.com
docker login -u AWS -p <PASSWORD> https://<AWS_ACCOUNT>.dkr.ecr.<EKS_REGION>.amazonaws.com
2.9. Force updating image with same tag
2.9.1. Manually
The ECR secret has to be updated in the last 12H |
To force update an image, if the deployment is configured as imagePullPolicy: Always
, deleting the pod will pull the new image.
kubectl get pods -w
NAME READY STATUS RESTARTS AGE api-dpl-6b5698b848-vddc8 1/1 Running 0 2m client-dpl-69786bdd8f-5zcnd 1/1 Running 0 13d db-dpl-6874657d-w6mzb 1/1 Running 0 27m kibana-dpl-55fdf8776f-k45pm 1/1 Running 0 26m
kubectl delete pod api-dpl-6b5698b848-vddc8
pod "api-dpl-6b5698b848-bthbs" deleted
kubectl get pods -w
NAME READY STATUS RESTARTS AGE api-dpl-6b5698b848-bthbs 1/1 Running 0 13d client-dpl-69786bdd8f-5zcnd 1/1 Running 0 13d db-dpl-6874657d-w6mzb 1/1 Running 0 54s kibana-dpl-55fdf8776f-k45pm 0/1 ContainerCreating 0 3s kibana-dpl-8d76c6dd8-cmrvz 0/1 Terminating 0 14d kibana-dpl-8d76c6dd8-cmrvz 0/1 Terminating 0 14d kibana-dpl-8d76c6dd8-cmrvz 0/1 Terminating 0 14d kibana-dpl-55fdf8776f-k45pm 1/1 Running 0 13s
kubectl describe pod api-dpl-6b5698b848-vddc8
Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 38s default-scheduler Successfully assigned develop/api-dpl-6b5698b848-vddc8 to ***.compute.internal Normal Pulling 37s kubelet, ***.compute.internal pulling image "694430786501.dkr.ecr.eu-west-3.amazonaws.com/adx/adx-api:develop" Normal Pulled 14s kubelet, ***.compute.internal Successfully pulled image "694430786501.dkr.ecr.eu-west-3.amazonaws.com/adx/adx-api:develop" Normal Created 13s kubelet, ***.compute.internal Created container Normal Started 13s kubelet, ***.compute.internal Started container
kubectl logs api-dpl-6b5698b848-vddc8
2019-02-08 11:04:50.939 INFO 1 --- [ main] o.s.s.concurrent.ThreadPoolTaskExecutor : Initializing ExecutorService 'applicationTaskExecutor' 2019-02-08 11:04:51.578 INFO 1 --- [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat started on port(s): XXX (http) with context path '' 2019-02-08 11:04:51.585 INFO 1 --- [ main] com.biomerieux.adxapi.AdxApiApplication : Started AdxApiApplication in 9.462 seconds (JVM running for 10.442)
2.9.2. In CI with deployment file
-
Add a variable in deployment file
spec:
replicas: 1
selector:
matchLabels:
app: sltback
template:
metadata:
labels:
app: sltback
# to force rollout on apply
commit: ${CI_COMMIT_SHA} (1)
1 | Unique value tied to CI, for example commit ID, here using Gitlab CI variables |
-
During CI, replace the value
apk --update add gettext # or apt-get, depending on the OS
envsubst < "my-app-with-variable.svc-dpl.yml" > "my-app.svc-dpl.yml"
-
Apply the deployment file
2.10. Example application
Here is a sample application, k8s-workshop, backed by a youtube video.
It’s a scalable webapp with a Redis cluster. To have some replicas at start, edit auto scaling files.
git clone https://github.com/reactiveops/k8s-workshop.git
kubectl create namespace k8s-workshop
kubectl label namespace k8s-workshop istio-injection=enabled
cd k8s-workshop/complete
kubectl apply -f 01_redis/
kubectl apply -f 02_webapp/
kubectl get pods --watch
NAME READY STATUS RESTARTS AGE redis-primary-7566957b9c-6rzb6 1/1 Running 0 4h10m redis-replica-7fd949b9d-db2rf 1/1 Running 0 4h10m redis-replica-7fd949b9d-rz7nb 1/1 Running 0 108m webapp-5498668448-8hcgq 1/1 Running 0 4h10m webapp-5498668448-ghv22 1/1 Running 0 107m webapp-5498668448-jdx9j 1/1 Running 0 107m
3. Helm
Helm helps you manage Kubernetes applications — Helm Charts help you define, install, and upgrade even the most complex Kubernetes application.
3.1. Installation
curl -s https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 | bash
3.2. Initialization for current cluster
kubectl create serviceaccount tiller -n kube-system
kubectl create clusterrolebinding tiller-is-admin --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
helm init --service-account tiller --upgrade
if already initialized by mistake, use this command :
|
3.2.1. Incubator repository
helm repo add incubator http://storage.googleapis.com/kubernetes-charts-incubator
helm repo update
3.3. Chart installation example
helm install stable/my-chart --name mc
if no name provided, an auto-generated name is used, like wonderful-rabbit . It is prefixed to all objects, so better shorten it by providing something.
|
3.4. Chart uninstallation
helm delete --purge mc
3.5. Multi-tenant
By default, Helm cannot install multiple charts with the same release name, which can become a problem when having multiple environment in the same cluster. The solution is having multiple tillers, one in each namespace/environment.
4. NGINX Ingress Controller
helm install stable/nginx-ingress --name ngc --namespace kube-system
Then get the public DNS with :
kubectl get svc
Works out of the box with AWS/EKS |
4.1. Cert manager
For TLS, a Certificate manager is needed. Steps are taken from official install documentations.
-
Install the manager
kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.7/deploy/manifests/00-crds.yaml
kubectl create namespace cm
kubectl label namespace cm certmanager.k8s.io/disable-validation=true
helm repo add jetstack https://charts.jetstack.io
helm repo update
helm install --name cm --namespace cm --version v0.7.0 jetstack/cert-manager
-
Check installation using official procedure.
5. Istio
Connect, secure, control, and observe services in a Kubernetes Cluster.
|
5.1. Prerequisites
Have Helm installed, locally and cluster side.
5.2. Installation
-
Download and initialise Istio using Helm
curl -L https://git.io/getLatestIstio | ISTIO_VERSION=1.1.0 sh -
cd istio-1.1.0
export PATH=$PWD/bin:$PATH
helm install install/kubernetes/helm/istio-init --name istio-init --namespace istio
-
Check that bellow command returns
53
(since cert-manager is not enabled, else it would be58
)
kubectl get crds | grep 'istio.io\|certmanager.k8s.io' | wc -l
-
Finish installation with Helm
helm template install/kubernetes/helm/istio --name istio --namespace istio --values install/kubernetes/helm/istio/values-istio-demo.yaml | kubectl apply -f -
-
Check Istio pods startup
kubectl get pods -n istio -w
NAME READY STATUS RESTARTS AGE grafana-57586c685b-m67t6 1/1 Running 0 2d19h istio-citadel-645ffc4999-7g9rl 1/1 Running 0 2d19h istio-cleanup-secrets-1.1.0-ks4fb 0/1 Completed 0 2d19h istio-egressgateway-5c7fd57fdb-spwlp 1/1 Running 0 2d19h istio-galley-978f9447f-zj8pd 1/1 Running 0 2d19h istio-grafana-post-install-1.1.0-wjn4m 0/1 Completed 0 2d19h istio-ingressgateway-8ccdc79bc-p67np 1/1 Running 0 2d19h istio-init-crd-10-t6pwq 0/1 Completed 0 2d19h istio-init-crd-11-j788x 0/1 Completed 0 2d19h istio-pilot-679c6b45b8-5fbw7 2/2 Running 0 2d19h istio-policy-fccd56fd-8qtlb 2/2 Running 2 2d19h istio-security-post-install-1.1.0-p9k29 0/1 Completed 0 2d19h istio-sidecar-injector-6dcc9d5c64-szrcw 1/1 Running 0 2d19h istio-telemetry-9bcfc78bd-mfwsh 2/2 Running 1 2d19h istio-tracing-656f9fc99c-r9n7d 1/1 Running 0 2d19h kiali-69d6978b45-t7tdn 1/1 Running 0 2d19h prometheus-66c9f5694-c548z 1/1 Running 0 2d19h
5.3. Activation for a namespace
Namespaces have to be explicitely labelled to be monitored by Istio
kubectl label namespace k8s-workshop istio-injection=enabled
5.4. UIs
If your cluster is behind a bastion, forward cluster API port to your local machine
ssh -L 6443:10.250.202.142:6443 ubuntu@35.180.231.227 -N -i ~/.ssh/traffic_staging_k8s_fr.pem
5.4.1. Kiali
Kiali project provides answers to the questions: What microservices are part of my Istio service mesh and how are they connected.
-
forward the port
kubectl -n istio-system port-forward $(kubectl -n istio-system get pod -l app=kiali -o jsonpath='{.items[0].metadata.name}') 20001:20001 &
-
access UI at http://localhost:20001
5.4.2. Jaeger / OpenTracing
Open source, end-to-end distributed tracing. Monitor and troubleshoot transactions in complex distributed systems.
-
forward the port
kubectl port-forward -n istio-system $(kubectl get pod -n istio-system -l app=jaeger -o jsonpath='{.items[0].metadata.name}') 16686:16686 &
-
access UI at http://localhost:16686
5.4.3. Grafana & Prometheus
Grafana in front of Prometheus, for analytics and monitoring the cluster.
-
forward the port
kubectl -n istio-system port-forward $(kubectl -n istio-system get pod -l app=grafana -o jsonpath='{.items[0].metadata.name}') 3000:3000 &
6. Powerfulseal
PowerfulSeal adds chaos to your Kubernetes clusters, so that you can detect problems in your systems as early as possible. It kills targeted pods and takes VMs up and down.
Powerfulseal uses collected cluster IPs, so if the cluster is behind a bastion, installation and usage must be on the bastion. |
-
Install Powerfulseal
sudo pip install powerfulseal
-
Create a policy file. Here we focus on pods failure
config:
minSecondsBetweenRuns: 1
maxSecondsBetweenRuns: 30
nodeScenarios: []
podScenarios:
- name: "delete random pods in default namespace"
match:
- namespace:
name: "k8s-workshop"
filters:
- randomSample:
size: 1
actions:
- kill:
probability: 0.77
force: true
-
Launch powerfulseal
seal autonomous --kubeconfig ~/.kube/config --no-cloud --inventory-kubernetes --ssh-allow-missing-host-keys --remote-user ubuntu --ssh-path-to-private-key ~/.ssh/traffic_staging_k8s_fr.pem --policy-file ~/seal-policy.yml --host 0.0.0.0 --port 30100
-
watch your pods failing and restarting
kubectl get pods --watch
NAME READY STATUS RESTARTS AGE redis-primary-7566957b9c-6rzb6 1/1 Running 1 4h10m redis-replica-7fd949b9d-db2rf 1/1 Running 0 4h10m redis-replica-7fd949b9d-rz7nb 1/1 Running 0 108m webapp-5498668448-8hcgq 1/1 Running 3 4h10m webapp-5498668448-ghv22 1/1 Running 1 107m webapp-5498668448-jdx9j 1/1 Running 1 107m
7. Troobleshooting
7.1. User "system:anonymous" cannot get resource
With some script under AWS, you can have this error
User \"system:anonymous\" cannot get resource
This command grants admin rights to anonymous users :
kubectl create clusterrolebinding cluster-system-anonymous --clusterrole=cluster-admin --user=system:anonymous
temporary when installing, and only if your cluster is not accessible from internet |
8. Full project example
8.2. Environments
The AWS cluster will host multiple environments, so we first create and use a develop
namespace :
kubectl create namespace develop
kubectl config current-context
arn:aws:eks:<EKS_REGION>:<AWS_ACCOUNT>:cluster/<CLUSTER>
kubectl config set-context arn:aws:eks:<EKS_REGION>:<AWS_ACCOUNT>:cluster/<CLUSTER> --namespace=develop
8.3. Deployments
Kubernetes deployments and services are stored in the same file for each module.
8.3.1. Elasticsearch
We start with the elasticsearch database.
-
This is the OSS image, simpler, no need for X-Pack
-
Note the system command in
initContainers
section
Deployment file
db.svc-dpl.yml
#
# database (elasticsearch) service and deployement
#
apiVersion: v1
kind: Service
metadata:
name: elasticsearch
labels:
app: db
tier: backend
group: adx
spec:
ports:
- port: 9200 # External port
targetPort: http # Port exposed by the pod/container from the deployment
selector:
app: db
group: adx
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: db-dpl
labels:
app: db
tier: backend
group: adx
spec:
replicas: 1
template:
metadata:
labels:
app: db
tier: backend
group: adx
spec:
initContainers:
- name: "sysctl"
image: "busybox"
imagePullPolicy: "IfNotPresent"
command: ["sysctl", "-w", "vm.max_map_count=262144"]
securityContext:
privileged: true
containers:
- name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch-oss:6.6.0
imagePullPolicy: "IfNotPresent"
ports:
- containerPort: 9200
name: http
env:
- name: ES_JAVA_OPTS
value: "-Xms512m -Xmx512m"
- name: discovery.type
value: single-node
resources:
limits:
memory: 1024Mi
requests:
memory: 512Mi
Commands
Launch (or update) the deployment :
kubectl apply -f adx.db.svc-dpl.yml
service/api created deployment.extensions/api-dpl created
kubectl get rs
NAME DESIRED CURRENT READY AGE db-dpl-5c767f46c7 1 1 1 32m
kubectl get pods
NAME READY STATUS RESTARTS AGE db-dpl-5c767f46c7-tkqkv 1/1 Running 0 32m
8.3.2. Kibana
Kibana is included, only for elasticsearch administration in test environments.
-
This is the OSS image, simpler, no need for X-Pack
-
This will not be accessible from external network, for security reasons
Deployment file
kibana.svc-dpl.yml
#
# kibana (elastic admin) service and deployement
#
apiVersion: v1
kind: Service
metadata:
name: kibana
labels:
app: kibana
tier: backend
group: adx
spec:
# pour protéger, pas de type et pas de port + proxy forward
# type: NodePort # Make the service externally visible via the node
ports:
- port: 5601 # External port
targetPort: http # Port exposed by the pod/container from the deployment
selector:
app: kibana
group: adx
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: kibana-dpl
labels:
app: kibana
tier: backend
group: adx
spec:
replicas: 1
template:
metadata:
labels:
app: kibana
tier: backend
group: adx
spec:
containers:
- name: kibana
image: docker.elastic.co/kibana/kibana-oss:6.6.0
env:
- name: ELASTICSEARCH_URL
value: http://elasticsearch:9200
resources:
limits:
memory: 512Mi
requests:
memory: 256Mi
ports:
- containerPort: 5601
name: http
Commands
Launch (or update) the deployment :
kubectl apply -f adx.kibana.svc-dpl.yml
To access the UI, we use port forwarding in a dedicated shell :
kubectl port-forward svc/kibana 5601:5601
The Kibana UI is now available at http://localhost:5601
8.3.3. Api / backend
-
The backend is pulled from AWS/ECR registry
Prerequisites
-
Get the full image name in EKR
-
Got to AWS Admin UI
-
Choose the zone containing your registry
-
Services → ECR → api repository
-
Get the
Image URI
-
-
get the registry password
aws ecr get-login
docker login -u AWS -p <PASSWORD> -e none https://<AWS_ACCOUNT>.dkr.ecr.<EKS_REGION>.amazonaws.com
-
create/update the secret using it
kubectl delete secret ecr-registry-secret
kubectl create secret docker-registry ecr-registry-secret --docker-username="AWS" --docker-password="<PASSWORD>" --docker-server="<AWS_ACCOUNT>.dkr.ecr.<EKS_REGION>.amazonaws.com" --docker-email="my.email@my-provider.com"
It is valid for 12 hours.
Now we can update the file and deploy it.
Deployment file
api.svc-dpl.yml
#
# api (back-end) service and deployement
#
apiVersion: v1
kind: Service
metadata:
name: api
labels:
app: api
tier: backend
group: adx
spec:
ports:
- port: 8080 # External port
targetPort: http # Port exposed by the pod/container from the deployment
selector:
app: api
group: adx
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: api-dpl
labels:
app: api
tier: backend
group: adx
spec:
replicas: 1
template:
metadata:
labels:
app: api
tier: backend
group: adx
spec:
# initContainers:
# - name: "sysctl"
# image: "busybox"
# imagePullPolicy: "IfNotPresent"
# command: ["curl", "-XGET", "http://elasticsearch:9200/_cluster/health?pretty=true"]
containers:
- name: api
image: ***.dkr.ecr.eu-west-3.amazonaws.com/adx/adx-api:develop
ports:
- containerPort: 8080
name: http
env:
- name: ELASTICSEARCH_REST_URIS
value: http://elasticsearch:9200
imagePullPolicy: Always
# imagePullSecrets:
# - name: ecr-registry-secret
8.3.4. Client / frontend
Prerequisites
Same as Api module.
Deployment file
client.ing-svc-dpl.yml
#
# client (front-end) ingress service and deployement
#
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: client-demo
spec:
# tls:
# - hosts:
# - cafe.example.com
# secretName: cafe-secret
rules:
- host: demo.afq.link
http:
paths:
- path: /
backend:
serviceName: client
servicePort: 10080
---
apiVersion: v1
kind: Service
metadata:
name: client
labels:
app: client
tier: frontend
group: adx
spec:
type: LoadBalancer # Make the service visible to the world
ports:
- port: 10080 # External port
targetPort: http # Port exposed by the pod/container from the deployment
selector:
app: client
group: adx
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: client-dpl
labels:
app: client
tier: frontend
group: adx
spec:
replicas: 1
template:
metadata:
labels:
app: client
tier: frontend
group: adx
spec:
containers:
- name: client
image: 694430786501.dkr.ecr.eu-west-3.amazonaws.com/adx/adx-client:develop
ports:
- containerPort: 80
name: http
imagePullPolicy: Always
# imagePullSecrets:
# - name: ecr-registry-secret
Access the frontend in a browser
-
Get the host/port
get services -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
adx-api ClusterIP 10.100.78.159 <none> 8080/TCP 2h app=api,group=adx
client LoadBalancer 10.100.145.183 <host>.us-east-2.elb.amazonaws.com 10080:30587/TCP 2h app=client,group=adx
elasticsearch ClusterIP 10.100.15.82 <none> 9200/TCP 23h app=db,group=adx
kibana ClusterIP 10.100.114.147 <none> 5601/TCP 23h app=kibana,group=adx