Kubernetes v1.10
    
    
    
    
    
stable
kubeadm init and kubeadm join together provides a nice user experience for creating a best-practice but bare Kubernetes cluster from scratch.
However, it might not be obvious how kubeadm does that.
This document provides additional details on what happen under the hood, with the aim of sharing knowledge on Kubernetes cluster best practices.
The cluster that kubeadm init and kubeadm join set up should be:
kubeadm initexport KUBECONFIG=/etc/kubernetes/admin.confkubectl apply -f <network-of-choice.yaml>kubeadm join --token <token> <master-ip>:<master-port>In order to reduce complexity and to simplify development of an on-top-of-kubeadm-implemented deployment solution, kubeadm uses a limited set of constants values for well know-known paths and file names.
The Kubernetes directory /etc/kubernetes is a constant in the application, since it is clearly the given path
in a majority of cases, and the most intuitive location; other constants paths and file names are:
/etc/kubernetes/manifests as the path where kubelet should look for static Pod manifests. Names of static Pod manifests are:
etcd.yamlkube-apiserver.yamlkube-controller-manager.yamlkube-scheduler.yaml/etc/kubernetes/ as the path where kubeconfig files with identities for control plane components are stored. Names of kubeconfig files are:
kubelet.conf (bootstrap-kubelet.conf during TLS bootstrap)controller-manager.confscheduler.confadmin.conf for the cluster admin and kubeadm itselfca.crt, ca.key for the Kubernetes certificate authorityapiserver.crt, apiserver.key for the API server certificateapiserver-kubelet-client.crt, apiserver-kubelet-client.key for the client certificate used by the API server to connect to the kubelets securelysa.pub, sa.key for the key used by the controller manager when signing ServiceAccountfront-proxy-ca.crt, front-proxy-ca.key for the front proxy certificate authorityfront-proxy-client.crt, front-proxy-client.key for the front proxy clientThe kubeadm init internal workflow consists of a sequence of atomic work tasks to perform,
as described in kubeadm init.
The kubeadm init phase command allows users to invoke individually each task, and ultimately offers a reusable and composable
API/toolbox that can be used by other Kubernetes bootstrap tools, by any IT automation tool or by advanced user
for creating custom clusters.
Kubeadm executes a set of preflight checks before starting the init, with the aim to verify preconditions and avoid common cluster startup problems.
In any case the user can skip specific preflight checks (or eventually all preflight checks) with the --ignore-preflight-errors option.
--kubernetes-version flag) is at least one minor version higher than the kubeadm CLI version./etc/kubernetes/manifest folder already exists and it is not empty/proc/sys/net/bridge/bridge-nf-call-iptables file does not exist/does not contain 1/proc/sys/net/bridge/bridge-nf-call-ip6tables does not exist/does not contain 1.ip, iptables,  mount, nsenter commands are not present in the command pathebtables, ethtool, socat, tc, touch, crictl commands are not present in the command pathPlease note that:
kubeadm init phase preflight commandKubeadm generates certificate and private key pairs for different purposes:
ca.crt file and ca.key private key fileca.crt as the CA, and saved into apiserver.crt file with
its private key apiserver.key. This certificate should contain following alternative names:
10.96.0.1 if service subnet is 10.96.0.0/12)kubernetes.default.svc.cluster.local if --service-dns-domain flag value is cluster.local, plus default DNS names kubernetes.default.svc, kubernetes.default, kubernetes--apiserver-advertise-addressca.crt as the CA and saved into
apiserver-kubelet-client.crt file with its private key apiserver-kubelet-client.key.
This certificate should be in the system:masters organizationsa.key file along with its public key sa.pubfront-proxy-ca.crt file with its key front-proxy-ca.keyfront-proxy-ca.crt as the CA and saved into front-proxy-client.crt file
with its private keyfront-proxy-client.keyCertificates are stored by default in /etc/kubernetes/pki, but this directory is configurable using the --cert-dir flag.
Please note that:
/etc/kubernetes/pki/ca.{crt,key}, and then kubeadm will use those files for signing the rest of the certs.
See also using custom certificatesca.crt file but not the ca.key file, if all other certificates and kubeconfig files
already are in place kubeadm recognize this condition and activates the ExternalCA , which also implies the csrsignercontroller in
controller-manager won’t be started--dry-run mode, certificates files are written in a temporary folderkubeadm init phase certs all commandKubeadm kubeconfig files with identities for control plane components:
/etc/kubernetes/kubelet.conf; inside this file is embedded a client certificate with kubelet identity.
This client cert should:
system:nodes organization, as required by the Node Authorization modulesystem:node:<hostname-lowercased>/etc/kubernetes/controller-manager.conf; inside this file is embedded a client
certificate with controller-manager identity. This client cert should have the CN system:kube-controller-manager, as defined
by default RBAC core components roles/etc/kubernetes/scheduler.conf; inside this file is embedded a client certificate with scheduler identity.
This client cert should have the CN system:kube-scheduler, as defined by default RBAC core components rolesAdditionally, a kubeconfig file for kubeadm to use itself and the admin is generated and save into the /etc/kubernetes/admin.conf file.
The “admin” here is defined the actual person(s) that is administering the cluster and want to have full control (root) over the cluster.
The embedded client certificate for admin should:
- Be in the system:masters organization, as defined by default RBAC user facing role bindings
- Include a CN, but that can be anything. Kubeadm uses the kubernetes-admin CN
Please note that:
ca.crt certificate is embedded in all the kubeconfig files.--dry-run mode, kubeconfig files are written in a temporary folderkubeadm init phase kubeconfig all commandKubeadm writes static Pod manifest files for control plane components to /etc/kubernetes/manifests; the kubelet watches this directory for Pods to create on startup.
Static Pod manifest share a set of common properties:
kube-system namespacetier:control-plane and component:{component-name} labelsscheduler.alpha.kubernetes.io/critical-pod annotation (this will be moved over to the proper solution
of using Pod Priority and Preemption when ready)hostNetwork: true is set on all static Pods to allow control plane startup before a network is configured; as a consequence:
address that the controller-manager and the scheduler use to refer the API server is 127.0.0.1etcd-servers address will be set to 127.0.0.1:2379Please note that:
--kubernetes-version/current architecture, will be pulled from k8s.gcr.io;
In case an alternative image repository or CI image repository is specified this one will be used; In case a specific container image
should be used for all control plane components, this one will be used. see using custom images
for more details--dry-run mode, static Pods files are written in a temporary folderkubeadm init phase control-plane all commandThe static Pod manifest for the API server is affected by following parameters provided by the users:
apiserver-advertise-address and apiserver-bind-port to bind to; if not provided, those value defaults to the IP address of
the default network interface on the machine and port 6443service-cluster-ip-range to use for servicesetcd-servers address and related TLS settings (etcd-cafile, etcd-certfile, etcd-keyfile);
if an external etcd server is not be provided, a local etcd will be used (via host network)--cloud-provider is configured, together with the  --cloud-config path
if such file exists (this is experimental, alpha and will be removed in a future version)Other API server flags that are set unconditionally are:
--insecure-port=0 to avoid insecure connections to the api server--enable-bootstrap-token-auth=true to enable the BootstrapTokenAuthenticator authentication module.
See TLS Bootstrapping for more details--allow-privileged to true (required e.g. by kube proxy)--requestheader-client-ca-file to front-proxy-ca.crt--enable-admission-plugins to:
NamespaceLifecycle e.g. to avoid deletion of
system reserved namespacesLimitRanger and ResourceQuota to enforce limits on namespacesServiceAccount to enforce service account automationPersistentVolumeLabel attaches region or zone labels to
PersistentVolumes as defined by the cloud provider (This admission controller is deprecated and will be removed in a future version.
It is not deployed by kubeadm by default with v1.9 onwards when not explicitly opting into using gce or aws as cloud providers)DefaultStorageClass to enforce default storage class on PersistentVolumeClaim objectsDefaultTolerationSecondsNodeRestriction to limit what a kubelet can modify
(e.g. only pods on this node)--kubelet-preferred-address-types to InternalIP,ExternalIP,Hostname; this makes kubectl logs and other API server-kubelet
communication work in environments where the hostnames of the nodes aren’t resolvable--client-ca-file to ca.crt--tls-cert-file to apiserver.crt--tls-private-key-file to apiserver.key--kubelet-client-certificate to apiserver-kubelet-client.crt--kubelet-client-key to apiserver-kubelet-client.key--service-account-key-file to sa.pub--requestheader-client-ca-file tofront-proxy-ca.crt--proxy-client-cert-file to front-proxy-client.crt--proxy-client-key-file to front-proxy-client.key--requestheader-username-headers=X-Remote-User--requestheader-group-headers=X-Remote-Group--requestheader-extra-headers-prefix=X-Remote-Extra---requestheader-allowed-names=front-proxy-clientThe static Pod manifest for the API server is affected by following parameters provided by the users:
--pod-network-cidr, the subnet manager feature required for some CNI network plugins is enabled by
setting:
--allocate-node-cidrs=true--cluster-cidr and --node-cidr-mask-size flags according to the given CIDR--cloud-provider is specified, together with the  --cloud-config path
if such configuration file exists (this is experimental, alpha and will be removed in a future version)Other flags that are set unconditionally are:
--controllers enabling all the default controllers plus BootstrapSigner and TokenCleaner controllers for TLS bootstrap.
See TLS Bootstrapping for more details--use-service-account-credentials to true--root-ca-file to ca.crt--cluster-signing-cert-file to ca.crt, if External CA mode is disabled, otherwise to ""--cluster-signing-key-file to ca.key, if External CA mode is disabled, otherwise to ""--service-account-private-key-file to sa.keyThe static Pod manifest for the scheduler is not affected by parameters provided by the users.
If the user specified an external etcd this step will be skipped, otherwise kubeadm generates a static Pod manifest file for creating a local etcd instance running in a Pod with following attributes:
localhost:2379 and use HostNetwork=truehostPath mount out from the dataDir to the host’s filesystemPlease note that:
k8s.gcr.io. In case an alternative image repository is specified this one will be used;
In case an alternative image name is specified, this one will be used. see using custom images for more details--dry-run mode, the etcd static Pod manifest is written in a temporary folderkubeadm init phase etcd local commandTo use this functionality call kubeadm alpha kubelet config enable-dynamic. It writes the kubelet init configuration
into /var/lib/kubelet/config/init/kubelet file.
The init configuration is used for starting the kubelet on this specific node, providing an alternative for the kubelet drop-in file; such configuration will be replaced by the kubelet base configuration as described in following steps. See set Kubelet parameters via a config file for additional info.
Please note that:
--dynamic-config-dir=/var/lib/kubelet/config/dynamic should be specified
in /etc/systemd/system/kubelet.service.d/10-kubeadm.confKubeletConfiguration object to kubeadm init or kubeadm join by using
a configuration file --config some-file.yaml. The KubeletConfiguration object can be separated from other objects such
as InitConfiguration using the --- separator. For more details have a look at the kubeadm config print-default command.This is a critical moment in time for kubeadm clusters.
kubeadm waits until localhost:6443/healthz returns ok, however in order to detect deadlock conditions, kubeadm fails fast
if localhost:10255/healthz (kubelet liveness) or localhost:10255/healthz/syncloop (kubelet readiness) don’t return ok,
respectively after 40 and 60 second.
kubeadm relies on the kubelet to pull the control plane images and run them properly as static Pods. After the control plane is up, kubeadm completes the tasks described in following paragraphs.
If kubeadm is invoked with --feature-gates=DynamicKubeletConfig:
kubelet-base-config-v1.9 ConfigMap in the kube-system namespacesystem:bootstrappers:kubeadm:default-node-token and system:nodes groups)Node.spec.configSource to the newly-created ConfigMapkubeadm saves the configuration passed to kubeadm init, either via flags or the config file, in a ConfigMap
named kubeadm-config under kube-system namespace.
This will ensure that kubeadm actions executed in future (e.g kubeadm upgrade) will be able to determine the actual/current cluster
state and make new decisions based on that data.
Please note that:
kubeadm init phase upload-config commandkubeadm upgrade to v1.8 . In order to facilitate this task, the kubeadm config upload (from-flags|from-file)
was implementedAs soon as the control plane is available, kubeadm executes following actions:
node-role.kubernetes.io/master=""node-role.kubernetes.io/master:NoSchedulePlease note that:
kubeadm init phase mark-control-plane commandKubeadm uses Authenticating with Bootstrap Tokens for joining new nodes to an existing cluster; for more details see also design proposal.
kubeadm init ensures that everything is properly configured for this process, and this includes following steps as well as
setting API server and controller flags as already described in previous paragraphs.
Please note that:
kubeadm init phase bootstrap-token
command, executing all the configuration steps described in following paragraphs; alternatively, each step can be invoked individuallykubeadm init create a first bootstrap token, either generated automatically or provided by the user with the --token flag; as documented
in bootstrap token specification, token should be saved as secrets with name bootstrap-token-<token-id> under kube-system namespace.
Please note that:
kubeadm init will be used to validate temporary user during TLS bootstrap process; those users will
be member of  system:bootstrappers:kubeadm:default-node-token group—token-ttl flag)kubeadm token command, that provide as well other useful functions
for token managementKubeadm ensures that users in  system:bootstrappers:kubeadm:default-node-token group are able to access the certificate signing API.
This is implemented by creating a ClusterRoleBinding named kubeadm:kubelet-bootstrap between the group above and the default
RBAC role system:node-bootstrapper.
Kubeadm ensures that the Bootstrap Token will get its CSR request automatically approved by the csrapprover controller.
This is implemented by creating ClusterRoleBinding named kubeadm:node-autoapprove-bootstrap between
the  system:bootstrappers:kubeadm:default-node-token group and the default role system:certificates.k8s.io:certificatesigningrequests:nodeclient.
The role system:certificates.k8s.io:certificatesigningrequests:nodeclient should be created as well, granting
POST permission to /apis/certificates.k8s.io/certificatesigningrequests/nodeclient.
Kubeadm ensures that certificate rotation is enabled for nodes, and that new certificate request for nodes will get its CSR request automatically approved by the csrapprover controller.
This is implemented by creating ClusterRoleBinding named kubeadm:node-autoapprove-certificate-rotation between the  system:nodes group
and the default role system:certificates.k8s.io:certificatesigningrequests:selfnodeclient.
This phase creates the cluster-info ConfigMap in the kube-public namespace.
Additionally it is created a role and a RoleBinding granting access to the ConfigMap for unauthenticated users
(i.e. users in RBAC group system:unauthenticated)
Please note that:
cluster-info ConfigMap is not rate-limited. This may or may not be a problem if you expose your master
to the internet; worst-case scenario here is a DoS attack where an attacker uses all the in-flight requests the kube-apiserver
can handle to serving the cluster-info ConfigMap.Kubeadm installs the internal DNS server and the kube-proxy addon components via the API server. Please note that:
kubeadm init phase addon all command.A ServiceAccount for kube-proxy is created in the kube-system namespace; then kube-proxy is deployed as a DaemonSet:
ca.crt and token) to the master come from the ServiceAccountkube-proxy ServiceAccount is bound to the privileges in the system:node-proxier ClusterRoleNote that:
kube-dns. This is done to prevent any interruption
in service when the user is switching the cluster DNS from kube-dns to CoreDNS or vice-versa--feature-gates=CoreDNS=true--feature-gates=CoreDNS=false to install kube-dns insteadCoreDNS feature gate is no longer available and kube-dns can be installed using the --config method described hereA ServiceAccount for CoreDNS/kube-dns is created in the kube-system namespace.
Deploy the kube-dns Deployment and Service:
kube-dns ServiceAccount is bound to the privileges in the system:kube-dns ClusterRoleSimilarly to kubeadm init, also kubeadm join internal workflow consists of a sequence of atomic work tasks to perform.
This is split into discovery (having the Node trust the Kubernetes Master) and TLS bootstrap (having the Kubernetes Master trust the Node).
see Authenticating with Bootstrap Tokens or the corresponding design proposal.
kubeadm executes a set of preflight checks before starting the join, with the aim to verify preconditions and avoid common
cluster startup problems.
Please note that:
kubeadm join preflight checks are basically a subset kubeadm init preflight checks--ignore-preflight-errors option.There are 2 main schemes for discovery. The first is to use a shared token along with the IP address of the API server. The second is to provide a file (that is a subset of the standard kubeconfig file).
If kubeadm join is invoked with --discovery-token, token discovery is used; in this case the node basically retrieves
the cluster CA certificates from the  cluster-info ConfigMap in the kube-public namespace.
In order to prevent “man in the middle” attacks, several steps are taken:
kubeadm init granted access to  cluster-info users for system:unauthenticated )--discovery-token-ca-cert-hash. This value is available in the output of kubeadm init or can
be calculated using standard tools (the hash is calculated over the bytes of the Subject Public Key Info (SPKI) object as in RFC7469).
The --discovery-token-ca-cert-hash flag may be repeated multiple times to allow more than one public key.Please note that:
--discovery-token-unsafe-skip-ca-verification flag; This weakens the kubeadm security
model since others can potentially impersonate the Kubernetes Master.If kubeadm join is invoked with --discovery-file, file discovery is used; this file can be a local file or downloaded via an HTTPS URL; in case of HTTPS, the host installed CA bundle is used to verify the connection.
With file discovery, the cluster CA certificates is provided into the file itself; in fact, the discovery file is a kubeconfig
file with only server and certificate-authority-data attributes set, as described in kubeadm join reference doc;
when the connection with the cluster is established, kubeadm try to access the cluster-info ConfigMap, and if available, uses it.
Once the cluster info are known, the file bootstrap-kubelet.conf is written, thus allowing kubelet to do TLS Bootstrapping
(conversely until v.1.7 TLS bootstrapping were managed by kubeadm).
The TLS bootstrap mechanism uses the shared token to temporarily authenticate with the Kubernetes Master to submit a certificate signing request (CSR) for a locally created key pair.
The request is then automatically approved and the operation completes saving ca.crt file and kubelet.conf file to be used
by kubelet for joining the cluster, whilebootstrap-kubelet.conf is deleted.
Please note that:
kubeadm init process (or with additional tokens
created with kubeadm token)system:bootstrappers:kubeadm:default-node-token group which was granted
access to CSR api during the kubeadm init processkubeadm init processIf kubeadm is invoked with --feature-gates=DynamicKubeletConfig:
kubelet-base-config-v1.9 ConfigMap in the kube-system namespace  using the
Bootstrap Token credentials, and write it to disk as kubelet init configuration file  /var/lib/kubelet/config/init/kubelet/etc/kubernetes/kubelet.conf), update current node configuration
specifying that the source for the node/kubelet configuration is the above ConfigMap.Please note that:
--dynamic-config-dir=/var/lib/kubelet/config/dynamic should be specified in /etc/systemd/system/kubelet.service.d/10-kubeadm.confWas this page helpful?
Thanks for the feedback. If you have a specific, answerable question about how to use Kubernetes, ask it on Stack Overflow. Open an issue in the GitHub repo if you want to report a problem or suggest an improvement.