Compare commits

...

527 Commits

Author SHA1 Message Date
c91a05f330 debian: Fix test failed after bullseye release (#7888)
(cherry picked from commit 79166496f3)
2021-10-29 07:46:51 -07:00
a583a2d9aa Implement drain fallback with --disable-eviction to ignore PDBs
Signed-off-by: Utku Ozdemir <uoz@protonmail.com>
2021-10-29 07:46:51 -07:00
713abf29ca Update vSphere CPI (#7840)
Backport of #7838

Changes:
  * ClusterRole updated according to the latest manifests from
    https://github.com/kubernetes/cloud-provider-vsphere
  * vSphere CPI/CSI default versions bumped and
    tested successfully on K8S 1.21.1
  * vSphere documentation updated

Signed-off-by: Vitaliy D <vi7alya@gmail.com>
2021-07-30 06:03:37 -07:00
247d062c02 [2.16] Fix how to get image ID on offline deployment (#7829)
* Add error handling for registorying images (#7787)

When running the script, I faced the following error but it was
difficult to know the root problem due to lack of error handling.

  docker tag" requires exactly 2 arguments.
  See 'docker tag --help'.

  Usage:  docker tag SOURCE_IMAGE[:TAG] TARGET_IMAGE[:TAG]

  Create a tag TARGET_IMAGE that refers to SOURCE_IMAGE

To investigate such errors easily, this adds an error handling.

* Fix how to get image ID on offline deployment (#7808)

Previously IDs of container images were gotten from tar files of container
images but that way was wrong. If multiple json files are contained in a
tar file, the script got multiple IDs and tried to pass these IDs on
`docker tag` command. Then the command was failed.

This updates the script to get image IDs from `docker image inspect` command
to fix this issue.
In addition, this adds a check a registry container exists already or not
before deploying registry container to avoid a container conflict failure.
2021-07-28 00:01:35 -07:00
9fa051780e [2.16] Disable OVH CI until voucher situation is cleared up (#7824) (#7831)
* Disable OVH CI until  voucher situation is cleared up (#7824)

* Allow failure on tf-elax_ubuntu18-calico (#7814)

tf-elax_ubuntu18-calico is so flake today. The test job is failed
due to SSH connectivity check error after deploying virtual machines
which are used for Kubernetes nodes.
This allows failure on the job to see the test situation without
pull request merger failures.

Co-authored-by: Maxime Guyot <Miouge1@users.noreply.github.com>
2021-07-27 06:16:45 -07:00
bcf695913f Fix Oracle yum disabled repository file after EPEL install (#7639) 2021-05-25 08:30:23 -07:00
23cd1d41fb Minor spelling edits (#7640)
Minor spelling edits.  Was reading your documentation.
2021-05-24 23:48:22 -07:00
62f5369237 Remove warning about Docker-only support (#7626) 2021-05-20 00:01:05 -07:00
59fc17f4e3 Override the default value of containerd's root, state, and oom_score (#7622)
* Override the default value of containerd's root, state, and oom_score configurations

* Add tests data for containerd_storage_dir, containerd_state_dir and containerd_oom_score variables
2021-05-19 08:24:53 -07:00
26c1d42fff add note on download_localhost (#7623)
It can be counterintuitive for the downloaded files to show up on local host when download_localhost is false, this explains the reason.
2021-05-19 05:04:51 -07:00
c1aa755a3c Fix missing broken_etcd filter in recover control plane task (#7619) 2021-05-18 10:29:04 -07:00
b3d9f2b4a2 Add contrib playbook to disable service firewall (#7431)
Basically we need to make necessary TCP/UDP ports open.
However the necessary ports are so many, and sometimes it is difficult
to figure out that is due to firewall issues or not if facing deployment
issues.
To distinguish a root problem on such situation, this adds contrib
playbook to disable the service firewall for Kubespray development
and test.
2021-05-18 06:45:30 -07:00
29c2fbdbc1 Fix cloud_resolver type from str to list (issue #7605) (#7606) 2021-05-18 06:41:30 -07:00
4b9f98f933 Fix pull_by_digest variable type to boolean instead of str (#7612) 2021-05-18 06:29:31 -07:00
e9870b8d25 add support for using ansible 2.10.x for deploying kubespray (#7600)
* add support for using ansible 2.10.x for deploying kubespray

* move dns-autoscaler-clusterrole{binding}.yml to files/ folder

* note that ansible 2.10 is now experimentally supported

* coredns: move files to templates like before #4341
2021-05-18 05:39:31 -07:00
e0c74fa082 Update nerdctl version to 0.8.1 (#7617) 2021-05-17 11:07:30 -07:00
5b93a97281 remove experimental note about CentOS 8 and derivatives (#7615) 2021-05-16 12:07:59 -07:00
bdf74c6749 Set default version to 1.20.7 2021-05-14 09:48:06 -07:00
d6f9a8d752 Update hashes with 1.21.1/1.20.7/1.19.11 2021-05-14 09:48:06 -07:00
e357d8678c update README about supported OSes (#7608) 2021-05-14 00:06:05 -07:00
b1b407a0b4 Replace map in Terraform scripts with tomap (#7576) (#7578)
* Replace map in Terraform scripts with tomap (#7576)

* Fix Terraform linter warnings (#7576)
2021-05-12 07:34:17 -07:00
6c3d1649a6 fixed MarkupSafe version to 1.1.1 (#7607) 2021-05-12 06:52:17 -07:00
14cf3e138b Support Calico advertisement of MetalLB LoadBalancer IPs (#7593)
* add initial MetalLB docs

* metallb allow disabling the deployment of the metallb speaker

* calico>=3.18 allow using calico to advertise service loadbalancer IPs

* Document the use of MetalLB and Calico

* clean MetalLB docs
2021-05-12 05:22:17 -07:00
afbabebfd5 Enables Calico serviceAccount token monitoring and update of /etc/cni/net.d/calico-kubeconfig if need be. (#7586)
Since K8S 1.21, BoundServiceAccountTokenVolume feature gate is in beta stage, thus activated by default (anyone who follows CSI guidelines has enabled AllAlpha and faced the issue before 1.21).
With this feature, SA tokens are regenerated every hour.
As a consequence for Calico CNI, token in /etc/cni/net.d/calico-kubeconfig copied from /var/run/secrets/kubernetes.io/serviceaccount in install-cni initContainer expires after one hour and any pod creation fails due to unauthorization.
Calico pods need to be restarted so that /etc/cni/net.d/calico-kubeconfig is updated with the new SA token.
2021-05-11 08:47:36 -07:00
8c0a2741ae allow overriding calico peers names and avoid ipv6 naming issues (#7591) 2021-05-11 07:05:36 -07:00
1d078e1119 Add script for generate download files and images list (#7561)
Fix coredns image repo and tag typo for #7570
2021-05-11 00:39:36 -07:00
d90baa8601 add containerd support for Amazon Linux 2 (#7595) 2021-05-10 19:25:36 -07:00
d5660cd37c Fix reset cluster task failed (#7597) 2021-05-10 17:25:36 -07:00
63cec45597 Add Amazon to the check for supported distributions (#7589) 2021-05-10 16:17:36 -07:00
f07e24db8f Cleanup duplicate task in etcd role (#7598)
* Remove the duplicate task in etcd role

* Remove inessential delegate_to
2021-05-10 16:11:36 -07:00
5d5be3e96a bump calico 3.18 to v3.18.3 (#7592) 2021-05-10 00:34:51 -07:00
6e7649360f Ignore error when ipvsadm utility not found on node (#7587) 2021-05-07 13:37:04 -07:00
1dd38721b3 Add external_openstack_enable_ingress_hostname option for openstack (#7572)
Signed-off-by: Cedric Hnyda <cedric.hnyda@itera.io>
2021-05-04 00:33:11 -07:00
6a001e4971 Add suport of Vsphere CSI driver 2.X versions (#7480) 2021-05-04 00:05:11 -07:00
96e6a6ac3f Add krew support (#7464)
* Add krew support

* Add reset for krew

* Update install krew(local)

* ansible lint

* yamllint

* fix krew default vars

* fix kubectl_localhost mode

* replace include

* fix e206
2021-05-03 07:16:03 -07:00
2556eb2733 Upgrade cilium role (#7521)
* Upgrade cilium roles

* Del old test result

* Add hubble ui examples

* Refactor hubble metrics

* Markdown fix pipeline errors

* yamllint check and fix

* refactor install from https://github.com/kubernetes-sigs/kubespray/pull/7520

* Docs syntax change (fix)

* Cilium set default 1.8.9

* Update cilium version in Readme
2021-04-30 08:09:59 -07:00
d29ea386d6 Fix issue with api token wait check not working (#7566) 2021-04-30 07:47:59 -07:00
a0ee569091 change coredns image name to coredns/coredns and prefix v to tag (#7570)
follow new naming conventions for gcr's coredns image.
starting from 1.21 kubeadm assumes it to be `coredns/coredns`:
this causes the kubeadm deployment being unable to pull image, beacuse `v`
was also added in image tag, until the role `kubernetes-apps` ovverides
it with the old name, which is only compatible with <=1.7.

Backward comptability with kubeadm <=1.20 is mantained checking
kubernetes version and falling back to old names (`coredns:1.xx`) when
the version is less than 1.21
2021-04-30 07:43:58 -07:00
3f4eb9be08 Fixes issue #7573 - Made Calico permissions compatible with v3.18.x (see https://github.com/projectcalico/calico/issues/4557). Specifically, granted watch to custom resources blockaffinities, ipamblocks & ipamhandles (#7575) 2021-04-30 07:25:59 -07:00
5ea2d1eb67 Add image_arch in flannel image tag (#7560)
* Add image_arch variable when download flannel image

* Fix flannel image tag typo with image arch
2021-04-29 17:51:57 -07:00
ffc38a2237 Fix busybox for tests to reduce dockerhub calls (#7571) 2021-04-29 17:39:57 -07:00
360aff4a57 Rename ansible groups to use _ instead of - (#7552)
* rename ansible groups to use _ instead of -

k8s-cluster -> k8s_cluster
k8s-node -> k8s_node
calico-rr -> calico_rr
no-floating -> no_floating

Note: kube-node,k8s-cluster groups in upgrade CI
      need clean-up after v2.16 is tagged

* ensure old groups are mapped to the new ones
2021-04-29 05:20:50 -07:00
d26191373a add default empty value for etc_hosts_localhosts_dict_target (#7567) 2021-04-28 11:34:50 -07:00
4c06aa98b5 crio: add supported versions 1.20 and 1.21 and align default with k8s version (#7562)
* crio: add supported versions 1.20 and 1.21 and align default with k8s version

* cri-o: drop versions 1.17 and 1.18 from version matrix

* update note on cri-o version alignment
2021-04-28 11:30:51 -07:00
1b267b6599 Fix calico-kube-controller becomes Error for canal (#7564) 2021-04-28 11:26:52 -07:00
dd6efb73f7 Calico new versions v3.17.4 and v3.18.2 (#7563)
* calico: upgrade from v3.17.3 to v3.17.4

* calico: upgrade from v3.18.1 to v3.18.2
2021-04-28 08:22:50 -07:00
dfeed1c1a4 Modify the commented config info (#7558) 2021-04-27 15:45:28 -07:00
0071e3c99c Update main.yml (#7557) 2021-04-27 15:41:27 -07:00
0feec14b15 Update Dockerfile for reduce image size (#7556)
* Update Dockerfile for reduce image size

* Remove KUBE_VERSION form Dockerfile
2021-04-26 23:33:37 -07:00
975f84494c Fix calico-kube-controller becomes Error (#7548)
Change mode so that calico-kube-controllers can be read because it was changed to run as non-root
https://github.com/projectcalico/kube-controllers/pull/566
2021-04-26 15:37:03 -07:00
7c86734d2e Add cri-o 1.20/1.21 (#7544) 2021-04-26 09:21:16 -07:00
8665e1de87 Fix cri-o support for Oracle and AlmaLinux (#7541) 2021-04-26 09:11:02 -07:00
c16efc9ab8 Fix Opensuse not working with ansible_distribution (#7551) 2021-04-26 08:37:02 -07:00
324c95d37f Fix some docs.ansible.com url typo (#7550) 2021-04-26 08:33:02 -07:00
69806e0a46 Add nerdctl cli tool for containerd user (#7500)
* Add nerdctl cli tool for containerd user

* Add nerdctl enable option

* Add nerdctl enable option and update nerdctl version to 0.8.0
2021-04-25 23:47:01 -07:00
ad15a4b755 Bump calico versions (#7543)
* add calico 3.16.10 hashes

* drop old calico version 3.16.9
2021-04-24 12:37:01 -07:00
002a4b03a4 Drop calico 3.15 (#7545)
* calico: drop support for version 3.15

* drop check for calico version >= 3.3, we are at 3.16 minimum now

* we moved to calico 3.16+ so we can default to /opt/cni/bin/install
2021-04-23 23:43:14 -07:00
96476430a3 Update cni-plugins and kubernetes version in README.md (#7540) 2021-04-22 23:54:02 -07:00
73db44b00c Initial AlmaLinux support (#7538)
* AlmaLinux: ansible>2.9.19 is needed to know about AlmaLinux

* AlmaLinux: identify as a centos derrivative

* AlmaLinux: add AlmaLinux to checks for CentOS

* Use ansible_os_family to compare family and not distribution
2021-04-22 23:50:03 -07:00
b32d25942d Minor update to cni-plugins and kube-router 2021-04-22 06:47:42 -07:00
fce705a92b Helm minor update to 3.5.4 2021-04-22 06:47:42 -07:00
6164c90f70 Update kube-ovn to 1.6.2 2021-04-22 06:47:42 -07:00
e036b899a3 update calico default version in README.md (#7537) 2021-04-22 06:41:41 -07:00
8c7b90ebbf add ingress controller class (#7522) 2021-04-22 00:22:38 -07:00
38d9d2ea0e Ambassador can watch multiple namespaces (#7516)
* Ambassador can watch multiple namespaces

* update variable name per PR review
2021-04-22 00:22:31 -07:00
384d30b675 add support for configuring cri-o pids_limit (#7525) 2021-04-21 10:55:51 -07:00
add61868c6 Add Calico v3.17.3 and v3.18.1 (#7524)
* add hashes for calico v3.17.3

* add hashes for claico v3.18.1

* bump default calico version to v3.17.3

* calico crds are missing yaml separator breaking kdd
2021-04-21 10:45:51 -07:00
b599f3084f Fix OpenStack StyleGuide rule H216 (On by default in latest version) (#7535)
ref: b921c4de51
2021-04-21 09:04:11 -07:00
a7493e26e1 add enablerepo: amzn2extra-docker for docker install on aws 2 (#7507) 2021-04-21 07:24:10 -07:00
ae3a1d7c01 Fix keepcache values of yum_repository (#7506)
As the official document[1], the parameter keepcache should be
'0' or '1' as string. To avoid the following warning message,
this fixes the parameter value:

  [WARNING]: The value False (type bool) in a string field was
  converted to u'False' (type string). If this does not look
  like what you expect, quote the entire value to ensure it
  does not change.

https://docs.ansible.com/ansible/latest/collections/ansible/builtin/yum_repository_module.html
2021-04-21 07:20:11 -07:00
e39e3d5c26 Fix OpenId Connect example prefixes (#7527)
Fixes "mapping values are not allowed in this context
2021-04-20 17:32:10 -07:00
1e7d48846a Fixes issue #7528 - allow configuring CALICO_STARTUP_LOGLEVEL via a new variable: calico_node_startup_loglevel (#7530)
Signed-off-by: Brendan Holmes <5072156+holmesb@users.noreply.github.com>
2021-04-20 15:37:42 -07:00
6001edeecd Cleanup hashes and 1.18 hooks (#7534) 2021-04-20 15:34:33 -07:00
ce0b7834ff Refactor cilium_ipsec_enabled check (#7520)
This is a followup to

https://github.com/kubernetes-sigs/kubespray/pull/7413

Although the code worked there was a desire for a better solution.
Hopefully people will be happy with this alternative.
2021-04-19 02:06:36 -07:00
3ac92689f0 exoscale: Rework EIP access from workers (#7337)
Context: Load-balancing in Exoscale is performed by associating many
workers with the same EIP. This works, however, the workers cannot access
themselves via the EIP, which is needed at least for cert-managers
"self-test".

Problem: The old iptables based workaround felt fragile and disappointed
me at least once.

New solution: Add the EIP to a loopback interface on each worker.
2021-04-16 03:22:22 -07:00
1c0836946f Update default Kubernetes version to 1.20.6 2021-04-15 22:26:22 -07:00
bccbe323b7 Add new kubernetes hashes (1.19.10, 1.20.6) 2021-04-15 22:26:22 -07:00
d73249a793 Add bash-completion package (#7510) 2021-04-15 08:33:50 -07:00
cd9a03f86c Update some docker defaults (#7499) 2021-04-14 15:13:07 -07:00
b47c21c683 Remove some bash completion file when reset cluster (#7502) 2021-04-14 11:07:09 -07:00
6de5303e3f Fix sample inventory (offline template) (#7498) 2021-04-14 03:28:43 -07:00
2a2fb68b2f Add missing proxy environment in crio_repo.yml (#7492) 2021-04-13 01:20:51 -07:00
844ebb7838 fix offline mode (#7493)
* fix offline mode

* add offline messages
2021-04-13 00:46:50 -07:00
332cc1cd58 Check if python netaddr and recent enough jinja are installed (#7486)
CentOS 7 provides up to date Ansible with really old jinja version

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-04-13 00:43:01 -07:00
e7ce83016e correct a wrong word (#7484)
* correct a wrong word

* correct a wrong word
2021-04-13 00:42:50 -07:00
bf6a39eb84 Add auto_renew_certificates_systemd_calendar (#7490)
This allow to configure when K8S certificates renewal runs

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-04-12 09:47:45 -07:00
42382e2cde Update Terraform/Vagrant + increase tf_ovh retries (#7477) 2021-04-12 09:47:39 -07:00
f8e4650791 Fix typo (#7489) 2021-04-12 09:43:38 -07:00
e444b3c140 Regenerate apiserver.crt on all control-plane nodes (#7463)
We were regenerating only the cert of the first node
While at it speed up the check step

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-04-12 09:17:38 -07:00
d56ac216f4 Use kubeadm_feature_gates instead of kube_feature_gates to leverage kubeadm feature gates and not to interfere with k8s components feature gates (#7447) 2021-04-12 01:05:59 -07:00
420a412234 Add containerd_extra_args (#7461)
* Add containerd_extra_args

This is useful for custom containerd config, e.g. auth

Signed-off-by: Zhong Jianxin <azuwis@gmail.com>

* Make containerd config.toml mode 0640

It may contain sensitive information like password

Signed-off-by: Zhong Jianxin <azuwis@gmail.com>
2021-04-12 01:02:00 -07:00
90c643f3ab format ansible output (#7482) 2021-04-11 00:37:59 -07:00
1d4e380231 Remove containerd_runtimes var in k8s-cluster.yml (#7476)
Also set in all/containerd.yml
2021-04-09 10:25:17 -07:00
6d293ba899 Update hashes with 1.21.0 (#7478) 2021-04-09 08:05:05 -07:00
aa086e5407 Remove dead code from kubeadm-etcd (#7470) 2021-04-09 01:10:47 -07:00
cce0940e1f add CI test for auto_renew_certificates (#7472)
* add CI test for auto_renew_certificates

* change timer value

fix typo error in rotate cert script
2021-04-09 00:42:47 -07:00
daed3e5b6a Use v2.15.1 as base image for CI (#7466) 2021-04-08 12:28:02 -07:00
e2a7f3e2ab remove-node roles: fix kubectl absolute path (#7469)
* kubelet absolute path

* kubelet absolute path
2021-04-08 12:24:02 -07:00
5a351b4b00 Add condition for audit_webhook_mode batch (#7444)
According to the document[1], audit-webhook-batch-max-size and
audit-webhook-batch-max-wait are used only in the batch mode.
This adds a condition to avoid unnecessary writting on the config.

[1]: https://kubernetes.io/docs/tasks/debug-application-cluster/audit/#batching
2021-04-08 07:52:56 -07:00
6f2abbf79c Move cilium kvstore settings to configmap (#7462)
This PR is to move the cilium kvstore options to the configmap
rather than specifying them in the deployment as args. This
is not technically necessary but keeping all the options in
one place is probably not a bad idea.

Tested with cilium 1.9.5.
2021-04-08 07:32:56 -07:00
bef1e628ac Fix issue with 'latest' in containerd version (#7459) 2021-04-07 08:33:53 -07:00
7340a163a4 fix scale (#7449) 2021-04-07 01:35:53 -07:00
a6622b176b Update cilium_ipsec_enabled check (#7413)
When attempting a fresh install without cilium_ipsec_enabled I ran
into the following error:

failed: [k8m01] (item={'name': 'cilium', 'file': 'cilium-secret.yml', 'type': 'secret', 'when': 'cilium_ipsec_enabled'}) =>
{"ansible_loop_var": "item", "changed": false, "item": {"file": "cilium-secret.yml", "name": "cilium", "type": "secret",
"when": "cilium_ipsec_enabled"},"msg": "AnsibleUndefinedVariable: 'cilium_ipsec_key' is undefined"}

Moving the when condition from the item level to the task level solved
the issue.
2021-04-06 06:17:33 -07:00
771a5e26bb Add KubeSchedulerConfiguration for k8s 1.19 and up (#7351)
* Add KubeSchedulerConfiguration for k8s 1.19 and up

With release of version 1.19.0 of kubernetes KubeSchedulerConfiguration
was graduated to beta. It allows to extend different stages of
scheduling with profiles. Such effect is achieved by using plugins and
extensions.

This patch adds KubeSchedulerConfiguration for versions 1.19 and later.
Configuration is set to k8s defaults or to kubespray vars. Moving those
defaults to new vars will be done in following patch.

Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>

* KubeSchedulerConfiguration: add defaults

Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
2021-04-06 00:35:35 -07:00
be278f9dba Add documentation for audit webhook variables (#7434)
* Add documentation for audit webhook variables

* Enclose the value of  audit_webhook_server_url in a codeblock

* Add default value for audit_webhook_batch_max_wait
2021-04-05 13:51:19 -07:00
6479e26904 Replace deprecated 'with_dict' with 'loop' (#7442) 2021-04-05 13:45:19 -07:00
1c7053c9d8 Fix CI template for etcd recover jobs (kube-master rename) (#7441) 2021-04-05 13:41:19 -07:00
596d0289f8 Remove calico-rr from local inventory hosts file (#7439) 2021-04-05 05:24:12 -07:00
7df7054bdc remove local lb privileged (#7437) 2021-04-05 05:22:14 -07:00
5377aac936 fix typo (#7436) 2021-04-05 01:20:19 -07:00
ceb6c172ad Crun v0.19 support (#7433)
* Add support for crun v0.19

* Change default crun version to v0.19
2021-04-05 01:20:13 -07:00
7f52c1d3a2 reset roles need flush iptables:raw (#7426) 2021-04-05 01:16:13 -07:00
af1e16b934 Remove old note related to offline installation (#7429)
The PR https://github.com/kubernetes-sigs/kubespray/pull/6927 has been
merged and the issue https://github.com/kubernetes-sigs/kubespray/issues/6233
was fixed.
This removes unnecessary note for the above PR.
2021-04-02 09:48:11 -07:00
2257181ca8 Set containerd version to 1.4.4 (#7398)
* Set containerd version to 1.4.3

* Set containerd version to 1.4.4

Co-authored-by: Barry Melbourne <9964974+bmelbourne@users.noreply.github.com>
2021-04-01 23:20:11 -07:00
7e75d48cc4 local provisioner 'useNodeNameOnly' option can be configured (#7421) 2021-04-01 16:54:11 -07:00
6330db89a7 Update KataContainers to 1.12.1 (#7427) 2021-04-01 08:55:21 -07:00
f05d6b3711 Add cilium_ipam_mode variable (#7418)
Starting with Cilium v1.9 the default ipam mode has changed to "Cluster
Scope". See:

https://docs.cilium.io/en/v1.9/concepts/networking/ipam/

With this ipam mode Cilium handles assigning subnets to nodes to use
for pod ip addresses. The default Kubespray deploy uses the Kube
Controller Manager for this (the --allocate-node-cidrs
kube-controller-manager flag is set). This makes the proper ipam mode
for kubespray using cilium v1.9+ "kubernetes".

Tested with Cilium 1.9.5.

This PR also mounts the cilium-config ConfigMap for this variable
to be read properly.

In the future we can probably remove the kvstore and kvstore-opt
Cilium Operator args since they can be in the ConfigMap. I will tackle
that after this merges.
2021-04-01 07:33:22 -07:00
cce9d3125d Update k8s-certs-renew.sh.j2 (#7422)
fix undefinedElse
2021-03-31 00:00:58 -07:00
e381ce57e2 Remove left over nodes_to_drain (#7412)
Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-03-29 16:17:56 -07:00
5dbce6a2bd add support for custom calico port (#7419) 2021-03-29 08:38:45 -07:00
5b0e88339a Update cilium-operator clusterrole (#7416)
When upgrading cilium from 1.8.8 to 1.9.5 I ran into the following
error:

level=error msg="Unable to update CRD" error="customresourcedefinitions.apiextensions.k8s.io
\"ciliumnodes.cilium.io\" is forbidden: User \"system:serviceaccount:kube-system:cilium-operator\"
cannot update resource \"customresourcedefinitions\" in API group \"apiextensions.k8s.io\" at the
cluster scope" name=CiliumNode/v2 subsys=k8s

The fix was to add the update verb to the clusterrole. I also added
create to match the clusterrole created by the cilium helm chart.
2021-03-29 00:04:51 -07:00
db43891f2b remove unused handlers in cilium roles (#7414) 2021-03-29 00:04:44 -07:00
f72063e7c2 Remove DNSSEC config management in bootstrap-debian.yml (#7408)
DNSSEC is off by default on ubuntu/bionic64 (18.04) as per resolved.conf(5).
These tasks are artefacts of obsolete infra configuration, and no longer needed.

Further removing these tasks resolves the issue that the tasks always reports
'changed' and bounces systemd-resolved unneccesarily, even if there was no
actual modification of /etc/systemd/resolved.conf.
2021-03-29 00:00:45 -07:00
36a3a78952 Fix remove-node by removing jq usage (#7405)
Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-03-26 08:48:43 -07:00
2d1597bf10 Fix k8s-certs-renew for k8s < 1.20 (#7410)
Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-03-26 08:44:44 -07:00
edfa3e9b14 Correct Jinja Syntax for etcd-unsupported-arch (#6919)
`-%` causes `etcd-unsupported-arch: arm64` to print on COL 1 instead of
COL 6.

Signed-off-by: anthr76 <hello@anthonyrabbito.com>
2021-03-26 02:10:43 -07:00
6fa3565dac Allow connecting to bastion via non-standard SSH port (#7396)
* Allow connecting to bastion via non-standard port

* Fix bastion connection when ansible_port is not provided
2021-03-26 00:48:43 -07:00
7dec8e5caa specify runAsGroup, allow safe sysctls by default (#7399) 2021-03-25 08:03:30 -07:00
49abf6007a Add cryptography installation (#7404)
To avoid ModuleNotFoundError due to no module named 'setuptools_rust',
this adds cryptography installation to requirements.txt.

Created by jfc-evs originally as https://github.com/kubernetes-sigs/kubespray/pull/7264
2021-03-25 05:15:29 -07:00
f0cdf71ccb Remove vault (#7400)
* Remove contrib/vault

This is marked as broken since 2018 / 3dcb914607
This still reference apiserver.pem, not used since ddffdb63bf

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>

* Finish nuking vault from the codebase

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-03-24 09:26:08 -07:00
8655b92e93 Set Kube-router version to 1.2.0 (#7402)
See: `https://github.com/cloudnativelabs/kube-router/releases/tag/v1.2.0`
2021-03-24 09:22:07 -07:00
e1c6992c55 fix: correct hardcoded macvlan template, use var macvlan_interface. (#7401) 2021-03-24 01:46:06 -07:00
486b223e01 Replace kube-master with kube_control_plane (#7256)
This replaces kube-master with kube_control_plane because of [1]:

  The Kubernetes project is moving away from wording that is
  considered offensive. A new working group WG Naming was created
  to track this work, and the word "master" was declared as offensive.
  A proposal was formalized for replacing the word "master" with
  "control plane". This means it should be removed from source code,
  documentation, and user-facing configuration from Kubernetes and
  its sub-projects.

NOTE: The reason why this changes it to kube_control_plane not
      kube-control-plane is for valid group names on ansible.

[1]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-cluster-lifecycle/kubeadm/2067-rename-master-label-taint/README.md#motivation
2021-03-23 17:26:05 -07:00
d53fd29e34 Add support for cilium ipsec (#7342)
* Add support for cilium ipsec

* Fix typo for bpffs
2021-03-23 13:46:06 -07:00
4f89bfac48 MetalLB: bump to v0.9.6 (#7397)
Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
2021-03-23 13:42:06 -07:00
p53
5fee96b404 Fix cinder cert permissions (#7384)
* Fix permissions of cinder cert

* Change runuser for external_cloud_controller to kube user with id 999, part of 999 - kube-cert group
2021-03-23 11:03:37 -07:00
12873f916b download_file for kata (#7393) 2021-03-23 01:39:36 -07:00
efa180392b Auto renew control plane certificates (#7358)
While at it remove force_certificate_regeneration
This boolean only forced the renewal of the apiserver certs
Either manually use k8s-certs-renew.sh or set auto_renew_certificates

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-03-22 11:22:48 -07:00
6d9ed398e3 Set default k8s version to 1.20.5 2021-03-19 10:04:34 -07:00
6d3dbb43a4 Update hashes for 1.20.5/1.19.9/1.18.17 2021-03-19 10:04:34 -07:00
811f546ea6 Download crun using download_file.yml (#7370)
* Add crun download_url and checksum

* Change versioning format to crun native versioning

* Download crun using download_file.yml

* Get crun version from download defaults

* Delegate crun binary copy task to crun role
2021-03-19 08:40:33 -07:00
ead8a4e4de Fix calico crds missing 3.16.9 (#7386) 2021-03-19 06:58:34 -07:00
05f132c136 Update CNI (calico, kubeovn, multus) and Helm 2021-03-18 17:20:36 -07:00
5f2c8ac38f Update nodelocaldns to 1.17.1 2021-03-18 17:20:36 -07:00
14511053aa Update docker to 20.10.5 2021-03-18 17:20:36 -07:00
8353532a09 Added experimental cri-o support for Amazon Linux 2 (#7353)
* Added experimental cri-o support for Amazon Linux 2

* Fixed dependencies order
2021-03-18 17:16:37 -07:00
1c62af0c95 Download Calico KDD CRDs (#7372)
* Download Calico KDD CRDs

* Replace kustomize with lineinfile and use ansible assemble module

* Replace find+lineinfile by sed in shell module to avoid nested loop

* add condition on sed

* use block for kdd tasks + remove supernumerary kdd manifest apply in start "Start Calico resources"
2021-03-18 17:06:36 -07:00
f103ac7640 Change default OCCM internal and public networks variables to empty lists (#7380)
Signed-off-by: Mikael Johansson <mik.json@gmail.com>
2021-03-18 16:52:36 -07:00
274e06a48d add etcd max snapshot and wals (#7382) 2021-03-18 16:48:36 -07:00
a39f306184 correct a wrong word (#7383) 2021-03-18 00:55:19 -07:00
69d11daef6 Upgrade openSUSE Leap to 15.2 (#7331)
15.1 has reached EOL on 2021-02-02.

Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
2021-03-17 09:12:56 -07:00
057e8b4358 Fixup one more missing kubespray-defaults (#7375)
"The error was: 'proxy_disable_env' is undefined\n\nThe error appears to
be in '<censored>scale.yml': line 72, column 7"

Fixes 067db686f6

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-03-15 07:09:05 -07:00
18c0e54e4f Add most_recent = true while retrieving the latest image (#7376) 2021-03-15 07:05:06 -07:00
85007fa9a7 Update upgrades.md (#7361)
upgrades.md explains how to do upgrade from v1.4.3 to v1.4.6 as an
example. The versions are a little old, and the doc readers would
have a concern the upgrade works fine or not.
This updates versions after verifying the way works fine by hands.
2021-03-15 03:59:05 -07:00
5c5bf41afe Terraform support for UpCloud (#7360)
* terraform support for UpCloud

* terraform support for UpCloud

* terraform support for UpCloud

* terraform support for UpCloud

* terraform support for UpCloud

* terraform support for UpCloud

* terraform support for UpCloud

* Updates to README.md and main.tf files

* formatting and updating readme

* added a .terraform_validate CI job

* fixed format issue

* added sample inventory

* added symbolic link to group_vars

* added missing tf variables and minor fixes

* added text formatting

* minor formatting fixes
2021-03-15 01:41:04 -07:00
5dba53a223 Fix dynamic inventory link (#7367) 2021-03-11 06:46:22 -08:00
2bcd9eb9e9 Bump crun to 0.18 version (#7364) 2021-03-11 00:00:24 -08:00
5a54db2f3c Check for dummy kernel module (#7348)
The dummy module is needed for nodelocaldns.
2021-03-09 08:07:00 -08:00
b47542b003 disable gather_facts for correctly work via bastion (#7265) 2021-03-09 01:47:00 -08:00
14b63ede8c Fixup kubelet.conf to point to kubelet-client-current.pem (#7347)
c9c0c01de0 only fix the problem for new clusters

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-03-08 23:55:00 -08:00
b07c5966a6 ansible and jinja2 updates (#7357)
* Update ansible to v2.9.18

Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>

* Update jinja2 to v2.11.3

Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
2021-03-08 11:42:59 -08:00
c7db72e1da Add nodeselector and tolerations for metallb (#7334)
* add nodeselector and tolerations for metallb

* remove unnecessary commented lines in metallb template

* set default speaker toleration to match original manifest
2021-03-08 07:57:42 -08:00
dc5df57c26 Add privileged_without_host_devices support (#7343)
When privileged is enabled for a container, all the `/dev/*` block
devices from the host are mounted into the guest. The
`privileged_without_host_devices` flag prevents host devices from
being passed to privileged containers.

More information:
* https://github.com/containerd/cri/pull/1225
* 1d0f68156b
2021-03-08 00:17:44 -08:00
a9c97e5253 Delete misnammed kubeadm-version.yml
The important action in kubeadm-version.yml is the templating of the configuration,
not finding / setting the version

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-03-04 23:42:22 -08:00
53e5ef6b4e Always backup both certs and kubeconfig
There are no reasons not to backup during upgrade

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-03-04 23:42:22 -08:00
8800b5c01d Remove rotate_tokens logic
kubeadm never rotates sa.key/sa.pub, so there is no need to delete tokens/restart pods

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-03-04 23:42:22 -08:00
280036fad6 Remove admin.conf removal
kubeadm is the default for a long time now,
and admin.conf is created by it, so let kubeadm handle it

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-03-04 23:42:22 -08:00
a6e1f5ece9 Remove useless call to 'kubeadm version'
Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-03-04 23:42:22 -08:00
fedd671d68 Remove pre kubeadm cert migration tasks
apiserver.pem is not used since ddffdb63bf

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-03-04 23:42:22 -08:00
b7c22659e3 kubeadm-config.v1beta2.yaml.j2: etcd log level arg (#7339)
According to [etcd's docs](https://etcd.io/docs/v3.4.0/op-guide/configuration/#--log-package-levels), argument 'log-package-levels' should not contain underscores.
2021-03-03 11:39:50 -08:00
c9c0c01de0 Stop using kubeadm to update server in kubeconfigs (#7338)
Using `kubeadm init phase kubeconfig all` breaks kubelet client certificate rotation
as we are missing `kubeadm init phase kubelet-finalize all` to point to `kubelet-client-current.pem`

kubeconfig format is stable so let's just use lineinfile,
this will avoid other future breakage

This revert to the logic before 6fe2248314

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-03-03 09:39:20 -08:00
e442b1d2b9 Add kube-ipvs0/nodelocaldns to NetworkManager unmanaged-devices (#7315)
On CentOS 8 they seem to be ignored by default, but better be extra safe
This also make it easy to exclude other network plugin interfaces

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-03-03 07:27:20 -08:00
e9f4ff227e fix master node taint removal bug (#7336)
code improvement
2021-03-03 05:35:20 -08:00
668bbe0528 Update Kubernetes dashboard and metrics-server 2021-03-02 08:33:19 -08:00
e045a45e48 Update docker & docker-cli to 20.10.4 2021-03-02 08:33:19 -08:00
2c9fc18903 template crun manifest (#7305)
add missing else to if inline
2021-03-02 01:57:19 -08:00
d4eecac108 add option to use calico with azure when using calico in vxlan (#7300) 2021-03-02 01:03:19 -08:00
ef351e0234 Update dashboard_enabled on sample (#7316)
Since https://github.com/kubernetes-sigs/kubespray/pull/6804
dashboard_enabled has been false by default.
However we forgot to update it on sample inventory and it made
confusion.
This updates the sample inventory.
2021-03-02 00:59:19 -08:00
05adeed1fa Fix recover-control-plane undefined 'proxy_disable_env' variable (#7326) 2021-03-01 13:38:16 -08:00
15f1b19136 Fix: added string to bool conversion for use_localhost_as_kube api load balancer (#7324) 2021-03-01 11:53:36 -08:00
154fa45422 fix: the filename </etc/vault> is Duplicate in the reset role. (#7313) 2021-03-01 11:53:25 -08:00
e35becebf8 Move centos7-crio CI job to centos8 (#7327) 2021-03-01 09:57:26 -08:00
bdd36c2d34 Update default exoscale master with more RAM (#7328)
The default master size for exoscale is 2cpu and 2GB ram.
I have found this to be too low, so this increases it to
2cpu and 4GB ram.
2021-03-01 09:41:25 -08:00
0a0156c946 Vsphere (#7306)
* Add terraform scripts for vSphere

* Fixup: Add terraform scripts for vSphere

* Add inventory generation

* Use machines var to provide IPs

* Add README file

* Add default.tfvars file

* Fix newlines at the end of files

* Remove master.count and worker.count variables

* Fixup cloud-init formatting

* Fixes after initial review

* Add warning about disabled DHCP

* Fixes after second review

* Add sample-inventory
2021-02-26 04:20:15 -08:00
100d9333ca Add configmaps to local-path-provisioner CR (#7323) 2021-02-25 16:22:17 -08:00
a4cc416511 use external_openstack_lbaas_use_octavia for template openstack-cloud… (#7298)
* use external_openstack_lbaas_use_octavia for template openstack-cloud-config

* Delete external_openstack_lbaas_use_octavia from default values. Added description and default values of variables to docs

* markdown fix

* make this simple

* set external_openstack_lbaas_use_octavia in default values

* duplicated variable in doc
2021-02-25 11:25:25 -08:00
2ea5793782 Replace KUBE_MASTERS with KUBE_CONTROL_HOSTS (#7257)
This replaces KUBE_MASTERS with KUBE_CONTROL_HOSTS because of [1]:

```
  The Kubernetes project is moving away from wording that is
  considered offensive. A new working group WG Naming was created
  to track this work, and the word "master" was declared as offensive.
  A proposal was formalized for replacing the word "master" with
  "control plane". This means it should be removed from source code,
  documentation, and user-facing configuration from Kubernetes and
  its sub-projects.
```

[1]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-cluster-lifecycle/kubeadm/2067-rename-master-label-taint/README.md#motivation
2021-02-23 10:00:03 -08:00
0ddf915027 Update Ansible to v2.9.17 (#7291)
This updates Ansible version to the latest stable version 2.9.17.
2021-02-23 09:54:03 -08:00
067db686f6 Fix proxy usage when *_PROXY are present in environment (#7309)
Since a790935d02 all proxy users
should be properly configured

Now when you have *_PROXY vars in your environment it can leads to failure
if NO_PROXY is not correct, or to persistent configuration changes
as seen with kubeadm in 1c5391dda7

Instead of playing constant whack-a-bug, inject empty *_PROXY vars everywhere
at the play level, and override at the task level when needed

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-02-23 09:44:02 -08:00
ed2b4b805e Fix reset when using containerd (#7308)
Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-02-22 12:44:03 -08:00
8375aa72e2 [Openstack] Update Cinder CSI driver to v1.20.0 (#7280)
* update Cinder CSI to v1.19.0

* Update Cinder CSI to v1.20
2021-02-22 10:09:42 -08:00
6334e4bd84 Set Kubernetes default version to 1.20.4 2021-02-22 08:45:42 -08:00
86ce8aac85 Add hashes for Kubernetes 1.18.16/1.19.8/1.20.4 2021-02-22 08:45:42 -08:00
de46f86137 Minor update to cilium and calico 2021-02-22 08:45:42 -08:00
5616b08229 Adding else in the inline if-expression (#7292)
Fix "AnsibleUndefinedVariable: the inline if-expression on line xx evaluated to false and no else section was defined."
2021-02-20 02:05:41 -08:00
8682a57ea3 use image id instad of name (#7293) 2021-02-19 09:16:25 -08:00
662a37ab4f Fix "api is up" check (#7295)
Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-02-19 09:12:25 -08:00
42947c9840 return the ability to update calico from 3.x.x version (#7290)
version check fixed
2021-02-17 00:07:06 -08:00
3749729d5a Remove calico-upgrade leftovers (#7282)
This is dead code since 28073c76ac

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-02-16 11:24:58 -08:00
fb8b075110 facts.yaml: reduce the number of setup calls by ~7x (#7286)
Before this commit, we were gathering:
1 !all
7 network
7 hardware

After we are gathering:
1 !all
1 network
1 hardware

ansible_distribution_major_version is gathered by '!all'

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-02-16 09:34:58 -08:00
1c5391dda7 Ensure kubeadm doesn't use proxy (#7275)
* Move proxy_env to kubespray-defaults/defaults

There is no reasons to use set_facts here

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>

* Ensure kubeadm doesn't use proxy

*_proxy variables might be present in the environment (/etc/environment, bash profile, ...)
When this is the case we end up with those proxy configuration in /etc/kubernetes/manifests/kube-*.yaml manifests

We cannot unset env variables, but kubeadm is nice enough to ignore empty vars
93d288e2a4/cmd/kubeadm/app/util/env.go (L27)

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-02-16 08:44:58 -08:00
f2d10e9465 allow users to set image_uuid instead of name, this allows the use of openstack community images (#7283) 2021-02-16 07:05:06 -08:00
796d3fb975 Improving PR 6473 (#7259) 2021-02-16 05:19:05 -08:00
5c04bdd52b Fixup cri-o metacopy mount options (#7287)
Ubuntu 18.04 crio package ships with 'mountopt = "nodev,metacopy=on"'
even if GA kernel is 4.15 (HWE Kernel can be more recent)

Fedora package ships without metacopy=on

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-02-15 20:51:07 -08:00
17143dbc51 write openstack controller manifests with correct perms (#7284) 2021-02-15 00:53:05 -08:00
1c8bba36db make sure worker rules is applied on workers (#7279) 2021-02-12 12:43:05 -08:00
95b329b64d bootstrap-os: match on os-release ID / VARIANT_ID (#7269)
This fixes deployment with CentOS 8 Streams and make detection more reliable

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-02-11 08:14:16 -08:00
de1d9df787 Only use stat get_checksum: yes when needed (#7270)
By default Ansible stat module compute checksum, list extended attributes and find mime type
To find all stat invocations that really use one of those:
git grep -F stat. | grep -vE 'stat.(islnk|exists|lnk_source|writeable)'

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-02-10 05:36:59 -08:00
6450207713 add containerd.io to dpkg_selection (#7273)
`containerd.io` is the companion package of `docker-ce` and is the
proper package name. This is needed to avoid apt upgrade/dist-upgrade
from breaking kubernetes.
2021-02-10 04:48:59 -08:00
edc4bb4a49 Update kube-ovn to 1.6.0 (#7240) 2021-02-10 02:25:01 -08:00
a21ee33180 fix typo error in role ingress-nginx (#7272) 2021-02-09 07:53:13 -08:00
bcaa31ae33 fix: Restart network doesn't work on Fedora CoreOS (#7271)
Running remove-node.yml tasks for clean up cluster on Fedora CoreOS.
The task failed to restart network daemon (task name: "reset | Restart network").
Fedora CoreOS is essentially using NetworkManager, but this task returns network.

Signed-off-by: Takashi IIGUNI <iiguni.tks@gmail.com>
2021-02-09 06:35:04 -08:00
0cc1726781 Remove deletion of coredns deployment. (#7211)
* Add unique annotation on coredns deployment and only remove existing deployment if annotation is missing.

* Ignore errors when gathering coredns deployment details to handle case where it doesn't exist yet

* Remove run_once, deletegate_to and add to when statement
2021-02-09 06:02:40 -08:00
aad78840a0 Updated etcd cert check tasks to detect when new cert gen is required (#7219)
* Added force_etcd_cert_refresh var to maintain existing functionality. Broke out etcd node cert syncing from member and admin cert sync logic. Now first etcd will sync node certs to other etcd members on every run to keep all etcds up to date after adding additional worker nodes to the cluster

* Updated etcd cert check tasks to better detect when new certificates need to be generated

* Move usage of force_etcd_cert_refresh var to gen_certs fact set

* Force etcd cert generation per server if force_etcd_cert_refresh is set to true

* Include gathering of node certs even if k8s-cluster member and in etcd group.

* Removed run_once due to when statement
2021-02-09 01:53:22 -08:00
e3ab665e90 Update main.yml (#7267)
````
TASK [bootstrap-os : Enable RHEL 8 repos] ***************************************************************************************************************************************************************************************************
fatal: [node6]: FAILED! => {"changed": false, "msg": "This system has no repositories available through subscriptions"}
fatal: [node7]: FAILED! => {"changed": false, "msg": "This system has no repositories available through subscriptions"}
fatal: [node1]: FAILED! => {"changed": false, "msg": "This system has no repositories available through subscriptions"}


root@node1:/kubespray# cat /etc/os-release
NAME="Ubuntu"
VERSION="18.04.5 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.5 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic
root@node1:/kubespray#
```
2021-02-08 10:25:37 -08:00
1a91792e7c Change the owner of /etc/crictl.yaml to root (#7254) 2021-02-05 09:28:53 -08:00
670c37b428 Update Helm version to 3.5.2 (#7248)
Helm v3.5.2 is a security (patch) release. Users are strongly
recommended to update to this release. It fixes two security issues in
upstream dependencies and one security issue in the Helm codebase.

See https://github.com/helm/helm/releases/tag/v3.5.2
2021-02-05 08:16:52 -08:00
040dacd5cd roles/docker: Make repokey fingerprint overrideable (#7247)
This makes the docker role work the same as the containerd role.
Being able to override this is needed when you have your own debian
repository. E.g. when performing an airgapped installation
2021-02-05 07:44:52 -08:00
59541de437 Vagrantfile: always recreate inventory symlink (#7245)
Fixes 7244

Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
2021-02-05 00:50:52 -08:00
fc8551bcba Run containerd related tasks on OracleLinux. (#7250) 2021-02-05 00:46:52 -08:00
c2c97c36bc Add in tests for Calico with dual-stack networking 2021-02-05 00:04:52 -08:00
211fdde742 Add IPv6 libvirt details to the Vagrantfile 2021-02-05 00:04:52 -08:00
366cbb3e6f Ensure we gather IPv6 facts 2021-02-05 00:04:52 -08:00
a318624fad Auto-add IPv6DualStack featureGate
When enable_dual_stack_networks is set, we need to make sure
IPv6DualStack=true is set too, otherwise we end up with
a broken cluster.
2021-02-05 00:04:52 -08:00
3cf5981146 Switch to use upstream kube_feature_gates logic 2021-02-05 00:04:52 -08:00
4cc065e66d Changes to support Dual Stack networking 2021-02-05 00:04:52 -08:00
ba731ed145 Update docker packages to 19.03.15 and 20.10.3 (#7243) 2021-02-04 13:20:53 -08:00
b77460ec34 contrib/terraform/exoscale: Rework SSH public keys (#7242)
* contrib/terraform/exoscale: Rework SSH public keys

Exoscale has a few limitations with `exoscale_ssh_keypair` resources.
Creating several clusters with these scripts may lead to an error like:

```
Error: API error ParamError 431 (InvalidParameterValueException 4350): The key pair "lj-sc-ssh-key" already has this fingerprint
```

This patch reworks handling of SSH public keys. Specifically, we rely on
the more cloud-agnostic way of configuring SSH public keys via
`cloud-init`.

* contrib/terraform/exoscale: terraform fmt

* contrib/terraform/exoscale: Add terraform validate

* contrib/terraform/exoscale: Inline public SSH keys

The Terraform scripts need to install some SSH key, so that Kubespray
(i.e., the "Ansible part") can take over. Initially, we pointed the
Terraform scripts to `~/.ssh/id_rsa.pub`. This proved to be suboptimal:
Operators sharing responbility for a cluster risk unnecessarily replacing resources.

Therefore, it has been determined that it's best to inline the public
SSH keys. The chosen variable `ssh_public_keys` provides some uniformity
with `contrib/azurerm`.

* Fix Terraform Exoscale test

* Fix Terraform 0.14 test
2021-02-03 07:32:28 -08:00
88bee6c68e Fix ansible calico route reflector tasks in calico role (#7224)
* Fix calico-rr tasks

* revert stdin only when it's already a string
2021-02-03 07:22:29 -08:00
1f84d6344b local-path-provisioner change default version to v0.0.19 and update config template (#7238)
* update local-path-storage config template to version v0.0.19

* changes local_path_provisioner image tag to v0.0.19

* removes copy paste example from rancher local-path-provisioner repo
2021-02-03 06:50:28 -08:00
699fbd64ab Move recover_control_plane/master to control-plane (#7236)
According to the following recommendation, this moves the directory
to control-plane:

The Kubernetes project is moving away from wording that is considered
offensive. A new working group WG Naming was created to track this work,
and the word "master" was declared as offensive. A proposal was formalized
for replacing the word "master" with "control plane".
2021-02-03 02:06:29 -08:00
b42bf39fb7 MetalLB: bump to v0.9.5 (#7241)
Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
2021-02-03 01:02:28 -08:00
5368d51d63 Mention docker image in readme (#7239) 2021-02-02 09:16:28 -08:00
c5db012c9a Move kubernetes/master to kubernetes/control-plane (#7218)
This is a small step to replace "master" with "control-plane" in
Kubespray project.
2021-02-01 07:15:49 -08:00
b70d986bfa Ensure when use_oracle_public_repo is set to false the public Oracle Linux yum repos are not set (#7228) 2021-01-29 03:59:41 -08:00
973628fc1b FIX: Bastion undefined variable (#7227)
Fixes the following error when using Bastion Node with the sample config.
```
fatal: [bastion]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'bastion'\n\nThe error appears to be in '/home/felix/inovex/kubespray/roles/bastion-ssh-config/tasks/main.yml': line 2, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n---\n- name: set bastion host IP\n  ^ here\n"}
```
2021-01-28 10:07:37 -08:00
91fea7c956 Fix unintended SIGPIPEs. (#7214) 2021-01-27 01:07:40 -08:00
d378d789cf Add retries to drain during upgrade. Allow leaving nodes cordoned after drain failure. Allow continuing upgrade if drain fails. (#7206) 2021-01-26 11:10:31 -08:00
9007d6621a Update nginx, minor weave and misc CI tools (vagrant/terraform) (#7215) 2021-01-26 08:22:34 -08:00
774ec49396 Update azure cloud config (#7208)
* Allow configureable vni and port for flannel overlay

* additional options for azure cloud config
2021-01-26 07:24:35 -08:00
bba55faae8 calico: fix NetworkManager check (#7169)
Previous check for presence of NM assumed "systemctl show
NetworkManager" would exit with a nonzero status code, which seems not
the case anymore with recent Flatcar Container Linux.

This new check also checks the activeness of network manager, as
`is-active` implies presence.

Signed-off-by Jorik Jonker <jorik@kippendief.biz>
2021-01-25 23:52:34 -08:00
8f2b0772f9 containerd,docker: stop installing extras repo on CentOS/RHEL (#7203)
This was introduced in 143e2272ff
Extra repo is enabled by default in CentOS, and is not the right repo for EL8
Instead of adding a CentOS repo to RHEL, enable the needed RHEL repos with rhsm_repository

For RHEL 7, we need the "extras" repo for container-selinux
For RHEL 8, we need the "appstream" repo for container-selinux, ipvsadm and socat

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-25 03:12:54 -08:00
1a409dc7ae Add download bin tasks (#7131)
* Add downlaod bin tasks

* Add tags never and etcd

* yamllint
2021-01-22 20:41:39 -08:00
404ea0270e Added terraform support for Exoscale (#7141)
* Added terraform support for Exoscale

* Fixed markdown lint error on exoscale terraform
2021-01-22 20:37:39 -08:00
ef939dee74 Add missing 'ingress-controller' tag to alb (#7204) 2021-01-22 19:11:39 -08:00
f1576eabb1 Calico: fixup check when ipipMode / vxlanMode is not present (#7195)
calicoctl.sh get ipPool default-pool -o json
{
  "kind": "IPPool",
  "apiVersion": "projectcalico.org/v3",
  "metadata": {
    "name": "default-pool",
...
  },
  "spec": {
    "cidr": "10.233.64.0/18",
    "ipipMode": "Always",
    "natOutgoing": true,
    "blockSize": 24,
    "nodeSelector": "all()"
  }
}

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-21 20:39:26 -08:00
49c4345c9a preinstall: etcd group might not exists (#7202)
fixes 8c1821228d

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-21 14:31:02 -08:00
f94182f77d Add cleanup of openstack network ports in CI (#7201) 2021-01-21 12:51:02 -08:00
222a77dfe7 Change node-role.kubernetes.io from master to control-plane (#7183) 2021-01-21 08:13:03 -08:00
24ceee134e Document the terraform option master_allowed_ports (#7196)
Implemented in #6547
2021-01-21 07:55:06 -08:00
04c8a73889 Check kube-apiserver up on all masters before upgrade (#7193)
Only checking the kubernetes api on the first master when upgrading is not enough.
Each master needs to be checked before it's upgrade.

Signed-off-by: Rick Haan <rickhaan94@gmail.com>
2021-01-20 01:42:03 -08:00
9a75501152 Promote node.k8s.io API groups from v1beta1 to v1 2021-01-19 08:57:45 -08:00
f6fbbc17a4 Cleanup old checks for k8s 1.18 (#7192) 2021-01-19 08:43:45 -08:00
15dc3868c3 Update Weave to 2.8.0 (#7181) 2021-01-19 08:35:48 -08:00
2525d7aff8 Update main.yml (#7175)
Fix issue #7129. Calico image tags support multiarch on quay.io.
2021-01-19 05:59:46 -08:00
a5d2137ed9 containerd: ensure containerd is really started and enabled
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-19 05:55:45 -08:00
a8e51e686e containerd,docker: use apt_repository instead of action
yum_repository expect really different params, so nothing to factor here
Ubuntu is not an ansible_os_family, the OS family for Ubuntu is Debian
Check for ansible_pkg_mgr == apt

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-19 05:55:45 -08:00
a2429ef64d containerd,docker: use apt_key instead of action
we don't need rpm_key, so nothing to factor here
Ubuntu is not an ansible_os_family, the OS family for Ubuntu is Debian
Check for ansible_pkg_mgr == apt

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-19 05:55:45 -08:00
1b88678cf3 containerd: use package instead of action
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-19 05:55:45 -08:00
0e96852159 docker: use package instead of action, cleanup
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-19 05:55:45 -08:00
19a61d838f containerd: use copy to set apt pin
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-19 05:55:45 -08:00
4eec302e86 preinstall: use package instead of action, use state: present
Before this commit we were upgrading base os packages on each run

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-19 05:55:45 -08:00
f3885aa589 docker: stop using apt force
Here the desciption from Ansible docs
Corresponds to the --force-yes to apt-get and implies allow_unauthenticated: yes
This option will disable checking both the packages' signatures and the certificates of the web servers they are downloaded from.
This option *is not* the equivalent of passing the -f flag to apt-get on the command line
**This is a destructive operation with the potential to destroy your system, and it should almost never be used.** Please also see man apt-get for more information.

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-19 05:55:45 -08:00
b493c81ce8 Update metrics-server to 0.4.1 (#7188) 2021-01-19 05:45:44 -08:00
9ef62194c3 Update bunch of dependencies (#7187) 2021-01-19 05:41:45 -08:00
91ee4aa542 Decrease docker dependency (#7172) 2021-01-18 01:41:44 -08:00
e3caff833c Add prompt to upgrade node or delay before upgrade (#7168)
* Add prompt to upgrade node or delay before upgrade

* add docs
2021-01-17 23:53:43 -08:00
b2995e4ec4 Adding other masters sequentially, not in parallel (#7166) 2021-01-15 17:19:43 -08:00
ccd3aeebbc Remove ignore_errors from drain tasks and enable retires (#7151)
* Remove ignore_errors from drain tasks and enable retires

* Fix lint error by checking if stdout length is not 0, ie string is not empty.
2021-01-15 13:17:43 -08:00
7a033a1d55 Add hashes and update default K8S version to 1.20.2 (#7171) 2021-01-15 12:43:09 -08:00
1652d8bf4b Use Kubespray v2.15.0 as base image for CI (#7165) 2021-01-15 08:25:52 -08:00
c85f275bdb Fix typo (#7164)
Signed-off-by: Guangwen Feng <fenggw-fnst@cn.fujitsu.com>
2021-01-15 02:19:52 -08:00
a923f4e7c0 Update kube_version_min_required and cleanup hashes for release (#7160) 2021-01-15 00:33:51 -08:00
82af8e455e docker: remove old versions
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-14 09:39:05 -08:00
1baee488ab containerd: remove duplicate package pining task
Leave it with the install instead of the repo config

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-14 09:39:05 -08:00
7433b70d95 docker: remove kernel check
Only CentOS 7 uses Linux 3.10, all other OSs have more recent kernels

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-14 09:39:05 -08:00
de6c71a426 docker: remove dockerproject repo reference
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-14 09:39:05 -08:00
16a34548ea docker: remove checks for docker 1.12
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-14 09:39:05 -08:00
b2f3ab77cd docker: remove some old debug code
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-14 09:39:05 -08:00
b2f6ed7dee docker: remove obsoletes=0 in yum.conf
This was introduced in ef7f5edbb3
obsoletes=0 is not present in the official repo config
https://download.docker.com/linux/centos/docker-ce.repo
so it might not be needed for some time

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-14 09:39:05 -08:00
09e34d29cd containerd: remove docker_yum_conf / yum_conf
leftover from 1945499e2f

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-14 09:39:05 -08:00
667a6981ea preinstall: remove credentials folder move
This was introduced in 3004791c64,
so since 2018 everyone should be upgraded ;)

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-14 09:39:05 -08:00
cf1d9f5612 preinstall: remove old Fedora task
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-14 09:39:05 -08:00
55b03a41b2 containerd-common,containerd,docker: remove ubuntu arch specific vars
By removing ancient version we don't need arch specific vars

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-14 09:39:05 -08:00
81b4ffa6b4 Add Fedora 33 CI, remove Fedora 31 (#7072) 2021-01-14 08:27:05 -08:00
8c1821228d preinstall: fixup etcd_deployment_type check (#7152)
fixes 8331939aed
Thanks to Tomas Vanderka / karlism / LuckySB

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-14 06:53:05 -08:00
9c5c1a09a1 test-infra: update CentOS images (#7134)
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-14 05:25:04 -08:00
09fa99fdc6 Update hashes and set default version to 1.19.7 (#7150) 2021-01-13 14:57:02 -08:00
8331939aed preinstall: check etcd_deployment_type (#7149)
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-13 14:19:03 -08:00
02213d6e07 change nodeSelector label from deprecated beta.kubernetes.io/os and arch to kubernetes.io prefix (#7138) 2021-01-13 13:55:02 -08:00
387df0ee1f Remove unnecessary condition check when updating server field in kube-proxy kubeconfig (#7145) 2021-01-13 09:50:37 -08:00
b59035df06 change nginx default HTTPS protocol from "SSLv2" to "TLSv1.2 TLSv1.3" (#7144) 2021-01-13 08:34:36 -08:00
5517e62c86 Fix and document environment variable KUBE_MASTERS (#7127)
This variable was added as KUBE_MASTERS_MASTERS. That's probably a typo.
Remove the redundant `_MASTERS` suffix. Also, document the variable in the
help message.
2021-01-11 11:34:24 -08:00
5dca5225dc update docs main menu with CRI section (#7132) 2021-01-11 09:07:05 -08:00
c005c90746 Remove unnecessary failed_when (#7120)
TASK [Generate a list of information about the images on a node]
registers list of container images to docker_images.
Then the next TASK [Set pull_required if the desired image is not
yet loaded] does based on expecting images are registered.
However sometimes the first TASK was failed as [1] but the failure
is ignored due to failed_when:false and it makes another issue.
This removes this unnecessary failed_when to detect the failure
at the point.
In addition, this removes no_log:true also because the output doesn't
contain any sensitive data and now it just makes debugging difficult.

[1]: https://gitlab.com/kargo-ci/kubernetes-sigs-kubespray/-/jobs/934714534#L2953
2021-01-11 08:49:10 -08:00
8bdd0bb82f Require 2.9.0 <= Ansible version < 2.10.0 (#7130)
We have multiple breakage report with Ansible 2.10+ in https://github.com/kubernetes-sigs/kubespray/issues/6762
README.md already recommended 2.9+

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-11 07:49:11 -08:00
a790935d02 Only setup *_PROXY env variables where needed (#7095)
no_proxy is a pain to get right, and having proxy variables present causes issues
(k8s components get proxy configuration after upgrade, see #7100)

It's better to only configure what require proxy:
- the runtime (containerd/docker/crio)
- the package manager + apt_key
- the download tasks

Tested with the following clusters
- 4 CentOS 8 nodes
- 1 Ubuntu 20.04 node

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-11 07:21:08 -08:00
1fcbbd3b9d Update containerd documentation with etcd change (#7126)
* update containerd documentation with etcd change

* update conterind docs
2021-01-11 06:39:08 -08:00
b9077d3ea2 Add ping_access_ip; allows to disable ping test (#7020)
In some environments, it might not be possible to ping the IP address
of the nodes, e.g., because ICMP echo is blocked.

This commit allows kubespray to be configured to disable the ping
check, while performing all other checks.
2021-01-11 06:15:08 -08:00
1d7d84540f update ansible dependecy (#7128)
This solves the error "Service is in unknown state" when creating a new Kubernetes installation.
See: https://github.com/ansible/ansible/issues/71528
2021-01-11 01:39:06 -08:00
6f471d1c5e Typo fix: kuberntes -> kubernetes (#7125) 2021-01-10 12:19:06 -08:00
ff95292435 calico: fix warnings (#7121)
TASK [network_plugin/calico : Calico | Configure calico network pool] **********
task path: /builds/kargo-ci/kubernetes-sigs-kubespray/roles/network_plugin/calico/tasks/install.yml:138
Friday 08 January 2021  17:10:12 +0000 (0:00:01.521)       0:11:36.885 ********
[WARNING]: The value {'kind': 'IPPool', 'apiVersion': 'projectcalico.org/v3',
'metadata': {'name': 'default-pool'}, 'spec': {'blockSize': 24, 'cidr':
'10.233.64.0/18', 'ipipMode': 'Always', 'vxlanMode': 'Never', 'natOutgoing':
True}} (type dict) in a string field was converted to "{'kind': 'IPPool',
'apiVersion': 'projectcalico.org/v3', 'metadata': {'name': 'default-pool'},
'spec': {'blockSize': 24, 'cidr': '10.233.64.0/18', 'ipipMode': 'Always',
'vxlanMode': 'Never', 'natOutgoing': True}}" (type string). If this does not
look like what you expect, quote the entire value to ensure it does not change.

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-08 16:01:05 -08:00
e8a8a7b8cc Update sample to 3 master node (#7117) 2021-01-08 09:14:54 -08:00
b0ad8ec023 Fixed issue #7112.  Created new API Server vars that replace defunct Controller Manager one (#7114)
Signed-off-by: Brendan Holmes <5072156+holmesb@users.noreply.github.com>
2021-01-08 07:20:53 -08:00
ab2bfd7f8c Proxy small fixes (#7102)
* Improve how we set 'proxy=' in yum.conf or dnf.conf

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>

* Fixup spaces in no_proxy

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>

* Add svc,svc.{{ dns_domain }} to no_proxy

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-07 10:50:53 -08:00
29f1c40580 Ignore all .git* for mardownlint (#7109)
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-07 10:16:53 -08:00
2585e72a30 Fix mardownlint failures of offline (#7108)
This fixes the following failures:

./contrib/offline/README.md:14:1 MD014/commands-show-output Dollar signs used before commands without showing output [Context: "$ ./manage-offline-container-i..."]
./contrib/offline/README.md:20:1 MD014/commands-show-output Dollar signs used before commands without showing output [Context: "$ ./manage-offline-container-i..."]
2021-01-06 23:45:45 -08:00
837fca1368 Add docker 20.10 to available packages (#7106) 2021-01-06 09:23:51 -08:00
0c995c1ea7 Remove last 1.19.5 references (#7107) 2021-01-06 08:43:51 -08:00
ad244ab744 Add manage-offline-container-images.sh (#7024)
One challenge of offline deployment was how to collect necessary
container images as a preparation. This adds a script to solve it.
2021-01-06 08:05:52 -08:00
308ceee46c Valuating conditional (need_https_proxy.rc != 0) fail if http_proxy set and skip_http_proxy_on_os_packages is true (#7078)
* Remove because of empty need_http_proxy.rc if http/https_proxy and skip_http_proxy_on_os_packages=true is set

* Modify sample for debian and centos skip_http_proxy

* Modify sample for debian and centos skip_http_proxy
2021-01-05 18:49:51 -08:00
e0195da80d Allow containerd root and state path to be configured (#7098) 2021-01-05 07:13:58 -08:00
b02f40b392 Improve reset.yml (#7094)
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-05 07:09:59 -08:00
c0fe32c4ec Add repo name for Fedora (#7093)
This fixes 1945499e2f

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-04 10:39:57 -08:00
e9f93a1de9 Remove libseccomp install tasks (#7074)
All packages have proper dependencies in latest versions

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-04 09:17:57 -08:00
c14388629a calico: check if inventory settings match cluster settings (#6969)
If some settings were changed from the default but not commited into an inventory repo,
we risk breaking the cluster / cause downtime, so add some extra checks

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2021-01-04 09:07:56 -08:00
3c1f84a9e9 [fix] change result conditions (#6973) 2020-12-30 07:15:49 -08:00
398a995798 Fix markdownlint failures under ./roles/ (#7089)
This fixes markdownlint failures under roles/
2020-12-30 05:07:49 -08:00
dc86b2063a Fix markdown failures on contrib/terraform (#7082)
This fixes markdown failures on contrib/terraform.
2020-12-25 12:10:27 -08:00
bbab1013c5 Added gcp terraform support (#6974)
* Added gcp terraform support

* Added http/https firewall rule

* Ignoring lifecycle changes for attached disks on the google_compute_instance
2020-12-24 09:16:26 -08:00
1945499e2f Disable docker-ce yum repo by default / cleanups (#7080)
Upgrading docker / containerd without adapting the configuration might break the node,
so disable docker-ce repo by default.
We are already using dpkg hold for Debian.

All containerd.io packages provide /usr/bin/runc, so no need to check

yum_conf was never used for containerd

module_hotfixes should not be needed with the EL8 repo

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2020-12-23 13:12:26 -08:00
c971debd15 Fix crictl with Docker (#7081) 2020-12-23 08:28:26 -08:00
161c7e9fce Blacklist Calico's VXLAN interface from NetworkManager (#7037)
See https://github.com/projectcalico/calico/issues/3271

Otherwise Calico can get into a fight with NM about who "owns" the vxlan.calico
interface, breaking all pod traffic.
2020-12-23 08:24:27 -08:00
fd3ebc13f7 Fix terraform0.13 errors (#7077)
* [terraform/aws] Fix Terraform >=0.13 warnings

Terraform >=0.13 gives the following warning:

```
Warning: Interpolation-only expressions are deprecated
```

The fix was tested as follows:
```
rm -rf .terraform && terraform0.12.26 init && terraform0.12.26 validate
rm -rf .terraform && terraform0.13.5 init && terraform0.13.5 validate
rm -rf .terraform && terraform0.14.3 init && terraform0.14.3 validate
```
which gave no errors nor warnings.

* [terraform/openstack] Fixes for Terraform >=0.13

Terraform >=0.13 gives the following error:
```
Error: Failed to install providers
Could not find required providers, but found possible alternatives:
  hashicorp/openstack -> terraform-provider-openstack/openstack
```

This patch fixes these errors.

This fix was tested as follows:
```
rm -rf .terraform && terraform0.12.26 init && terraform0.12.26 validate
rm -rf .terraform && terraform0.13.5 init && terraform0.13.5 validate
rm -rf .terraform && terraform0.14.3 init && terraform0.14.3 validate
```
which gave no errors nor warnings for Terraform 0.13.5 and Terraform
0.14.3. Unfortunately, 0.12.x gives a harmless warning, but
with 0.14.3 out the door, I guess we need to move on.

* [terraform/packet] Fixes for Terraform >=0.13

This fix was tested as follows:
```
export PACKET_AUTH_TOKEN=blah-blah
rm -rf .terraform && terraform0.12.26 init && terraform0.12.26 validate
rm -rf .terraform && terraform0.13.5 init && terraform0.13.5 validate
rm -rf .terraform && terraform0.14.3 init && terraform0.14.3 validate
```

Errors are gone, but warnings still remain. It is impossible to please
all three versions of Terraform.

* Add tests for Terraform >=0.13
2020-12-23 05:08:26 -08:00
9db4b949f2 Fedora CoreOS fixes (#7010)
* Fedora CoreOS: Fix for ethtool pre-installed

Fix error in rpm-ostree when ethtool is already insatlled (FCOS >= 32.20201104.3.0)

* Fedora CoreOS: Fix connection lost

Fedora CoreOS: Ignore connection lost due to reboot and continues the playbook
2020-12-23 00:22:25 -08:00
5b5726bdd4 Improve markdownlint for contrib/network-storage (#7079)
This fixes markdownlint failures under contrib/network-storage and
contrib/vault.
2020-12-23 00:00:26 -08:00
1347bb2e4b Improve markdownlint coverage (#7075)
Now markdownlint covers ./README.md and md files under ./docs only.
However we have a lot of md files under different directories also.
This enables markdownlint for other md files also.
2020-12-22 04:44:26 -08:00
286191ecb7 Update nginx & cilium version (#7073) 2020-12-21 07:22:25 -08:00
096bcdd078 Download once for crio (#6998)
* download run once feature for CRI-O

* fix typo

* fix test
2020-12-21 01:54:25 -08:00
7d7739e031 Calico: fix node ip subnet detection (#7065)
We are currently setting the IP variable to hostIP,
Before https://github.com/projectcalico/node/pull/593 (not yet released)
Calico interpret that as hostIP/32
Using 'can-reach' we get the future behavior
This fixes vxlan and IPIP CrossSubnet modes

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2020-12-21 00:58:25 -08:00
3470810709 Remove kube_version in downloads role (#7066) 2020-12-19 14:38:26 -08:00
98b43bb24f Removes apps tags from apps meta dependencies (#7041)
Signed-off-by: François Travais <francois@travais.fr>
2020-12-19 08:14:25 -08:00
275c54e810 Wait for available API token in a new namespace (#7045)
Just after creating a namespace, the corresponding token could not be
created and sometimes the pod creation might be failed.
This adds check of the token in the new namespace to make this test
case stable.
2020-12-19 04:54:24 -08:00
9a05037352 SHAs for 1.19.6, 1.18.14, 1.18.13, 1.17.16 and 1.17.15 (#7063)
* SHAs for 1.19.6, 1.18.14, 1.18.13, 1.17.16 and 1.17.15

* Fix etcd version in README

* Bump kube_version to 1.19.6
2020-12-18 15:42:24 -08:00
143f9c78be fix MASSIVE_SCALE_THRESHOLD env paramter (#7054) 2020-12-18 08:50:25 -08:00
75f0aaf4a1 Fixed waiting for scheduler and controller manager (#6893) 2020-12-18 07:38:25 -08:00
c36df6a78b fix typo in containerd doc (#7057) 2020-12-18 00:34:24 -08:00
10a6bd67de Calico: update files to handle multi-asn bgp peering conditions. (#6971)
* update files to handle multi-asn bgp peering conditions.

* put back in the serviceClusterIPs.  Bad merge.

* remove extraneous environment var.

* update files as discussed with mirwan

* update titles.

* add not in.

* add a conditional for using bgp to advertise cluster ips.

Co-authored-by: marlow-h <mweston@habana.ai>
2020-12-17 22:54:25 -08:00
db17ba54b4 Add cluster-name to external-openstack-cloud-controller-manager (#7055)
If cluster-name is not set, the default value "kubernetes" is used.
The loadbalancees created by Kubernetes follow the format:
  kube_service_clusterName_serviceNamespace_serviceName
If 2 clusters create a loadbalancer for the same service in the same
namespace, they will share the same non-working loadbalancer.

Signed-off-by: Cedric Hnyda <cedric.hnyda@itera.io>
2020-12-17 08:23:09 -08:00
c2f64a52da Update dashboard to 2.1.0 and metrics-scraper to 1.0.6 (#7050) 2020-12-17 07:29:09 -08:00
0b81c6a6c4 Fix to use ansible-lint instead of ansible-lint.sh (#7047)
tests/scripts/ansible-lint.sh was written on the doc, but there was
not such file actually. We can use ansible-lint command to check
ansible yml files without any options.
This updates to use the command.
2020-12-17 07:21:09 -08:00
36bd4cdc43 Update cni plugin to 0.9.0 (#7049) 2020-12-17 07:17:09 -08:00
87eea16d7b Fix config containerd template (#7051) 2020-12-17 07:13:09 -08:00
0aa6d3d4bc Replace non-ascii with ascii (#7044)
When opening the main.yaml, vi cannot show the string correctly
due to non-ascii string. This replaces it.
2020-12-16 23:43:09 -08:00
43dbff938e Exclude .git/ from shellcheck (#7048)
If a branch name contains '.sh', current shellcheck checks the branch
file under .git/ and outputs error because the format is not shell
script one.
This makes shellcheck exclude files under .git/ to avoid this issue.
2020-12-16 15:51:09 -08:00
54aebb92fd Set Kube-Router version to v1.1.1 (#7022) 2020-12-16 13:58:31 -08:00
f0c7649158 Update ambassador.md (#7023)
Typo
2020-12-16 07:04:21 -08:00
93445b4dbc Update hashes and set default version to 1.19.5 (#7012)
* Update hashes and set default version to 1.19.5

Signed-off-by: anthr76 <hello@anthonyrabbito.com>

* Reorder hashes

1.19.5 hashes should be near 1.19.x

* Added back blank line
2020-12-16 01:42:20 -08:00
aeaa876d57 Move some approvers to emeritus status (#6966) 2020-12-10 01:40:54 -08:00
9c1e08249d change | to is (#6991)
Since ansible 2.9 search cannot be used as filter after a pipe but after `is`

Signed-off-by: Sylvain Desbureaux <sylvain.desbureaux@orange.com>
2020-12-09 07:26:50 -08:00
33a60fe919 Fix warning of mkdir usage (#6951)
This fixes the following warning:

  [kubernetes/client : Generate admin kubeconfig with external api endpoint]
  [WARNING]: Consider using the file module with state=directory rather than
  running 'mkdir'.  If you need to use command because file is insufficient
  you can
2020-12-09 07:14:51 -08:00
85982dc8e9 add support crio version for varios k8s vers (#7003)
* add support crio version for various k8s vers

* regexp in pkg versions
2020-12-09 01:22:50 -08:00
dbe02d398a etcd: Fix permissions of /etc/ssl/etcd/ssl (#6908) 2020-12-09 00:48:49 -08:00
e022e2e13c Fix URL of offline container images (#7005)
When clicking the link, we faced NotFound error page of the github.com.
This fixes the link to avoid that.
2020-12-09 00:16:50 -08:00
7084d38767 Fix ETCD_CIPHER_SUITES shell var assignment (#7002) 2020-12-08 13:23:34 -08:00
00e0f3bd2b Fix nf_conntrack_ipv4 modprobe (#6988)
RedHat 8.3 merged nf_conntrack_ipv4 in nf_conntrack but still advertise 4.18
so just try to modprobe and decide depending on the success
Also nf_conntrack is a dependency of ip_vs, so no need to care about it

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2020-12-07 18:17:11 -08:00
cd7212453e Add etcd tls cipher suites (#7001)
* Add etcd tls cipher suites

* yamllint
2020-12-07 18:13:10 -08:00
a69f2b09da download run once feature for containerd (#6997) 2020-12-07 01:09:25 -08:00
878fe80ca3 add and use common crictl role (#6978) 2020-12-05 09:43:25 -08:00
8331c1f858 Hold the docker-ce-cli (#6995)
This will make sure an upgrade doesn't upgrade the docker cli.
2020-12-04 18:21:25 -08:00
f4a69d2827 Update docker to 19.03.14 and containerd to 1.3.9 (#6980) 2020-12-03 16:33:25 -08:00
ed6cef85d8 add crio registry mirror support (#6977)
* add crio registry mirror support

* mdlint fix
2020-12-03 13:57:25 -08:00
d315f73080 Ensure libseccomp is installed before starting containerd on CentOS 8 (#6922)
* Ensure libseccomp is installed before starting containerd on CentOS 8

* Simplify libseccomp install on CentOS 8

- Uses `package` module
- Replaces complex version check with 'state: latest'. The version must
  be > 2.3 when using with cri-o.
- Removes unnecessary `not is_ostree` condition as CentOS 8 does not use
  ostree
2020-12-03 13:43:26 -08:00
06ec5393d7 up vagrant box to fedora/33-cloud-base in cri-o molecule tests (#6992) 2020-12-03 11:25:26 -08:00
1a491fc10c Update hashes and set default to 1.19.4 (#6903) 2020-12-03 06:34:59 -08:00
488db81e36 Add pasqualet to approvers (#6976) 2020-12-03 00:58:59 -08:00
f377d9f057 Set etcd_.*_addresses to use etcd_[events_]access_address instead of access_ip (#6936) 2020-12-02 13:55:00 -08:00
db4e942b0d Remove hyperkube from codebase (#6965) 2020-12-02 13:50:59 -08:00
68b96bdf1a Helm v3 only (#6846)
* Fix etcd download dest

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>

* Only support Helm v3, cleanup install

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2020-12-02 00:20:50 -08:00
4f7a760a94 Add crun support (#6864)
Signed-off-by: Victor Morales <v.morales@samsung.com>
2020-12-01 11:00:50 -08:00
da5077fa5f Vagrantfile: Fix incorrect references to 'rhel' variable as 'redhat' (#6967) 2020-12-01 01:22:50 -08:00
f1231bb97d Add molecule for Kata Containers with Containerd (#6905) 2020-11-30 23:34:49 -08:00
80eb1ad936 fix ansible password authentication (#6907)
* copying ssh key no longer required, works with password auth
* use copy module instead of synchronize (which requires sshpass)
* less tasks and always changed tasks
2020-11-30 15:12:50 -08:00
cc5303e1c8 Add test for Fedora CoreOS before creating Docker service file (#6940) 2020-11-30 09:20:49 -08:00
f6a5948f58 Upgrade Jetstack Cert-Manager v1.0.4 (#6937) 2020-11-30 06:52:50 -08:00
f6eed8091e Remove contiv related files (#6964) 2020-11-30 06:48:50 -08:00
4a8a52bad9 containerd docker hub registry mirror support (#6962)
* containerd docker hub registry mirror support

* add docs

* fix typo

* fix yamllint

* fix indent in sample
and ansible-playbook param in testcases_run

* fix md

* mv common vars to tests/common/_docker_hub_registry_mirror.yml

* checkout vars to upgrade tests
2020-11-30 00:22:49 -08:00
c09aabab0c Remove executable bit from yaml and j2 files (#6894) 2020-11-29 20:18:48 -08:00
d47ba2b2ef Disable CRI-O restart by Multus (#6930) 2020-11-28 08:52:47 -08:00
17fb1ceed8 Allow airgapped CRI-O installation (#6927) 2020-11-28 08:38:47 -08:00
97ff67e54a Fix yaml syntax error when use multilines in dns_etchosts (#6960) 2020-11-28 08:32:47 -08:00
d4204a42fd Fix crictl paths and some of docker paths (#6961)
If crictl (and docker) binaries are deployed to the directories
that are not in standard PATH (e.g. /usr/local/bin), it is required
to specify full path to the binaries.
2020-11-28 08:30:47 -08:00
c6f6940459 Fix warning of "Enable ip forwarding" (#6953)
The task outputs the following warning:

  TASK [kubernetes/preinstall : Enable ip forwarding]
  [WARNING]: The value 1 (type int) in a string field was converted
  to u'1' (type string). If this does not look like what you expect,
  quote the entire value to ensure it does not change.
2020-11-27 03:54:49 -08:00
d739a6bb2f add Google proxy-mirror-cache for docker hub to CI tests (#6957) 2020-11-27 03:24:48 -08:00
0982c66051 fix: added boto3 as dependency required by kubespray-aws-inventory.py (#6890)
Added "boto3" as dependency in "requirements.txt" which is required by "kubespray-aws-inventory.py".

Signed-off-by: Pratik raj <rajpratik71@gmail.com>
2020-11-26 15:06:19 -08:00
d40701463f Update kube-ovn to 1.5.2 (#6610) 2020-11-26 09:34:19 -08:00
405692d793 Switch some image from dockerhub to k8s.gcr (also increase pkg retries) (#6955) 2020-11-26 08:46:19 -08:00
7938748d77 Allow configuring container log limits for Kubelet (#6933) 2020-11-26 00:32:19 -08:00
e909f84966 Bump nodelocaldns to 1.16.0 (#6916)
This new version uses the same base image as kube-proxy
(k8s.gcr.io/build-image/debian-iptables)
This allow to automatically pick iptables-legacy or iptables-nft,
and be compatible with RHEL/CentOS 8
https://github.com/kubernetes/dns/pull/367

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2020-11-25 17:10:19 -08:00
8a153ed38e Add serviceExternalIPs option for calico installation (#6928) 2020-11-25 05:34:39 -08:00
eb16986f32 Add RHEL support subscription registration (#6572) 2020-11-24 08:33:00 -08:00
bd801de236 bump calico version to 3.16.5 (#6944) 2020-11-24 02:49:01 -08:00
9c3bcd48ee Some typos in docs (#6923)
Signed-off-by: zouyu <zouy.fnst@cn.fujitsu.com>
2020-11-23 12:49:00 -08:00
ee23b947aa fix flake8 errors in Kubespray CI - tox-inventory-builder (#6910)
* fix flake8 errors in Kubespray CI - tox-inventory-builder

* Invalidate CRI-O kubic repo's cache

Signed-off-by: Victor Morales <v.morales@samsung.com>

* add support to configure pkg install retries

and use in CI job tf-ovh_ubuntu18-calico (due to it failing often)

* Switch Calico, Cilium and MetalLB image repos to Quay.io

Co-authored-by: Victor Morales <v.morales@samsung.com>
Co-authored-by: Barry Melbourne <9964974+bmelbourne@users.noreply.github.com>
2020-11-22 23:47:35 -08:00
0f7341bdde Update kubevirt Centos7 from 1809 to 2003 (#6823) 2020-11-14 12:25:04 -08:00
602b5aaf01 add warning about current state of heketi (#6888) 2020-11-13 00:06:23 -08:00
70bbb3e280 calico: avoid POD restart during initial deploy (#6886)
calico PODs are first started and then in a handler killed and
restarted for no reason, nothing has changed.

By using the existing variable 'calico_cni_config' (only defined when
calico has already started) the restart can be skipped.
2020-11-13 00:02:23 -08:00
a27eebb225 Fix hash of pypy3.6-v7.3.2-linux64 archive. (#6897)
The previous hash was still that of v7.3.1, see https://www.pypy.org/download.html for the hash of the current release.
2020-11-11 09:20:27 -08:00
1b0326f773 do not apply floating IP's before router port is created (#6887) 2020-11-06 00:16:50 -08:00
93a1693040 Update BGPPeer CRD to match v3.16 of Calico (#6881) 2020-11-05 11:14:51 -08:00
df7ed24389 [Openstack] Add security groups not managed by terraform (#6865)
* add custom sec groups

* make sure groups are applied only when created

* fix spacing
2020-11-05 05:30:54 -08:00
544aa00c17 install etcdctl to host when etcd deployment type is kubeadm (#6857)
* create a wrapper script with pki options
* supports all kubespray managed container engines

Co-authored-by: Hans Feldt <hafe@users.noreply.github.com>
2020-11-04 00:20:04 -08:00
fc22453618 crio: avoid extra restart after install and upgrade (#6882)
Package upgrade restarts crio. By creating/updating config first,
an extra restart can be avoided.
2020-11-03 08:54:03 -08:00
fefcb8c9f8 Allow the eventRecordQPS setting to be set. (#6880)
* Allow the eventRecordQPS setting to be set.

The eventRecordQPS parameter controls rate limiting for event recording. When zero, unlimited events can cause denial-of-service situations. For my situation, I don't need more than a setting of "5". This change allows me to configure the setting before creating the cluster.

* Allow the eventRecordQPS setting to be set.

The default settings (see types.go) is five. So, this change does not affect the cluster provisioning. However, it does allow for the setting to be changed.
2020-11-03 00:42:15 -08:00
9cf5dd0291 Use cgroup v1 in Fedora +31 (#6862)
Fedora 31 uses Cgroups v2 by default. This change by passes the kernel
parameter systemd.unified_cgroup_hierarchy=0.

Signed-off-by: Victor Morales <v.morales@samsung.com>
2020-11-02 06:32:53 -08:00
7a1f033c1d Update helm stable repo (#6867)
As https://helm.sh/blog/new-location-stable-incubator-charts/
helm stable repo is changed to https://charts.helm.sh/stable
In addition, if using helm v3.4.0+ the old stable repo installation
is failed.
So this updates the stable repo to avoid such error.
2020-10-31 09:54:51 -07:00
4a5acad414 Fix missing spaces in section heading. (#6868)
When https://kubespray.io/#/docs/comparisons is generated, having the link in the heading creates the following HTML. When displayed there is no space between "vs" and the link. I simply moved the link into the following paragraph.

```
<h2 id="kubespray-vs-kops"><a href="#/docs/comparisons?id=kubespray-vs-kops" data-id="kubespray-vs-kops" class="anchor"><span>Kubespray vs </span></a><a href="https://github.com/kubernetes/kops" target="_blank" rel="noopener">Kops</a></h2>
```
2020-10-29 10:29:54 -07:00
227e96469c Minor update Calico and Cilium (#6871) 2020-10-29 07:14:59 -07:00
c93fa6effe Handle dns_mode set to 'none' in generate nameservers task (#6825)
When dns_mode was set to 'none' the coredns_server became an empty
string and invalid operation of adding string to list was executed.
2020-10-29 01:04:58 -07:00
102fb94524 Notes About Server In admin.conf (#6854)
* Add note about changing private IP in admin.conf.

When I run kubespray, a load balancer is created which should be used instead of the ip of the controller node.

* Procedure to find load balancer and update admin.conf

When I run kubespray, a load balancer is used instead of the private ip of the controller.
2020-10-28 18:30:59 -07:00
c25d624524 Register missing outputs in role "remove-node" (#6856) 2020-10-28 12:55:56 -07:00
12ab8b7af3 update version of ingress-nginx controller in docs. (#6855)
* update version of ingress-nginx controller.

Change tag from controller-v0.34.0 to controller-v0.40.2 to use newest tag.

* Update docs about aws deploy templates.

In the yaml templates, there is no mention of idle timeouts. This is why I removed the documentation about it. This might be a mistake. Please verify this. I don't know enough to verify it myself.

* Change label when checking version.

When checking for `app.kubernetes.io/name=ingress-nginx`, a completed pod was selected which is not helpful when trying to `exec`. Changing the label selects the running controller pod.

* put back the information about ELB Idle Timeouts.

When I removed the information, I had overlooked that it was mentioned in the L7 yaml file. Thanks.
2020-10-28 11:05:57 -07:00
097bec473c fixed bug in etcd retention where backups are not sorted by date (#6860)
* fixed bug in etcd retention where backups are not sorted by date

* added directory filter to find command
2020-10-28 09:09:57 -07:00
d36b5d7d55 Install cri-o with package version (#6853)
and thereby support upgrade from e.g. 1.18.x to 1.19.y

Included OSes:
- Centos7/8
- Ubuntu18/20

New variables for overriding by default installed packages:
- centos_crio_packages
- ubuntu_crio_packages
2020-10-26 08:35:02 -07:00
4b858b6466 Fixes 6621 etcd backup directory is consuming much rootfs disk space (#6836)
* added an ansible var to manage retention of etcd backups

* refactord ls/grep into find in etcd backup removal command
2020-10-23 07:09:57 -07:00
e03e3c4582 Add Kata Containers support to CRI-O runtime (#6830)
* Enable Kata Containers for CRI-O runtime

Kata Containers is an OCI runtime where containers are run inside
lightweight VMs. This runtime has been enabled for containerd runtime
thru the kata_containers_enabled variable. This change enables Kata
Containers to CRI-O container runtime.

Signed-off-by: Victor Morales <v.morales@samsung.com>

* Set appropiate conmon_cgroup when crio_cgroup_manager is 'cgroupfs'

* Set manage_ns_lifecycle=true when KataContainers is enabed

* Add preinstall check for katacontainers

Signed-off-by: Victor Morales <v.morales@samsung.com>

Co-authored-by: Pasquale Toscano <pasqualetoscano90@gmail.com>
2020-10-23 03:07:46 -07:00
91f1edbdd4 Update k8s-dns-node-cache to 1.15.16 (#6852) 2020-10-22 10:29:36 -07:00
c6e2a4ebd8 Set feature gates in kube-proxy ConfigMap (#6851)
Command line flags aren't added to kube-proxy which results in missing
feature gates set in this component. Add appropriate setting to
ConfigMap instead.

Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
2020-10-22 03:39:34 -07:00
3eefb5f2ad fix scaling in kubeadm etcd mode (#6822)
'ansible.vars.hostvars.HostVarsVars object' has no attribute 'kubeadm_upload_cert'

kubeadm_upload_cert will never be found as a hostvar for the first
master since the task is executed for a worker.

Fix by executing the upload task for the first master and register
the needed key. After that, workers can read hostvars for the master

Var kubeadm_etcd_refresh_cert_key removed since it no longer has
any use.
2020-10-21 07:32:32 -07:00
04b19359cb allow non existing etcd group (#6797)
When using kubeadm managed etcd, configuring an etcd group can now
be skipped.
2020-10-21 07:32:20 -07:00
f2ef781efd Add tag for test-infra images and docker logout (#6848) 2020-10-21 04:08:20 -07:00
60b0fb3e88 Update hashes and set default version to 1.19.3 (#6841) 2020-10-21 00:58:20 -07:00
f323d70c0f Adding option to disable globally applying a proxy to etc/yum.conf (#6828)
* Adding option to disable gloablly applying a proxy to etc/yum.conf

* Change made to proxy_yum_globaly basedon reviewer feedback

* fix trailing spaces in ymllint
2020-10-20 23:22:19 -07:00
03f316e7a2 Fix proxy and module_hotfixes (#6837)
This fixes the Containerd + EL8 case that was missed in 7d1ab3374e

On CentOS 8 with proxy ansible render inline `proxy` and `module_hotfixes` options.

For example:
```
proxy=http://127.0.0.1:3128module_hotfixes=True
```

But expected result:
```
proxy=http://127.0.0.1:3128
module_hotfixes=True
```

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
2020-10-19 23:06:07 -07:00
79b7f0d592 Use existing variable for tiller service account name (#6829)
* Use existing variable for tiller service account name

* keep crb as tiller
2020-10-19 03:04:13 -07:00
d25aebdaf5 Upgrade Flannel to 0.13.0 (#6826) 2020-10-15 10:50:22 -07:00
1454ba4a9e Disable audio device mapping for VirtualBox VMs (#6811)
I can't see any reason why audio devices would be needed, and it can cause issues with the host audio
2020-10-13 10:30:26 -07:00
4781df587c bump crio version to 1.19 (#6758)
* bump crio version to 1.19

* crio package name has changed for debian/ubuntu
* crio upgrade does not work, see #6757

* update crio info in docs
2020-10-13 02:08:26 -07:00
e49330d6ee change owner to root for bin_dir directory (#6814) 2020-10-12 18:13:22 -07:00
dbe6eb20c8 Modify imagepullpolicy (#6816) 2020-10-12 17:45:22 -07:00
8bec5beb4b fix: add tags for set facts nodelocaldns (#6813) 2020-10-12 16:47:21 -07:00
e6effb8245 Make reset work for crio (#6812)
crio refuses to delete pods when cni is unavailable which is the
case e.g. using calico with kdd datastore. See:

https://github.com/cri-o/cri-o/issues/4084

Fix by deleting storage associated with containers. Stop and disable
crio service so switching container runtime can be done.
2020-10-12 15:47:22 -07:00
5e32655830 Added option to force apiserver and respective client certificate to … (#6403)
* Added option to force apiserver and respective client certificate to be regenerated without necessarily needing to bump the K8S cluster version

* Removed extra blank line
2020-10-12 06:02:48 -07:00
270f91e577 cleanup kubelet_deployment_type (#6815)
No longer used/supported
2020-10-12 00:04:47 -07:00
07858e8f71 allow pre-existing floating IPs to be specified with k8s_master_fips (#6755)
k8s_master_no_etcd_fips should not be input var
2020-10-11 23:54:47 -07:00
4cb5a4f609 Fix line-spacing in no_proxy.yml (#6810)
Signed-off-by: holmesb <5072156+holmesb@users.noreply.github.com>
2020-10-11 08:50:47 -07:00
cb57c3c916 Fix handler naming issue for Kubeadm | kubelet (#6803)
Handlers with the same name (Kubeadm | restart kubelet) leads to incorrect playbook execution. As a result, after completing the tasks, kubelet does not restart. This PR fix this behavior
2020-10-11 08:26:47 -07:00
92b1166dd0 Disable dashboard by default (#6804)
Users should opt in for features and not opt out.
2020-10-11 08:06:47 -07:00
e6c28982dd Chmod kubeconfig to avoid group-readable (#6800)
After upgrading to newer Kubernetes(v1.17 at least), kubectl command
shows the following warning message:

  WARNING: Kubernetes configuration file is group-readable.
  This is insecure. Location: /home/foo/.kube/config

The kubeconfig was copied from {{ artifacts_dir }}/admin.conf with
kubeconfig_localhost feature. It is better to set valid file mode
at getting it on Kubespray.
2020-10-09 01:39:08 -07:00
64f69718fb Update bunch of dependencies (#6801) 2020-10-09 01:35:06 -07:00
1301e69c7d If no_proxy_exclude_workers is true, workers will be excluded from the no_proxy variable.  This prevents docker engine restarting when scaling workers. (#6520)
Signed-off-by: holmesb <5072156+holmesb@users.noreply.github.com>
2020-10-09 01:15:07 -07:00
99b8f0902e crio: ensure service is started and enabled (#6753) 2020-10-07 00:10:42 -07:00
6a4d322a7c Do not install etcd and etcdctl on master with scale.yml playbook. (#6798)
Remove task with install etcdctl from etcd role when etcd_kubeadm_enabled=true
2020-10-06 07:04:20 -07:00
9d7f358d4b Fix csi-snapshotter timeout option. Fix ebs-external-attacher-role ClusterRole. (#6776) 2020-10-06 06:44:21 -07:00
b1bb5a4796 Fix cinder & external_openstack cacert deployment (#6745)
The CA cert was only deployed on master nodes
2020-10-06 05:34:21 -07:00
f8ae086334 Added Comment line above checksum section to add clarification about Kubespray's version support and testing (#6785) 2020-10-06 05:30:21 -07:00
c49bda7319 Update nginx ingress controller to 0.40.1 (#6786) 2020-10-06 05:10:21 -07:00
aa9022d638 Use v2.14.1 as base image for CI (#6773) 2020-10-06 04:44:20 -07:00
2994ba33ac Add oomichi to reviewers (#6796) 2020-10-06 00:12:19 -07:00
87c0f135dc Update triage/support label references to kind/support (#6792)
The label triage/support has been reclassified as kind/support. The
kind/* family of labels makes more logical sense, as they describe the
"kind" of thing an issue or PR is.

For more information, see the announcement email:
https://groups.google.com/g/kubernetes-dev/c/YcaJpsjjLKw/m/i15cLLx5CAAJ
2020-10-05 14:38:20 -07:00
a687013fbe Update kube-router to 1.1.0 (#6793) 2020-10-05 13:46:20 -07:00
b0097fd0c1 harden reset to work in more cases (#6781)
reset playbook fails and does not continue cleanup after for
example a host reboot with kubelet stopped/disabled
2020-10-05 12:55:21 -07:00
9729b6b75a Add extra arguments variables for openstack and vsphere cloud controller manager daemonsets (#6783) 2020-10-02 10:14:48 -07:00
58959ae82f Update cilium with minor fix for CVE (#6784) 2020-10-02 10:02:48 -07:00
4ffc106c58 Add plugins/mitogen to .gitignore (#6774)
If the `mitogen.yml` playbook is run, it installs Mitogen in this path, causing Git to believe there to 500+ changes. This simply excludes that external module from git
2020-10-01 16:03:21 -07:00
a374301570 Remove arch from flannel image tag (#6765)
The 0d0cc8cf9c change creates several
DaemonSets to cover the Flannel CNI installation for different CPU
architectures. This change removes the unnecessary architecture value
from the docker tag value.

Signed-off-by: Victor Morales <v.morales@samsung.com>
2020-09-30 14:16:54 -07:00
bc8e16fc69 nginx ingress: fix yaml for multiple nodeselectors (#6768)
In case multiple nodeselectors are specified in ingress_nginx_nodeselector, the generated daemonset yaml template for nginx is invalid due to missing indentation starting with the second nodeselector
2020-09-30 07:23:26 -07:00
947162452d Forgotten debian10 test during nightly tests (#6769) 2020-09-30 07:19:26 -07:00
7a730d42dd Add bin_dir to PATH environment. (#6764) 2020-09-29 06:35:27 -07:00
109391031b Add error msg for check of local ip (#6761)
When stopping at the check of "Stop if ip var does not match local ips"
the error message is like:

  fatal: [single-k8s]: FAILED! => {
      "assertion": "ip in ansible_all_ipv4_addresses",
      "changed": false,
      "evaluated_to": false,
      "msg": "Assertion failed"
  }

That doesn't contain actual IP addresses and it is difficult to understand
what was wrong. This adds the error message which contain actual IP addresses
to investigate the issue if happens.
2020-09-29 06:29:27 -07:00
aba63f0f9a Added support for dynamic tags in AWS and Azure. (#6752)
* Added support for dynamic tags in AWS and Azure.

* Added examples of dynamic tags configuration.
2020-09-26 10:50:48 -07:00
e67886bf9d add leader election timeouts and durations to available parameters (#6691) 2020-09-25 08:21:11 -07:00
c2ac3b51c1 Update containerd to 1.3.7 - add fedora32/centos8 containerd packages (#6749) 2020-09-25 08:15:11 -07:00
081a9e7bd8 /opt/cni/bin/install not before calico 3.16 (#6738) 2020-09-25 06:15:11 -07:00
55d8ed093a Add centos8 docker repo (#6747) 2020-09-25 06:11:11 -07:00
77149e5d89 Fixes #6740: Allow disabling reverse DNS lookups in coredns (#6741)
* created variable to enable/disable reverse dns lookups in coredns

* fixed linting-error in dns-stack.md
2020-09-25 02:33:11 -07:00
28839f6b71 remove duplicate audit-policy-file argument in kubeadm configuration (#6734) 2020-09-24 09:26:06 -07:00
49bcf91aaf Allow period ci jobs to fail (#6737) 2020-09-24 09:22:06 -07:00
28073c76ac Calico upgrade path validation and old version cleanup (#6733)
* calico: add constant calico_min_version_required

and verify current deployed version against it.

* calico: remove upgrade support with data migration

The tool was used pre v3.0.0 and is no longer needed.

* calico: remove old version support from tasks

* calico: remove old ver support from policy ctrl

* calico: remove old ver support from node

* canal: remove old ver support

* remove unused calicoctl download checksums

calico_min_version_required is the oldest version that can be installed
Older versions can be removed.
2020-09-24 09:04:06 -07:00
50e8a52c74 Handle calico-rr nodes as workers so they get upgraded too (#6447)
* Handle calico-rr nodes as workers so they get upgraded too

* calico-rr nodes run 'calico and external cloud provider' too
2020-09-24 04:38:05 -07:00
5c448b6896 Add retries to update calico-rr data in etcd through calicoctl (#6505)
* Add retries to update calico-rr data in etcd through calicoctl

* Update update-node yaml syntax

* Add comment to clarify ansible block loop

* Remove trailing space
2020-09-24 03:24:05 -07:00
c0fd5b2e84 remove variable 'etcd_ionice', because ionice removed from container image etcd:v3.4.x (#6735) 2020-09-23 12:34:05 -07:00
6141b98bf8 calico: default to using kdd datastore (#6693)
If already deployed, get current datastore from CNI config file
2020-09-23 08:38:09 -07:00
2eae207435 Update docker packages to 19.03.13 + add docker f32 (#6712) 2020-09-23 08:32:19 -07:00
9a8e4381be Fix snapshot.storage apiVersion (#6711) 2020-09-23 08:32:10 -07:00
5f034330c5 properly generate extravolumes in kubeadmconfig for centos (#6708) 2020-09-23 01:20:09 -07:00
edea63511d Fix reserved memory unit in kubelet configuration (#6725)
* Fix reserved memory unit in kubelet configuration

Signed-off-by: Wang Zhen <lazybetrayer@gmail.com>

* Move systemReserved default values from template

Signed-off-by: Wang Zhen <lazybetrayer@gmail.com>
2020-09-22 15:20:09 -07:00
80df4f8b01 Fix unintended SIGPIPE (#6721) 2020-09-22 11:14:42 -07:00
68118c2653 Expose offline install overrides in inventory (#6728)
* Expose offline install overrides in inventory

* Remove not recommended warning
2020-09-22 07:14:48 -07:00
1e79dcfcaa Added ability to set calico vxlan vni and port. defaults to calico's … (#6678)
* Added ability to set calico vxlan vni and port. defaults to calico's documented defaults.

* Check if calico_network_backend is defined prior to checking value

* Removed calico hidden defaults for vxlan port and vni

* Fixed FELIX_VXLANVNI typo
2020-09-22 01:04:48 -07:00
1805e95b69 Change health check from TCP to HTTPS (#6487)
I kept seeing `TLS handshake error from 10.250.250.158:63770: EOF` from two IP addresses that correlate to my ELB. Changing the health check from TCP to HTTPS stopped the errors from being generated.
2020-09-22 00:56:47 -07:00
0d0cc8cf9c Add multi architeture support to flannel (#6166)
Signed-off-by: Victor Morales <v.morales@samsung.com>
2020-09-22 00:44:47 -07:00
5bd937ece0 Remove pypi repo and pip extra flags (#6729) 2020-09-21 13:27:51 -07:00
8908a70c19 Fails if kubeadm_version do not matches kubernetes version (#6302) 2020-09-21 07:20:32 -07:00
5ec2467268 Add external_openstack_lbaas_provider setting for occm (#6566)
* Add external_openstack_lbaas_provider setting for occm

* Integrate with existing lbaas_provider block

* Refactor lbaas_provider config template block

* Remove external_openstack_lbaas_use_octavia from sample inventory
2020-09-21 07:04:32 -07:00
e489e70031 add new variable allowing additionnal audit webhook server options (#6726) 2020-09-21 06:44:32 -07:00
05c9169c70 Fix example value for etcd_quota_backend_bytes (#6724) 2020-09-21 05:42:31 -07:00
bd49c993de Added support for setting tiller_service_account and tiller_replicas (#6696)
* Added support for setting tiller_service_account and tiller_replicas

* Specify helm 2 version to ensure we have a test path that still hits helm 2 code

* Moved tiller_service_account to defaults.yml. Fixed is tiller_replicas defined check.
2020-09-20 23:52:30 -07:00
5989680967 Make sure node_ip is set if node is in etcd group (#6719) 2020-09-18 17:14:27 -07:00
e1265b2e7b Fix order of OS CI cleanup (#6714) 2020-09-18 16:20:28 -07:00
1721460dcd Remove vagrant.deb from docker image (#6717) 2020-09-18 14:48:27 -07:00
861bf967a7 Move floruyt to approver (#6713) 2020-09-18 11:24:46 -07:00
09b8314057 Add support for periodic CI (#6715) 2020-09-18 08:08:46 -07:00
151b142d30 Ignore pause from kubeadm config images list (#6689) 2020-09-18 07:32:46 -07:00
b7c4136702 Ignore error in check mode when disabling swap (#6703) 2020-09-18 07:26:46 -07:00
e666fe5a8d flannel image arch specific tag (#6685) 2020-09-18 02:12:54 -07:00
9ce34be217 Added missing permissions for operator. (#6683)
Related commit: 976337b750
2020-09-18 02:12:45 -07:00
79226d0870 Add Kubernetes hashes 1.19.2/1.18.9/1.17.12 and set default (#6698) 2020-09-17 11:12:45 -07:00
686316b390 Cleanup virsh volumes in Vagrant CI (#6688) 2020-09-17 08:04:45 -07:00
6da385de9d Use "kubeadm join" to join masters to control plane (#6661)
Remove configuration variable kubeadm_control_plane
2020-09-17 04:34:45 -07:00
0cc5e3ef03 Remove workaround with kube_proxy_remove (#6512)
* kube-proxy never gets deployed so need to remove it
2020-09-17 04:30:45 -07:00
47194c1fe4 fix incorrect documentation of use_access_ip (#6674)
It was documented as if it were an Ansible variable, but it is a Terraform variable.
This also means the colon syntax was incorrect. TF variables are assigned with an equals sign.

Co-authored-by: rptaylor <rptaylor@uvic.ca>
2020-09-17 02:48:45 -07:00
3bf40d5db9 make metallb image repos configurable (#6671) (#6672)
* Make metallb image repos configurable

* Moved metallb image repo definitions to download role defaults

* Removed comment. These are set in download defaults
2020-09-17 02:45:13 -07:00
a9e11623cd fix remove node (#6666) 2020-09-17 02:45:05 -07:00
a870dd368e Allow configuration of nodelabels in local_volume_provisioner (#6620) 2020-09-17 02:44:58 -07:00
b6b26c710f Add support for Calico CNI host-local IPAM plugin (#6580) 2020-09-17 02:44:46 -07:00
705ad84ce7 Update third party librairies and tools (#6669) 2020-09-17 02:36:46 -07:00
04932f496f Updated KataContainers version to 1.11.3 (#6694) 2020-09-17 02:32:45 -07:00
dffbd58671 Move from widehat.opensuse to download.opensuse for crio centos (#6682) 2020-09-15 06:28:07 -07:00
152e0162a9 Update api version, deprecated in 1.19 (#6656) 2020-09-11 15:12:09 -07:00
2fa7faa75a Update etcd to 3.4.13 (#6658) 2020-09-11 12:32:09 -07:00
12f514f752 Update dockerfile for v1.19.1 (#6668) 2020-09-11 05:48:14 -07:00
e2886f37a2 yamllint: ignore .git dir (#6667) 2020-09-11 02:06:14 -07:00
03dff09b8a fix kubelet_flexvolumes_plugins_dir undefined (#6645) 2020-09-11 00:34:14 -07:00
a556f8f2bf Remove deprecated (and removed in 1.19) flag and function --basic-auth-file (#6655) 2020-09-11 00:30:14 -07:00
1765c9125a Update CoreDNS to 1.7.0 (#6657) 2020-09-10 15:48:14 -07:00
ab28192d50 Update various dependencies following 1.19 release (#6660) 2020-09-10 11:07:45 -07:00
ad15721677 Add Kubernetes 1.19.1 hashes and set default (#6654) 2020-09-10 10:43:46 -07:00
a2d4dbeee4 crio: use system default for storage driver by default (#6637)
After host reboot kubelet and crio goes into a loop and no container is started.

storage_driver in crio.conf overrides system defaults in etc/containers/storage.conf

/etc/containers/storage.conf is installed by package containers-common dependency
installed from cri-o (centos7) and contains "overlay".

Hosts already configured with overlay2 should be reconfigured and the
/var/lib/containers content removed.
2020-09-10 05:29:45 -07:00
1712ba1198 Add iptables_backend to weave options (#6639) 2020-09-10 03:49:52 -07:00
040dda37ed Add comment clarifying network allocation and sizes (#6607)
* Add comment from roles/kubespray-defaults/defaults/main.yaml clarifying network allocation and sizes

Signed-off-by: Mikael Johansson <mik.json@gmail.com>

* Rewrite of the comment and added new examples

Signed-off-by: Mikael Johansson <mik.json@gmail.com>
2020-09-10 03:49:44 -07:00
a99ba3bb16 Allowing resource management of metrics-server container. Will allow fine-tuning of resource allocation and solving throttling issues. Setting defaults as per the current request & limit allocation: cpu: 43m, memory 55Mi for both limits & requests. (#6652)
Signed-off-by: Brendan Holmes <holmesb@users.noreply.github.com>

Co-authored-by: Brendan Holmes <holmesb@users.noreply.github.com>
2020-09-10 03:46:02 -07:00
05ff4a527d Fix a bunch of failed quality rules (#6646) 2020-09-10 03:45:54 -07:00
ae5328c500 Update calico to 3.16.1 (#6644) 2020-09-10 03:45:46 -07:00
34ff39e654 NetworkManager lists must be separated by , (#6643) 2020-09-10 03:41:44 -07:00
8e3915f5bf Set ansible_python_interpreter to python3 on debian (fix error with mitogen) (#6633) 2020-09-08 15:37:52 -07:00
6019a1006c Use v2.14.0 as base image for CI (#6636) 2020-09-08 11:31:03 -07:00
739 changed files with 21479 additions and 14827 deletions

View File

@ -1,7 +1,7 @@
---
name: Support Request
about: Support request or question relating to Kubespray
labels: triage/support
labels: kind/support
---

1
.gitignore vendored
View File

@ -15,6 +15,7 @@ contrib/terraform/aws/credentials.tfvars
**/*.sw[pon]
*~
vagrant/
plugins/mitogen
# Ansible inventory
inventory/*

View File

@ -8,13 +8,14 @@ stages:
- deploy-special
variables:
KUBESPRAY_VERSION: v2.13.3
KUBESPRAY_VERSION: v2.15.1
FAILFASTCI_NAMESPACE: 'kargo-ci'
GITLAB_REPOSITORY: 'kargo-ci/kubernetes-sigs-kubespray'
ANSIBLE_FORCE_COLOR: "true"
MAGIC: "ci check this"
TEST_ID: "$CI_PIPELINE_ID-$CI_BUILD_ID"
CI_TEST_VARS: "./tests/files/${CI_JOB_NAME}.yml"
CI_TEST_REGISTRY_MIRROR: "./tests/common/_docker_hub_registry_mirror.yml"
GS_ACCESS_KEY_ID: $GS_KEY
GS_SECRET_ACCESS_KEY: $GS_SECRET
CONTAINER_ENGINE: docker
@ -29,7 +30,9 @@ variables:
MITOGEN_ENABLE: "false"
ANSIBLE_LOG_LEVEL: "-vv"
RECOVER_CONTROL_PLANE_TEST: "false"
RECOVER_CONTROL_PLANE_TEST_GROUPS: "etcd[2:],kube-master[1:]"
RECOVER_CONTROL_PLANE_TEST_GROUPS: "etcd[2:],kube_control_plane[1:]"
TERRAFORM_14_VERSION: 0.14.10
TERRAFORM_13_VERSION: 0.13.6
before_script:
- ./tests/scripts/rebase.sh

View File

@ -14,7 +14,7 @@ vagrant-validate:
stage: unit-tests
tags: [light]
variables:
VAGRANT_VERSION: 2.2.4
VAGRANT_VERSION: 2.2.15
script:
- ./tests/scripts/vagrant-validate.sh
except: ['triggers', 'master']
@ -64,9 +64,9 @@ markdownlint:
tags: [light]
image: node
before_script:
- npm install -g markdownlint-cli
- npm install -g markdownlint-cli@0.22.0
script:
- markdownlint README.md docs --ignore docs/_sidebar.md
- markdownlint $(find . -name '*.md' | grep -vF './.git') --ignore docs/_sidebar.md --ignore contrib/dind/README.md
ci-matrix:
stage: unit-tests

View File

@ -1,43 +1,54 @@
---
.packet: &packet
.packet:
extends: .testcases
variables:
CI_PLATFORM: "packet"
SSH_USER: "kubespray"
CI_PLATFORM: packet
SSH_USER: kubespray
tags:
- packet
except: [triggers]
# CI template for PRs
.packet_pr:
only: [/^pr-.*$/]
except: ['triggers']
extends: .packet
# CI template for periodic CI jobs
# Enabled when PERIODIC_CI_ENABLED var is set
.packet_periodic:
only:
variables:
- $PERIODIC_CI_ENABLED
allow_failure: true
extends: .packet
packet_ubuntu18-calico-aio:
stage: deploy-part1
extends: .packet
extends: .packet_pr
when: on_success
# Future AIO job
packet_ubuntu20-calico-aio:
stage: deploy-part1
extends: .packet
extends: .packet_pr
when: on_success
# ### PR JOBS PART2
packet_centos7-flannel-containerd-addons-ha:
extends: .packet
extends: .packet_pr
stage: deploy-part2
when: on_success
variables:
MITOGEN_ENABLE: "true"
packet_centos7-crio:
extends: .packet
packet_centos8-crio:
extends: .packet_pr
stage: deploy-part2
when: on_success
variables:
MITOGEN_ENABLE: "true"
packet_ubuntu18-crio:
extends: .packet
extends: .packet_pr
stage: deploy-part2
when: manual
variables:
@ -45,44 +56,44 @@ packet_ubuntu18-crio:
packet_ubuntu16-canal-kubeadm-ha:
stage: deploy-part2
extends: .packet
extends: .packet_periodic
when: on_success
packet_ubuntu16-canal-sep:
stage: deploy-special
extends: .packet
extends: .packet_pr
when: manual
packet_ubuntu16-flannel-ha:
stage: deploy-part2
extends: .packet
extends: .packet_pr
when: manual
packet_ubuntu16-kube-router-sep:
stage: deploy-part2
extends: .packet
extends: .packet_pr
when: manual
packet_ubuntu16-kube-router-svc-proxy:
stage: deploy-part2
extends: .packet
extends: .packet_pr
when: manual
packet_debian10-cilium-svc-proxy:
stage: deploy-part2
extends: .packet
when: manual
extends: .packet_periodic
when: on_success
packet_debian10-containerd:
stage: deploy-part2
extends: .packet
extends: .packet_pr
when: on_success
variables:
MITOGEN_ENABLE: "true"
packet_centos7-calico-ha-once-localhost:
stage: deploy-part2
extends: .packet
extends: .packet_pr
when: on_success
variables:
# This will instruct Docker not to start over TLS.
@ -92,97 +103,91 @@ packet_centos7-calico-ha-once-localhost:
packet_centos8-kube-ovn:
stage: deploy-part2
extends: .packet
extends: .packet_periodic
when: on_success
packet_centos8-calico:
stage: deploy-part2
extends: .packet
extends: .packet_pr
when: on_success
packet_fedora32-weave:
stage: deploy-part2
extends: .packet
extends: .packet_pr
when: on_success
packet_opensuse-canal:
stage: deploy-part2
extends: .packet
extends: .packet_periodic
when: on_success
packet_ubuntu18-ovn4nfv:
stage: deploy-part2
extends: .packet
extends: .packet_periodic
when: on_success
# Contiv does not work in k8s v1.16
# packet_ubuntu16-contiv-sep:
# stage: deploy-part2
# extends: .packet
# when: on_success
# ### MANUAL JOBS
packet_ubuntu16-weave-sep:
stage: deploy-part2
extends: .packet
extends: .packet_pr
when: manual
packet_ubuntu18-cilium-sep:
stage: deploy-special
extends: .packet
extends: .packet_pr
when: manual
packet_ubuntu18-flannel-containerd-ha:
stage: deploy-part2
extends: .packet
extends: .packet_pr
when: manual
packet_ubuntu18-flannel-containerd-ha-once:
stage: deploy-part2
extends: .packet
extends: .packet_pr
when: manual
packet_debian9-macvlan:
stage: deploy-part2
extends: .packet
extends: .packet_pr
when: manual
packet_centos7-calico-ha:
stage: deploy-part2
extends: .packet
extends: .packet_pr
when: manual
packet_centos7-kube-router:
stage: deploy-part2
extends: .packet
extends: .packet_pr
when: manual
packet_centos7-multus-calico:
stage: deploy-part2
extends: .packet
extends: .packet_pr
when: manual
packet_oracle7-canal-ha:
stage: deploy-part2
extends: .packet
extends: .packet_pr
when: manual
packet_fedora31-flannel:
packet_fedora33-calico:
stage: deploy-part2
extends: .packet
extends: .packet_periodic
when: on_success
variables:
MITOGEN_ENABLE: "true"
packet_amazon-linux-2-aio:
stage: deploy-part2
extends: .packet
extends: .packet_pr
when: manual
packet_fedora32-kube-ovn-containerd:
stage: deploy-part2
extends: .packet
extends: .packet_periodic
when: on_success
# ### PR JOBS PART3
@ -190,7 +195,7 @@ packet_fedora32-kube-ovn-containerd:
packet_centos7-weave-upgrade-ha:
stage: deploy-part3
extends: .packet
extends: .packet_periodic
when: on_success
variables:
UPGRADE_TEST: basic
@ -198,7 +203,7 @@ packet_centos7-weave-upgrade-ha:
packet_debian9-calico-upgrade:
stage: deploy-part3
extends: .packet
extends: .packet_pr
when: on_success
variables:
UPGRADE_TEST: graceful
@ -206,7 +211,7 @@ packet_debian9-calico-upgrade:
packet_debian9-calico-upgrade-once:
stage: deploy-part3
extends: .packet
extends: .packet_periodic
when: on_success
variables:
UPGRADE_TEST: graceful
@ -214,16 +219,16 @@ packet_debian9-calico-upgrade-once:
packet_ubuntu18-calico-ha-recover:
stage: deploy-part3
extends: .packet
extends: .packet_periodic
when: on_success
variables:
RECOVER_CONTROL_PLANE_TEST: "true"
RECOVER_CONTROL_PLANE_TEST_GROUPS: "etcd[2:],kube-master[1:]"
RECOVER_CONTROL_PLANE_TEST_GROUPS: "etcd[2:],kube_control_plane[1:]"
packet_ubuntu18-calico-ha-recover-noquorum:
stage: deploy-part3
extends: .packet
extends: .packet_periodic
when: on_success
variables:
RECOVER_CONTROL_PLANE_TEST: "true"
RECOVER_CONTROL_PLANE_TEST_GROUPS: "etcd[1:],kube-master[1:]"
RECOVER_CONTROL_PLANE_TEST_GROUPS: "etcd[1:],kube_control_plane[1:]"

View File

@ -4,7 +4,7 @@ shellcheck:
stage: unit-tests
tags: [light]
variables:
SHELLCHECK_VERSION: v0.6.0
SHELLCHECK_VERSION: v0.7.1
before_script:
- ./tests/scripts/rebase.sh
- curl --silent --location "https://github.com/koalaman/shellcheck/releases/download/"${SHELLCHECK_VERSION}"/shellcheck-"${SHELLCHECK_VERSION}".linux.x86_64.tar.xz" | tar -xJv
@ -12,5 +12,5 @@ shellcheck:
- shellcheck --version
script:
# Run shellcheck for all *.sh except contrib/
- find . -name '*.sh' -not -path './contrib/*' | xargs shellcheck --severity error
- find . -name '*.sh' -not -path './contrib/*' -not -path './.git/*' | xargs shellcheck --severity error
except: ['triggers', 'master']

View File

@ -53,31 +53,92 @@
# Cleanup regardless of exit code
- chronic ./tests/scripts/testcases_cleanup.sh
tf-validate-openstack:
tf-0.13.x-validate-openstack:
extends: .terraform_validate
variables:
TF_VERSION: 0.12.24
TF_VERSION: $TERRAFORM_13_VERSION
PROVIDER: openstack
CLUSTER: $CI_COMMIT_REF_NAME
tf-validate-packet:
tf-0.13.x-validate-packet:
extends: .terraform_validate
variables:
TF_VERSION: 0.12.24
TF_VERSION: $TERRAFORM_13_VERSION
PROVIDER: packet
CLUSTER: $CI_COMMIT_REF_NAME
tf-validate-aws:
tf-0.13.x-validate-aws:
extends: .terraform_validate
variables:
TF_VERSION: 0.12.24
TF_VERSION: $TERRAFORM_13_VERSION
PROVIDER: aws
CLUSTER: $CI_COMMIT_REF_NAME
tf-0.13.x-validate-exoscale:
extends: .terraform_validate
variables:
TF_VERSION: $TERRAFORM_13_VERSION
PROVIDER: exoscale
tf-0.13.x-validate-vsphere:
extends: .terraform_validate
variables:
TF_VERSION: $TERRAFORM_13_VERSION
PROVIDER: vsphere
CLUSTER: $CI_COMMIT_REF_NAME
tf-0.13.x-validate-upcloud:
extends: .terraform_validate
variables:
TF_VERSION: $TERRAFORM_13_VERSION
PROVIDER: upcloud
CLUSTER: $CI_COMMIT_REF_NAME
tf-0.14.x-validate-openstack:
extends: .terraform_validate
variables:
TF_VERSION: $TERRAFORM_14_VERSION
PROVIDER: openstack
CLUSTER: $CI_COMMIT_REF_NAME
tf-0.14.x-validate-packet:
extends: .terraform_validate
variables:
TF_VERSION: $TERRAFORM_14_VERSION
PROVIDER: packet
CLUSTER: $CI_COMMIT_REF_NAME
tf-0.14.x-validate-aws:
extends: .terraform_validate
variables:
TF_VERSION: $TERRAFORM_14_VERSION
PROVIDER: aws
CLUSTER: $CI_COMMIT_REF_NAME
tf-0.14.x-validate-exoscale:
extends: .terraform_validate
variables:
TF_VERSION: $TERRAFORM_14_VERSION
PROVIDER: exoscale
tf-0.14.x-validate-vsphere:
extends: .terraform_validate
variables:
TF_VERSION: $TERRAFORM_14_VERSION
PROVIDER: vsphere
CLUSTER: $CI_COMMIT_REF_NAME
tf-0.14.x-validate-upcloud:
extends: .terraform_validate
variables:
TF_VERSION: $TERRAFORM_14_VERSION
PROVIDER: upcloud
CLUSTER: $CI_COMMIT_REF_NAME
# tf-packet-ubuntu16-default:
# extends: .terraform_apply
# variables:
# TF_VERSION: 0.12.24
# TF_VERSION: $TERRAFORM_14_VERSION
# PROVIDER: packet
# CLUSTER: $CI_COMMIT_REF_NAME
# TF_VAR_number_of_k8s_masters: "1"
@ -91,7 +152,7 @@ tf-validate-aws:
# tf-packet-ubuntu18-default:
# extends: .terraform_apply
# variables:
# TF_VERSION: 0.12.24
# TF_VERSION: $TERRAFORM_14_VERSION
# PROVIDER: packet
# CLUSTER: $CI_COMMIT_REF_NAME
# TF_VAR_number_of_k8s_masters: "1"
@ -146,9 +207,10 @@ tf-elastx_ubuntu18-calico:
extends: .terraform_apply
stage: deploy-part3
when: on_success
allow_failure: true
variables:
<<: *elastx_variables
TF_VERSION: 0.12.24
TF_VERSION: $TERRAFORM_14_VERSION
PROVIDER: openstack
CLUSTER: $CI_COMMIT_REF_NAME
ANSIBLE_TIMEOUT: "60"
@ -174,44 +236,45 @@ tf-elastx_ubuntu18-calico:
TF_VAR_image: ubuntu-18.04-server-latest
TF_VAR_k8s_allowed_remote_ips: '["0.0.0.0/0"]'
# OVH voucher expired, commenting job until things are sorted out
tf-ovh_cleanup:
stage: unit-tests
tags: [light]
image: python
environment: ovh
variables:
<<: *ovh_variables
before_script:
- pip install -r scripts/openstack-cleanup/requirements.txt
script:
- ./scripts/openstack-cleanup/main.py
# tf-ovh_cleanup:
# stage: unit-tests
# tags: [light]
# image: python
# environment: ovh
# variables:
# <<: *ovh_variables
# before_script:
# - pip install -r scripts/openstack-cleanup/requirements.txt
# script:
# - ./scripts/openstack-cleanup/main.py
tf-ovh_ubuntu18-calico:
extends: .terraform_apply
when: on_success
environment: ovh
variables:
<<: *ovh_variables
TF_VERSION: 0.12.24
PROVIDER: openstack
CLUSTER: $CI_COMMIT_REF_NAME
ANSIBLE_TIMEOUT: "60"
SSH_USER: ubuntu
TF_VAR_number_of_k8s_masters: "0"
TF_VAR_number_of_k8s_masters_no_floating_ip: "1"
TF_VAR_number_of_k8s_masters_no_floating_ip_no_etcd: "0"
TF_VAR_number_of_etcd: "0"
TF_VAR_number_of_k8s_nodes: "0"
TF_VAR_number_of_k8s_nodes_no_floating_ip: "1"
TF_VAR_number_of_gfs_nodes_no_floating_ip: "0"
TF_VAR_number_of_bastions: "0"
TF_VAR_number_of_k8s_masters_no_etcd: "0"
TF_VAR_use_neutron: "0"
TF_VAR_floatingip_pool: "Ext-Net"
TF_VAR_external_net: "6011fbc9-4cbf-46a4-8452-6890a340b60b"
TF_VAR_network_name: "Ext-Net"
TF_VAR_flavor_k8s_master: "defa64c3-bd46-43b4-858a-d93bbae0a229" # s1-8
TF_VAR_flavor_k8s_node: "defa64c3-bd46-43b4-858a-d93bbae0a229" # s1-8
TF_VAR_image: "Ubuntu 18.04"
TF_VAR_k8s_allowed_remote_ips: '["0.0.0.0/0"]'
# tf-ovh_ubuntu18-calico:
# extends: .terraform_apply
# when: on_success
# environment: ovh
# variables:
# <<: *ovh_variables
# TF_VERSION: $TERRAFORM_14_VERSION
# PROVIDER: openstack
# CLUSTER: $CI_COMMIT_REF_NAME
# ANSIBLE_TIMEOUT: "60"
# SSH_USER: ubuntu
# TF_VAR_number_of_k8s_masters: "0"
# TF_VAR_number_of_k8s_masters_no_floating_ip: "1"
# TF_VAR_number_of_k8s_masters_no_floating_ip_no_etcd: "0"
# TF_VAR_number_of_etcd: "0"
# TF_VAR_number_of_k8s_nodes: "0"
# TF_VAR_number_of_k8s_nodes_no_floating_ip: "1"
# TF_VAR_number_of_gfs_nodes_no_floating_ip: "0"
# TF_VAR_number_of_bastions: "0"
# TF_VAR_number_of_k8s_masters_no_etcd: "0"
# TF_VAR_use_neutron: "0"
# TF_VAR_floatingip_pool: "Ext-Net"
# TF_VAR_external_net: "6011fbc9-4cbf-46a4-8452-6890a340b60b"
# TF_VAR_network_name: "Ext-Net"
# TF_VAR_flavor_k8s_master: "defa64c3-bd46-43b4-858a-d93bbae0a229" # s1-8
# TF_VAR_flavor_k8s_node: "defa64c3-bd46-43b4-858a-d93bbae0a229" # s1-8
# TF_VAR_image: "Ubuntu 18.04"
# TF_VAR_k8s_allowed_remote_ips: '["0.0.0.0/0"]'

View File

@ -38,6 +38,11 @@ molecule_tests:
after_script:
- chronic ./tests/scripts/testcases_cleanup.sh
vagrant_ubuntu18-calico-dual-stack:
stage: deploy-part2
extends: .vagrant
when: on_success
vagrant_ubuntu18-flannel:
stage: deploy-part2
extends: .vagrant

View File

@ -1,6 +1,9 @@
---
extends: default
ignore: |
.git/
rules:
braces:
min-spaces-inside: 0

View File

@ -10,7 +10,7 @@ To install development dependencies you can use `pip install -r tests/requiremen
#### Linting
Kubespray uses `yamllint` and `ansible-lint`. To run them locally use `yamllint .` and `./tests/scripts/ansible-lint.sh`
Kubespray uses `yamllint` and `ansible-lint`. To run them locally use `yamllint .` and `ansible-lint`
#### Molecule

View File

@ -1,21 +1,30 @@
FROM ubuntu:18.04
# Use imutable image tags rather than mutable tags (like ubuntu:18.04)
FROM ubuntu:bionic-20200807
RUN mkdir /kubespray
WORKDIR /kubespray
RUN apt update -y && \
apt install -y \
RUN apt update -y \
&& apt install -y \
libssl-dev python3-dev sshpass apt-transport-https jq moreutils \
ca-certificates curl gnupg2 software-properties-common python3-pip rsync
RUN curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add - && \
add-apt-repository \
ca-certificates curl gnupg2 software-properties-common python3-pip rsync \
&& rm -rf /var/lib/apt/lists/*
RUN curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add - \
&& add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable" \
&& apt update -y && apt-get install docker-ce -y
&& apt update -y && apt-get install --no-install-recommends -y docker-ce \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /kubespray
COPY . .
RUN /usr/bin/python3 -m pip install pip -U && /usr/bin/python3 -m pip install -r tests/requirements.txt && python3 -m pip install -r requirements.txt && update-alternatives --install /usr/bin/python python /usr/bin/python3 1
RUN curl -LO https://storage.googleapis.com/kubernetes-release/release/v1.17.5/bin/linux/amd64/kubectl \
&& chmod a+x kubectl && cp kubectl /usr/local/bin/kubectl
RUN /usr/bin/python3 -m pip install pip -U \
&& /usr/bin/python3 -m pip install -r tests/requirements.txt \
&& python3 -m pip install -r requirements.txt \
&& update-alternatives --install /usr/bin/python python /usr/bin/python3 1
RUN KUBE_VERSION=$(sed -n 's/^kube_version: //p' roles/kubespray-defaults/defaults/main.yaml) \
&& curl -LO https://storage.googleapis.com/kubernetes-release/release/$KUBE_VERSION/bin/linux/amd64/kubectl \
&& chmod a+x kubectl \
&& mv kubectl /usr/local/bin/kubectl
# Some tools like yamllint need this
ENV LANG=C.UTF-8

2
OWNERS
View File

@ -4,3 +4,5 @@ approvers:
- kubespray-approvers
reviewers:
- kubespray-reviewers
emeritus_approvers:
- kubespray-emeritus_approvers

View File

@ -1,19 +1,18 @@
aliases:
kubespray-approvers:
- ant31
- mattymo
- atoms
- chadswen
- mirwan
- miouge1
- riverzhang
- verwilst
- woopstar
- luckysb
- floryut
kubespray-reviewers:
- jjungnickel
- archifleks
- holmsten
- bozzo
- floryut
- eppo
- oomichi
kubespray-emeritus_approvers:
- riverzhang
- atoms
- ant31

View File

@ -32,7 +32,7 @@ CONFIG_FILE=inventory/mycluster/hosts.yaml python3 contrib/inventory_builder/inv
# Review and change parameters under ``inventory/mycluster/group_vars``
cat inventory/mycluster/group_vars/all/all.yml
cat inventory/mycluster/group_vars/k8s-cluster/k8s-cluster.yml
cat inventory/mycluster/group_vars/k8s_cluster/k8s_cluster.yml
# Deploy Kubespray with Ansible Playbook - run the playbook as root
# The option `--become` is required, as for example writing SSL keys in /etc/,
@ -48,11 +48,23 @@ As a consequence, `ansible-playbook` command will fail with:
ERROR! no action detected in task. This often indicates a misspelled module name, or incorrect module path.
```
probably pointing on a task depending on a module present in requirements.txt (i.e. "unseal vault").
probably pointing on a task depending on a module present in requirements.txt.
One way of solving this would be to uninstall the Ansible package and then, to install it via pip but it is not always possible.
A workaround consists of setting `ANSIBLE_LIBRARY` and `ANSIBLE_MODULE_UTILS` environment variables respectively to the `ansible/modules` and `ansible/module_utils` subdirectories of pip packages installation location, which can be found in the Location field of the output of `pip show [package]` before executing `ansible-playbook`.
A simple way to ensure you get all the correct version of Ansible is to use the [pre-built docker image from Quay](https://quay.io/repository/kubespray/kubespray?tab=tags).
You will then need to use [bind mounts](https://docs.docker.com/storage/bind-mounts/) to get the inventory and ssh key into the container, like this:
```ShellSession
docker pull quay.io/kubespray/kubespray:v2.15.1
docker run --rm -it --mount type=bind,source="$(pwd)"/inventory/sample,dst=/inventory \
--mount type=bind,source="${HOME}"/.ssh/id_rsa,dst=/root/.ssh/id_rsa \
quay.io/kubespray/kubespray:v2.15.1 bash
# Inside the container you may now run the kubespray playbooks:
ansible-playbook -i /inventory/inventory.ini --private-key /root/.ssh/id_rsa cluster.yml
```
### Vagrant
For Vagrant we need to install python dependencies for provisioning tasks.
@ -105,51 +117,55 @@ vagrant up
- **Flatcar Container Linux by Kinvolk**
- **Debian** Buster, Jessie, Stretch, Wheezy
- **Ubuntu** 16.04, 18.04, 20.04
- **CentOS/RHEL** 7, 8 (experimental: see [centos 8 notes](docs/centos8.md))
- **Fedora** 31, 32
- **CentOS/RHEL** 7, [8](docs/centos8.md)
- **Fedora** 32, 33
- **Fedora CoreOS** (experimental: see [fcos Note](docs/fcos.md))
- **openSUSE** Leap 42.3/Tumbleweed
- **Oracle Linux** 7, 8 (experimental: [centos 8 notes](docs/centos8.md) apply)
- **openSUSE** Leap 15.x/Tumbleweed
- **Oracle Linux** 7, [8](docs/centos8.md)
- **Alma Linux** [8](docs/centos8.md)
- **Amazon Linux 2** (experimental: see [amazon linux notes](docs/amazonlinux.md)
Note: Upstart/SysV init based OS types are not supported.
## Supported Components
- Core
- [kubernetes](https://github.com/kubernetes/kubernetes) v1.18.8
- [etcd](https://github.com/coreos/etcd) v3.4.3
- [kubernetes](https://github.com/kubernetes/kubernetes) v1.20.7
- [etcd](https://github.com/coreos/etcd) v3.4.13
- [docker](https://www.docker.com/) v19.03 (see note)
- [containerd](https://containerd.io/) v1.2.13
- [cri-o](http://cri-o.io/) v1.17 (experimental: see [CRI-O Note](docs/cri-o.md). Only on fedora, ubuntu and centos based OS)
- [containerd](https://containerd.io/) v1.4.4
- [cri-o](http://cri-o.io/) v1.20 (experimental: see [CRI-O Note](docs/cri-o.md). Only on fedora, ubuntu and centos based OS)
- Network Plugin
- [cni-plugins](https://github.com/containernetworking/plugins) v0.8.6
- [calico](https://github.com/projectcalico/calico) v3.15.2
- [cni-plugins](https://github.com/containernetworking/plugins) v0.9.1
- [calico](https://github.com/projectcalico/calico) v3.17.4
- [canal](https://github.com/projectcalico/canal) (given calico/flannel versions)
- [cilium](https://github.com/cilium/cilium) v1.8.3
- [contiv](https://github.com/contiv/install) v1.2.1
- [flanneld](https://github.com/coreos/flannel) v0.12.0
- [kube-ovn](https://github.com/alauda/kube-ovn) v1.3.0
- [kube-router](https://github.com/cloudnativelabs/kube-router) v1.0.1
- [multus](https://github.com/intel/multus-cni) v3.6.0
- [cilium](https://github.com/cilium/cilium) v1.8.9
- [flanneld](https://github.com/coreos/flannel) v0.13.0
- [kube-ovn](https://github.com/alauda/kube-ovn) v1.6.2
- [kube-router](https://github.com/cloudnativelabs/kube-router) v1.2.2
- [multus](https://github.com/intel/multus-cni) v3.7.0
- [ovn4nfv](https://github.com/opnfv/ovn4nfv-k8s-plugin) v1.1.0
- [weave](https://github.com/weaveworks/weave) v2.7.0
- [weave](https://github.com/weaveworks/weave) v2.8.1
- Application
- [ambassador](https://github.com/datawire/ambassador): v1.5
- [cephfs-provisioner](https://github.com/kubernetes-incubator/external-storage) v2.1.0-k8s1.11
- [rbd-provisioner](https://github.com/kubernetes-incubator/external-storage) v2.1.1-k8s1.11
- [cert-manager](https://github.com/jetstack/cert-manager) v0.16.1
- [coredns](https://github.com/coredns/coredns) v1.6.7
- [ingress-nginx](https://github.com/kubernetes/ingress-nginx) v0.35.0
- [coredns](https://github.com/coredns/coredns) v1.7.0
- [ingress-nginx](https://github.com/kubernetes/ingress-nginx) v0.43.0
Note: The list of validated [docker versions](https://kubernetes.io/docs/setup/production-environment/container-runtimes/#docker) is 1.13.1, 17.03, 17.06, 17.09, 18.06, 18.09 and 19.03. The recommended docker version is 19.03. The kubelet might break on docker's non-standard version numbering (it no longer uses semantic versioning). To ensure auto-updates don't break your cluster look into e.g. yum versionlock plugin or apt pin).
## Container Runtime Notes
- The list of available docker version is 18.09, 19.03 and 20.10. The recommended docker version is 19.03. The kubelet might break on docker's non-standard version numbering (it no longer uses semantic versioning). To ensure auto-updates don't break your cluster look into e.g. yum versionlock plugin or apt pin).
- The cri-o version should be aligned with the respective kubernetes version (i.e. kube_version=1.20.x, crio_version=1.20)
## Requirements
- **Minimum required version of Kubernetes is v1.17**
- **Ansible v2.9+, Jinja 2.11+ and python-netaddr is installed on the machine that will run Ansible commands**
- **Minimum required version of Kubernetes is v1.19**
- **Ansible v2.9.x, Jinja 2.11+ and python-netaddr is installed on the machine that will run Ansible commands, Ansible 2.10.x is experimentally supported for now**
- The target servers must have **access to the Internet** in order to pull docker images. Otherwise, additional configuration is required (See [Offline Environment](docs/offline-environment.md))
- The target servers are configured to allow **IPv4 forwarding**.
- **Your ssh key must be copied** to all the servers part of your inventory.
- If using IPv6 for pods and services, the target servers are configured to allow **IPv6 forwarding**.
- The **firewalls are not managed**, you'll need to implement your own rules the way you used to.
in order to avoid any issue during deployment you should disable your firewall.
- If kubespray is ran from non-root user account, correct privilege escalation method
@ -179,9 +195,6 @@ You can choose between 10 network plugins. (default: `calico`, except Vagrant us
- [cilium](http://docs.cilium.io/en/latest/): layer 3/4 networking (as well as layer 7 to protect and secure application protocols), supports dynamic insertion of BPF bytecode into the Linux kernel to implement security services, networking and visibility logic.
- [contiv](docs/contiv.md): supports vlan, vxlan, bgp and Cisco SDN networking. This plugin is able to
apply firewall policies, segregate containers in multiple network and bridging pods onto physical networks.
- [ovn4nfv](docs/ovn4nfv.md): [ovn4nfv-k8s-plugins](https://github.com/opnfv/ovn4nfv-k8s-plugin) is the network controller, OVS agent and CNI server to offer basic SFC and OVN overlay networking.
- [weave](docs/weave.md): Weave is a lightweight container overlay network that doesn't require an external K/V database cluster.
@ -208,6 +221,8 @@ See also [Network checker](docs/netcheck.md).
- [nginx](https://kubernetes.github.io/ingress-nginx): the NGINX Ingress Controller.
- [metallb](docs/metallb.md): the MetalLB bare-metal service LoadBalancer provider.
## Community docs and resources
- [kubernetes.io/docs/setup/production-environment/tools/kubespray/](https://kubernetes.io/docs/setup/production-environment/tools/kubespray/)

60
Vagrantfile vendored
View File

@ -26,12 +26,14 @@ SUPPORTED_OS = {
"centos-bento" => {box: "bento/centos-7.6", user: "vagrant"},
"centos8" => {box: "centos/8", user: "vagrant"},
"centos8-bento" => {box: "bento/centos-8", user: "vagrant"},
"fedora31" => {box: "fedora/31-cloud-base", user: "vagrant"},
"fedora32" => {box: "fedora/32-cloud-base", user: "vagrant"},
"opensuse" => {box: "bento/opensuse-leap-15.1", user: "vagrant"},
"fedora33" => {box: "fedora/33-cloud-base", user: "vagrant"},
"opensuse" => {box: "bento/opensuse-leap-15.2", user: "vagrant"},
"opensuse-tumbleweed" => {box: "opensuse/Tumbleweed.x86_64", user: "vagrant"},
"oraclelinux" => {box: "generic/oracle7", user: "vagrant"},
"oraclelinux8" => {box: "generic/oracle8", user: "vagrant"},
"rhel7" => {box: "generic/rhel7", user: "vagrant"},
"rhel8" => {box: "generic/rhel8", user: "vagrant"},
}
if File.exist?(CONFIG)
@ -47,6 +49,7 @@ $vm_cpus ||= 2
$shared_folders ||= {}
$forwarded_ports ||= {}
$subnet ||= "172.18.8"
$subnet_ipv6 ||= "fd3c:b398:0698:0756"
$os ||= "ubuntu1804"
$network_plugin ||= "flannel"
# Setting multi_networking to true will install Multus: https://github.com/intel/multus-cni
@ -83,16 +86,16 @@ $inventory = File.absolute_path($inventory, File.dirname(__FILE__))
if ! File.exist?(File.join(File.dirname($inventory), "hosts.ini"))
$vagrant_ansible = File.join(File.dirname(__FILE__), ".vagrant", "provisioners", "ansible")
FileUtils.mkdir_p($vagrant_ansible) if ! File.exist?($vagrant_ansible)
if ! File.exist?(File.join($vagrant_ansible,"inventory"))
FileUtils.ln_s($inventory, File.join($vagrant_ansible,"inventory"))
end
$vagrant_inventory = File.join($vagrant_ansible,"inventory")
FileUtils.rm_f($vagrant_inventory)
FileUtils.ln_s($inventory, $vagrant_inventory)
end
if Vagrant.has_plugin?("vagrant-proxyconf")
$no_proxy = ENV['NO_PROXY'] || ENV['no_proxy'] || "127.0.0.1,localhost"
(1..$num_instances).each do |i|
$no_proxy += ",#{$subnet}.#{i+100}"
end
$no_proxy = ENV['NO_PROXY'] || ENV['no_proxy'] || "127.0.0.1,localhost"
(1..$num_instances).each do |i|
$no_proxy += ",#{$subnet}.#{i+100}"
end
end
Vagrant.configure("2") do |config|
@ -142,6 +145,7 @@ Vagrant.configure("2") do |config|
vb.gui = $vm_gui
vb.linked_clone = true
vb.customize ["modifyvm", :id, "--vram", "8"] # ubuntu defaults to 256 MB which is a waste of precious RAM
vb.customize ["modifyvm", :id, "--audio", "none"]
end
node.vm.provider :libvirt do |lv|
@ -176,19 +180,39 @@ Vagrant.configure("2") do |config|
node.vm.network "forwarded_port", guest: guest, host: host, auto_correct: true
end
node.vm.synced_folder ".", "/vagrant", disabled: false, type: "rsync", rsync__args: ['--verbose', '--archive', '--delete', '-z'] , rsync__exclude: ['.git','venv']
$shared_folders.each do |src, dst|
node.vm.synced_folder src, dst, type: "rsync", rsync__args: ['--verbose', '--archive', '--delete', '-z']
if ["rhel7","rhel8"].include? $os
# Vagrant synced_folder rsync options cannot be used for RHEL boxes as Rsync package cannot
# be installed until the host is registered with a valid Red Hat support subscription
node.vm.synced_folder ".", "/vagrant", disabled: false
$shared_folders.each do |src, dst|
node.vm.synced_folder src, dst
end
else
node.vm.synced_folder ".", "/vagrant", disabled: false, type: "rsync", rsync__args: ['--verbose', '--archive', '--delete', '-z'] , rsync__exclude: ['.git','venv']
$shared_folders.each do |src, dst|
node.vm.synced_folder src, dst, type: "rsync", rsync__args: ['--verbose', '--archive', '--delete', '-z']
end
end
ip = "#{$subnet}.#{i+100}"
node.vm.network :private_network, ip: ip
node.vm.network :private_network, ip: ip,
:libvirt__guest_ipv6 => 'yes',
:libvirt__ipv6_address => "#{$subnet_ipv6}::#{i+100}",
:libvirt__ipv6_prefix => "64",
:libvirt__forward_mode => "none",
:libvirt__dhcp_enabled => false
# Disable swap for each vm
node.vm.provision "shell", inline: "swapoff -a"
# Disable firewalld on oraclelinux vms
if ["oraclelinux","oraclelinux8"].include? $os
# ubuntu1804 and ubuntu2004 have IPv6 explicitly disabled. This undoes that.
if ["ubuntu1804", "ubuntu2004"].include? $os
node.vm.provision "shell", inline: "rm -f /etc/modprobe.d/local.conf"
node.vm.provision "shell", inline: "sed -i '/net.ipv6.conf.all.disable_ipv6/d' /etc/sysctl.d/99-sysctl.conf /etc/sysctl.conf"
end
# Disable firewalld on oraclelinux/redhat vms
if ["oraclelinux","oraclelinux8","rhel7","rhel8"].include? $os
node.vm.provision "shell", inline: "systemctl stop firewalld; systemctl disable firewalld"
end
@ -229,9 +253,9 @@ Vagrant.configure("2") do |config|
#ansible.tags = ['download']
ansible.groups = {
"etcd" => ["#{$instance_name_prefix}-[1:#{$etcd_instances}]"],
"kube-master" => ["#{$instance_name_prefix}-[1:#{$kube_master_instances}]"],
"kube-node" => ["#{$instance_name_prefix}-[1:#{$kube_node_instances}]"],
"k8s-cluster:children" => ["kube-master", "kube-node"],
"kube_control_plane" => ["#{$instance_name_prefix}-[1:#{$kube_master_instances}]"],
"kube_node" => ["#{$instance_name_prefix}-[1:#{$kube_node_instances}]"],
"k8s_cluster:children" => ["kube_control_plane", "kube_node"],
}
end
end

View File

@ -3,13 +3,30 @@
gather_facts: false
become: no
vars:
minimal_ansible_version: 2.8.0
minimal_ansible_version: 2.9.0
maximal_ansible_version: 2.11.0
ansible_connection: local
tasks:
- name: "Check ansible version >={{ minimal_ansible_version }}"
- name: "Check {{ minimal_ansible_version }} <= Ansible version < {{ maximal_ansible_version }}"
assert:
msg: "Ansible must be {{ minimal_ansible_version }} or higher"
msg: "Ansible must be between {{ minimal_ansible_version }} and {{ maximal_ansible_version }}"
that:
- ansible_version.string is version(minimal_ansible_version, ">=")
- ansible_version.string is version(maximal_ansible_version, "<")
tags:
- check
- name: "Check that python netaddr is installed"
assert:
msg: "Python netaddr is not present"
that: "'127.0.0.1' | ipaddr"
tags:
- check
# CentOS 7 provides too old jinja version
- name: "Check that jinja is not too old (install via pip)"
assert:
msg: "Your Jinja version is too old, install via pip"
that: "{% set test %}It works{% endset %}{{ test == 'It works' }}"
tags:
- check

View File

@ -2,31 +2,21 @@
- name: Check ansible version
import_playbook: ansible_version.yml
- hosts: all
gather_facts: false
tags: always
tasks:
- name: "Set up proxy environment"
set_fact:
proxy_env:
http_proxy: "{{ http_proxy | default ('') }}"
HTTP_PROXY: "{{ http_proxy | default ('') }}"
https_proxy: "{{ https_proxy | default ('') }}"
HTTPS_PROXY: "{{ https_proxy | default ('') }}"
no_proxy: "{{ no_proxy | default ('') }}"
NO_PROXY: "{{ no_proxy | default ('') }}"
no_log: true
- name: Ensure compatibility with old groups
import_playbook: legacy_groups.yml
- hosts: bastion[0]
gather_facts: False
environment: "{{ proxy_disable_env }}"
roles:
- { role: kubespray-defaults }
- { role: bastion-ssh-config, tags: ["localhost", "bastion"] }
- hosts: k8s-cluster:etcd
- hosts: k8s_cluster:etcd
strategy: linear
any_errors_fatal: "{{ any_errors_fatal | default(true) }}"
gather_facts: false
environment: "{{ proxy_disable_env }}"
roles:
- { role: kubespray-defaults }
- { role: bootstrap-os, tags: bootstrap-os}
@ -35,19 +25,20 @@
tags: always
import_playbook: facts.yml
- hosts: k8s-cluster:etcd
- hosts: k8s_cluster:etcd
gather_facts: False
any_errors_fatal: "{{ any_errors_fatal | default(true) }}"
environment: "{{ proxy_disable_env }}"
roles:
- { role: kubespray-defaults }
- { role: kubernetes/preinstall, tags: preinstall }
- { role: "container-engine", tags: "container-engine", when: deploy_container_engine|default(true) }
- { role: download, tags: download, when: "not skip_downloads" }
environment: "{{ proxy_env }}"
- hosts: etcd
gather_facts: False
any_errors_fatal: "{{ any_errors_fatal | default(true) }}"
environment: "{{ proxy_disable_env }}"
roles:
- { role: kubespray-defaults }
- role: etcd
@ -57,9 +48,10 @@
etcd_events_cluster_setup: "{{ etcd_events_cluster_enabled }}"
when: not etcd_kubeadm_enabled| default(false)
- hosts: k8s-cluster
- hosts: k8s_cluster
gather_facts: False
any_errors_fatal: "{{ any_errors_fatal | default(true) }}"
environment: "{{ proxy_disable_env }}"
roles:
- { role: kubespray-defaults }
- role: etcd
@ -69,50 +61,54 @@
etcd_events_cluster_setup: false
when: not etcd_kubeadm_enabled| default(false)
- hosts: k8s-cluster
- hosts: k8s_cluster
gather_facts: False
any_errors_fatal: "{{ any_errors_fatal | default(true) }}"
environment: "{{ proxy_disable_env }}"
roles:
- { role: kubespray-defaults }
- { role: kubernetes/node, tags: node }
environment: "{{ proxy_env }}"
- hosts: kube-master
- hosts: kube_control_plane
gather_facts: False
any_errors_fatal: "{{ any_errors_fatal | default(true) }}"
environment: "{{ proxy_disable_env }}"
roles:
- { role: kubespray-defaults }
- { role: kubernetes/master, tags: master }
- { role: kubernetes/control-plane, tags: master }
- { role: kubernetes/client, tags: client }
- { role: kubernetes-apps/cluster_roles, tags: cluster-roles }
- hosts: k8s-cluster
- hosts: k8s_cluster
gather_facts: False
any_errors_fatal: "{{ any_errors_fatal | default(true) }}"
environment: "{{ proxy_disable_env }}"
roles:
- { role: kubespray-defaults }
- { role: kubernetes/kubeadm, tags: kubeadm}
- { role: network_plugin, tags: network }
- { role: kubernetes/node-label, tags: node-label }
- hosts: calico-rr
- hosts: calico_rr
gather_facts: False
any_errors_fatal: "{{ any_errors_fatal | default(true) }}"
environment: "{{ proxy_disable_env }}"
roles:
- { role: kubespray-defaults }
- { role: network_plugin/calico/rr, tags: ['network', 'calico_rr'] }
- hosts: kube-master[0]
- hosts: kube_control_plane[0]
gather_facts: False
any_errors_fatal: "{{ any_errors_fatal | default(true) }}"
environment: "{{ proxy_disable_env }}"
roles:
- { role: kubespray-defaults }
- { role: kubernetes-apps/rotate_tokens, tags: rotate_tokens, when: "secret_changed|default(false)" }
- { role: win_nodes/kubernetes_patch, tags: ["master", "win_nodes"] }
- hosts: kube-master
- hosts: kube_control_plane
gather_facts: False
any_errors_fatal: "{{ any_errors_fatal | default(true) }}"
environment: "{{ proxy_disable_env }}"
roles:
- { role: kubespray-defaults }
- { role: kubernetes-apps/external_cloud_controller, tags: external-cloud-controller }
@ -121,17 +117,18 @@
- { role: kubernetes-apps/ingress_controller, tags: ingress-controller }
- { role: kubernetes-apps/external_provisioner, tags: external-provisioner }
- hosts: kube-master
- hosts: kube_control_plane
gather_facts: False
any_errors_fatal: "{{ any_errors_fatal | default(true) }}"
environment: "{{ proxy_disable_env }}"
roles:
- { role: kubespray-defaults }
- { role: kubernetes-apps, tags: apps }
environment: "{{ proxy_env }}"
- hosts: k8s-cluster
- hosts: k8s_cluster
gather_facts: False
any_errors_fatal: "{{ any_errors_fatal | default(true) }}"
environment: "{{ proxy_disable_env }}"
roles:
- { role: kubespray-defaults }
- { role: kubernetes/preinstall, when: "dns_mode != 'none' and resolvconf_mode == 'host_resolvconf'", tags: resolvconf, dns_late: true }

View File

@ -35,7 +35,7 @@ class SearchEC2Tags(object):
hosts['_meta'] = { 'hostvars': {} }
##Search ec2 three times to find nodes of each group type. Relies on kubespray-role key/value.
for group in ["kube-master", "kube-node", "etcd"]:
for group in ["kube_control_plane", "kube_node", "etcd"]:
hosts[group] = []
tag_key = "kubespray-role"
tag_value = ["*"+group+"*"]
@ -70,7 +70,7 @@ class SearchEC2Tags(object):
hosts[group].append(dns_name)
hosts['_meta']['hostvars'][dns_name] = ansible_host
hosts['k8s-cluster'] = {'children':['kube-master', 'kube-node']}
hosts['k8s_cluster'] = {'children':['kube_control_plane', 'kube_node']}
print(json.dumps(hosts, sort_keys=True, indent=2))
SearchEC2Tags()

View File

@ -0,0 +1 @@
boto3 # Apache-2.0

View File

@ -24,14 +24,14 @@ experience.
You can enable the use of a Bastion Host by changing **use_bastion** in group_vars/all to **true**. The generated
templates will then include an additional bastion VM which can then be used to connect to the masters and nodes. The option
also removes all public IPs from all other VMs.
also removes all public IPs from all other VMs.
## Generating and applying
To generate and apply the templates, call:
```shell
$ ./apply-rg.sh <resource_group_name>
./apply-rg.sh <resource_group_name>
```
If you change something in the configuration (e.g. number of nodes) later, you can call this again and Azure will
@ -42,25 +42,23 @@ take care about creating/modifying whatever is needed.
If you need to delete all resources from a resource group, simply call:
```shell
$ ./clear-rg.sh <resource_group_name>
./clear-rg.sh <resource_group_name>
```
**WARNING** this really deletes everything from your resource group, including everything that was later created by you!
## Generating an inventory for kubespray
After you have applied the templates, you can generate an inventory with this call:
```shell
$ ./generate-inventory.sh <resource_group_name>
./generate-inventory.sh <resource_group_name>
```
It will create the file ./inventory which can then be used with kubespray, e.g.:
```shell
$ cd kubespray-root-dir
$ sudo pip3 install -r requirements.txt
$ ansible-playbook -i contrib/azurerm/inventory -u devops --become -e "@inventory/sample/group_vars/all/all.yml" cluster.yml
cd kubespray-root-dir
sudo pip3 install -r requirements.txt
ansible-playbook -i contrib/azurerm/inventory -u devops --become -e "@inventory/sample/group_vars/all/all.yml" cluster.yml
```

View File

@ -7,9 +7,9 @@
{% endif %}
{% endfor %}
[kube-master]
[kube_control_plane]
{% for vm in vm_list %}
{% if 'kube-master' in vm.tags.roles %}
{% if 'kube_control_plane' in vm.tags.roles %}
{{ vm.name }}
{% endif %}
{% endfor %}
@ -21,13 +21,13 @@
{% endif %}
{% endfor %}
[kube-node]
[kube_node]
{% for vm in vm_list %}
{% if 'kube-node' in vm.tags.roles %}
{% if 'kube_node' in vm.tags.roles %}
{{ vm.name }}
{% endif %}
{% endfor %}
[k8s-cluster:children]
kube-node
kube-master
[k8s_cluster:children]
kube_node
kube_control_plane

View File

@ -7,9 +7,9 @@
{% endif %}
{% endfor %}
[kube-master]
[kube_control_plane]
{% for vm in vm_roles_list %}
{% if 'kube-master' in vm.tags.roles %}
{% if 'kube_control_plane' in vm.tags.roles %}
{{ vm.name }}
{% endif %}
{% endfor %}
@ -21,14 +21,14 @@
{% endif %}
{% endfor %}
[kube-node]
[kube_node]
{% for vm in vm_roles_list %}
{% if 'kube-node' in vm.tags.roles %}
{% if 'kube_node' in vm.tags.roles %}
{{ vm.name }}
{% endif %}
{% endfor %}
[k8s-cluster:children]
kube-node
kube-master
[k8s_cluster:children]
kube_node
kube_control_plane

View File

@ -144,7 +144,7 @@
"[concat('Microsoft.Network/networkInterfaces/', 'master-{{i}}-nic')]"
],
"tags": {
"roles": "kube-master,etcd"
"roles": "kube_control_plane,etcd"
},
"apiVersion": "{{apiVersion}}",
"properties": {

View File

@ -61,7 +61,7 @@
"[concat('Microsoft.Network/networkInterfaces/', 'minion-{{i}}-nic')]"
],
"tags": {
"roles": "kube-node"
"roles": "kube_node"
},
"apiVersion": "{{apiVersion}}",
"properties": {
@ -112,4 +112,4 @@
} {% if not loop.last %},{% endif %}
{% endfor %}
]
}
}

View File

@ -6,6 +6,7 @@ to serve as Kubernetes "nodes", which in turn will run
called DIND (Docker-IN-Docker).
The playbook has two roles:
- dind-host: creates the "nodes" as containers in localhost, with
appropriate settings for DIND (privileged, volume mapping for dind
storage, etc).
@ -27,7 +28,7 @@ See below for a complete successful run:
1. Create the node containers
~~~~
```shell
# From the kubespray root dir
cd contrib/dind
pip install -r requirements.txt
@ -36,15 +37,15 @@ ansible-playbook -i hosts dind-cluster.yaml
# Back to kubespray root
cd ../..
~~~~
```
NOTE: if the playbook run fails with something like below error
message, you may need to specifically set `ansible_python_interpreter`,
see `./hosts` file for an example expanded localhost entry.
~~~
```shell
failed: [localhost] (item=kube-node1) => {"changed": false, "item": "kube-node1", "msg": "Failed to import docker or docker-py - No module named requests.exceptions. Try `pip install docker` or `pip install docker-py` (Python 2.6)"}
~~~
```
2. Customize kubespray-dind.yaml
@ -52,33 +53,33 @@ Note that there's coupling between above created node containers
and `kubespray-dind.yaml` settings, in particular regarding selected `node_distro`
(as set in `group_vars/all/all.yaml`), and docker settings.
~~~
```shell
$EDITOR contrib/dind/kubespray-dind.yaml
~~~
```
3. Prepare the inventory and run the playbook
~~~
```shell
INVENTORY_DIR=inventory/local-dind
mkdir -p ${INVENTORY_DIR}
rm -f ${INVENTORY_DIR}/hosts.ini
CONFIG_FILE=${INVENTORY_DIR}/hosts.ini /tmp/kubespray.dind.inventory_builder.sh
ansible-playbook --become -e ansible_ssh_user=debian -i ${INVENTORY_DIR}/hosts.ini cluster.yml --extra-vars @contrib/dind/kubespray-dind.yaml
~~~
```
NOTE: You could also test other distros without editing files by
passing `--extra-vars` as per below commandline,
replacing `DISTRO` by either `debian`, `ubuntu`, `centos`, `fedora`:
~~~
```shell
cd contrib/dind
ansible-playbook -i hosts dind-cluster.yaml --extra-vars node_distro=DISTRO
cd ../..
CONFIG_FILE=inventory/local-dind/hosts.ini /tmp/kubespray.dind.inventory_builder.sh
ansible-playbook --become -e ansible_ssh_user=DISTRO -i inventory/local-dind/hosts.ini cluster.yml --extra-vars @contrib/dind/kubespray-dind.yaml --extra-vars bootstrap_os=DISTRO
~~~
```
## Resulting deployment
@ -89,7 +90,7 @@ from the host where you ran kubespray playbooks.
Running from an Ubuntu Xenial host:
~~~
```shell
$ uname -a
Linux ip-xx-xx-xx-xx 4.4.0-1069-aws #79-Ubuntu SMP Mon Sep 24
15:01:41 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
@ -149,14 +150,14 @@ kube-system weave-net-xr46t 2/2 Running 0
$ docker exec kube-node1 curl -s http://localhost:31081/api/v1/connectivity_check
{"Message":"All 10 pods successfully reported back to the server","Absent":null,"Outdated":null}
~~~
```
## Using ./run-test-distros.sh
You can use `./run-test-distros.sh` to run a set of tests via DIND,
and excerpt from this script, to get an idea:
~~~
```shell
# The SPEC file(s) must have two arrays as e.g.
# DISTROS=(debian centos)
# EXTRAS=(
@ -169,7 +170,7 @@ and excerpt from this script, to get an idea:
#
# Each $EXTRAS element will be whitespace split, and passed as --extra-vars
# to main kubespray ansible-playbook run.
~~~
```
See e.g. `test-some_distros-most_CNIs.env` and
`test-some_distros-kube_router_combo.env` in particular for a richer

View File

@ -46,7 +46,7 @@ test_distro() {
pass_or_fail "$prefix: netcheck" || return 1
}
NODES=($(egrep ^kube-node hosts))
NODES=($(egrep ^kube_node hosts))
NETCHECKER_HOST=localhost
: ${OUTPUT_DIR:=./out}

View File

@ -44,8 +44,8 @@ import re
import subprocess
import sys
ROLES = ['all', 'kube-master', 'kube-node', 'etcd', 'k8s-cluster',
'calico-rr']
ROLES = ['all', 'kube_control_plane', 'kube_node', 'etcd', 'k8s_cluster',
'calico_rr']
PROTECTED_NAMES = ROLES
AVAILABLE_COMMANDS = ['help', 'print_cfg', 'print_ips', 'print_hostnames',
'load']
@ -63,10 +63,12 @@ def get_var_as_bool(name, default):
CONFIG_FILE = os.environ.get("CONFIG_FILE", "./inventory/sample/hosts.yaml")
KUBE_MASTERS = int(os.environ.get("KUBE_MASTERS_MASTERS", 2))
# Remove the reference of KUBE_MASTERS after some deprecation cycles.
KUBE_CONTROL_HOSTS = int(os.environ.get("KUBE_CONTROL_HOSTS",
os.environ.get("KUBE_MASTERS", 2)))
# Reconfigures cluster distribution at scale
SCALE_THRESHOLD = int(os.environ.get("SCALE_THRESHOLD", 50))
MASSIVE_SCALE_THRESHOLD = int(os.environ.get("SCALE_THRESHOLD", 200))
MASSIVE_SCALE_THRESHOLD = int(os.environ.get("MASSIVE_SCALE_THRESHOLD", 200))
DEBUG = get_var_as_bool("DEBUG", True)
HOST_PREFIX = os.environ.get("HOST_PREFIX", "node")
@ -102,10 +104,11 @@ class KubesprayInventory(object):
etcd_hosts_count = 3 if len(self.hosts.keys()) >= 3 else 1
self.set_etcd(list(self.hosts.keys())[:etcd_hosts_count])
if len(self.hosts) >= SCALE_THRESHOLD:
self.set_kube_master(list(self.hosts.keys())[
etcd_hosts_count:(etcd_hosts_count + KUBE_MASTERS)])
self.set_kube_control_plane(list(self.hosts.keys())[
etcd_hosts_count:(etcd_hosts_count + KUBE_CONTROL_HOSTS)])
else:
self.set_kube_master(list(self.hosts.keys())[:KUBE_MASTERS])
self.set_kube_control_plane(
list(self.hosts.keys())[:KUBE_CONTROL_HOSTS])
self.set_kube_node(self.hosts.keys())
if len(self.hosts) >= SCALE_THRESHOLD:
self.set_calico_rr(list(self.hosts.keys())[:etcd_hosts_count])
@ -266,7 +269,7 @@ class KubesprayInventory(object):
def purge_invalid_hosts(self, hostnames, protected_names=[]):
for role in self.yaml_config['all']['children']:
if role != 'k8s-cluster' and self.yaml_config['all']['children'][role]['hosts']: # noqa
if role != 'k8s_cluster' and self.yaml_config['all']['children'][role]['hosts']: # noqa
all_hosts = self.yaml_config['all']['children'][role]['hosts'].copy() # noqa
for host in all_hosts.keys():
if host not in hostnames and host not in protected_names:
@ -287,52 +290,54 @@ class KubesprayInventory(object):
if self.yaml_config['all']['hosts'] is None:
self.yaml_config['all']['hosts'] = {host: None}
self.yaml_config['all']['hosts'][host] = opts
elif group != 'k8s-cluster:children':
elif group != 'k8s_cluster:children':
if self.yaml_config['all']['children'][group]['hosts'] is None:
self.yaml_config['all']['children'][group]['hosts'] = {
host: None}
else:
self.yaml_config['all']['children'][group]['hosts'][host] = None # noqa
def set_kube_master(self, hosts):
def set_kube_control_plane(self, hosts):
for host in hosts:
self.add_host_to_group('kube-master', host)
self.add_host_to_group('kube_control_plane', host)
def set_all(self, hosts):
for host, opts in hosts.items():
self.add_host_to_group('all', host, opts)
def set_k8s_cluster(self):
k8s_cluster = {'children': {'kube-master': None, 'kube-node': None}}
self.yaml_config['all']['children']['k8s-cluster'] = k8s_cluster
k8s_cluster = {'children': {'kube_control_plane': None,
'kube_node': None}}
self.yaml_config['all']['children']['k8s_cluster'] = k8s_cluster
def set_calico_rr(self, hosts):
for host in hosts:
if host in self.yaml_config['all']['children']['kube-master']:
self.debug("Not adding {0} to calico-rr group because it "
"conflicts with kube-master group".format(host))
if host in self.yaml_config['all']['children']['kube_control_plane']: # noqa
self.debug("Not adding {0} to calico_rr group because it "
"conflicts with kube_control_plane "
"group".format(host))
continue
if host in self.yaml_config['all']['children']['kube-node']:
self.debug("Not adding {0} to calico-rr group because it "
"conflicts with kube-node group".format(host))
if host in self.yaml_config['all']['children']['kube_node']:
self.debug("Not adding {0} to calico_rr group because it "
"conflicts with kube_node group".format(host))
continue
self.add_host_to_group('calico-rr', host)
self.add_host_to_group('calico_rr', host)
def set_kube_node(self, hosts):
for host in hosts:
if len(self.yaml_config['all']['hosts']) >= SCALE_THRESHOLD:
if host in self.yaml_config['all']['children']['etcd']['hosts']: # noqa
self.debug("Not adding {0} to kube-node group because of "
self.debug("Not adding {0} to kube_node group because of "
"scale deployment and host is in etcd "
"group.".format(host))
continue
if len(self.yaml_config['all']['hosts']) >= MASSIVE_SCALE_THRESHOLD: # noqa
if host in self.yaml_config['all']['children']['kube-master']['hosts']: # noqa
self.debug("Not adding {0} to kube-node group because of "
"scale deployment and host is in kube-master "
"group.".format(host))
if host in self.yaml_config['all']['children']['kube_control_plane']['hosts']: # noqa
self.debug("Not adding {0} to kube_node group because of "
"scale deployment and host is in "
"kube_control_plane group.".format(host))
continue
self.add_host_to_group('kube-node', host)
self.add_host_to_group('kube_node', host)
def set_etcd(self, hosts):
for host in hosts:
@ -402,8 +407,9 @@ Configurable env vars:
DEBUG Enable debug printing. Default: True
CONFIG_FILE File to write config to Default: ./inventory/sample/hosts.yaml
HOST_PREFIX Host prefix for generated hosts. Default: node
KUBE_CONTROL_HOSTS Set the number of kube-control-planes. Default: 2
SCALE_THRESHOLD Separate ETCD role if # of nodes >= 50
MASSIVE_SCALE_THRESHOLD Separate K8s master and ETCD if # of nodes >= 200
MASSIVE_SCALE_THRESHOLD Separate K8s control-plane and ETCD if # of nodes >= 200
''' # noqa
print(help_text)

View File

@ -13,8 +13,8 @@
# under the License.
import inventory
import mock
import unittest
from unittest import mock
from collections import OrderedDict
import sys
@ -51,7 +51,7 @@ class TestInventory(unittest.TestCase):
groups = ['group1', 'group2']
self.inv.ensure_required_groups(groups)
for group in groups:
self.assertTrue(group in self.inv.yaml_config['all']['children'])
self.assertIn(group, self.inv.yaml_config['all']['children'])
def test_get_host_id(self):
hostnames = ['node99', 'no99de01', '01node01', 'node1.domain',
@ -209,8 +209,8 @@ class TestInventory(unittest.TestCase):
('doesnotbelong2', {'whateveropts=ilike'})])
self.inv.yaml_config['all']['hosts'] = existing_hosts
self.inv.purge_invalid_hosts(proper_hostnames)
self.assertTrue(
bad_host not in self.inv.yaml_config['all']['hosts'].keys())
self.assertNotIn(
bad_host, self.inv.yaml_config['all']['hosts'].keys())
def test_add_host_to_group(self):
group = 'etcd'
@ -222,13 +222,13 @@ class TestInventory(unittest.TestCase):
self.inv.yaml_config['all']['children'][group]['hosts'].get(host),
None)
def test_set_kube_master(self):
group = 'kube-master'
def test_set_kube_control_plane(self):
group = 'kube_control_plane'
host = 'node1'
self.inv.set_kube_master([host])
self.assertTrue(
host in self.inv.yaml_config['all']['children'][group]['hosts'])
self.inv.set_kube_control_plane([host])
self.assertIn(
host, self.inv.yaml_config['all']['children'][group]['hosts'])
def test_set_all(self):
hosts = OrderedDict([
@ -241,30 +241,30 @@ class TestInventory(unittest.TestCase):
self.inv.yaml_config['all']['hosts'].get(host), opt)
def test_set_k8s_cluster(self):
group = 'k8s-cluster'
expected_hosts = ['kube-node', 'kube-master']
group = 'k8s_cluster'
expected_hosts = ['kube_node', 'kube_control_plane']
self.inv.set_k8s_cluster()
for host in expected_hosts:
self.assertTrue(
host in
self.assertIn(
host,
self.inv.yaml_config['all']['children'][group]['children'])
def test_set_kube_node(self):
group = 'kube-node'
group = 'kube_node'
host = 'node1'
self.inv.set_kube_node([host])
self.assertTrue(
host in self.inv.yaml_config['all']['children'][group]['hosts'])
self.assertIn(
host, self.inv.yaml_config['all']['children'][group]['hosts'])
def test_set_etcd(self):
group = 'etcd'
host = 'node1'
self.inv.set_etcd([host])
self.assertTrue(
host in self.inv.yaml_config['all']['children'][group]['hosts'])
self.assertIn(
host, self.inv.yaml_config['all']['children'][group]['hosts'])
def test_scale_scenario_one(self):
num_nodes = 50
@ -275,12 +275,12 @@ class TestInventory(unittest.TestCase):
self.inv.set_all(hosts)
self.inv.set_etcd(list(hosts.keys())[0:3])
self.inv.set_kube_master(list(hosts.keys())[0:2])
self.inv.set_kube_control_plane(list(hosts.keys())[0:2])
self.inv.set_kube_node(hosts.keys())
for h in range(3):
self.assertFalse(
list(hosts.keys())[h] in
self.inv.yaml_config['all']['children']['kube-node']['hosts'])
self.inv.yaml_config['all']['children']['kube_node']['hosts'])
def test_scale_scenario_two(self):
num_nodes = 500
@ -291,12 +291,12 @@ class TestInventory(unittest.TestCase):
self.inv.set_all(hosts)
self.inv.set_etcd(list(hosts.keys())[0:3])
self.inv.set_kube_master(list(hosts.keys())[3:5])
self.inv.set_kube_control_plane(list(hosts.keys())[3:5])
self.inv.set_kube_node(hosts.keys())
for h in range(5):
self.assertFalse(
list(hosts.keys())[h] in
self.inv.yaml_config['all']['children']['kube-node']['hosts'])
self.inv.yaml_config['all']['children']['kube_node']['hosts'])
def test_range2ips_range(self):
changed_hosts = ['10.90.0.2', '10.90.0.4-10.90.0.6', '10.90.0.8']

View File

@ -5,7 +5,7 @@ deployment on VMs.
This playbook does not create Virtual Machines, nor does it run Kubespray itself.
### User creation
## User creation
If you want to create a user for running Kubespray deployment, you should specify
both `k8s_deployment_user` and `k8s_deployment_user_pkey_path`.

View File

@ -8,19 +8,19 @@ In the same directory of this ReadMe file you should find a file named `inventor
Change that file to reflect your local setup (adding more machines or removing them and setting the adequate ip numbers), and save it to `inventory/sample/k8s_gfs_inventory`. Make sure that the settings on `inventory/sample/group_vars/all.yml` make sense with your deployment. Then execute change to the kubespray root folder, and execute (supposing that the machines are all using ubuntu):
```
```shell
ansible-playbook -b --become-user=root -i inventory/sample/k8s_gfs_inventory --user=ubuntu ./cluster.yml
```
This will provision your Kubernetes cluster. Then, to provision and configure the GlusterFS cluster, from the same directory execute:
```
```shell
ansible-playbook -b --become-user=root -i inventory/sample/k8s_gfs_inventory --user=ubuntu ./contrib/network-storage/glusterfs/glusterfs.yml
```
If your machines are not using Ubuntu, you need to change the `--user=ubuntu` to the correct user. Alternatively, if your Kubernetes machines are using one OS and your GlusterFS a different one, you can instead specify the `ansible_ssh_user=<correct-user>` variable in the inventory file that you just created, for each machine/VM:
```
```shell
k8s-master-1 ansible_ssh_host=192.168.0.147 ip=192.168.0.147 ansible_ssh_user=core
k8s-master-node-1 ansible_ssh_host=192.168.0.148 ip=192.168.0.148 ansible_ssh_user=core
k8s-master-node-2 ansible_ssh_host=192.168.0.146 ip=192.168.0.146 ansible_ssh_user=core
@ -30,7 +30,7 @@ k8s-master-node-2 ansible_ssh_host=192.168.0.146 ip=192.168.0.146 ansible_ssh_us
First step is to fill in a `my-kubespray-gluster-cluster.tfvars` file with the specification desired for your cluster. An example with all required variables would look like:
```
```ini
cluster_name = "cluster1"
number_of_k8s_masters = "1"
number_of_k8s_masters_no_floating_ip = "2"
@ -39,7 +39,7 @@ number_of_k8s_nodes = "0"
public_key_path = "~/.ssh/my-desired-key.pub"
image = "Ubuntu 16.04"
ssh_user = "ubuntu"
flavor_k8s_node = "node-flavor-id-in-your-openstack"
flavor_k8s_node = "node-flavor-id-in-your-openstack"
flavor_k8s_master = "master-flavor-id-in-your-openstack"
network_name = "k8s-network"
floatingip_pool = "net_external"
@ -54,7 +54,7 @@ ssh_user_gfs = "ubuntu"
As explained in the general terraform/openstack guide, you need to source your OpenStack credentials file, add your ssh-key to the ssh-agent and setup environment variables for terraform:
```
```shell
$ source ~/.stackrc
$ eval $(ssh-agent -s)
$ ssh-add ~/.ssh/my-desired-key
@ -67,7 +67,7 @@ $ echo Setting up Terraform creds && \
Then, standing on the kubespray directory (root base of the Git checkout), issue the following terraform command to create the VMs for the cluster:
```
```shell
terraform apply -state=contrib/terraform/openstack/terraform.tfstate -var-file=my-kubespray-gluster-cluster.tfvars contrib/terraform/openstack
```
@ -75,18 +75,18 @@ This will create both your Kubernetes and Gluster VMs. Make sure that the ansibl
Then, provision your Kubernetes (kubespray) cluster with the following ansible call:
```
```shell
ansible-playbook -b --become-user=root -i contrib/terraform/openstack/hosts ./cluster.yml
```
Finally, provision the glusterfs nodes and add the Persistent Volume setup for GlusterFS in Kubernetes through the following ansible call:
```
```shell
ansible-playbook -b --become-user=root -i contrib/terraform/openstack/hosts ./contrib/network-storage/glusterfs/glusterfs.yml
```
If you need to destroy the cluster, you can run:
```
```shell
terraform destroy -state=contrib/terraform/openstack/terraform.tfstate -var-file=my-kubespray-gluster-cluster.tfvars contrib/terraform/openstack
```

View File

@ -15,10 +15,10 @@
roles:
- { role: glusterfs/server }
- hosts: k8s-cluster
- hosts: k8s_cluster
roles:
- { role: glusterfs/client }
- hosts: kube-master[0]
- hosts: kube_control_plane[0]
roles:
- { role: kubernetes-pv }

View File

@ -14,7 +14,7 @@
# gfs_node2 ansible_ssh_host=95.54.0.19 # disk_volume_device_1=/dev/vdc ip=10.3.0.8
# gfs_node3 ansible_ssh_host=95.54.0.20 # disk_volume_device_1=/dev/vdc ip=10.3.0.9
# [kube-master]
# [kube_control_plane]
# node1
# node2
@ -23,16 +23,16 @@
# node2
# node3
# [kube-node]
# [kube_node]
# node2
# node3
# node4
# node5
# node6
# [k8s-cluster:children]
# kube-node
# kube-master
# [k8s_cluster:children]
# kube_node
# kube_control_plane
# [gfs-cluster]
# gfs_node1

View File

@ -8,7 +8,7 @@ Installs and configures GlusterFS on Linux.
For GlusterFS to connect between servers, TCP ports `24007`, `24008`, and `24009`/`49152`+ (that port, plus an additional incremented port for each additional server in the cluster; the latter if GlusterFS is version 3.4+), and TCP/UDP port `111` must be open. You can open these using whatever firewall you wish (this can easily be configured using the `geerlingguy.firewall` role).
This role performs basic installation and setup of Gluster, but it does not configure or mount bricks (volumes), since that step is easier to do in a series of plays in your own playbook. Ansible 1.9+ includes the [`gluster_volume`](https://docs.ansible.com/gluster_volume_module.html) module to ease the management of Gluster volumes.
This role performs basic installation and setup of Gluster, but it does not configure or mount bricks (volumes), since that step is easier to do in a series of plays in your own playbook. Ansible 1.9+ includes the [`gluster_volume`](https://docs.ansible.com/ansible/latest/collections/gluster/gluster/gluster_volume_module.html) module to ease the management of Gluster volumes.
## Role Variables

View File

@ -8,7 +8,7 @@
- { file: glusterfs-kubernetes-pv.yml.j2, type: pv, dest: glusterfs-kubernetes-pv.yml}
- { file: glusterfs-kubernetes-endpoint-svc.json.j2, type: svc, dest: glusterfs-kubernetes-endpoint-svc.json}
register: gluster_pv
when: inventory_hostname == groups['kube-master'][0] and groups['gfs-cluster'] is defined and hostvars[groups['gfs-cluster'][0]].gluster_disk_size_gb is defined
when: inventory_hostname == groups['kube_control_plane'][0] and groups['gfs-cluster'] is defined and hostvars[groups['gfs-cluster'][0]].gluster_disk_size_gb is defined
- name: Kubernetes Apps | Set GlusterFS endpoint and PV
kube:
@ -19,4 +19,4 @@
filename: "{{ kube_config_dir }}/{{ item.item.dest }}"
state: "{{ item.changed | ternary('latest','present') }}"
with_items: "{{ gluster_pv.results }}"
when: inventory_hostname == groups['kube-master'][0] and groups['gfs-cluster'] is defined
when: inventory_hostname == groups['kube_control_plane'][0] and groups['gfs-cluster'] is defined

View File

@ -1,17 +1,26 @@
# Deploy Heketi/Glusterfs into Kubespray/Kubernetes
This playbook aims to automate [this](https://github.com/heketi/heketi/blob/master/docs/admin/install-kubernetes.md) tutorial. It deploys heketi/glusterfs into kubernetes and sets up a storageclass.
## Important notice
> Due to resource limits on the current project maintainers and general lack of contributions we are considering placing Heketi into a [near-maintenance mode](https://github.com/heketi/heketi#important-notice)
## Client Setup
Heketi provides a CLI that provides users with a means to administer the deployment and configuration of GlusterFS in Kubernetes. [Download and install the heketi-cli](https://github.com/heketi/heketi/releases) on your client machine.
## Install
Copy the inventory.yml.sample over to inventory/sample/k8s_heketi_inventory.yml and change it according to your setup.
```
```shell
ansible-playbook --ask-become -i inventory/sample/k8s_heketi_inventory.yml contrib/network-storage/heketi/heketi.yml
```
## Tear down
```
```shell
ansible-playbook --ask-become -i inventory/sample/k8s_heketi_inventory.yml contrib/network-storage/heketi/heketi-tear-down.yml
```

View File

@ -1,5 +1,5 @@
---
- hosts: kube-master[0]
- hosts: kube_control_plane[0]
roles:
- { role: tear-down }

View File

@ -3,7 +3,7 @@
roles:
- { role: prepare }
- hosts: kube-master[0]
- hosts: kube_control_plane[0]
tags:
- "provision"
roles:

View File

@ -3,17 +3,17 @@ all:
heketi_admin_key: "11elfeinhundertundelf"
heketi_user_key: "!!einseinseins"
children:
k8s-cluster:
k8s_cluster:
vars:
kubelet_fail_swap_on: false
children:
kube-master:
kube_control_plane:
hosts:
node1:
etcd:
hosts:
node2:
kube-node:
kube_node:
hosts: &kube_nodes
node1:
node2:

43
contrib/offline/README.md Normal file
View File

@ -0,0 +1,43 @@
# Offline deployment
## manage-offline-container-images.sh
Container image collecting script for offline deployment
This script has two features:
(1) Get container images from an environment which is deployed online.
(2) Deploy local container registry and register the container images to the registry.
Step(1) should be done online site as a preparation, then we bring the gotten images
to the target offline environment.
Then we will run step(2) for registering the images to local registry.
Step(1) can be operated with:
```shell
manage-offline-container-images.sh create
```
Step(2) can be operated with:
```shell
manage-offline-container-images.sh register
```
## generate_list.sh
This script generates the list of downloaded files and the list of container images by `roles/download/defaults/main.yml` file.
Run this script will generates three files, all downloaded files url in files.list, all container images in images.list, all component version in generate.sh.
```shell
bash generate_list.sh
tree temp
temp
├── files.list
├── generate.sh
└── images.list
0 directories, 3 files
```
In some cases you may want to update some component version, you can edit `generate.sh` file, then run `bash generate.sh | grep 'https' > files.list` to update file.list or run `bash generate.sh | grep -v 'https'> images.list` to update images.list.

View File

@ -0,0 +1 @@
{ "insecure-registries":["HOSTNAME:5000"] }

View File

@ -0,0 +1,57 @@
#!/bin/bash
set -eo pipefail
CURRENT_DIR=$(cd $(dirname $0); pwd)
TEMP_DIR="${CURRENT_DIR}/temp"
REPO_ROOT_DIR="${CURRENT_DIR%/contrib/offline}"
: ${IMAGE_ARCH:="amd64"}
: ${ANSIBLE_SYSTEM:="linux"}
: ${ANSIBLE_ARCHITECTURE:="x86_64"}
: ${DOWNLOAD_YML:="roles/download/defaults/main.yml"}
: ${KUBE_VERSION_YAML:="roles/kubespray-defaults/defaults/main.yaml"}
mkdir -p ${TEMP_DIR}
# ARCH used in convert {%- if image_arch != 'amd64' -%}-{{ image_arch }}{%- endif -%} to {{arch}}
if [ "${IMAGE_ARCH}" != "amd64" ]; then ARCH="${IMAGE_ARCH}"; fi
cat > ${TEMP_DIR}/generate.sh << EOF
arch=${ARCH}
image_arch=${IMAGE_ARCH}
ansible_system=${ANSIBLE_SYSTEM}
ansible_architecture=${ANSIBLE_ARCHITECTURE}
EOF
# generate all component version by $DOWNLOAD_YML
grep 'kube_version:' ${REPO_ROOT_DIR}/${KUBE_VERSION_YAML} \
| sed 's/: /=/g' >> ${TEMP_DIR}/generate.sh
grep '_version:' ${REPO_ROOT_DIR}/${DOWNLOAD_YML} \
| sed 's/: /=/g;s/{{/${/g;s/}}/}/g' | tr -d ' ' >> ${TEMP_DIR}/generate.sh
sed -i 's/kube_major_version=.*/kube_major_version=${kube_version%.*}/g' ${TEMP_DIR}/generate.sh
sed -i 's/crictl_version=.*/crictl_version=${kube_version%.*}.0/g' ${TEMP_DIR}/generate.sh
# generate all download files url
grep 'download_url:' ${REPO_ROOT_DIR}/${DOWNLOAD_YML} \
| sed 's/: /=/g;s/ //g;s/{{/${/g;s/}}/}/g;s/|lower//g;s/^.*_url=/echo /g' >> ${TEMP_DIR}/generate.sh
# generate all images list
grep -E '_repo:|_tag:' ${REPO_ROOT_DIR}/${DOWNLOAD_YML} \
| sed "s#{%- if image_arch != 'amd64' -%}-{{ image_arch }}{%- endif -%}#{{arch}}#g" \
| sed 's/: /=/g;s/{{/${/g;s/}}/}/g' | tr -d ' ' >> ${TEMP_DIR}/generate.sh
sed -n '/^downloads:/,/download_defaults:/p' ${REPO_ROOT_DIR}/${DOWNLOAD_YML} \
| sed -n "s/repo: //p;s/tag: //p" | tr -d ' ' | sed 's/{{/${/g;s/}}/}/g' \
| sed 'N;s#\n# #g' | tr ' ' ':' | sed 's/^/echo /g' >> ${TEMP_DIR}/generate.sh
# special handling for https://github.com/kubernetes-sigs/kubespray/pull/7570
sed -i 's#^coredns_image_repo=.*#coredns_image_repo=${kube_image_repo}$(if printf "%s\\n%s\\n" v1.21 ${kube_version%.*} | sort --check=quiet --version-sort; then echo -n /coredns/coredns;else echo -n /coredns; fi)#' ${TEMP_DIR}/generate.sh
sed -i 's#^coredns_image_tag=.*#coredns_image_tag=$(if printf "%s\\n%s\\n" v1.21 ${kube_version%.*} | sort --check=quiet --version-sort; then echo -n ${coredns_version};else echo -n ${coredns_version/v/}; fi)#' ${TEMP_DIR}/generate.sh
# add kube-* images to images list
KUBE_IMAGES="kube-apiserver kube-controller-manager kube-scheduler kube-proxy"
echo "${KUBE_IMAGES}" | tr ' ' '\n' | xargs -L1 -I {} \
echo 'echo ${kube_image_repo}/{}:${kube_version}' >> ${TEMP_DIR}/generate.sh
# print files.list and images.list
bash ${TEMP_DIR}/generate.sh | grep 'https' | sort > ${TEMP_DIR}/files.list
bash ${TEMP_DIR}/generate.sh | grep -v 'https' | sort > ${TEMP_DIR}/images.list

View File

@ -0,0 +1,170 @@
#!/bin/bash
OPTION=$1
CURRENT_DIR=$(cd $(dirname $0); pwd)
TEMP_DIR="${CURRENT_DIR}/temp"
IMAGE_TAR_FILE="${CURRENT_DIR}/container-images.tar.gz"
IMAGE_DIR="${CURRENT_DIR}/container-images"
IMAGE_LIST="${IMAGE_DIR}/container-images.txt"
RETRY_COUNT=5
function create_container_image_tar() {
set -e
IMAGES=$(kubectl describe pods --all-namespaces | grep " Image:" | awk '{print $2}' | sort | uniq)
# NOTE: etcd and pause cannot be seen as pods.
# The pause image is used for --pod-infra-container-image option of kubelet.
EXT_IMAGES=$(kubectl cluster-info dump | egrep "quay.io/coreos/etcd:|k8s.gcr.io/pause:" | sed s@\"@@g)
IMAGES="${IMAGES} ${EXT_IMAGES}"
rm -f ${IMAGE_TAR_FILE}
rm -rf ${IMAGE_DIR}
mkdir ${IMAGE_DIR}
cd ${IMAGE_DIR}
sudo docker pull registry:latest
sudo docker save -o registry-latest.tar registry:latest
for image in ${IMAGES}
do
FILE_NAME="$(echo ${image} | sed s@"/"@"-"@g | sed s/":"/"-"/g)".tar
set +e
for step in $(seq 1 ${RETRY_COUNT})
do
sudo docker pull ${image}
if [ $? -eq 0 ]; then
break
fi
echo "Failed to pull ${image} at step ${step}"
if [ ${step} -eq ${RETRY_COUNT} ]; then
exit 1
fi
done
set -e
sudo docker save -o ${FILE_NAME} ${image}
# NOTE: Here removes the following repo parts from each image
# so that these parts will be replaced with Kubespray.
# - kube_image_repo: "k8s.gcr.io"
# - gcr_image_repo: "gcr.io"
# - docker_image_repo: "docker.io"
# - quay_image_repo: "quay.io"
FIRST_PART=$(echo ${image} | awk -F"/" '{print $1}')
if [ "${FIRST_PART}" = "k8s.gcr.io" ] ||
[ "${FIRST_PART}" = "gcr.io" ] ||
[ "${FIRST_PART}" = "docker.io" ] ||
[ "${FIRST_PART}" = "quay.io" ]; then
image=$(echo ${image} | sed s@"${FIRST_PART}/"@@)
fi
echo "${FILE_NAME} ${image}" >> ${IMAGE_LIST}
done
cd ..
sudo chown ${USER} ${IMAGE_DIR}/*
tar -zcvf ${IMAGE_TAR_FILE} ./container-images
rm -rf ${IMAGE_DIR}
echo ""
echo "${IMAGE_TAR_FILE} is created to contain your container images."
echo "Please keep this file and bring it to your offline environment."
}
function register_container_images() {
if [ ! -f ${IMAGE_TAR_FILE} ]; then
echo "${IMAGE_TAR_FILE} should exist."
exit 1
fi
if [ ! -d ${TEMP_DIR} ]; then
mkdir ${TEMP_DIR}
fi
# To avoid "http: server gave http response to https client" error.
LOCALHOST_NAME=$(hostname)
if [ -d /etc/docker/ ]; then
set -e
# Ubuntu18.04, RHEL7/CentOS7
cp ${CURRENT_DIR}/docker-daemon.json ${TEMP_DIR}/docker-daemon.json
sed -i s@"HOSTNAME"@"${LOCALHOST_NAME}"@ ${TEMP_DIR}/docker-daemon.json
sudo cp ${TEMP_DIR}/docker-daemon.json /etc/docker/daemon.json
elif [ -d /etc/containers/ ]; then
set -e
# RHEL8/CentOS8
cp ${CURRENT_DIR}/registries.conf ${TEMP_DIR}/registries.conf
sed -i s@"HOSTNAME"@"${LOCALHOST_NAME}"@ ${TEMP_DIR}/registries.conf
sudo cp ${TEMP_DIR}/registries.conf /etc/containers/registries.conf
else
echo "docker package(docker-ce, etc.) should be installed"
exit 1
fi
tar -zxvf ${IMAGE_TAR_FILE}
sudo docker load -i ${IMAGE_DIR}/registry-latest.tar
set +e
sudo docker container inspect registry >/dev/null 2>&1
if [ $? -ne 0 ]; then
sudo docker run --restart=always -d -p 5000:5000 --name registry registry:latest
fi
set -e
while read -r line; do
file_name=$(echo ${line} | awk '{print $1}')
raw_image=$(echo ${line} | awk '{print $2}')
new_image="${LOCALHOST_NAME}:5000/${raw_image}"
org_image=$(sudo docker load -i ${IMAGE_DIR}/${file_name} | head -n1 | awk '{print $3}')
image_id=$(sudo docker image inspect ${org_image} | grep "\"Id\":" | awk -F: '{print $3}'| sed s/'\",'//)
if [ -z "${file_name}" ]; then
echo "Failed to get file_name for line ${line}"
exit 1
fi
if [ -z "${raw_image}" ]; then
echo "Failed to get raw_image for line ${line}"
exit 1
fi
if [ -z "${org_image}" ]; then
echo "Failed to get org_image for line ${line}"
exit 1
fi
if [ -z "${image_id}" ]; then
echo "Failed to get image_id for file ${file_name}"
exit 1
fi
sudo docker load -i ${IMAGE_DIR}/${file_name}
sudo docker tag ${image_id} ${new_image}
sudo docker push ${new_image}
done <<< "$(cat ${IMAGE_LIST})"
echo "Succeeded to register container images to local registry."
echo "Please specify ${LOCALHOST_NAME}:5000 for the following options in your inventry:"
echo "- kube_image_repo"
echo "- gcr_image_repo"
echo "- docker_image_repo"
echo "- quay_image_repo"
}
if [ "${OPTION}" == "create" ]; then
create_container_image_tar
elif [ "${OPTION}" == "register" ]; then
register_container_images
else
echo "This script has two features:"
echo "(1) Get container images from an environment which is deployed online."
echo "(2) Deploy local container registry and register the container images to the registry."
echo ""
echo "Step(1) should be done online site as a preparation, then we bring"
echo "the gotten images to the target offline environment."
echo "Then we will run step(2) for registering the images to local registry."
echo ""
echo "${IMAGE_TAR_FILE} is created to contain your container images."
echo "Please keep this file and bring it to your offline environment."
echo ""
echo "Step(1) can be operated with:"
echo " $ ./manage-offline-container-images.sh create"
echo ""
echo "Step(2) can be operated with:"
echo " $ ./manage-offline-container-images.sh register"
echo ""
echo "Please specify 'create' or 'register'."
echo ""
exit 1
fi

View File

@ -0,0 +1,8 @@
[registries.search]
registries = ['registry.access.redhat.com', 'registry.redhat.io', 'docker.io']
[registries.insecure]
registries = ['HOSTNAME:5000']
[registries.block]
registries = []

View File

@ -0,0 +1,4 @@
---
- hosts: all
roles:
- { role: prepare }

View File

@ -0,0 +1,2 @@
---
disable_service_firewall: false

View File

@ -0,0 +1,23 @@
---
- block:
- name: List services
service_facts:
- name: Disable service firewalld
systemd:
name: firewalld
state: stopped
enabled: no
when:
"'firewalld.service' in services"
- name: Disable service ufw
systemd:
name: ufw
state: stopped
enabled: no
when:
"'ufw.service' in services"
when:
- disable_service_firewall

View File

@ -51,7 +51,7 @@ export SKIP_PIP_INSTALL=1
%doc %{_docdir}/%{name}/inventory/sample/hosts.ini
%config %{_sysconfdir}/%{name}/ansible.cfg
%config %{_sysconfdir}/%{name}/inventory/sample/group_vars/all.yml
%config %{_sysconfdir}/%{name}/inventory/sample/group_vars/k8s-cluster.yml
%config %{_sysconfdir}/%{name}/inventory/sample/group_vars/k8s_cluster.yml
%license %{_docdir}/%{name}/LICENSE
%{python2_sitelib}/%{srcname}-%{release}-py%{python2_version}.egg-info
%{_datarootdir}/%{name}/roles/

View File

@ -1,40 +1,44 @@
## Kubernetes on AWS with Terraform
# Kubernetes on AWS with Terraform
**Overview:**
## Overview
This project will create:
* VPC with Public and Private Subnets in # Availability Zones
* Bastion Hosts and NAT Gateways in the Public Subnet
* A dynamic number of masters, etcd, and worker nodes in the Private Subnet
* even distributed over the # of Availability Zones
* AWS ELB in the Public Subnet for accessing the Kubernetes API from the internet
**Requirements**
- VPC with Public and Private Subnets in # Availability Zones
- Bastion Hosts and NAT Gateways in the Public Subnet
- A dynamic number of masters, etcd, and worker nodes in the Private Subnet
- even distributed over the # of Availability Zones
- AWS ELB in the Public Subnet for accessing the Kubernetes API from the internet
## Requirements
- Terraform 0.12.0 or newer
**How to Use:**
## How to Use
- Export the variables for your AWS credentials or edit `credentials.tfvars`:
```
```commandline
export TF_VAR_AWS_ACCESS_KEY_ID="www"
export TF_VAR_AWS_SECRET_ACCESS_KEY ="xxx"
export TF_VAR_AWS_SSH_KEY_NAME="yyy"
export TF_VAR_AWS_DEFAULT_REGION="zzz"
```
- Update `contrib/terraform/aws/terraform.tfvars` with your data. By default, the Terraform scripts use Ubuntu 18.04 LTS (Bionic) as base image. If you want to change this behaviour, see note "Using other distrib than Ubuntu" below.
- Create an AWS EC2 SSH Key
- Run with `terraform apply --var-file="credentials.tfvars"` or `terraform apply` depending if you exported your AWS credentials
Example:
```commandline
terraform apply -var-file=credentials.tfvars
```
- Terraform automatically creates an Ansible Inventory file called `hosts` with the created infrastructure in the directory `inventory`
- Ansible will automatically generate an ssh config file for your bastion hosts. To connect to hosts with ssh using bastion host use generated ssh-bastion.conf.
Ansible automatically detects bastion and changes ssh_args
```commandline
ssh -F ./ssh-bastion.conf user@$ip
```
@ -42,14 +46,19 @@ ssh -F ./ssh-bastion.conf user@$ip
- Once the infrastructure is created, you can run the kubespray playbooks and supply inventory/hosts with the `-i` flag.
Example (this one assumes you are using Ubuntu)
```commandline
ansible-playbook -i ./inventory/hosts ./cluster.yml -e ansible_user=ubuntu -b --become-user=root --flush-cache
```
***Using other distrib than Ubuntu***
If you want to use another distribution than Ubuntu 18.04 (Bionic) LTS, you can modify the search filters of the 'data "aws_ami" "distro"' in variables.tf.
For example, to use:
- Debian Jessie, replace 'data "aws_ami" "distro"' in variables.tf with
```ini
data "aws_ami" "distro" {
most_recent = true
@ -65,8 +74,11 @@ data "aws_ami" "distro" {
owners = ["379101102735"]
}
```
- Ubuntu 16.04, replace 'data "aws_ami" "distro"' in variables.tf with
```ini
data "aws_ami" "distro" {
most_recent = true
@ -82,8 +94,11 @@ data "aws_ami" "distro" {
owners = ["099720109477"]
}
```
- Centos 7, replace 'data "aws_ami" "distro"' in variables.tf with
```ini
data "aws_ami" "distro" {
most_recent = true
@ -99,23 +114,49 @@ data "aws_ami" "distro" {
owners = ["688023202711"]
}
```
**Troubleshooting**
## Connecting to Kubernetes
***Remaining AWS IAM Instance Profile***:
You can use the following set of commands to get the kubeconfig file from your newly created cluster. Before running the commands, make sure you are in the project's root folder.
```commandline
# Get the controller's IP address.
CONTROLLER_HOST_NAME=$(cat ./inventory/hosts | grep "\[kube_control_plane\]" -A 1 | tail -n 1)
CONTROLLER_IP=$(cat ./inventory/hosts | grep $CONTROLLER_HOST_NAME | grep ansible_host | cut -d'=' -f2)
# Get the hostname of the load balancer.
LB_HOST=$(cat inventory/hosts | grep apiserver_loadbalancer_domain_name | cut -d'"' -f2)
# Get the controller's SSH fingerprint.
ssh-keygen -R $CONTROLLER_IP > /dev/null 2>&1
ssh-keyscan -H $CONTROLLER_IP >> ~/.ssh/known_hosts 2>/dev/null
# Get the kubeconfig from the controller.
mkdir -p ~/.kube
ssh -F ssh-bastion.conf centos@$CONTROLLER_IP "sudo chmod 644 /etc/kubernetes/admin.conf"
scp -F ssh-bastion.conf centos@$CONTROLLER_IP:/etc/kubernetes/admin.conf ~/.kube/config
sed -i "s^server:.*^server: https://$LB_HOST:6443^" ~/.kube/config
kubectl get nodes
```
## Troubleshooting
### Remaining AWS IAM Instance Profile
If the cluster was destroyed without using Terraform it is possible that
the AWS IAM Instance Profiles still remain. To delete them you can use
the `AWS CLI` with the following command:
```
```commandline
aws iam delete-instance-profile --region <region_name> --instance-profile-name <profile_name>
```
***Ansible Inventory doesn't get created:***
### Ansible Inventory doesn't get created
It could happen that Terraform doesn't create an Ansible Inventory file automatically. If this is the case copy the output after `inventory=` and create a file named `hosts`in the directory `inventory` and paste the inventory into the file.
**Architecture**
## Architecture
Pictured is an AWS Infrastructure created with this Terraform project distributed over two Availability Zones.

View File

@ -61,11 +61,11 @@ resource "aws_instance" "bastion-server" {
key_name = var.AWS_SSH_KEY_NAME
tags = merge(var.default_tags, map(
"Name", "kubernetes-${var.aws_cluster_name}-bastion-${count.index}",
"Cluster", "${var.aws_cluster_name}",
"Role", "bastion-${var.aws_cluster_name}-${count.index}"
))
tags = merge(var.default_tags, tomap({
Name = "kubernetes-${var.aws_cluster_name}-bastion-${count.index}"
Cluster = var.aws_cluster_name
Role = "bastion-${var.aws_cluster_name}-${count.index}"
}))
}
/*
@ -84,14 +84,14 @@ resource "aws_instance" "k8s-master" {
vpc_security_group_ids = module.aws-vpc.aws_security_group
iam_instance_profile = module.aws-iam.kube-master-profile
iam_instance_profile = module.aws-iam.kube_control_plane-profile
key_name = var.AWS_SSH_KEY_NAME
tags = merge(var.default_tags, map(
"Name", "kubernetes-${var.aws_cluster_name}-master${count.index}",
"kubernetes.io/cluster/${var.aws_cluster_name}", "member",
"Role", "master"
))
tags = merge(var.default_tags, tomap({
Name = "kubernetes-${var.aws_cluster_name}-master${count.index}"
"kubernetes.io/cluster/${var.aws_cluster_name}" = "member"
Role = "master"
}))
}
resource "aws_elb_attachment" "attach_master_nodes" {
@ -113,11 +113,11 @@ resource "aws_instance" "k8s-etcd" {
key_name = var.AWS_SSH_KEY_NAME
tags = merge(var.default_tags, map(
"Name", "kubernetes-${var.aws_cluster_name}-etcd${count.index}",
"kubernetes.io/cluster/${var.aws_cluster_name}", "member",
"Role", "etcd"
))
tags = merge(var.default_tags, tomap({
Name = "kubernetes-${var.aws_cluster_name}-etcd${count.index}"
"kubernetes.io/cluster/${var.aws_cluster_name}" = "member"
Role = "etcd"
}))
}
resource "aws_instance" "k8s-worker" {
@ -134,11 +134,11 @@ resource "aws_instance" "k8s-worker" {
iam_instance_profile = module.aws-iam.kube-worker-profile
key_name = var.AWS_SSH_KEY_NAME
tags = merge(var.default_tags, map(
"Name", "kubernetes-${var.aws_cluster_name}-worker${count.index}",
"kubernetes.io/cluster/${var.aws_cluster_name}", "member",
"Role", "worker"
))
tags = merge(var.default_tags, tomap({
Name = "kubernetes-${var.aws_cluster_name}-worker${count.index}"
"kubernetes.io/cluster/${var.aws_cluster_name}" = "member"
Role = "worker"
}))
}
/*

View File

@ -2,9 +2,9 @@ resource "aws_security_group" "aws-elb" {
name = "kubernetes-${var.aws_cluster_name}-securitygroup-elb"
vpc_id = var.aws_vpc_id
tags = merge(var.default_tags, map(
"Name", "kubernetes-${var.aws_cluster_name}-securitygroup-elb"
))
tags = merge(var.default_tags, tomap({
Name = "kubernetes-${var.aws_cluster_name}-securitygroup-elb"
}))
}
resource "aws_security_group_rule" "aws-allow-api-access" {
@ -42,7 +42,7 @@ resource "aws_elb" "aws-elb-api" {
healthy_threshold = 2
unhealthy_threshold = 2
timeout = 3
target = "TCP:${var.k8s_secure_api_port}"
target = "HTTPS:${var.k8s_secure_api_port}/healthz"
interval = 30
}
@ -51,7 +51,7 @@ resource "aws_elb" "aws-elb-api" {
connection_draining = true
connection_draining_timeout = 400
tags = merge(var.default_tags, map(
"Name", "kubernetes-${var.aws_cluster_name}-elb-api"
))
tags = merge(var.default_tags, tomap({
Name = "kubernetes-${var.aws_cluster_name}-elb-api"
}))
}

View File

@ -16,15 +16,15 @@ variable "k8s_secure_api_port" {
variable "aws_avail_zones" {
description = "Availability Zones Used"
type = "list"
type = list(string)
}
variable "aws_subnet_ids_public" {
description = "IDs of Public Subnets"
type = "list"
type = list(string)
}
variable "default_tags" {
description = "Tags for all resources"
type = "map"
type = map(string)
}

View File

@ -1,6 +1,6 @@
#Add AWS Roles for Kubernetes
resource "aws_iam_role" "kube-master" {
resource "aws_iam_role" "kube_control_plane" {
name = "kubernetes-${var.aws_cluster_name}-master"
assume_role_policy = <<EOF
@ -40,9 +40,9 @@ EOF
#Add AWS Policies for Kubernetes
resource "aws_iam_role_policy" "kube-master" {
resource "aws_iam_role_policy" "kube_control_plane" {
name = "kubernetes-${var.aws_cluster_name}-master"
role = aws_iam_role.kube-master.id
role = aws_iam_role.kube_control_plane.id
policy = <<EOF
{
@ -130,9 +130,9 @@ EOF
#Create AWS Instance Profiles
resource "aws_iam_instance_profile" "kube-master" {
resource "aws_iam_instance_profile" "kube_control_plane" {
name = "kube_${var.aws_cluster_name}_master_profile"
role = aws_iam_role.kube-master.name
role = aws_iam_role.kube_control_plane.name
}
resource "aws_iam_instance_profile" "kube-worker" {

View File

@ -1,5 +1,5 @@
output "kube-master-profile" {
value = aws_iam_instance_profile.kube-master.name
output "kube_control_plane-profile" {
value = aws_iam_instance_profile.kube_control_plane.name
}
output "kube-worker-profile" {

View File

@ -5,9 +5,9 @@ resource "aws_vpc" "cluster-vpc" {
enable_dns_support = true
enable_dns_hostnames = true
tags = merge(var.default_tags, map(
"Name", "kubernetes-${var.aws_cluster_name}-vpc"
))
tags = merge(var.default_tags, tomap({
Name = "kubernetes-${var.aws_cluster_name}-vpc"
}))
}
resource "aws_eip" "cluster-nat-eip" {
@ -18,9 +18,9 @@ resource "aws_eip" "cluster-nat-eip" {
resource "aws_internet_gateway" "cluster-vpc-internetgw" {
vpc_id = aws_vpc.cluster-vpc.id
tags = merge(var.default_tags, map(
"Name", "kubernetes-${var.aws_cluster_name}-internetgw"
))
tags = merge(var.default_tags, tomap({
Name = "kubernetes-${var.aws_cluster_name}-internetgw"
}))
}
resource "aws_subnet" "cluster-vpc-subnets-public" {
@ -29,10 +29,10 @@ resource "aws_subnet" "cluster-vpc-subnets-public" {
availability_zone = element(var.aws_avail_zones, count.index)
cidr_block = element(var.aws_cidr_subnets_public, count.index)
tags = merge(var.default_tags, map(
"Name", "kubernetes-${var.aws_cluster_name}-${element(var.aws_avail_zones, count.index)}-public",
"kubernetes.io/cluster/${var.aws_cluster_name}", "member"
))
tags = merge(var.default_tags, tomap({
Name = "kubernetes-${var.aws_cluster_name}-${element(var.aws_avail_zones, count.index)}-public"
"kubernetes.io/cluster/${var.aws_cluster_name}" = "member"
}))
}
resource "aws_nat_gateway" "cluster-nat-gateway" {
@ -47,9 +47,9 @@ resource "aws_subnet" "cluster-vpc-subnets-private" {
availability_zone = element(var.aws_avail_zones, count.index)
cidr_block = element(var.aws_cidr_subnets_private, count.index)
tags = merge(var.default_tags, map(
"Name", "kubernetes-${var.aws_cluster_name}-${element(var.aws_avail_zones, count.index)}-private"
))
tags = merge(var.default_tags, tomap({
Name = "kubernetes-${var.aws_cluster_name}-${element(var.aws_avail_zones, count.index)}-private"
}))
}
#Routing in VPC
@ -64,9 +64,9 @@ resource "aws_route_table" "kubernetes-public" {
gateway_id = aws_internet_gateway.cluster-vpc-internetgw.id
}
tags = merge(var.default_tags, map(
"Name", "kubernetes-${var.aws_cluster_name}-routetable-public"
))
tags = merge(var.default_tags, tomap({
Name = "kubernetes-${var.aws_cluster_name}-routetable-public"
}))
}
resource "aws_route_table" "kubernetes-private" {
@ -78,9 +78,9 @@ resource "aws_route_table" "kubernetes-private" {
nat_gateway_id = element(aws_nat_gateway.cluster-nat-gateway.*.id, count.index)
}
tags = merge(var.default_tags, map(
"Name", "kubernetes-${var.aws_cluster_name}-routetable-private-${count.index}"
))
tags = merge(var.default_tags, tomap({
Name = "kubernetes-${var.aws_cluster_name}-routetable-private-${count.index}"
}))
}
resource "aws_route_table_association" "kubernetes-public" {
@ -101,9 +101,9 @@ resource "aws_security_group" "kubernetes" {
name = "kubernetes-${var.aws_cluster_name}-securitygroup"
vpc_id = aws_vpc.cluster-vpc.id
tags = merge(var.default_tags, map(
"Name", "kubernetes-${var.aws_cluster_name}-securitygroup"
))
tags = merge(var.default_tags, tomap({
Name = "kubernetes-${var.aws_cluster_name}-securitygroup"
}))
}
resource "aws_security_group_rule" "allow-all-ingress" {

View File

@ -8,20 +8,20 @@ variable "aws_cluster_name" {
variable "aws_avail_zones" {
description = "AWS Availability Zones Used"
type = "list"
type = list(string)
}
variable "aws_cidr_subnets_private" {
description = "CIDR Blocks for private subnets in Availability zones"
type = "list"
type = list(string)
}
variable "aws_cidr_subnets_public" {
description = "CIDR Blocks for public subnets in Availability zones"
type = "list"
type = list(string)
}
variable "default_tags" {
description = "Default tags for all resources"
type = "map"
type = map(string)
}

View File

@ -7,11 +7,11 @@ ${public_ip_address_bastion}
[bastion]
${public_ip_address_bastion}
[kube-master]
[kube_control_plane]
${list_master}
[kube-node]
[kube_node]
${list_node}
@ -19,10 +19,10 @@ ${list_node}
${list_etcd}
[k8s-cluster:children]
kube-node
kube-master
[k8s_cluster:children]
kube_node
kube_control_plane
[k8s-cluster:vars]
[k8s_cluster:vars]
${elb_api_fqdn}

View File

@ -44,12 +44,12 @@ variable "aws_vpc_cidr_block" {
variable "aws_cidr_subnets_private" {
description = "CIDR Blocks for private subnets in Availability Zones"
type = "list"
type = list(string)
}
variable "aws_cidr_subnets_public" {
description = "CIDR Blocks for public subnets in Availability Zones"
type = "list"
type = list(string)
}
//AWS EC2 Settings
@ -101,7 +101,7 @@ variable "k8s_secure_api_port" {
variable "default_tags" {
description = "Default tags for all resources"
type = "map"
type = map(string)
}
variable "inventory_file" {

View File

@ -0,0 +1,154 @@
# Kubernetes on Exoscale with Terraform
Provision a Kubernetes cluster on [Exoscale](https://www.exoscale.com/) using Terraform and Kubespray
## Overview
The setup looks like following
```text
Kubernetes cluster
+-----------------------+
+---------------+ | +--------------+ |
| | | | +--------------+ |
| API server LB +---------> | | | |
| | | | | Master/etcd | |
+---------------+ | | | node(s) | |
| +-+ | |
| +--------------+ |
| ^ |
| | |
| v |
+---------------+ | +--------------+ |
| | | | +--------------+ |
| Ingress LB +---------> | | | |
| | | | | Worker | |
+---------------+ | | | node(s) | |
| +-+ | |
| +--------------+ |
+-----------------------+
```
## Requirements
* Terraform 0.13.0 or newer
*0.12 also works if you modify the provider block to include version and remove all `versions.tf` files*
## Quickstart
NOTE: *Assumes you are at the root of the kubespray repo*
Copy the sample inventory for your cluster and copy the default terraform variables.
```bash
CLUSTER=my-exoscale-cluster
cp -r inventory/sample inventory/$CLUSTER
cp contrib/terraform/exoscale/default.tfvars inventory/$CLUSTER/
cd inventory/$CLUSTER
```
Edit `default.tfvars` to match your setup. You MUST, at the very least, change `ssh_public_keys`.
```bash
# Ensure $EDITOR points to your favorite editor, e.g., vim, emacs, VS Code, etc.
$EDITOR default.tfvars
```
For authentication you can use the credentials file `~/.cloudstack.ini` or `./cloudstack.ini`.
The file should look like something like this:
```ini
[cloudstack]
key = <API key>
secret = <API secret>
```
Follow the [Exoscale IAM Quick-start](https://community.exoscale.com/documentation/iam/quick-start/) to learn how to generate API keys.
### Encrypted credentials
To have the credentials encrypted at rest, you can use [sops](https://github.com/mozilla/sops) and only decrypt the credentials at runtime.
```bash
cat << EOF > cloudstack.ini
[cloudstack]
key =
secret =
EOF
sops --encrypt --in-place --pgp <PGP key fingerprint> cloudstack.ini
sops cloudstack.ini
```
Run terraform to create the infrastructure
```bash
terraform init ../../contrib/terraform/exoscale
terraform apply -var-file default.tfvars ../../contrib/terraform/exoscale
```
If your cloudstack credentials file is encrypted using sops, run the following:
```bash
terraform init ../../contrib/terraform/exoscale
sops exec-file -no-fifo cloudstack.ini 'CLOUDSTACK_CONFIG={} terraform apply -var-file default.tfvars ../../contrib/terraform/exoscale'
```
You should now have a inventory file named `inventory.ini` that you can use with kubespray.
You can now copy your inventory file and use it with kubespray to set up a cluster.
You can type `terraform output` to find out the IP addresses of the nodes, as well as control-plane and data-plane load-balancer.
It is a good idea to check that you have basic SSH connectivity to the nodes. You can do that by:
```bash
ansible -i inventory.ini -m ping all
```
Example to use this with the default sample inventory:
```bash
ansible-playbook -i inventory.ini ../../cluster.yml -b -v
```
## Teardown
The Kubernetes cluster cannot create any load-balancers or disks, hence, teardown is as simple as Terraform destroy:
```bash
terraform destroy -var-file default.tfvars ../../contrib/terraform/exoscale
```
## Variables
### Required
* `ssh_public_keys`: List of public SSH keys to install on all machines
* `zone`: The zone where to run the cluster
* `machines`: Machines to provision. Key of this object will be used as the name of the machine
* `node_type`: The role of this node *(master|worker)*
* `size`: The size to use
* `boot_disk`: The boot disk to use
* `image_name`: Name of the image
* `root_partition_size`: Size *(in GB)* for the root partition
* `ceph_partition_size`: Size *(in GB)* for the partition for rook to use as ceph storage. *(Set to 0 to disable)*
* `node_local_partition_size`: Size *(in GB)* for the partition for node-local-storage. *(Set to 0 to disable)*
* `ssh_whitelist`: List of IP ranges (CIDR) that will be allowed to ssh to the nodes
* `api_server_whitelist`: List of IP ranges (CIDR) that will be allowed to connect to the API server
* `nodeport_whitelist`: List of IP ranges (CIDR) that will be allowed to connect to the kubernetes nodes on port 30000-32767 (kubernetes nodeports)
### Optional
* `prefix`: Prefix to use for all resources, required to be unique for all clusters in the same project *(Defaults to `default`)*
An example variables file can be found `default.tfvars`
## Known limitations
### Only single disk
Since Exoscale doesn't support additional disks to be mounted onto an instance, this script has the ability to create partitions for [Rook](https://rook.io/) and [node-local-storage](https://kubernetes.io/docs/concepts/storage/volumes/#local).
### No Kubernetes API
The current solution doesn't use the [Exoscale Kubernetes cloud controller](https://github.com/exoscale/exoscale-cloud-controller-manager).
This means that we need to set up a HTTP(S) loadbalancer in front of all workers and set the Ingress controller to DaemonSet mode.

View File

@ -0,0 +1,65 @@
prefix = "default"
zone = "ch-gva-2"
inventory_file = "inventory.ini"
ssh_public_keys = [
# Put your public SSH key here
"ssh-rsa I-did-not-read-the-docs",
"ssh-rsa I-did-not-read-the-docs 2",
]
machines = {
"master-0" : {
"node_type" : "master",
"size" : "Medium",
"boot_disk" : {
"image_name" : "Linux Ubuntu 20.04 LTS 64-bit",
"root_partition_size" : 50,
"node_local_partition_size" : 0,
"ceph_partition_size" : 0
}
},
"worker-0" : {
"node_type" : "worker",
"size" : "Large",
"boot_disk" : {
"image_name" : "Linux Ubuntu 20.04 LTS 64-bit",
"root_partition_size" : 50,
"node_local_partition_size" : 0,
"ceph_partition_size" : 0
}
},
"worker-1" : {
"node_type" : "worker",
"size" : "Large",
"boot_disk" : {
"image_name" : "Linux Ubuntu 20.04 LTS 64-bit",
"root_partition_size" : 50,
"node_local_partition_size" : 0,
"ceph_partition_size" : 0
}
},
"worker-2" : {
"node_type" : "worker",
"size" : "Large",
"boot_disk" : {
"image_name" : "Linux Ubuntu 20.04 LTS 64-bit",
"root_partition_size" : 50,
"node_local_partition_size" : 0,
"ceph_partition_size" : 0
}
}
}
nodeport_whitelist = [
"0.0.0.0/0"
]
ssh_whitelist = [
"0.0.0.0/0"
]
api_server_whitelist = [
"0.0.0.0/0"
]

View File

@ -0,0 +1,49 @@
provider "exoscale" {}
module "kubernetes" {
source = "./modules/kubernetes-cluster"
prefix = var.prefix
machines = var.machines
ssh_public_keys = var.ssh_public_keys
ssh_whitelist = var.ssh_whitelist
api_server_whitelist = var.api_server_whitelist
nodeport_whitelist = var.nodeport_whitelist
}
#
# Generate ansible inventory
#
data "template_file" "inventory" {
template = file("${path.module}/templates/inventory.tpl")
vars = {
connection_strings_master = join("\n", formatlist("%s ansible_user=ubuntu ansible_host=%s ip=%s etcd_member_name=etcd%d",
keys(module.kubernetes.master_ip_addresses),
values(module.kubernetes.master_ip_addresses).*.public_ip,
values(module.kubernetes.master_ip_addresses).*.private_ip,
range(1, length(module.kubernetes.master_ip_addresses) + 1)))
connection_strings_worker = join("\n", formatlist("%s ansible_user=ubuntu ansible_host=%s ip=%s",
keys(module.kubernetes.worker_ip_addresses),
values(module.kubernetes.worker_ip_addresses).*.public_ip,
values(module.kubernetes.worker_ip_addresses).*.private_ip))
list_master = join("\n", keys(module.kubernetes.master_ip_addresses))
list_worker = join("\n", keys(module.kubernetes.worker_ip_addresses))
api_lb_ip_address = module.kubernetes.control_plane_lb_ip_address
}
}
resource "null_resource" "inventories" {
provisioner "local-exec" {
command = "echo '${data.template_file.inventory.rendered}' > ${var.inventory_file}"
}
triggers = {
template = data.template_file.inventory.rendered
}
}

View File

@ -0,0 +1,193 @@
data "exoscale_compute_template" "os_image" {
for_each = var.machines
zone = var.zone
name = each.value.boot_disk.image_name
}
data "exoscale_compute" "master_nodes" {
for_each = exoscale_compute.master
id = each.value.id
# Since private IP address is not assigned until the nics are created we need this
depends_on = [exoscale_nic.master_private_network_nic]
}
data "exoscale_compute" "worker_nodes" {
for_each = exoscale_compute.worker
id = each.value.id
# Since private IP address is not assigned until the nics are created we need this
depends_on = [exoscale_nic.worker_private_network_nic]
}
resource "exoscale_network" "private_network" {
zone = var.zone
name = "${var.prefix}-network"
start_ip = cidrhost(var.private_network_cidr, 1)
# cidr -1 = Broadcast address
# cidr -2 = DHCP server address (exoscale specific)
end_ip = cidrhost(var.private_network_cidr, -3)
netmask = cidrnetmask(var.private_network_cidr)
}
resource "exoscale_compute" "master" {
for_each = {
for name, machine in var.machines :
name => machine
if machine.node_type == "master"
}
display_name = "${var.prefix}-${each.key}"
template_id = data.exoscale_compute_template.os_image[each.key].id
size = each.value.size
disk_size = each.value.boot_disk.root_partition_size + each.value.boot_disk.node_local_partition_size + each.value.boot_disk.ceph_partition_size
state = "Running"
zone = var.zone
security_groups = [exoscale_security_group.master_sg.name]
user_data = templatefile(
"${path.module}/templates/cloud-init.tmpl",
{
eip_ip_address = exoscale_ipaddress.ingress_controller_lb.ip_address
node_local_partition_size = each.value.boot_disk.node_local_partition_size
ceph_partition_size = each.value.boot_disk.ceph_partition_size
root_partition_size = each.value.boot_disk.root_partition_size
node_type = "master"
ssh_public_keys = var.ssh_public_keys
}
)
}
resource "exoscale_compute" "worker" {
for_each = {
for name, machine in var.machines :
name => machine
if machine.node_type == "worker"
}
display_name = "${var.prefix}-${each.key}"
template_id = data.exoscale_compute_template.os_image[each.key].id
size = each.value.size
disk_size = each.value.boot_disk.root_partition_size + each.value.boot_disk.node_local_partition_size + each.value.boot_disk.ceph_partition_size
state = "Running"
zone = var.zone
security_groups = [exoscale_security_group.worker_sg.name]
user_data = templatefile(
"${path.module}/templates/cloud-init.tmpl",
{
eip_ip_address = exoscale_ipaddress.ingress_controller_lb.ip_address
node_local_partition_size = each.value.boot_disk.node_local_partition_size
ceph_partition_size = each.value.boot_disk.ceph_partition_size
root_partition_size = each.value.boot_disk.root_partition_size
node_type = "worker"
ssh_public_keys = var.ssh_public_keys
}
)
}
resource "exoscale_nic" "master_private_network_nic" {
for_each = exoscale_compute.master
compute_id = each.value.id
network_id = exoscale_network.private_network.id
}
resource "exoscale_nic" "worker_private_network_nic" {
for_each = exoscale_compute.worker
compute_id = each.value.id
network_id = exoscale_network.private_network.id
}
resource "exoscale_security_group" "master_sg" {
name = "${var.prefix}-master-sg"
description = "Security group for Kubernetes masters"
}
resource "exoscale_security_group_rules" "master_sg_rules" {
security_group_id = exoscale_security_group.master_sg.id
# SSH
ingress {
protocol = "TCP"
cidr_list = var.ssh_whitelist
ports = ["22"]
}
# Kubernetes API
ingress {
protocol = "TCP"
cidr_list = var.api_server_whitelist
ports = ["6443"]
}
}
resource "exoscale_security_group" "worker_sg" {
name = "${var.prefix}-worker-sg"
description = "security group for kubernetes worker nodes"
}
resource "exoscale_security_group_rules" "worker_sg_rules" {
security_group_id = exoscale_security_group.worker_sg.id
# SSH
ingress {
protocol = "TCP"
cidr_list = var.ssh_whitelist
ports = ["22"]
}
# HTTP(S)
ingress {
protocol = "TCP"
cidr_list = ["0.0.0.0/0"]
ports = ["80", "443"]
}
# Kubernetes Nodeport
ingress {
protocol = "TCP"
cidr_list = var.nodeport_whitelist
ports = ["30000-32767"]
}
}
resource "exoscale_ipaddress" "ingress_controller_lb" {
zone = var.zone
healthcheck_mode = "http"
healthcheck_port = 80
healthcheck_path = "/healthz"
healthcheck_interval = 10
healthcheck_timeout = 2
healthcheck_strikes_ok = 2
healthcheck_strikes_fail = 3
}
resource "exoscale_secondary_ipaddress" "ingress_controller_lb" {
for_each = exoscale_compute.worker
compute_id = each.value.id
ip_address = exoscale_ipaddress.ingress_controller_lb.ip_address
}
resource "exoscale_ipaddress" "control_plane_lb" {
zone = var.zone
healthcheck_mode = "tcp"
healthcheck_port = 6443
healthcheck_interval = 10
healthcheck_timeout = 2
healthcheck_strikes_ok = 2
healthcheck_strikes_fail = 3
}
resource "exoscale_secondary_ipaddress" "control_plane_lb" {
for_each = exoscale_compute.master
compute_id = each.value.id
ip_address = exoscale_ipaddress.control_plane_lb.ip_address
}

View File

@ -0,0 +1,31 @@
output "master_ip_addresses" {
value = {
for key, instance in exoscale_compute.master :
instance.name => {
"private_ip" = contains(keys(data.exoscale_compute.master_nodes), key) ? data.exoscale_compute.master_nodes[key].private_network_ip_addresses[0] : ""
"public_ip" = exoscale_compute.master[key].ip_address
}
}
}
output "worker_ip_addresses" {
value = {
for key, instance in exoscale_compute.worker :
instance.name => {
"private_ip" = contains(keys(data.exoscale_compute.worker_nodes), key) ? data.exoscale_compute.worker_nodes[key].private_network_ip_addresses[0] : ""
"public_ip" = exoscale_compute.worker[key].ip_address
}
}
}
output "cluster_private_network_cidr" {
value = var.private_network_cidr
}
output "ingress_controller_lb_ip_address" {
value = exoscale_ipaddress.ingress_controller_lb.ip_address
}
output "control_plane_lb_ip_address" {
value = exoscale_ipaddress.control_plane_lb.ip_address
}

View File

@ -0,0 +1,52 @@
#cloud-config
%{ if ceph_partition_size > 0 || node_local_partition_size > 0}
bootcmd:
- [ cloud-init-per, once, move-second-header, sgdisk, --move-second-header, /dev/vda ]
%{ if node_local_partition_size > 0 }
# Create partition for node local storage
- [ cloud-init-per, once, create-node-local-part, parted, --script, /dev/vda, 'mkpart extended ext4 ${root_partition_size}GB %{ if ceph_partition_size == 0 }-1%{ else }${root_partition_size + node_local_partition_size}GB%{ endif }' ]
- [ cloud-init-per, once, create-fs-node-local-part, mkfs.ext4, /dev/vda2 ]
%{ endif }
%{ if ceph_partition_size > 0 }
# Create partition for rook to use for ceph
- [ cloud-init-per, once, create-ceph-part, parted, --script, /dev/vda, 'mkpart extended ${root_partition_size + node_local_partition_size}GB -1' ]
%{ endif }
%{ endif }
ssh_authorized_keys:
%{ for ssh_public_key in ssh_public_keys ~}
- ${ssh_public_key}
%{ endfor ~}
write_files:
- path: /etc/netplan/eth1.yaml
content: |
network:
version: 2
ethernets:
eth1:
dhcp4: true
%{ if node_type == "worker" }
# TODO: When a VM is seen as healthy and is added to the EIP loadbalancer
# pool it no longer can send traffic back to itself via the EIP IP
# address.
# Remove this if it ever gets solved.
- path: /etc/netplan/20-eip-fix.yaml
content: |
network:
version: 2
ethernets:
"lo:0":
match:
name: lo
dhcp4: false
addresses:
- ${eip_ip_address}/32
%{ endif }
runcmd:
- netplan apply
%{ if node_local_partition_size > 0 }
- mkdir -p /mnt/disks/node-local-storage
- chown nobody:nogroup /mnt/disks/node-local-storage
- mount /dev/vda2 /mnt/disks/node-local-storage
%{ endif }

View File

@ -0,0 +1,42 @@
variable "zone" {
type = string
# This is currently the only zone that is supposed to be supporting
# so called "managed private networks".
# See: https://www.exoscale.com/syslog/introducing-managed-private-networks
default = "ch-gva-2"
}
variable "prefix" {}
variable "machines" {
type = map(object({
node_type = string
size = string
boot_disk = object({
image_name = string
root_partition_size = number
ceph_partition_size = number
node_local_partition_size = number
})
}))
}
variable "ssh_public_keys" {
type = list(string)
}
variable "ssh_whitelist" {
type = list(string)
}
variable "api_server_whitelist" {
type = list(string)
}
variable "nodeport_whitelist" {
type = list(string)
}
variable "private_network_cidr" {
default = "172.0.10.0/24"
}

View File

@ -0,0 +1,9 @@
terraform {
required_providers {
exoscale = {
source = "exoscale/exoscale"
version = ">= 0.21"
}
}
required_version = ">= 0.13"
}

View File

@ -0,0 +1,15 @@
output "master_ips" {
value = module.kubernetes.master_ip_addresses
}
output "worker_ips" {
value = module.kubernetes.worker_ip_addresses
}
output "ingress_controller_lb_ip_address" {
value = module.kubernetes.ingress_controller_lb_ip_address
}
output "control_plane_lb_ip_address" {
value = module.kubernetes.control_plane_lb_ip_address
}

View File

@ -0,0 +1,65 @@
prefix = "default"
zone = "ch-gva-2"
inventory_file = "inventory.ini"
ssh_public_keys = [
# Put your public SSH key here
"ssh-rsa I-did-not-read-the-docs",
"ssh-rsa I-did-not-read-the-docs 2",
]
machines = {
"master-0" : {
"node_type" : "master",
"size" : "Small",
"boot_disk" : {
"image_name" : "Linux Ubuntu 20.04 LTS 64-bit",
"root_partition_size" : 50,
"node_local_partition_size" : 0,
"ceph_partition_size" : 0
}
},
"worker-0" : {
"node_type" : "worker",
"size" : "Large",
"boot_disk" : {
"image_name" : "Linux Ubuntu 20.04 LTS 64-bit",
"root_partition_size" : 50,
"node_local_partition_size" : 0,
"ceph_partition_size" : 0
}
},
"worker-1" : {
"node_type" : "worker",
"size" : "Large",
"boot_disk" : {
"image_name" : "Linux Ubuntu 20.04 LTS 64-bit",
"root_partition_size" : 50,
"node_local_partition_size" : 0,
"ceph_partition_size" : 0
}
},
"worker-2" : {
"node_type" : "worker",
"size" : "Large",
"boot_disk" : {
"image_name" : "Linux Ubuntu 20.04 LTS 64-bit",
"root_partition_size" : 50,
"node_local_partition_size" : 0,
"ceph_partition_size" : 0
}
}
}
nodeport_whitelist = [
"0.0.0.0/0"
]
ssh_whitelist = [
"0.0.0.0/0"
]
api_server_whitelist = [
"0.0.0.0/0"
]

View File

@ -0,0 +1 @@
../../../../inventory/sample/group_vars

View File

@ -0,0 +1,19 @@
[all]
${connection_strings_master}
${connection_strings_worker}
[kube_control_plane]
${list_master}
[kube_control_plane:vars]
supplementary_addresses_in_ssl_keys = [ "${api_lb_ip_address}" ]
[etcd]
${list_master}
[kube_node]
${list_worker}
[k8s_cluster:children]
kube_control_plane
kube_node

View File

@ -0,0 +1,46 @@
variable "zone" {
description = "The zone where to run the cluster"
}
variable "prefix" {
description = "Prefix for resource names"
default = "default"
}
variable "machines" {
description = "Cluster machines"
type = map(object({
node_type = string
size = string
boot_disk = object({
image_name = string
root_partition_size = number
ceph_partition_size = number
node_local_partition_size = number
})
}))
}
variable "ssh_public_keys" {
description = "List of public SSH keys which are injected into the VMs."
type = list(string)
}
variable "ssh_whitelist" {
description = "List of IP ranges (CIDR) to whitelist for ssh"
type = list(string)
}
variable "api_server_whitelist" {
description = "List of IP ranges (CIDR) to whitelist for kubernetes api server"
type = list(string)
}
variable "nodeport_whitelist" {
description = "List of IP ranges (CIDR) to whitelist for kubernetes nodeports"
type = list(string)
}
variable "inventory_file" {
description = "Where to store the generated inventory file"
}

View File

@ -0,0 +1,15 @@
terraform {
required_providers {
exoscale = {
source = "exoscale/exoscale"
version = ">= 0.21"
}
null = {
source = "hashicorp/null"
}
template = {
source = "hashicorp/template"
}
}
required_version = ">= 0.13"
}

View File

@ -0,0 +1,90 @@
# Kubernetes on GCP with Terraform
Provision a Kubernetes cluster on GCP using Terraform and Kubespray
## Overview
The setup looks like following
```text
Kubernetes cluster
+-----------------------+
+---------------+ | +--------------+ |
| | | | +--------------+ |
| API server LB +---------> | | | |
| | | | | Master/etcd | |
+---------------+ | | | node(s) | |
| +-+ | |
| +--------------+ |
| ^ |
| | |
| v |
+---------------+ | +--------------+ |
| | | | +--------------+ |
| Ingress LB +---------> | | | |
| | | | | Worker | |
+---------------+ | | | node(s) | |
| +-+ | |
| +--------------+ |
+-----------------------+
```
## Requirements
* Terraform 0.12.0 or newer
## Quickstart
To get a cluster up and running you'll need a JSON keyfile.
Set the path to the file in the `tfvars.json` file and run the following:
```bash
terraform apply -var-file tfvars.json -state dev-cluster.tfstate -var gcp_project_id=<ID of your GCP project> -var keyfile_location=<location of the json keyfile>
```
To generate kubespray inventory based on the terraform state file you can run the following:
```bash
./generate-inventory.sh dev-cluster.tfstate > inventory.ini
```
You should now have a inventory file named `inventory.ini` that you can use with kubespray, e.g.
```bash
ansible-playbook -i contrib/terraform/gcs/inventory.ini cluster.yml -b -v
```
## Variables
### Required
* `keyfile_location`: Location to the keyfile to use as credentials for the google terraform provider
* `gcp_project_id`: ID of the GCP project to deploy the cluster in
* `ssh_pub_key`: Path to public ssh key to use for all machines
* `region`: The region where to run the cluster
* `machines`: Machines to provision. Key of this object will be used as the name of the machine
* `node_type`: The role of this node *(master|worker)*
* `size`: The size to use
* `zone`: The zone the machine should run in
* `additional_disks`: Extra disks to add to the machine. Key of this object will be used as the disk name
* `size`: Size of the disk (in GB)
* `boot_disk`: The boot disk to use
* `image_name`: Name of the image
* `size`: Size of the boot disk (in GB)
* `ssh_whitelist`: List of IP ranges (CIDR) that will be allowed to ssh to the nodes
* `api_server_whitelist`: List of IP ranges (CIDR) that will be allowed to connect to the API server
* `nodeport_whitelist`: List of IP ranges (CIDR) that will be allowed to connect to the kubernetes nodes on port 30000-32767 (kubernetes nodeports)
### Optional
* `prefix`: Prefix to use for all resources, required to be unique for all clusters in the same project *(Defaults to `default`)*
* `master_sa_email`: Service account email to use for the master nodes *(Defaults to `""`, auto generate one)*
* `master_sa_scopes`: Service account email to use for the master nodes *(Defaults to `["https://www.googleapis.com/auth/cloud-platform"]`)*
* `worker_sa_email`: Service account email to use for the worker nodes *(Defaults to `""`, auto generate one)*
* `worker_sa_scopes`: Service account email to use for the worker nodes *(Defaults to `["https://www.googleapis.com/auth/cloud-platform"]`)*
An example variables file can be found `tfvars.json`
## Known limitations
This solution does not provide a solution to use a bastion host. Thus all the nodes must expose a public IP for kubespray to work.

View File

@ -0,0 +1,76 @@
#!/bin/bash
#
# Generates a inventory file based on the terraform output.
# After provisioning a cluster, simply run this command and supply the terraform state file
# Default state file is terraform.tfstate
#
set -e
usage () {
echo "Usage: $0 <state file>" >&2
exit 1
}
if [[ $# -ne 1 ]]; then
usage
fi
TF_STATE_FILE=${1}
if [[ ! -f "${TF_STATE_FILE}" ]]; then
echo "ERROR: state file ${TF_STATE_FILE} doesn't exist" >&2
usage
fi
TF_OUT=$(terraform output -state "${TF_STATE_FILE}" -json)
MASTERS=$(jq -r '.master_ips.value | to_entries[]' <(echo "${TF_OUT}"))
WORKERS=$(jq -r '.worker_ips.value | to_entries[]' <(echo "${TF_OUT}"))
mapfile -t MASTER_NAMES < <(jq -r '.key' <(echo "${MASTERS}"))
mapfile -t WORKER_NAMES < <(jq -r '.key' <(echo "${WORKERS}"))
API_LB=$(jq -r '.control_plane_lb_ip_address.value' <(echo "${TF_OUT}"))
# Generate master hosts
i=1
for name in "${MASTER_NAMES[@]}"; do
private_ip=$(jq -r '. | select( .key=='"\"${name}\""' ) | .value.private_ip' <(echo "${MASTERS}"))
public_ip=$(jq -r '. | select( .key=='"\"${name}\""' ) | .value.public_ip' <(echo "${MASTERS}"))
echo "${name} ansible_user=ubuntu ansible_host=${public_ip} ip=${private_ip} etcd_member_name=etcd${i}"
i=$(( i + 1 ))
done
# Generate worker hosts
for name in "${WORKER_NAMES[@]}"; do
private_ip=$(jq -r '. | select( .key=='"\"${name}\""' ) | .value.private_ip' <(echo "${WORKERS}"))
public_ip=$(jq -r '. | select( .key=='"\"${name}\""' ) | .value.public_ip' <(echo "${WORKERS}"))
echo "${name} ansible_user=ubuntu ansible_host=${public_ip} ip=${private_ip}"
done
echo ""
echo "[kube_control_plane]"
for name in "${MASTER_NAMES[@]}"; do
echo "${name}"
done
echo ""
echo "[kube_control_plane:vars]"
echo "supplementary_addresses_in_ssl_keys = [ '${API_LB}' ]" # Add LB address to API server certificate
echo ""
echo "[etcd]"
for name in "${MASTER_NAMES[@]}"; do
echo "${name}"
done
echo ""
echo "[kube_node]"
for name in "${WORKER_NAMES[@]}"; do
echo "${name}"
done
echo ""
echo "[k8s_cluster:children]"
echo "kube_control_plane"
echo "kube_node"

View File

@ -0,0 +1,24 @@
provider "google" {
credentials = file(var.keyfile_location)
region = var.region
project = var.gcp_project_id
version = "~> 3.48"
}
module "kubernetes" {
source = "./modules/kubernetes-cluster"
region = var.region
prefix = var.prefix
machines = var.machines
ssh_pub_key = var.ssh_pub_key
master_sa_email = var.master_sa_email
master_sa_scopes = var.master_sa_scopes
worker_sa_email = var.worker_sa_email
worker_sa_scopes = var.worker_sa_scopes
ssh_whitelist = var.ssh_whitelist
api_server_whitelist = var.api_server_whitelist
nodeport_whitelist = var.nodeport_whitelist
}

View File

@ -0,0 +1,360 @@
#################################################
##
## General
##
resource "google_compute_network" "main" {
name = "${var.prefix}-network"
}
resource "google_compute_subnetwork" "main" {
name = "${var.prefix}-subnet"
network = google_compute_network.main.name
ip_cidr_range = var.private_network_cidr
region = var.region
}
resource "google_compute_firewall" "deny_all" {
name = "${var.prefix}-default-firewall"
network = google_compute_network.main.name
priority = 1000
deny {
protocol = "all"
}
}
resource "google_compute_firewall" "allow_internal" {
name = "${var.prefix}-internal-firewall"
network = google_compute_network.main.name
priority = 500
source_ranges = [var.private_network_cidr]
allow {
protocol = "all"
}
}
resource "google_compute_firewall" "ssh" {
name = "${var.prefix}-ssh-firewall"
network = google_compute_network.main.name
priority = 100
source_ranges = var.ssh_whitelist
allow {
protocol = "tcp"
ports = ["22"]
}
}
resource "google_compute_firewall" "api_server" {
name = "${var.prefix}-api-server-firewall"
network = google_compute_network.main.name
priority = 100
source_ranges = var.api_server_whitelist
allow {
protocol = "tcp"
ports = ["6443"]
}
}
resource "google_compute_firewall" "nodeport" {
name = "${var.prefix}-nodeport-firewall"
network = google_compute_network.main.name
priority = 100
source_ranges = var.nodeport_whitelist
allow {
protocol = "tcp"
ports = ["30000-32767"]
}
}
resource "google_compute_firewall" "ingress_http" {
name = "${var.prefix}-http-ingress-firewall"
network = google_compute_network.main.name
priority = 100
allow {
protocol = "tcp"
ports = ["80"]
}
}
resource "google_compute_firewall" "ingress_https" {
name = "${var.prefix}-https-ingress-firewall"
network = google_compute_network.main.name
priority = 100
allow {
protocol = "tcp"
ports = ["443"]
}
}
#################################################
##
## Local variables
##
locals {
master_target_list = [
for name, machine in google_compute_instance.master :
"${machine.zone}/${machine.name}"
]
worker_target_list = [
for name, machine in google_compute_instance.worker :
"${machine.zone}/${machine.name}"
]
master_disks = flatten([
for machine_name, machine in var.machines : [
for disk_name, disk in machine.additional_disks : {
"${machine_name}-${disk_name}" = {
"machine_name": machine_name,
"machine": machine,
"disk_size": disk.size,
"disk_name": disk_name
}
}
]
if machine.node_type == "master"
])
worker_disks = flatten([
for machine_name, machine in var.machines : [
for disk_name, disk in machine.additional_disks : {
"${machine_name}-${disk_name}" = {
"machine_name": machine_name,
"machine": machine,
"disk_size": disk.size,
"disk_name": disk_name
}
}
]
if machine.node_type == "worker"
])
}
#################################################
##
## Master
##
resource "google_compute_address" "master" {
for_each = {
for name, machine in var.machines :
name => machine
if machine.node_type == "master"
}
name = "${var.prefix}-${each.key}-pip"
address_type = "EXTERNAL"
region = var.region
}
resource "google_compute_disk" "master" {
for_each = {
for item in local.master_disks :
keys(item)[0] => values(item)[0]
}
name = "${var.prefix}-${each.key}"
type = "pd-ssd"
zone = each.value.machine.zone
size = each.value.disk_size
physical_block_size_bytes = 4096
}
resource "google_compute_attached_disk" "master" {
for_each = {
for item in local.master_disks :
keys(item)[0] => values(item)[0]
}
disk = google_compute_disk.master[each.key].id
instance = google_compute_instance.master[each.value.machine_name].id
}
resource "google_compute_instance" "master" {
for_each = {
for name, machine in var.machines :
name => machine
if machine.node_type == "master"
}
name = "${var.prefix}-${each.key}"
machine_type = each.value.size
zone = each.value.zone
tags = ["master"]
boot_disk {
initialize_params {
image = each.value.boot_disk.image_name
size = each.value.boot_disk.size
}
}
network_interface {
subnetwork = google_compute_subnetwork.main.name
access_config {
nat_ip = google_compute_address.master[each.key].address
}
}
metadata = {
ssh-keys = "ubuntu:${trimspace(file(pathexpand(var.ssh_pub_key)))}"
}
service_account {
email = var.master_sa_email
scopes = var.master_sa_scopes
}
# Since we use google_compute_attached_disk we need to ignore this
lifecycle {
ignore_changes = ["attached_disk"]
}
}
resource "google_compute_forwarding_rule" "master_lb" {
name = "${var.prefix}-master-lb-forward-rule"
port_range = "6443"
target = google_compute_target_pool.master_lb.id
}
resource "google_compute_target_pool" "master_lb" {
name = "${var.prefix}-master-lb-pool"
instances = local.master_target_list
}
#################################################
##
## Worker
##
resource "google_compute_disk" "worker" {
for_each = {
for item in local.worker_disks :
keys(item)[0] => values(item)[0]
}
name = "${var.prefix}-${each.key}"
type = "pd-ssd"
zone = each.value.machine.zone
size = each.value.disk_size
physical_block_size_bytes = 4096
}
resource "google_compute_attached_disk" "worker" {
for_each = {
for item in local.worker_disks :
keys(item)[0] => values(item)[0]
}
disk = google_compute_disk.worker[each.key].id
instance = google_compute_instance.worker[each.value.machine_name].id
}
resource "google_compute_address" "worker" {
for_each = {
for name, machine in var.machines :
name => machine
if machine.node_type == "worker"
}
name = "${var.prefix}-${each.key}-pip"
address_type = "EXTERNAL"
region = var.region
}
resource "google_compute_instance" "worker" {
for_each = {
for name, machine in var.machines :
name => machine
if machine.node_type == "worker"
}
name = "${var.prefix}-${each.key}"
machine_type = each.value.size
zone = each.value.zone
tags = ["worker"]
boot_disk {
initialize_params {
image = each.value.boot_disk.image_name
size = each.value.boot_disk.size
}
}
network_interface {
subnetwork = google_compute_subnetwork.main.name
access_config {
nat_ip = google_compute_address.worker[each.key].address
}
}
metadata = {
ssh-keys = "ubuntu:${trimspace(file(pathexpand(var.ssh_pub_key)))}"
}
service_account {
email = var.worker_sa_email
scopes = var.worker_sa_scopes
}
# Since we use google_compute_attached_disk we need to ignore this
lifecycle {
ignore_changes = ["attached_disk"]
}
}
resource "google_compute_address" "worker_lb" {
name = "${var.prefix}-worker-lb-address"
address_type = "EXTERNAL"
region = var.region
}
resource "google_compute_forwarding_rule" "worker_http_lb" {
name = "${var.prefix}-worker-http-lb-forward-rule"
ip_address = google_compute_address.worker_lb.address
port_range = "80"
target = google_compute_target_pool.worker_lb.id
}
resource "google_compute_forwarding_rule" "worker_https_lb" {
name = "${var.prefix}-worker-https-lb-forward-rule"
ip_address = google_compute_address.worker_lb.address
port_range = "443"
target = google_compute_target_pool.worker_lb.id
}
resource "google_compute_target_pool" "worker_lb" {
name = "${var.prefix}-worker-lb-pool"
instances = local.worker_target_list
}

View File

@ -0,0 +1,27 @@
output "master_ip_addresses" {
value = {
for key, instance in google_compute_instance.master :
instance.name => {
"private_ip" = instance.network_interface.0.network_ip
"public_ip" = instance.network_interface.0.access_config.0.nat_ip
}
}
}
output "worker_ip_addresses" {
value = {
for key, instance in google_compute_instance.worker :
instance.name => {
"private_ip" = instance.network_interface.0.network_ip
"public_ip" = instance.network_interface.0.access_config.0.nat_ip
}
}
}
output "ingress_controller_lb_ip_address" {
value = google_compute_address.worker_lb.address
}
output "control_plane_lb_ip_address" {
value = google_compute_forwarding_rule.master_lb.ip_address
}

View File

@ -0,0 +1,54 @@
variable "region" {
type = string
}
variable "prefix" {}
variable "machines" {
type = map(object({
node_type = string
size = string
zone = string
additional_disks = map(object({
size = number
}))
boot_disk = object({
image_name = string
size = number
})
}))
}
variable "master_sa_email" {
type = string
}
variable "master_sa_scopes" {
type = list(string)
}
variable "worker_sa_email" {
type = string
}
variable "worker_sa_scopes" {
type = list(string)
}
variable "ssh_pub_key" {}
variable "ssh_whitelist" {
type = list(string)
}
variable "api_server_whitelist" {
type = list(string)
}
variable "nodeport_whitelist" {
type = list(string)
}
variable "private_network_cidr" {
default = "10.0.10.0/24"
}

View File

@ -0,0 +1,15 @@
output "master_ips" {
value = module.kubernetes.master_ip_addresses
}
output "worker_ips" {
value = module.kubernetes.worker_ip_addresses
}
output "ingress_controller_lb_ip_address" {
value = module.kubernetes.ingress_controller_lb_ip_address
}
output "control_plane_lb_ip_address" {
value = module.kubernetes.control_plane_lb_ip_address
}

View File

@ -0,0 +1,60 @@
{
"gcp_project_id": "GCP_PROJECT_ID",
"region": "us-central1",
"ssh_pub_key": "~/.ssh/id_rsa.pub",
"keyfile_location": "service-account.json",
"prefix": "development",
"ssh_whitelist": [
"1.2.3.4/32"
],
"api_server_whitelist": [
"1.2.3.4/32"
],
"nodeport_whitelist": [
"1.2.3.4/32"
],
"machines": {
"master-0": {
"node_type": "master",
"size": "n1-standard-2",
"zone": "us-central1-a",
"additional_disks": {},
"boot_disk": {
"image_name": "ubuntu-os-cloud/ubuntu-1804-bionic-v20201116",
"size": 50
}
},
"worker-0": {
"node_type": "worker",
"size": "n1-standard-8",
"zone": "us-central1-a",
"additional_disks": {
"extra-disk-1": {
"size": 100
}
},
"boot_disk": {
"image_name": "ubuntu-os-cloud/ubuntu-1804-bionic-v20201116",
"size": 50
}
},
"worker-1": {
"node_type": "worker",
"size": "n1-standard-8",
"zone": "us-central1-a",
"additional_disks": {
"extra-disk-1": {
"size": 100
}
},
"boot_disk": {
"image_name": "ubuntu-os-cloud/ubuntu-1804-bionic-v20201116",
"size": 50
}
}
}
}

View File

@ -0,0 +1,72 @@
variable keyfile_location {
description = "Location of the json keyfile to use with the google provider"
type = string
}
variable region {
description = "Region of all resources"
type = string
}
variable gcp_project_id {
description = "ID of the project"
type = string
}
variable prefix {
description = "Prefix for resource names"
default = "default"
}
variable machines {
description = "Cluster machines"
type = map(object({
node_type = string
size = string
zone = string
additional_disks = map(object({
size = number
}))
boot_disk = object({
image_name = string
size = number
})
}))
}
variable "master_sa_email" {
type = string
default = ""
}
variable "master_sa_scopes" {
type = list(string)
default = ["https://www.googleapis.com/auth/cloud-platform"]
}
variable "worker_sa_email" {
type = string
default = ""
}
variable "worker_sa_scopes" {
type = list(string)
default = ["https://www.googleapis.com/auth/cloud-platform"]
}
variable ssh_pub_key {
description = "Path to public SSH key file which is injected into the VMs."
type = string
}
variable ssh_whitelist {
type = list(string)
}
variable api_server_whitelist {
type = list(string)
}
variable nodeport_whitelist {
type = list(string)
}

View File

@ -9,6 +9,7 @@ This will install a Kubernetes cluster on an OpenStack Cloud. It should work on
most modern installs of OpenStack that support the basic services.
### Known compatible public clouds
- [Auro](https://auro.io/)
- [Betacloud](https://www.betacloud.io/)
- [CityCloud](https://www.citycloud.com/)
@ -23,8 +24,8 @@ most modern installs of OpenStack that support the basic services.
- [VexxHost](https://vexxhost.com/)
- [Zetta](https://www.zetta.io/)
## Approach
The terraform configuration inspects variables found in
[variables.tf](variables.tf) to create resources in your OpenStack cluster.
There is a [python script](../terraform.py) that reads the generated`.tfstate`
@ -32,6 +33,7 @@ file to generate a dynamic inventory that is consumed by the main ansible script
to actually install kubernetes and stand up the cluster.
### Networking
The configuration includes creating a private subnet with a router to the
external net. It will allocate floating IPs from a pool and assign them to the
hosts where that makes sense. You have the option of creating bastion hosts
@ -39,19 +41,23 @@ inside the private subnet to access the nodes there. Alternatively, a node with
a floating IP can be used as a jump host to nodes without.
#### Using an existing router
It is possible to use an existing router instead of creating one. To use an
existing router set the router\_id variable to the uuid of the router you wish
to use.
For example:
```
```ShellSession
router_id = "00c542e7-6f46-4535-ae95-984c7f0391a3"
```
### Kubernetes Nodes
You can create many different kubernetes topologies by setting the number of
different classes of hosts. For each class there are options for allocating
floating IP addresses or not.
- Master nodes with etcd
- Master nodes without etcd
- Standalone etcd hosts
@ -64,10 +70,12 @@ master nodes with etcd replicas. As an example, if you have three master nodes w
etcd replicas and three standalone etcd nodes, the script will fail since there are
now six total etcd replicas.
### GlusterFS
### GlusterFS shared file system
The Terraform configuration supports provisioning of an optional GlusterFS
shared file system based on a separate set of VMs. To enable this, you need to
specify:
- the number of Gluster hosts (minimum 2)
- Size of the non-ephemeral volumes to be attached to store the GlusterFS bricks
- Other properties related to provisioning the hosts
@ -87,7 +95,9 @@ binaries available on hyperkube v1.4.3_coreos.0 or higher.
- you have a pair of keys generated that can be used to secure the new hosts
## Module Architecture
The configuration is divided into three modules:
- Network
- IPs
- Compute
@ -100,12 +110,13 @@ to be updated.
You can force your existing IPs by modifying the compute variables in
`kubespray.tf` as follows:
```
```ini
k8s_master_fips = ["151.101.129.67"]
k8s_node_fips = ["151.101.129.68"]
```
## Terraform
Terraform will be used to provision all of the OpenStack resources with base software as appropriate.
### Configuration
@ -115,10 +126,10 @@ Terraform will be used to provision all of the OpenStack resources with base sof
Create an inventory directory for your cluster by copying the existing sample and linking the `hosts` script (used to build the inventory based on Terraform state):
```ShellSession
$ cp -LRp contrib/terraform/openstack/sample-inventory inventory/$CLUSTER
$ cd inventory/$CLUSTER
$ ln -s ../../contrib/terraform/openstack/hosts
$ ln -s ../../contrib
cp -LRp contrib/terraform/openstack/sample-inventory inventory/$CLUSTER
cd inventory/$CLUSTER
ln -s ../../contrib/terraform/openstack/hosts
ln -s ../../contrib
```
This will be the base for subsequent Terraform commands.
@ -138,13 +149,13 @@ please read the [OpenStack provider documentation](https://www.terraform.io/docs
The recommended authentication method is to describe credentials in a YAML file `clouds.yaml` that can be stored in:
* the current directory
* `~/.config/openstack`
* `/etc/openstack`
- the current directory
- `~/.config/openstack`
- `/etc/openstack`
`clouds.yaml`:
```
```yaml
clouds:
mycloud:
auth:
@ -162,7 +173,7 @@ clouds:
If you have multiple clouds defined in your `clouds.yaml` file you can choose
the one you want to use with the environment variable `OS_CLOUD`:
```
```ShellSession
export OS_CLOUD=mycloud
```
@ -174,7 +185,7 @@ from Horizon under *Project* -> *Compute* -> *Access & Security* -> *API Access*
With identity v2:
```
```ShellSession
source openrc
env | grep OS
@ -191,7 +202,7 @@ OS_IDENTITY_API_VERSION=2
With identity v3:
```
```ShellSession
source openrc
env | grep OS
@ -208,24 +219,24 @@ OS_IDENTITY_API_VERSION=3
OS_USER_DOMAIN_NAME=Default
```
Terraform does not support a mix of DomainName and DomainID, choose one or the
other:
Terraform does not support a mix of DomainName and DomainID, choose one or the other:
```
* provider.openstack: You must provide exactly one of DomainID or DomainName to authenticate by Username
```
- provider.openstack: You must provide exactly one of DomainID or DomainName to authenticate by Username
```
```ShellSession
unset OS_USER_DOMAIN_NAME
export OS_USER_DOMAIN_ID=default
```
or
```ShellSession
unset OS_PROJECT_DOMAIN_ID
set OS_PROJECT_DOMAIN_NAME=Default
```
#### Cluster variables
The construction of the cluster is driven by values found in
[variables.tf](variables.tf).
@ -239,6 +250,7 @@ For your cluster, edit `inventory/$CLUSTER/cluster.tfvars`.
|`network_dns_domain` | (Optional) The dns_domain for the internal network that will be generated |
|`dns_nameservers`| An array of DNS name server names to be used by hosts in the internal subnet. |
|`floatingip_pool` | Name of the pool from which floating IPs will be allocated |
|`k8s_master_fips` | A list of floating IPs that you have already pre-allocated; they will be attached to master nodes instead of creating new random floating IPs. |
|`external_net` | UUID of the external network that will be routed to |
|`flavor_k8s_master`,`flavor_k8s_node`,`flavor_etcd`, `flavor_bastion`,`flavor_gfs_node` | Flavor depends on your openstack installation, you can get available flavor IDs through `openstack flavor list` |
|`image`,`image_gfs` | Name of the image to use in provisioning the compute resources. Should already be loaded into glance. |
@ -251,12 +263,13 @@ For your cluster, edit `inventory/$CLUSTER/cluster.tfvars`.
|`number_of_bastions` | Number of bastion hosts to create. Scripts assume this is really just zero or one |
|`number_of_gfs_nodes_no_floating_ip` | Number of gluster servers to provision. |
| `gfs_volume_size_in_gb` | Size of the non-ephemeral volumes to be attached to store the GlusterFS bricks |
|`supplementary_master_groups` | To add ansible groups to the masters, such as `kube-node` for tainting them as nodes, empty by default. |
|`supplementary_node_groups` | To add ansible groups to the nodes, such as `kube-ingress` for running ingress controller pods, empty by default. |
|`supplementary_master_groups` | To add ansible groups to the masters, such as `kube_node` for tainting them as nodes, empty by default. |
|`supplementary_node_groups` | To add ansible groups to the nodes, such as `kube_ingress` for running ingress controller pods, empty by default. |
|`bastion_allowed_remote_ips` | List of CIDR allowed to initiate a SSH connection, `["0.0.0.0/0"]` by default |
|`master_allowed_remote_ips` | List of CIDR blocks allowed to initiate an API connection, `["0.0.0.0/0"]` by default |
|`k8s_allowed_remote_ips` | List of CIDR allowed to initiate a SSH connection, empty by default |
|`worker_allowed_ports` | List of ports to open on worker nodes, `[{ "protocol" = "tcp", "port_range_min" = 30000, "port_range_max" = 32767, "remote_ip_prefix" = "0.0.0.0/0"}]` by default |
|`master_allowed_ports` | List of ports to open on master nodes, expected format is `[{ "protocol" = "tcp", "port_range_min" = 443, "port_range_max" = 443, "remote_ip_prefix" = "0.0.0.0/0"}]`, empty by default |
|`wait_for_floatingip` | Let Terraform poll the instance until the floating IP has been associated, `false` by default. |
|`node_root_volume_size_in_gb` | Size of the root volume for nodes, 0 to use ephemeral storage |
|`master_root_volume_size_in_gb` | Size of the root volume for masters, 0 to use ephemeral storage |
@ -264,16 +277,19 @@ For your cluster, edit `inventory/$CLUSTER/cluster.tfvars`.
|`etcd_root_volume_size_in_gb` | Size of the root volume for etcd nodes, 0 to use ephemeral storage |
|`bastion_root_volume_size_in_gb` | Size of the root volume for bastions, 0 to use ephemeral storage |
|`use_server_group` | Create and use openstack nova servergroups, default: false |
|`use_access_ip` | If 1, nodes with floating IPs will transmit internal cluster traffic via floating IPs; if 0 private IPs will be used instead. Default value is 1. |
|`k8s_nodes` | Map containing worker node definition, see explanation below |
##### k8s_nodes
Allows a custom defintion of worker nodes giving the operator full control over individual node flavor and
Allows a custom definition of worker nodes giving the operator full control over individual node flavor and
availability zone placement. To enable the use of this mode set the `number_of_k8s_nodes` and
`number_of_k8s_nodes_no_floating_ip` variables to 0. Then define your desired worker node configuration
using the `k8s_nodes` variable.
For example:
```
```ini
k8s_nodes = {
"1" = {
"az" = "sto1"
@ -294,14 +310,16 @@ k8s_nodes = {
```
Would result in the same configuration as:
```
```ini
number_of_k8s_nodes = 3
flavor_k8s_node = "83d8b44a-26a0-4f02-a981-079446926445"
az_list = ["sto1", "sto2", "sto3"]
```
And:
```
```ini
k8s_nodes = {
"ing-1" = {
"az" = "sto1"
@ -355,7 +373,8 @@ Would result in three nodes in each availability zone each with their own separa
flavor and floating ip configuration.
The "schema":
```
```ini
k8s_nodes = {
"key | node name suffix, must be unique" = {
"az" = string
@ -364,6 +383,7 @@ k8s_nodes = {
},
}
```
All values are required.
#### Terraform state files
@ -372,10 +392,10 @@ In the cluster's inventory folder, the following files might be created (either
or manually), to prevent you from pushing them accidentally they are in a
`.gitignore` file in the `terraform/openstack` directory :
* `.terraform`
* `.tfvars`
* `.tfstate`
* `.tfstate.backup`
- `.terraform`
- `.tfvars`
- `.tfstate`
- `.tfstate.backup`
You can still add them manually if you want to.
@ -385,39 +405,43 @@ Before Terraform can operate on your cluster you need to install the required
plugins. This is accomplished as follows:
```ShellSession
$ cd inventory/$CLUSTER
$ terraform init ../../contrib/terraform/openstack
cd inventory/$CLUSTER
terraform init ../../contrib/terraform/openstack
```
This should finish fairly quickly telling you Terraform has successfully initialized and loaded necessary modules.
### Provisioning cluster
You can apply the Terraform configuration to your cluster with the following command
issued from your cluster's inventory directory (`inventory/$CLUSTER`):
```ShellSession
$ terraform apply -var-file=cluster.tfvars ../../contrib/terraform/openstack
terraform apply -var-file=cluster.tfvars ../../contrib/terraform/openstack
```
if you chose to create a bastion host, this script will create
`contrib/terraform/openstack/k8s-cluster.yml` with an ssh command for Ansible to
`contrib/terraform/openstack/k8s_cluster.yml` with an ssh command for Ansible to
be able to access your machines tunneling through the bastion's IP address. If
you want to manually handle the ssh tunneling to these machines, please delete
or move that file. If you want to use this, just leave it there, as ansible will
pick it up automatically.
### Destroying cluster
You can destroy your new cluster with the following command issued from the cluster's inventory directory:
```ShellSession
$ terraform destroy -var-file=cluster.tfvars ../../contrib/terraform/openstack
terraform destroy -var-file=cluster.tfvars ../../contrib/terraform/openstack
```
If you've started the Ansible run, it may also be a good idea to do some manual cleanup:
* remove SSH keys from the destroyed cluster from your `~/.ssh/known_hosts` file
* clean up any temporary cache files: `rm /tmp/$CLUSTER-*`
- remove SSH keys from the destroyed cluster from your `~/.ssh/known_hosts` file
- clean up any temporary cache files: `rm /tmp/$CLUSTER-*`
### Debugging
You can enable debugging output from Terraform by setting
`OS_DEBUG` to 1 and`TF_LOG` to`DEBUG` before running the Terraform command.
@ -425,8 +449,8 @@ You can enable debugging output from Terraform by setting
Terraform can output values that are useful for configure Neutron/Octavia LBaaS or Cinder persistent volume provisioning as part of your Kubernetes deployment:
- `private_subnet_id`: the subnet where your instances are running is used for `openstack_lbaas_subnet_id`
- `floating_network_id`: the network_id where the floating IP are provisioned is used for `openstack_lbaas_floating_network_id`
- `private_subnet_id`: the subnet where your instances are running is used for `openstack_lbaas_subnet_id`
- `floating_network_id`: the network_id where the floating IP are provisioned is used for `openstack_lbaas_floating_network_id`
## Ansible
@ -437,9 +461,9 @@ Terraform can output values that are useful for configure Neutron/Octavia LBaaS
Ensure your local ssh-agent is running and your ssh key has been added. This
step is required by the terraform provisioner:
```
$ eval $(ssh-agent -s)
$ ssh-add ~/.ssh/id_rsa
```ShellSession
eval $(ssh-agent -s)
ssh-add ~/.ssh/id_rsa
```
If you have deployed and destroyed a previous iteration of your cluster, you will need to clear out any stale keys from your SSH "known hosts" file ( `~/.ssh/known_hosts`).
@ -451,13 +475,13 @@ generated`.tfstate` file to generate a dynamic inventory recognizes
some variables within a "metadata" block, defined in a "resource"
block (example):
```
```ini
resource "openstack_compute_instance_v2" "example" {
...
metadata {
ssh_user = "ubuntu"
prefer_ipv6 = true
python_bin = "/usr/bin/python3"
python_bin = "/usr/bin/python3"
}
...
}
@ -472,8 +496,8 @@ instance should be preferred over IPv4.
Bastion access will be determined by:
- Your choice on the amount of bastion hosts (set by `number_of_bastions` terraform variable).
- The existence of nodes/masters with floating IPs (set by `number_of_k8s_masters`, `number_of_k8s_nodes`, `number_of_k8s_masters_no_etcd` terraform variables).
- Your choice on the amount of bastion hosts (set by `number_of_bastions` terraform variable).
- The existence of nodes/masters with floating IPs (set by `number_of_k8s_masters`, `number_of_k8s_nodes`, `number_of_k8s_masters_no_etcd` terraform variables).
If you have a bastion host, your ssh traffic will be directly routed through it. This is regardless of whether you have masters/nodes with a floating IP assigned.
If you don't have a bastion host, but at least one of your masters/nodes have a floating IP, then ssh traffic will be tunneled by one of these machines.
@ -484,7 +508,7 @@ So, either a bastion host, or at least master/node with a floating IP are requir
Make sure you can connect to the hosts. Note that Flatcar Container Linux by Kinvolk will have a state `FAILED` due to Python not being present. This is okay, because Python will be installed during bootstrapping, so long as the hosts are not `UNREACHABLE`.
```
```ShellSession
$ ansible -i inventory/$CLUSTER/hosts -m ping all
example-k8s_node-1 | SUCCESS => {
"changed": false,
@ -505,48 +529,55 @@ If it fails try to connect manually via SSH. It could be something as simple as
### Configure cluster variables
Edit `inventory/$CLUSTER/group_vars/all/all.yml`:
- **bin_dir**:
```
```yml
# Directory where the binaries will be installed
# Default:
# bin_dir: /usr/local/bin
# For Flatcar Container Linux by Kinvolk:
bin_dir: /opt/bin
```
- and **cloud_provider**:
```
```yml
cloud_provider: openstack
```
Edit `inventory/$CLUSTER/group_vars/k8s-cluster/k8s-cluster.yml`:
Edit `inventory/$CLUSTER/group_vars/k8s_cluster/k8s_cluster.yml`:
- Set variable **kube_network_plugin** to your desired networking plugin.
- **flannel** works out-of-the-box
- **calico** requires [configuring OpenStack Neutron ports](/docs/openstack.md) to allow service and pod subnets
```
```yml
# Choose network plugin (calico, weave or flannel)
# Can also be set to 'cloud', which lets the cloud provider setup appropriate routing
kube_network_plugin: flannel
```
- Set variable **resolvconf_mode**
```
```yml
# Can be docker_dns, host_resolvconf or none
# Default:
# resolvconf_mode: docker_dns
# For Flatcar Container Linux by Kinvolk:
resolvconf_mode: host_resolvconf
```
- Set max amount of attached cinder volume per host (default 256)
```
```yml
node_volume_attach_limit: 26
```
- Disable access_ip, this will make all innternal cluster traffic to be sent over local network when a floating IP is attached (default this value is set to 1)
```
use_access_ip: 0
```
### Deploy Kubernetes
```
$ ansible-playbook --become -i inventory/$CLUSTER/hosts cluster.yml
```ShellSession
ansible-playbook --become -i inventory/$CLUSTER/hosts cluster.yml
```
This will take some time as there are many tasks to run.
@ -554,26 +585,36 @@ This will take some time as there are many tasks to run.
## Kubernetes
### Set up kubectl
1. [Install kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/) on your workstation
2. Add a route to the internal IP of a master node (if needed):
```
```ShellSession
sudo route add [master-internal-ip] gw [router-ip]
```
or
```
```ShellSession
sudo route add -net [internal-subnet]/24 gw [router-ip]
```
3. List Kubernetes certificates & keys:
```
1. List Kubernetes certificates & keys:
```ShellSession
ssh [os-user]@[master-ip] sudo ls /etc/kubernetes/ssl/
```
4. Get `admin`'s certificates and keys:
```
1. Get `admin`'s certificates and keys:
```ShellSession
ssh [os-user]@[master-ip] sudo cat /etc/kubernetes/ssl/admin-kube-master-1-key.pem > admin-key.pem
ssh [os-user]@[master-ip] sudo cat /etc/kubernetes/ssl/admin-kube-master-1.pem > admin.pem
ssh [os-user]@[master-ip] sudo cat /etc/kubernetes/ssl/ca.pem > ca.pem
```
5. Configure kubectl:
1. Configure kubectl:
```ShellSession
$ kubectl config set-cluster default-cluster --server=https://[master-internal-ip]:6443 \
--certificate-authority=ca.pem
@ -586,21 +627,24 @@ $ kubectl config set-credentials default-admin \
$ kubectl config set-context default-system --cluster=default-cluster --user=default-admin
$ kubectl config use-context default-system
```
7. Check it:
```
1. Check it:
```ShellSession
kubectl version
```
## GlusterFS
GlusterFS is not deployed by the standard`cluster.yml` playbook, see the
GlusterFS is not deployed by the standard `cluster.yml` playbook, see the
[GlusterFS playbook documentation](../../network-storage/glusterfs/README.md)
for instructions.
Basically you will install Gluster as
```ShellSession
$ ansible-playbook --become -i inventory/$CLUSTER/hosts ./contrib/network-storage/glusterfs/glusterfs.yml
```
```ShellSession
ansible-playbook --become -i inventory/$CLUSTER/hosts ./contrib/network-storage/glusterfs/glusterfs.yml
```
## What's next
@ -609,6 +653,7 @@ Try out your new Kubernetes cluster with the [Hello Kubernetes service](https://
## Appendix
### Migration from `number_of_k8s_nodes*` to `k8s_nodes`
If you currently have a cluster defined using the `number_of_k8s_nodes*` variables and wish
to migrate to the `k8s_nodes` style you can do it like so:

View File

@ -1,7 +1,3 @@
provider "openstack" {
version = "~> 1.17"
}
module "network" {
source = "./modules/network"
@ -27,6 +23,8 @@ module "ips" {
network_name = var.network_name
router_id = module.network.router_id
k8s_nodes = var.k8s_nodes
k8s_master_fips = var.k8s_master_fips
router_internal_port_id = module.network.router_internal_port_id
}
module "compute" {
@ -54,7 +52,11 @@ module "compute" {
master_volume_type = var.master_volume_type
public_key_path = var.public_key_path
image = var.image
image_uuid = var.image_uuid
image_gfs = var.image_gfs
image_master = var.image_master
image_master_uuid = var.image_master_uuid
image_gfs_uuid = var.image_gfs_uuid
ssh_user = var.ssh_user
ssh_user_gfs = var.ssh_user_gfs
flavor_k8s_master = var.flavor_k8s_master
@ -79,6 +81,8 @@ module "compute" {
wait_for_floatingip = var.wait_for_floatingip
use_access_ip = var.use_access_ip
use_server_groups = var.use_server_groups
extra_sec_groups = var.extra_sec_groups
extra_sec_groups_name = var.extra_sec_groups_name
network_id = module.network.router_id
}

View File

@ -1,11 +1,20 @@
data "openstack_images_image_v2" "vm_image" {
count = var.image_uuid == "" ? 1 : 0
most_recent = true
name = var.image
}
data "openstack_images_image_v2" "gfs_image" {
count = var.image_gfs_uuid == "" ? var.image_uuid == "" ? 1 : 0 : 0
most_recent = true
name = var.image_gfs == "" ? var.image : var.image_gfs
}
data "openstack_images_image_v2" "image_master" {
count = var.image_master_uuid == "" ? var.image_uuid == "" ? 1 : 0 : 0
name = var.image_master == "" ? var.image : var.image_master
}
resource "openstack_compute_keypair_v2" "k8s" {
name = "kubernetes-${var.cluster_name}"
public_key = chomp(file(var.public_key_path))
@ -17,6 +26,13 @@ resource "openstack_networking_secgroup_v2" "k8s_master" {
delete_default_rules = true
}
resource "openstack_networking_secgroup_v2" "k8s_master_extra" {
count = "%{if var.extra_sec_groups}1%{else}0%{endif}"
name = "${var.cluster_name}-k8s-master-${var.extra_sec_groups_name}"
description = "${var.cluster_name} - Kubernetes Master nodes - rules not managed by terraform"
delete_default_rules = true
}
resource "openstack_networking_secgroup_rule_v2" "k8s_master" {
count = length(var.master_allowed_remote_ips)
direction = "ingress"
@ -95,6 +111,13 @@ resource "openstack_networking_secgroup_v2" "worker" {
delete_default_rules = true
}
resource "openstack_networking_secgroup_v2" "worker_extra" {
count = "%{if var.extra_sec_groups}1%{else}0%{endif}"
name = "${var.cluster_name}-k8s-worker-${var.extra_sec_groups_name}"
description = "${var.cluster_name} - Kubernetes worker nodes - rules not managed by terraform"
delete_default_rules = true
}
resource "openstack_networking_secgroup_rule_v2" "worker" {
count = length(var.worker_allowed_ports)
direction = "ingress"
@ -124,17 +147,39 @@ resource "openstack_compute_servergroup_v2" "k8s_etcd" {
policies = ["anti-affinity"]
}
locals {
# master groups
master_sec_groups = compact([
openstack_networking_secgroup_v2.k8s_master.name,
openstack_networking_secgroup_v2.k8s.name,
var.extra_sec_groups ?openstack_networking_secgroup_v2.k8s_master_extra[0].name : "",
])
# worker groups
worker_sec_groups = compact([
openstack_networking_secgroup_v2.k8s.name,
openstack_networking_secgroup_v2.worker.name,
var.extra_sec_groups ? openstack_networking_secgroup_v2.worker_extra[0].name : "",
])
# Image uuid
image_to_use_node = var.image_uuid != "" ? var.image_uuid : data.openstack_images_image_v2.vm_image[0].id
# Image_gfs uuid
image_to_use_gfs = var.image_gfs_uuid != "" ? var.image_gfs_uuid : var.image_uuid != "" ? var.image_uuid : data.openstack_images_image_v2.gfs_image[0].id
# image_master uuidimage_gfs_uuid
image_to_use_master = var.image_master_uuid != "" ? var.image_master_uuid : var.image_uuid != "" ? var.image_uuid : data.openstack_images_image_v2.image_master[0].id
}
resource "openstack_compute_instance_v2" "bastion" {
name = "${var.cluster_name}-bastion-${count.index + 1}"
count = var.number_of_bastions
image_name = var.image
image_id = var.bastion_root_volume_size_in_gb == 0 ? local.image_to_use_node : null
flavor_id = var.flavor_bastion
key_pair = openstack_compute_keypair_v2.k8s.name
dynamic "block_device" {
for_each = var.bastion_root_volume_size_in_gb > 0 ? [var.image] : []
for_each = var.bastion_root_volume_size_in_gb > 0 ? [local.image_to_use_node] : []
content {
uuid = data.openstack_images_image_v2.vm_image.id
uuid = local.image_to_use_node
source_type = "image"
volume_size = var.bastion_root_volume_size_in_gb
boot_index = 0
@ -159,7 +204,7 @@ resource "openstack_compute_instance_v2" "bastion" {
}
provisioner "local-exec" {
command = "sed s/USER/${var.ssh_user}/ ../../contrib/terraform/openstack/ansible_bastion_template.txt | sed s/BASTION_ADDRESS/${var.bastion_fips[0]}/ > group_vars/no-floating.yml"
command = "sed s/USER/${var.ssh_user}/ ../../contrib/terraform/openstack/ansible_bastion_template.txt | sed s/BASTION_ADDRESS/${var.bastion_fips[0]}/ > group_vars/no_floating.yml"
}
}
@ -167,15 +212,15 @@ resource "openstack_compute_instance_v2" "k8s_master" {
name = "${var.cluster_name}-k8s-master-${count.index + 1}"
count = var.number_of_k8s_masters
availability_zone = element(var.az_list, count.index)
image_name = var.image
image_id = var.master_root_volume_size_in_gb == 0 ? local.image_to_use_master : null
flavor_id = var.flavor_k8s_master
key_pair = openstack_compute_keypair_v2.k8s.name
dynamic "block_device" {
for_each = var.master_root_volume_size_in_gb > 0 ? [var.image] : []
for_each = var.master_root_volume_size_in_gb > 0 ? [local.image_to_use_master] : []
content {
uuid = data.openstack_images_image_v2.vm_image.id
uuid = local.image_to_use_master
source_type = "image"
volume_size = var.master_root_volume_size_in_gb
volume_type = var.master_volume_type
@ -189,9 +234,7 @@ resource "openstack_compute_instance_v2" "k8s_master" {
name = var.network_name
}
security_groups = [openstack_networking_secgroup_v2.k8s_master.name,
openstack_networking_secgroup_v2.k8s.name,
]
security_groups = local.master_sec_groups
dynamic "scheduler_hints" {
for_each = var.use_server_groups ? [openstack_compute_servergroup_v2.k8s_master[0]] : []
@ -202,13 +245,13 @@ resource "openstack_compute_instance_v2" "k8s_master" {
metadata = {
ssh_user = var.ssh_user
kubespray_groups = "etcd,kube-master,${var.supplementary_master_groups},k8s-cluster,vault"
kubespray_groups = "etcd,kube_control_plane,${var.supplementary_master_groups},k8s_cluster"
depends_on = var.network_id
use_access_ip = var.use_access_ip
}
provisioner "local-exec" {
command = "sed s/USER/${var.ssh_user}/ ../../contrib/terraform/openstack/ansible_bastion_template.txt | sed s/BASTION_ADDRESS/${element(concat(var.bastion_fips, var.k8s_master_fips), 0)}/ > group_vars/no-floating.yml"
command = "sed s/USER/${var.ssh_user}/ ../../contrib/terraform/openstack/ansible_bastion_template.txt | sed s/BASTION_ADDRESS/${element(concat(var.bastion_fips, var.k8s_master_fips), 0)}/ > group_vars/no_floating.yml"
}
}
@ -216,15 +259,15 @@ resource "openstack_compute_instance_v2" "k8s_master_no_etcd" {
name = "${var.cluster_name}-k8s-master-ne-${count.index + 1}"
count = var.number_of_k8s_masters_no_etcd
availability_zone = element(var.az_list, count.index)
image_name = var.image
image_id = var.master_root_volume_size_in_gb == 0 ? local.image_to_use_master : null
flavor_id = var.flavor_k8s_master
key_pair = openstack_compute_keypair_v2.k8s.name
dynamic "block_device" {
for_each = var.master_root_volume_size_in_gb > 0 ? [var.image] : []
for_each = var.master_root_volume_size_in_gb > 0 ? [local.image_to_use_master] : []
content {
uuid = data.openstack_images_image_v2.vm_image.id
uuid = local.image_to_use_master
source_type = "image"
volume_size = var.master_root_volume_size_in_gb
volume_type = var.master_volume_type
@ -238,9 +281,7 @@ resource "openstack_compute_instance_v2" "k8s_master_no_etcd" {
name = var.network_name
}
security_groups = [openstack_networking_secgroup_v2.k8s_master.name,
openstack_networking_secgroup_v2.k8s.name,
]
security_groups = local.master_sec_groups
dynamic "scheduler_hints" {
for_each = var.use_server_groups ? [openstack_compute_servergroup_v2.k8s_master[0]] : []
@ -251,13 +292,13 @@ resource "openstack_compute_instance_v2" "k8s_master_no_etcd" {
metadata = {
ssh_user = var.ssh_user
kubespray_groups = "kube-master,${var.supplementary_master_groups},k8s-cluster,vault"
kubespray_groups = "kube_control_plane,${var.supplementary_master_groups},k8s_cluster"
depends_on = var.network_id
use_access_ip = var.use_access_ip
}
provisioner "local-exec" {
command = "sed s/USER/${var.ssh_user}/ ../../contrib/terraform/openstack/ansible_bastion_template.txt | sed s/BASTION_ADDRESS/${element(concat(var.bastion_fips, var.k8s_master_fips), 0)}/ > group_vars/no-floating.yml"
command = "sed s/USER/${var.ssh_user}/ ../../contrib/terraform/openstack/ansible_bastion_template.txt | sed s/BASTION_ADDRESS/${element(concat(var.bastion_fips, var.k8s_master_fips), 0)}/ > group_vars/no_floating.yml"
}
}
@ -265,14 +306,14 @@ resource "openstack_compute_instance_v2" "etcd" {
name = "${var.cluster_name}-etcd-${count.index + 1}"
count = var.number_of_etcd
availability_zone = element(var.az_list, count.index)
image_name = var.image
image_id = var.etcd_root_volume_size_in_gb == 0 ? local.image_to_use_master : null
flavor_id = var.flavor_etcd
key_pair = openstack_compute_keypair_v2.k8s.name
dynamic "block_device" {
for_each = var.etcd_root_volume_size_in_gb > 0 ? [var.image] : []
for_each = var.etcd_root_volume_size_in_gb > 0 ? [local.image_to_use_master] : []
content {
uuid = data.openstack_images_image_v2.vm_image.id
uuid = local.image_to_use_master
source_type = "image"
volume_size = var.etcd_root_volume_size_in_gb
boot_index = 0
@ -296,7 +337,7 @@ resource "openstack_compute_instance_v2" "etcd" {
metadata = {
ssh_user = var.ssh_user
kubespray_groups = "etcd,vault,no-floating"
kubespray_groups = "etcd,no_floating"
depends_on = var.network_id
use_access_ip = var.use_access_ip
}
@ -306,14 +347,14 @@ resource "openstack_compute_instance_v2" "k8s_master_no_floating_ip" {
name = "${var.cluster_name}-k8s-master-nf-${count.index + 1}"
count = var.number_of_k8s_masters_no_floating_ip
availability_zone = element(var.az_list, count.index)
image_name = var.image
image_id = var.master_root_volume_size_in_gb == 0 ? local.image_to_use_master : null
flavor_id = var.flavor_k8s_master
key_pair = openstack_compute_keypair_v2.k8s.name
dynamic "block_device" {
for_each = var.master_root_volume_size_in_gb > 0 ? [var.image] : []
for_each = var.master_root_volume_size_in_gb > 0 ? [local.image_to_use_master] : []
content {
uuid = data.openstack_images_image_v2.vm_image.id
uuid = local.image_to_use_master
source_type = "image"
volume_size = var.master_root_volume_size_in_gb
volume_type = var.master_volume_type
@ -327,9 +368,7 @@ resource "openstack_compute_instance_v2" "k8s_master_no_floating_ip" {
name = var.network_name
}
security_groups = [openstack_networking_secgroup_v2.k8s_master.name,
openstack_networking_secgroup_v2.k8s.name,
]
security_groups = local.master_sec_groups
dynamic "scheduler_hints" {
for_each = var.use_server_groups ? [openstack_compute_servergroup_v2.k8s_master[0]] : []
@ -340,7 +379,7 @@ resource "openstack_compute_instance_v2" "k8s_master_no_floating_ip" {
metadata = {
ssh_user = var.ssh_user
kubespray_groups = "etcd,kube-master,${var.supplementary_master_groups},k8s-cluster,vault,no-floating"
kubespray_groups = "etcd,kube_control_plane,${var.supplementary_master_groups},k8s_cluster,no_floating"
depends_on = var.network_id
use_access_ip = var.use_access_ip
}
@ -350,14 +389,14 @@ resource "openstack_compute_instance_v2" "k8s_master_no_floating_ip_no_etcd" {
name = "${var.cluster_name}-k8s-master-ne-nf-${count.index + 1}"
count = var.number_of_k8s_masters_no_floating_ip_no_etcd
availability_zone = element(var.az_list, count.index)
image_name = var.image
image_id = var.master_root_volume_size_in_gb == 0 ? local.image_to_use_master : null
flavor_id = var.flavor_k8s_master
key_pair = openstack_compute_keypair_v2.k8s.name
dynamic "block_device" {
for_each = var.master_root_volume_size_in_gb > 0 ? [var.image] : []
for_each = var.master_root_volume_size_in_gb > 0 ? [local.image_to_use_master] : []
content {
uuid = data.openstack_images_image_v2.vm_image.id
uuid = local.image_to_use_master
source_type = "image"
volume_size = var.master_root_volume_size_in_gb
volume_type = var.master_volume_type
@ -371,9 +410,7 @@ resource "openstack_compute_instance_v2" "k8s_master_no_floating_ip_no_etcd" {
name = var.network_name
}
security_groups = [openstack_networking_secgroup_v2.k8s_master.name,
openstack_networking_secgroup_v2.k8s.name,
]
security_groups = local.master_sec_groups
dynamic "scheduler_hints" {
for_each = var.use_server_groups ? [openstack_compute_servergroup_v2.k8s_master[0]] : []
@ -384,7 +421,7 @@ resource "openstack_compute_instance_v2" "k8s_master_no_floating_ip_no_etcd" {
metadata = {
ssh_user = var.ssh_user
kubespray_groups = "kube-master,${var.supplementary_master_groups},k8s-cluster,vault,no-floating"
kubespray_groups = "kube_control_plane,${var.supplementary_master_groups},k8s_cluster,no_floating"
depends_on = var.network_id
use_access_ip = var.use_access_ip
}
@ -394,14 +431,14 @@ resource "openstack_compute_instance_v2" "k8s_node" {
name = "${var.cluster_name}-k8s-node-${count.index + 1}"
count = var.number_of_k8s_nodes
availability_zone = element(var.az_list_node, count.index)
image_name = var.image
image_id = var.node_root_volume_size_in_gb == 0 ? local.image_to_use_node : null
flavor_id = var.flavor_k8s_node
key_pair = openstack_compute_keypair_v2.k8s.name
dynamic "block_device" {
for_each = var.node_root_volume_size_in_gb > 0 ? [var.image] : []
for_each = var.node_root_volume_size_in_gb > 0 ? [local.image_to_use_node] : []
content {
uuid = data.openstack_images_image_v2.vm_image.id
uuid = local.image_to_use_node
source_type = "image"
volume_size = var.node_root_volume_size_in_gb
boot_index = 0
@ -414,9 +451,7 @@ resource "openstack_compute_instance_v2" "k8s_node" {
name = var.network_name
}
security_groups = [openstack_networking_secgroup_v2.k8s.name,
openstack_networking_secgroup_v2.worker.name,
]
security_groups = local.worker_sec_groups
dynamic "scheduler_hints" {
for_each = var.use_server_groups ? [openstack_compute_servergroup_v2.k8s_node[0]] : []
@ -427,13 +462,13 @@ resource "openstack_compute_instance_v2" "k8s_node" {
metadata = {
ssh_user = var.ssh_user
kubespray_groups = "kube-node,k8s-cluster,${var.supplementary_node_groups}"
kubespray_groups = "kube_node,k8s_cluster,${var.supplementary_node_groups}"
depends_on = var.network_id
use_access_ip = var.use_access_ip
}
provisioner "local-exec" {
command = "sed s/USER/${var.ssh_user}/ ../../contrib/terraform/openstack/ansible_bastion_template.txt | sed s/BASTION_ADDRESS/${element(concat(var.bastion_fips, var.k8s_node_fips), 0)}/ > group_vars/no-floating.yml"
command = "sed s/USER/${var.ssh_user}/ ../../contrib/terraform/openstack/ansible_bastion_template.txt | sed s/BASTION_ADDRESS/${element(concat(var.bastion_fips, var.k8s_node_fips), 0)}/ > group_vars/no_floating.yml"
}
}
@ -441,14 +476,14 @@ resource "openstack_compute_instance_v2" "k8s_node_no_floating_ip" {
name = "${var.cluster_name}-k8s-node-nf-${count.index + 1}"
count = var.number_of_k8s_nodes_no_floating_ip
availability_zone = element(var.az_list_node, count.index)
image_name = var.image
image_id = var.node_root_volume_size_in_gb == 0 ? local.image_to_use_node : null
flavor_id = var.flavor_k8s_node
key_pair = openstack_compute_keypair_v2.k8s.name
dynamic "block_device" {
for_each = var.node_root_volume_size_in_gb > 0 ? [var.image] : []
for_each = var.node_root_volume_size_in_gb > 0 ? [local.image_to_use_node] : []
content {
uuid = data.openstack_images_image_v2.vm_image.id
uuid = local.image_to_use_node
source_type = "image"
volume_size = var.node_root_volume_size_in_gb
boot_index = 0
@ -461,9 +496,7 @@ resource "openstack_compute_instance_v2" "k8s_node_no_floating_ip" {
name = var.network_name
}
security_groups = [openstack_networking_secgroup_v2.k8s.name,
openstack_networking_secgroup_v2.worker.name,
]
security_groups = local.worker_sec_groups
dynamic "scheduler_hints" {
for_each = var.use_server_groups ? [openstack_compute_servergroup_v2.k8s_node[0]] : []
@ -474,7 +507,7 @@ resource "openstack_compute_instance_v2" "k8s_node_no_floating_ip" {
metadata = {
ssh_user = var.ssh_user
kubespray_groups = "kube-node,k8s-cluster,no-floating,${var.supplementary_node_groups}"
kubespray_groups = "kube_node,k8s_cluster,no_floating,${var.supplementary_node_groups}"
depends_on = var.network_id
use_access_ip = var.use_access_ip
}
@ -484,14 +517,14 @@ resource "openstack_compute_instance_v2" "k8s_nodes" {
for_each = var.number_of_k8s_nodes == 0 && var.number_of_k8s_nodes_no_floating_ip == 0 ? var.k8s_nodes : {}
name = "${var.cluster_name}-k8s-node-${each.key}"
availability_zone = each.value.az
image_name = var.image
image_id = var.node_root_volume_size_in_gb == 0 ? local.image_to_use_node : null
flavor_id = each.value.flavor
key_pair = openstack_compute_keypair_v2.k8s.name
dynamic "block_device" {
for_each = var.node_root_volume_size_in_gb > 0 ? [var.image] : []
for_each = var.node_root_volume_size_in_gb > 0 ? [local.image_to_use_node] : []
content {
uuid = data.openstack_images_image_v2.vm_image.id
uuid = local.image_to_use_node
source_type = "image"
volume_size = var.node_root_volume_size_in_gb
boot_index = 0
@ -504,9 +537,7 @@ resource "openstack_compute_instance_v2" "k8s_nodes" {
name = var.network_name
}
security_groups = [openstack_networking_secgroup_v2.k8s.name,
openstack_networking_secgroup_v2.worker.name,
]
security_groups = local.worker_sec_groups
dynamic "scheduler_hints" {
for_each = var.use_server_groups ? [openstack_compute_servergroup_v2.k8s_node[0]] : []
@ -517,13 +548,13 @@ resource "openstack_compute_instance_v2" "k8s_nodes" {
metadata = {
ssh_user = var.ssh_user
kubespray_groups = "kube-node,k8s-cluster,%{if each.value.floating_ip == false}no-floating,%{endif}${var.supplementary_node_groups}"
kubespray_groups = "kube_node,k8s_cluster,%{if each.value.floating_ip == false}no_floating,%{endif}${var.supplementary_node_groups}"
depends_on = var.network_id
use_access_ip = var.use_access_ip
}
provisioner "local-exec" {
command = "%{if each.value.floating_ip}sed s/USER/${var.ssh_user}/ ../../contrib/terraform/openstack/ansible_bastion_template.txt | sed s/BASTION_ADDRESS/${element(concat(var.bastion_fips, [for key, value in var.k8s_nodes_fips : value.address]), 0)}/ > group_vars/no-floating.yml%{else}true%{endif}"
command = "%{if each.value.floating_ip}sed s/USER/${var.ssh_user}/ ../../contrib/terraform/openstack/ansible_bastion_template.txt | sed s/BASTION_ADDRESS/${element(concat(var.bastion_fips, [for key, value in var.k8s_nodes_fips : value.address]), 0)}/ > group_vars/no_floating.yml%{else}true%{endif}"
}
}
@ -531,14 +562,14 @@ resource "openstack_compute_instance_v2" "glusterfs_node_no_floating_ip" {
name = "${var.cluster_name}-gfs-node-nf-${count.index + 1}"
count = var.number_of_gfs_nodes_no_floating_ip
availability_zone = element(var.az_list, count.index)
image_name = var.image_gfs
image_name = var.gfs_root_volume_size_in_gb == 0 ? local.image_to_use_gfs : null
flavor_id = var.flavor_gfs_node
key_pair = openstack_compute_keypair_v2.k8s.name
dynamic "block_device" {
for_each = var.gfs_root_volume_size_in_gb > 0 ? [var.image] : []
for_each = var.gfs_root_volume_size_in_gb > 0 ? [local.image_to_use_gfs] : []
content {
uuid = data.openstack_images_image_v2.vm_image.id
uuid = local.image_to_use_gfs
source_type = "image"
volume_size = var.gfs_root_volume_size_in_gb
boot_index = 0
@ -562,7 +593,7 @@ resource "openstack_compute_instance_v2" "glusterfs_node_no_floating_ip" {
metadata = {
ssh_user = var.ssh_user_gfs
kubespray_groups = "gfs-cluster,network-storage,no-floating"
kubespray_groups = "gfs-cluster,network-storage,no_floating"
depends_on = var.network_id
use_access_ip = var.use_access_ip
}

View File

@ -127,3 +127,27 @@ variable "use_access_ip" {}
variable "use_server_groups" {
type = bool
}
variable "extra_sec_groups" {
type = bool
}
variable "extra_sec_groups_name" {
type = string
}
variable "image_uuid" {
type = string
}
variable "image_gfs_uuid" {
type = string
}
variable "image_master" {
type = string
}
variable "image_master_uuid" {
type = string
}

View File

@ -0,0 +1,8 @@
terraform {
required_providers {
openstack = {
source = "terraform-provider-openstack/openstack"
}
}
required_version = ">= 0.12.26"
}

View File

@ -2,16 +2,21 @@ resource "null_resource" "dummy_dependency" {
triggers = {
dependency_id = var.router_id
}
depends_on = [
var.router_internal_port_id
]
}
# If user specifies pre-existing IPs to use in k8s_master_fips, do not create new ones.
resource "openstack_networking_floatingip_v2" "k8s_master" {
count = var.number_of_k8s_masters
count = length(var.k8s_master_fips) > 0 ? 0 : var.number_of_k8s_masters
pool = var.floatingip_pool
depends_on = [null_resource.dummy_dependency]
}
# If user specifies pre-existing IPs to use in k8s_master_fips, do not create new ones.
resource "openstack_networking_floatingip_v2" "k8s_master_no_etcd" {
count = var.number_of_k8s_masters_no_etcd
count = length(var.k8s_master_fips) > 0 ? 0 : var.number_of_k8s_masters_no_etcd
pool = var.floatingip_pool
depends_on = [null_resource.dummy_dependency]
}

View File

@ -1,9 +1,11 @@
# If k8s_master_fips is already defined as input, keep the same value since new FIPs have not been created.
output "k8s_master_fips" {
value = openstack_networking_floatingip_v2.k8s_master[*].address
value = length(var.k8s_master_fips) > 0 ? var.k8s_master_fips : openstack_networking_floatingip_v2.k8s_master[*].address
}
# If k8s_master_fips is already defined as input, keep the same value since new FIPs have not been created.
output "k8s_master_no_etcd_fips" {
value = openstack_networking_floatingip_v2.k8s_master_no_etcd[*].address
value = length(var.k8s_master_fips) > 0 ? var.k8s_master_fips : openstack_networking_floatingip_v2.k8s_master_no_etcd[*].address
}
output "k8s_node_fips" {

View File

@ -17,3 +17,7 @@ variable "router_id" {
}
variable "k8s_nodes" {}
variable "k8s_master_fips" {}
variable "router_internal_port_id" {}

View File

@ -0,0 +1,11 @@
terraform {
required_providers {
null = {
source = "hashicorp/null"
}
openstack = {
source = "terraform-provider-openstack/openstack"
}
}
required_version = ">= 0.12.26"
}

View File

@ -0,0 +1,8 @@
terraform {
required_providers {
openstack = {
source = "terraform-provider-openstack/openstack"
}
}
required_version = ">= 0.12.26"
}

View File

@ -152,7 +152,13 @@ variable "subnet_cidr" {
variable "dns_nameservers" {
description = "An array of DNS name server names used by hosts in this subnet."
type = list
type = list(string)
default = []
}
variable "k8s_master_fips" {
description = "specific pre-existing floating IPs to use for master nodes"
type = list(string)
default = []
}
@ -171,12 +177,12 @@ variable "external_net" {
}
variable "supplementary_master_groups" {
description = "supplementary kubespray ansible groups for masters, such kube-node"
description = "supplementary kubespray ansible groups for masters, such kube_node"
default = ""
}
variable "supplementary_node_groups" {
description = "supplementary kubespray ansible groups for worker nodes, such as kube-ingress"
description = "supplementary kubespray ansible groups for worker nodes, such as kube_ingress"
default = ""
}
@ -205,13 +211,13 @@ variable "k8s_allowed_egress_ips" {
}
variable "master_allowed_ports" {
type = list
type = list(any)
default = []
}
variable "worker_allowed_ports" {
type = list
type = list(any)
default = [
{
@ -236,7 +242,39 @@ variable "router_id" {
default = null
}
variable "router_internal_port_id" {
description = "uuid of the port connection our router to our network"
default = null
}
variable "k8s_nodes" {
default = {}
}
variable "extra_sec_groups" {
default = false
}
variable "extra_sec_groups_name" {
default = "custom"
}
variable "image_uuid" {
description = "uuid of image inside openstack to use"
default = ""
}
variable "image_gfs_uuid" {
description = "uuid of image to be used on gluster fs nodes. If empty defaults to image_uuid"
default = ""
}
variable "image_master" {
description = "uuid of image inside openstack to use"
default = ""
}
variable "image_master_uuid" {
description = "uuid of image to be used on master nodes. If empty defaults to image_uuid"
default = ""
}

View File

@ -0,0 +1,9 @@
terraform {
required_providers {
openstack = {
source = "terraform-provider-openstack/openstack"
version = "~> 1.17"
}
}
required_version = ">= 0.12.26"
}

View File

@ -8,6 +8,7 @@ Provision a Kubernetes cluster with [Terraform](https://www.terraform.io) on
This will install a Kubernetes cluster on Packet bare metal. It should work in all locations and on most server types.
## Approach
The terraform configuration inspects variables found in
[variables.tf](variables.tf) to create resources in your Packet project.
There is a [python script](../terraform.py) that reads the generated`.tfstate`
@ -15,8 +16,10 @@ file to generate a dynamic inventory that is consumed by [cluster.yml](../../../
to actually install Kubernetes with Kubespray.
### Kubernetes Nodes
You can create many different kubernetes topologies by setting the number of
different classes of hosts.
- Master nodes with etcd: `number_of_k8s_masters` variable
- Master nodes without etcd: `number_of_k8s_masters_no_etcd` variable
- Standalone etcd hosts: `number_of_etcd` variable
@ -47,6 +50,7 @@ ssh-keygen -f ~/.ssh/id_rsa
```
## Terraform
Terraform will be used to provision all of the Packet resources with base software as appropriate.
### Configuration
@ -56,9 +60,9 @@ Terraform will be used to provision all of the Packet resources with base softwa
Create an inventory directory for your cluster by copying the existing sample and linking the `hosts` script (used to build the inventory based on Terraform state):
```ShellSession
$ cp -LRp contrib/terraform/packet/sample-inventory inventory/$CLUSTER
$ cd inventory/$CLUSTER
$ ln -s ../../contrib/terraform/packet/hosts
cp -LRp contrib/terraform/packet/sample-inventory inventory/$CLUSTER
cd inventory/$CLUSTER
ln -s ../../contrib/terraform/packet/hosts
```
This will be the base for subsequent Terraform commands.
@ -69,22 +73,23 @@ Your Packet API key must be available in the `PACKET_AUTH_TOKEN` environment var
This key is typically stored outside of the code repo since it is considered secret.
If someone gets this key, they can startup/shutdown hosts in your project!
For more information on how to generate an API key or find your project ID, please see:
https://support.packet.com/kb/articles/api-integrations
For more information on how to generate an API key or find your project ID, please see
[API Integrations](https://support.packet.com/kb/articles/api-integrations)
The Packet Project ID associated with the key will be set later in cluster.tfvars.
For more information about the API, please see:
https://www.packet.com/developers/api/
For more information about the API, please see [Packet API](https://www.packet.com/developers/api/)
Example:
```ShellSession
$ export PACKET_AUTH_TOKEN="Example-API-Token"
export PACKET_AUTH_TOKEN="Example-API-Token"
```
Note that to deploy several clusters within the same project you need to use [terraform workspace](https://www.terraform.io/docs/state/workspaces.html#using-workspaces).
#### Cluster variables
The construction of the cluster is driven by values found in
[variables.tf](variables.tf).
@ -95,14 +100,15 @@ This helps when identifying which hosts are associated with each cluster.
While the defaults in variables.tf will successfully deploy a cluster, it is recommended to set the following values:
* cluster_name = the name of the inventory directory created above as $CLUSTER
* packet_project_id = the Packet Project ID associated with the Packet API token above
- cluster_name = the name of the inventory directory created above as $CLUSTER
- packet_project_id = the Packet Project ID associated with the Packet API token above
#### Enable localhost access
Kubespray will pull down a Kubernetes configuration file to access this cluster by enabling the
Kubespray will pull down a Kubernetes configuration file to access this cluster by enabling the
`kubeconfig_localhost: true` in the Kubespray configuration.
Edit `inventory/$CLUSTER/group_vars/k8s-cluster/k8s-cluster.yml` and comment back in the following line and change from `false` to `true`:
Edit `inventory/$CLUSTER/group_vars/k8s_cluster/k8s_cluster.yml` and comment back in the following line and change from `false` to `true`:
`\# kubeconfig_localhost: false`
becomes:
`kubeconfig_localhost: true`
@ -115,10 +121,10 @@ In the cluster's inventory folder, the following files might be created (either
or manually), to prevent you from pushing them accidentally they are in a
`.gitignore` file in the `terraform/packet` directory :
* `.terraform`
* `.tfvars`
* `.tfstate`
* `.tfstate.backup`
- `.terraform`
- `.tfvars`
- `.tfstate`
- `.tfstate.backup`
You can still add them manually if you want to.
@ -128,34 +134,38 @@ Before Terraform can operate on your cluster you need to install the required
plugins. This is accomplished as follows:
```ShellSession
$ cd inventory/$CLUSTER
$ terraform init ../../contrib/terraform/packet
cd inventory/$CLUSTER
terraform init ../../contrib/terraform/packet
```
This should finish fairly quickly telling you Terraform has successfully initialized and loaded necessary modules.
### Provisioning cluster
You can apply the Terraform configuration to your cluster with the following command
issued from your cluster's inventory directory (`inventory/$CLUSTER`):
```ShellSession
$ terraform apply -var-file=cluster.tfvars ../../contrib/terraform/packet
$ export ANSIBLE_HOST_KEY_CHECKING=False
$ ansible-playbook -i hosts ../../cluster.yml
terraform apply -var-file=cluster.tfvars ../../contrib/terraform/packet
export ANSIBLE_HOST_KEY_CHECKING=False
ansible-playbook -i hosts ../../cluster.yml
```
### Destroying cluster
You can destroy your new cluster with the following command issued from the cluster's inventory directory:
```ShellSession
$ terraform destroy -var-file=cluster.tfvars ../../contrib/terraform/packet
terraform destroy -var-file=cluster.tfvars ../../contrib/terraform/packet
```
If you've started the Ansible run, it may also be a good idea to do some manual cleanup:
* remove SSH keys from the destroyed cluster from your `~/.ssh/known_hosts` file
* clean up any temporary cache files: `rm /tmp/$CLUSTER-*`
- Remove SSH keys from the destroyed cluster from your `~/.ssh/known_hosts` file
- Clean up any temporary cache files: `rm /tmp/$CLUSTER-*`
### Debugging
You can enable debugging output from Terraform by setting `TF_LOG` to `DEBUG` before running the Terraform command.
## Ansible
@ -167,9 +177,9 @@ You can enable debugging output from Terraform by setting `TF_LOG` to `DEBUG` be
Ensure your local ssh-agent is running and your ssh key has been added. This
step is required by the terraform provisioner:
```
$ eval $(ssh-agent -s)
$ ssh-add ~/.ssh/id_rsa
```ShellSession
eval $(ssh-agent -s)
ssh-add ~/.ssh/id_rsa
```
If you have deployed and destroyed a previous iteration of your cluster, you will need to clear out any stale keys from your SSH "known hosts" file ( `~/.ssh/known_hosts`).
@ -178,7 +188,7 @@ If you have deployed and destroyed a previous iteration of your cluster, you wil
Make sure you can connect to the hosts. Note that Flatcar Container Linux by Kinvolk will have a state `FAILED` due to Python not being present. This is okay, because Python will be installed during bootstrapping, so long as the hosts are not `UNREACHABLE`.
```
```ShellSession
$ ansible -i inventory/$CLUSTER/hosts -m ping all
example-k8s_node-1 | SUCCESS => {
"changed": false,
@ -198,8 +208,8 @@ If it fails try to connect manually via SSH. It could be something as simple as
### Deploy Kubernetes
```
$ ansible-playbook --become -i inventory/$CLUSTER/hosts cluster.yml
```ShellSession
ansible-playbook --become -i inventory/$CLUSTER/hosts cluster.yml
```
This will take some time as there are many tasks to run.
@ -208,20 +218,22 @@ This will take some time as there are many tasks to run.
### Set up kubectl
* [Install kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/) on the localhost.
- [Install kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/) on the localhost.
- Verify that Kubectl runs correctly
* Verify that Kubectl runs correctly
```
```ShellSession
kubectl version
```
* Verify that the Kubernetes configuration file has been copied over
```
- Verify that the Kubernetes configuration file has been copied over
```ShellSession
cat inventory/alpha/$CLUSTER/admin.conf
```
* Verify that all the nodes are running correctly.
```
- Verify that all the nodes are running correctly.
```ShellSession
kubectl version
kubectl --kubeconfig=inventory/$CLUSTER/artifacts/admin.conf get nodes
```

View File

@ -19,7 +19,7 @@ resource "packet_device" "k8s_master" {
operating_system = var.operating_system
billing_cycle = var.billing_cycle
project_id = var.packet_project_id
tags = ["cluster-${var.cluster_name}", "k8s-cluster", "kube-master", "etcd", "kube-node"]
tags = ["cluster-${var.cluster_name}", "k8s_cluster", "kube_control_plane", "etcd", "kube_node"]
}
resource "packet_device" "k8s_master_no_etcd" {
@ -32,7 +32,7 @@ resource "packet_device" "k8s_master_no_etcd" {
operating_system = var.operating_system
billing_cycle = var.billing_cycle
project_id = var.packet_project_id
tags = ["cluster-${var.cluster_name}", "k8s-cluster", "kube-master"]
tags = ["cluster-${var.cluster_name}", "k8s_cluster", "kube_control_plane"]
}
resource "packet_device" "k8s_etcd" {
@ -58,6 +58,6 @@ resource "packet_device" "k8s_node" {
operating_system = var.operating_system
billing_cycle = var.billing_cycle
project_id = var.packet_project_id
tags = ["cluster-${var.cluster_name}", "k8s-cluster", "kube-node"]
tags = ["cluster-${var.cluster_name}", "k8s_cluster", "kube_node"]
}

View File

@ -1,4 +1,9 @@
terraform {
required_version = ">= 0.12"
required_providers {
packet = {
source = "terraform-providers/packet"
}
}
}

View File

@ -323,7 +323,7 @@ def openstack_host(resource, module_name):
})
# add groups based on attrs
groups.append('os_image=' + attrs['image']['name'])
groups.append('os_image=' + attrs['image']['id'])
groups.append('os_flavor=' + attrs['flavor']['name'])
groups.extend('os_metadata_%s=%s' % item
for item in list(attrs['metadata'].items()))

View File

@ -0,0 +1,106 @@
# Kubernetes on UpCloud with Terraform
Provision a Kubernetes cluster on [UpCloud](https://upcloud.com/) using Terraform and Kubespray
## Overview
The setup looks like following
```text
Kubernetes cluster
+-----------------------+
| +--------------+ |
| | +--------------+ |
| | | | |
| | | Master/etcd | |
| | | node(s) | |
| +-+ | |
| +--------------+ |
| ^ |
| | |
| v |
| +--------------+ |
| | +--------------+ |
| | | | |
| | | Worker | |
| | | node(s) | |
| +-+ | |
| +--------------+ |
+-----------------------+
```
## Requirements
* Terraform 0.13.0 or newer
## Quickstart
NOTE: Assumes you are at the root of the kubespray repo.
For authentication in your cluster you can use the environment variables.
```bash
export TF_VAR_UPCLOUD_USERNAME=username
export TF_VAR_UPCLOUD_PASSWORD=password
```
To allow API access to your UpCloud account, you need to allow API connections by visiting [Account-page](https://hub.upcloud.com/account) in your UpCloud Hub.
Copy the cluster configuration file.
```bash
CLUSTER=my-upcloud-cluster
cp -r inventory/sample inventory/$CLUSTER
cp contrib/terraform/upcloud/cluster-settings.tfvars inventory/$CLUSTER/
export ANSIBLE_CONFIG=ansible.cfg
cd inventory/$CLUSTER
```
Edit `cluster-settings.tfvars` to match your requirement.
Run Terraform to create the infrastructure.
```bash
terraform init ../../contrib/terraform/upcloud
terraform apply --var-file cluster-settings.tfvars \
-state=tfstate-$CLUSTER.tfstate \
../../contrib/terraform/upcloud/
```
You should now have a inventory file named `inventory.ini` that you can use with kubespray.
You can use the inventory file with kubespray to set up a cluster.
It is a good idea to check that you have basic SSH connectivity to the nodes. You can do that by:
```bash
ansible -i inventory.ini -m ping all
```
You can setup Kubernetes with kubespray using the generated inventory:
```bash
ansible-playbook -i inventory.ini ../../cluster.yml -b -v
```
## Teardown
You can teardown your infrastructure using the following Terraform command:
```bash
terraform destroy --var-file cluster-settings.tfvars \
-state=tfstate-$CLUSTER.tfstate \
../../contrib/terraform/upcloud/
```
## Variables
* `hostname`: A valid domain name, e.g. example.com. The maximum length is 128 characters.
* `template_name`: The name or UUID of a base image
* `username`: a user to access the nodes
* `ssh_public_keys`: List of public SSH keys to install on all machines
* `zone`: The zone where to run the cluster
* `machines`: Machines to provision. Key of this object will be used as the name of the machine
* `node_type`: The role of this node *(master|worker)*
* `cpu`: number of cpu cores
* `mem`: memory size in MB
* `disk_size`: The size of the storage in GB

View File

@ -0,0 +1,58 @@
# See: https://developers.upcloud.com/1.3/5-zones/
zone = "fi-hel1"
username = "ubuntu"
inventory_file = "inventory.ini"
# A valid domain name, e.g. host.example.com. The maximum length is 128 characters.
hostname = "example.com"
# Set the operating system using UUID or exact name
template_name = "Ubuntu Server 20.04 LTS (Focal Fossa)"
ssh_public_keys = [
# Put your public SSH key here
"ssh-rsa public key 1",
"ssh-rsa public key 2",
]
#check list of available plan https://developers.upcloud.com/1.3/7-plans/
machines = {
"master-0" : {
"node_type" : "master",
#number of cpu cores
"cpu" : "2",
#memory size in MB
"mem" : "4096"
# The size of the storage in GB
"disk_size" : 250
},
"worker-0" : {
"node_type" : "worker",
#number of cpu cores
"cpu" : "2",
#memory size in MB
"mem" : "4096"
# The size of the storage in GB
"disk_size" : 250
},
"worker-1" : {
"node_type" : "worker",
#number of cpu cores
"cpu" : "2",
#memory size in MB
"mem" : "4096"
# The size of the storage in GB
"disk_size" : 250
},
"worker-2" : {
"node_type" : "worker",
#number of cpu cores
"cpu" : "2",
#memory size in MB
"mem" : "4096"
# The size of the storage in GB
"disk_size" : 250
}
}

View File

@ -0,0 +1,55 @@
terraform {
required_version = ">= 0.13.0"
}
provider "upcloud" {
# Your UpCloud credentials are read from environment variables:
username = var.UPCLOUD_USERNAME
password = var.UPCLOUD_PASSWORD
}
module "kubernetes" {
source = "./modules/kubernetes-cluster"
zone = var.zone
hostname = var.hostname
template_name = var.template_name
username = var.username
machines = var.machines
ssh_public_keys = var.ssh_public_keys
}
#
# Generate ansible inventory
#
data "template_file" "inventory" {
template = file("${path.module}/templates/inventory.tpl")
vars = {
connection_strings_master = join("\n", formatlist("%s ansible_user=ubuntu ansible_host=%s etcd_member_name=etcd%d",
keys(module.kubernetes.master_ip),
values(module.kubernetes.master_ip),
range(1, length(module.kubernetes.master_ip) + 1)))
connection_strings_worker = join("\n", formatlist("%s ansible_user=ubuntu ansible_host=%s",
keys(module.kubernetes.worker_ip),
values(module.kubernetes.worker_ip)))
list_master = join("\n", formatlist("%s",
keys(module.kubernetes.master_ip)))
list_worker = join("\n", formatlist("%s",
keys(module.kubernetes.worker_ip)))
}
}
resource "null_resource" "inventories" {
provisioner "local-exec" {
command = "echo '${data.template_file.inventory.rendered}' > ${var.inventory_file}"
}
triggers = {
template = data.template_file.inventory.rendered
}
}

View File

@ -0,0 +1,66 @@
resource "upcloud_server" "master" {
for_each = {
for name, machine in var.machines :
name => machine
if machine.node_type == "master"
}
hostname = "${each.key}.${var.hostname}"
cpu = each.value.cpu
mem = each.value.mem
zone = var.zone
template {
storage = var.template_name
size = each.value.disk_size
}
# Network interfaces
network_interface {
type = "public"
}
network_interface {
type = "utility"
}
# Include at least one public SSH key
login {
user = var.username
keys = var.ssh_public_keys
create_password = false
}
}
resource "upcloud_server" "worker" {
for_each = {
for name, machine in var.machines :
name => machine
if machine.node_type == "worker"
}
hostname = "${each.key}.${var.hostname}"
cpu = each.value.cpu
mem = each.value.mem
zone = var.zone
template {
storage = var.template_name
size = each.value.disk_size
}
# Network interfaces
network_interface {
type = "public"
}
# Include at least one public SSH key
login {
user = var.username
keys = var.ssh_public_keys
create_password = false
}
}

Some files were not shown because too many files have changed in this diff Show More