Kubernetes

gridscale Managed Kubernetes (GSK) is a secure and fully-managed Kubernetes solution. All you need to do is to configure how powerful you wish your cluster to be. We take care of upgrades and OS maintenance.

gridscale Managed Kubernetes (GSK) fully integrates into our products, offering easy configuration, monitoring, release management and security enabling you to explicitly focus on your business applications. GSK easily integrates with our Load Balancer, Certificates and Storage IaaS for Ingress and persistent volumes respectively.

The GSK cluster is managed by our provisioning stack. Any changes that are done directly to the worker nodes are transient as worker nodes can and will be recycled under certain circumstances. Due to this transient nature, volume claims that need to persist should be done as persistent volume claims through Kubernetes.

If you are new to Kubernetes or containers in general, we’d recommend you get familiar with commonly used terminology and go through our line-up of content for you to get started:

  1. gridscale presents: Managed Kubernetes
  2. Kubernetes - All about clusters, pods and kubelets
  3. How to: connect gridscale Kubernetes Cluster and PaaS
  4. Release notes

You also may want to take a look at known issues.

Release Support

The GSK offering supports the three latest stable Kubernetes releases. Keep in mind that support for newer Kubernetes versions require time to adapt components that integrate the gridscale platform into Kubernetes and provide a migration path from previous releases.

Releases other than the three latest supported GSK releases are deprecated. You will be notified of GSK release deprecation within the Cloud Panel Notification four weeks in advance.

Please upgrade your clusters ahead of deprecation (see below on how to perform upgrades). Once deprecated, your cluster is subject to an auto-upgrade. With auto-upgrades, the correct functioning of your workloads cannot be guaranteed, because minor Kubernetes versions can introduce breaking changes to your workloads.

Release notes:

1.21.14-gs1, 1.20.15-gs2, 1.19.16-gs2 (released: 2022-09-12)

Bug fixes:

  • Fix the issue of the surge node upgrade, where the new configuration was not saved for further operations such as scale-out/in.

1.21.14-gs0, 1.20.15-gs1, 1.19.16-gs1 (released: 2022-07-19)

Improvements:

  • Upgrade k8s: v1.21.14, v1.20.15, 1.19.16.
  • Support PVC volume usage metrics
  • Support PVC volume health
  • Support surge upgrades to avoid resource shortage during the upgrade

1.21.11-gs0, 1.20.15-gs0, 1.19.16-gs0

Improvements:

  • Upgrade k8s: v1.21.11, v1.20.15, 1.19.16.
  • Avoid Warning FailedScheduling in pods with PVC.
  • Spread pods across nodes evenly via PodTopologySpread.

Bug fixes:

  • Fix issue causing storage cannot be deleted when storageclass has reclaimPolicy: Retain.
  • Fix scale out/in fails if one of the nodes is down.
  • Fix k8s doesn’t recursively change ownership and permissions for the contents of each volume to match the fsGroup specified in a Pod’s securityContext when that volume is mounted.

GSK Updates and Upgrades

Patch Updates

Patch updates contain either a new Kubernetes patch release or GSK specific changes (such as CSI plugin) or both.

Availability of new patch updates are announced as notifications in your Cloud Panel.

Upon availability, you can update your cluster via the Cloud Panel or the API at a time of your choosing.

To guarantee that your cluster is running the latest stable patch update, unpatched clusters will be auto-updated after 3 weeks of patch availability.

Please consult the upgrade considerations section below.

Release Upgrades

Release upgrades contain a new Kubernetes minor or major release and (optionally) GSK specific changes. Release upgrades are not performed automatically for you.

You can perform release upgrades via the Cloud Panel or the API at a time of your choosing.

Please consult the upgrade considerations section below for compatibility information between Kubernetes releases.

Performing Patch Updates and Release Upgrades via the API

  1. Get your GSK service:
curl 'https://api.gridscale.io/objects/paas/services/<CLUSTER_UUID>' -H 'X-Auth-UserId: <AUTH_USER_UUID>' -H 'X-Auth-Token: <AUTH_TOKEN>'
  1. Get the available Service Templates:
curl 'https://api.gridscale.io/objects/paas/service_templates' -H 'X-Auth-UserId: <AUTH_USER_UUID>' -H 'X-Auth-Token: <AUTH_TOKEN>'
  1. Take the current service_template_uuid from Step 2, which corresponds to your GSK cluster found in Step 1.
  2. Find the target services template from the patch_updates attribute from Step 3.
  3. Initiate GSK Update via Service Patch using the UUID from Step 4:
curl 'https://api.gridscale.io/objects/paas/services/<CLUSTER_UUID>' -X PATCH -H 'Content-Type: application/json' -H 'X-Auth-UserId: <AUTH_USER_UUID>' -H 'X-Auth-Token: <AUTH_TOKEN>' --data-raw '{"service_template_uuid":"<PATCH_UPDATE_SERVICE_TEMPLATE_UUID>"}'

Effect of Updates and Upgrades on Nodes and Workloads

Nodes are considered volatile in the Kubernetes cluster. During updates, upgrades or node recoveries, nodes are not modified - they are replaced.

The process starts by upgrading the master node. Kubernetes API will experience a short interruption during which you won’t be able to change cluster resources. Existing pods will continue to run uninterrupted. New pods can be scheduled once the master node upgrade has completed.

The next step is upgrading all worker nodes. This is a sequential process, where nodes are upgraded one at a time. To avoid resource shortage, surge upgrades are performed by default.

Worker node upgrades drain workloads of the node before taking it down, to allow your pods to be rescheduled gracefully. In case pod disruption policies prevent your workloads from being drained, the process will continue to ensure cluster integrity. Once the node has been drained, it is replaced and joins the cluster again.

Be sure to configure your workloads with redundancies in place, so that they remain available during an upgrade, if continuous operation is a priority for your workload.

Surge Upgrades

With surge upgrades, resource shortage during upgrades is counteracted by adding worker nodes for the time of the upgrade.

If enabled (default is 1 surge node), the configured amount of nodes are added to your cluster before the first node is taken down. They are temporary in nature and are removed once the upgrade has succeeded.

Additional costs are generated during surge node lifetime. You can disable surge upgrades in your Cloud Panel or via the API by setting parameter k8s_surge_node_count to 0.

Note: Surge node count is currently limited to either 0 or 1. Support for counts >1 will be added in the future.

Impact on Node Labels

Node labels are not persisted when nodes are replaced. In case you rely on node labels to control where deployments run in your cluster, please look into Affinity and anti-affinity as the preferred approach.

Considerations for Upgrades

Patch updates (1.19.10 → 1.19.11) is considered safe from the Kubernetes project.

Release upgrades (1.16.x → 1.17.x) can introduce breaking changes. To check if your workloads (deployments, services, daemonsets, etc.) are still compatible with the Kubernetes release you want to upgrade to, you can do several things.

Read on to find out how to check if your workloads (deployments, services, daemonsets etc.) are still compatible with the Kubernetes release you want to upgrade to.

Official Kubernetes Documentation

You can find the official Kubernetes release notes in the Changelog.

Another helpful resource is the Deprecated API Migration Guide, which lists all API removals by release.

Example:

The extensions/v1beta1 API version of NetworkPolicy is no longer served as of v1.16.

  • Migrate manifests and API clients to use the networking.k8s.io/v1 API version, available since v1.8.
  • All existing persisted objects are accessible via the new API

Deploy and Test Workload on a Temporary Cluster

The easiest way is just to provision a test cluster with the new release.

Deploy your workloads to the test cluster and check if everything is working as expected.

This way you can make sure your workloads are compatible with the kubernetes release you want to upgrade to without impact on live workloads.

Third Party Cluster Linting Tool

There are some third party tools which could make the transition easier.

Pluto is a tool which helps users find deprecated Kubernetes APIs.

In this example we see two files in our directory that have deprecated apiVersions. Deployment extensions/v1beta1 is no longer available and needs to be replaced with apps/v1. This will need to be fixed prior to a 1.16 upgrade:

pluto detect-files -d kubernetes/testdata

NAME        KIND         VERSION              REPLACEMENT   REMOVED   DEPRECATED
utilities   Deployment   extensions/v1beta1   apps/v1       true      true
utilities   Deployment   extensions/v1beta1   apps/v1       true      true

Head over to the Pluto Documentation to read more about in-depth usage.

Connect a Kubernetes Cluster to a PaaS service

Requires 1.19.16-gs0, 1.20.15-gs0, 1.21.11-gs0 or higher.

We recently released the support of private networks with IPv4 for PaaS services. This feature allows you to access a PaaS service from a Kubernetes cluster as a Kubernetes service, so your application can access the PaaS service without a proxy. You can follow the following steps:

  • First, you need a GSK cluster. The worker nodes of the GSK cluster will be connected to a private network with IPv4.
  • Determine the private network that the worker nodes are connected to. The name of the private network always consists of the cluster name and the suffix private. For example, if you have a cluster named my-first-gridscale-k8s, the name of the cluster’s private network is my-first-gridscale-k8s-private
  • Connect your PaaS service to the cluster’s private network that you looked up in the previous step. For both new and existing services you can do so
    • via the API, where you specify network_uuid in the create or update request’s payload.
    • via the panel, where you can check the “Relate custom private Network” box during creation of the PaaS service or with the edit-icon in the Connections-pane for existing PaaS services. Then select the corresponding network from the dropdown.
  • Create a Kubernetes service via mapping a hostname to the PaaS service private IP
    • Determine the PaaS service private IP from the Service Access: for example a Postgres database with the following Service Access:

      • connection-string format:
        postgres://postgres:XXpasswordXX@10.244.0.43:5432
        
      • connection-parameters format:
        username = postgres
        password = XXpasswordXX
        host = 10.244.0.43
        port = 5432
        
    • Create a Kubernetes service as in this example

      kind: "Service"
      apiVersion: "v1"
      metadata:
        name: "paas-postgres"
      spec:
        ports:
          - name: "paas-postgres"
            protocol: "TCP"
            port: 5432
            targetPort: 5432
      
    • After applying the above yaml manifest, you can get the paas-postgres service as following

      $ kubectl get services paas-postgres
      NAME            TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)    AGE
      paas-postgres   ClusterIP   10.244.69.82   <none>        5432/TCP   2d17h
      
    • Create a Kubernetes Endpoints for the Kubernetes service. The IP address should be the one from the service access (connection-string or connection-parameters). In this example, the ip address is 10.244.0.43

      kind: "Endpoints"
      apiVersion: "v1"
      metadata:
        name: "paas-postgres"
      subsets:
        - addresses:
          - ip: "10.244.0.43"
          ports:
            - port: 5432
              name: "paas-postgres"
      
    • After applying the above yaml manifest, you can get the paas-postgres endpoints as following

      $ kubectl get endpoints paas-postgres
      NAME            ENDPOINTS          AGE
      paas-postgres   10.244.0.43:5432   2d17h
      
    • Create the secrets for database access, use the postgres database, username, and password.

      $ kubectl create secret generic paas-postgres \
          --from-literal=database=postgres \
          --from-literal=username=postgres \
          --from-literal=password=XXpasswordXX
      
    • As the service, endpoint, and secrets were created, the application now can access the postgres database as a Kubernetes service. Here is an example on how to configure your application to access the postgres database.

      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: my-app
        labels:
          app: my-app
      spec:
        replicas: 1
        selector:
          matchLabels:
            app: my-app
        template:
          metadata:
            labels:
              app: my-app
          spec:
            containers:
            - name: my-app
              image: postgres:12-alpine
              imagePullPolicy: Always
              env:
                - name: DATABASE_HOST
                  value: "paas-postgres"
                - name: DATABASE_NAME
                  valueFrom:
                    secretKeyRef:
                      name: paas-postgres
                      key: database
                - name: DATABASE_USER
                  valueFrom:
                    secretKeyRef:
                      name: paas-postgres
                      key: username
                - name: DATABASE_PASSWORD
                  valueFrom:
                    secretKeyRef:
                      name: paas-postgres
                      key: password
                - name: POSTGRES_PASSWORD
                  valueFrom:
                    secretKeyRef:
                      name: paas-postgres
                      key: password
              ports:
              - containerPort: 8080
      
    • Show the pods

      $ kubectl get pods
      NAME                      READY   STATUS    RESTARTS   AGE
      my-app-6559f7f88c-fjqtq   1/1     Running   0          10s
      
    • You can access the database from one of the pods

      $ kubectl exec -it my-app-6559f7f88c-fjqtq bash
      
    • Connect, describe and list the database

      bash-5.1# PGPASSWORD=$POSTGRES_PASSWORD psql -U $DATABASE_USER -h $DATABASE_HOST
      psql (12.10, server 13.0 (Debian 13.0-1.pgdg100+1))
      WARNING: psql major version 12, server major version 13.
               Some psql features might not work.
      Type "help" for help.
      
      postgres=# \d
                              List of relations
       Schema |               Name                |   Type   |  Owner
      --------+-----------------------------------+----------+----------
       public | auth_group                        | table    | postgres
       public | auth_group_id_seq                 | sequence | postgres
       public | auth_group_permissions            | table    | postgres
       public | auth_group_permissions_id_seq     | sequence | postgres
       public | auth_permission                   | table    | postgres
       public | auth_permission_id_seq            | sequence | postgres
       public | auth_user                         | table    | postgres
       public | auth_user_groups                  | table    | postgres
       public | auth_user_groups_id_seq           | sequence | postgres
       public | auth_user_id_seq                  | sequence | postgres
      
      postgres=# \l
                                       List of databases
         Name    |  Owner   | Encoding |  Collate   |   Ctype    |   Access privileges
      -----------+----------+----------+------------+------------+-----------------------
       postgres  | postgres | UTF8     | en_US.utf8 | en_US.utf8 |
       template0 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          +
                 |          |          |            |            | postgres=CTc/postgres
       template1 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          +
                 |          |          |            |            | postgres=CTc/postgres
      (3 rows)
      
      postgres=#
      

Resource Protection

Resources like servers, storages, networks, ip addresses or load balancers, which make up the cluster, are visible to you via API or within the Cloud Panel for transparency and billing reasons. They are, however, protected from being altered. This not only makes sure that they are not deleted accidentally, but is also vital to stable cluster operations.

Protected Resources:

  • Master Nodes (server, storage, ips)
  • Worker Nodes (server, storage, ips)
  • Kubernetes network
    • You can still attach your own servers or platform services to it, i.e. to access them from inside your cluster.
  • Storages created by Kubernetes (like Persistent Volumes)
  • LoadBalancers created by Kubernetes (like Ingress-Controllers)

If you want to change your worker config you can still do this in the Kubernetes configuration.

Horizontal Pod Autoscaler (HPA)

In order to use the horizontal pod autoscaler (HPA) you need to install the Metrics Server. You can bring your own or just follow the example.

Install Metrics Server

You can install the Metrics Server via Helm. There is a ready-to-use Metrics Server Helm Chart by Bitnami.

Add the Bitnami Metrics Server repository to your Helm installation:

helm repo add bitnami https://charts.bitnami.com/bitnami

Create a values.yaml with this content to configure your Metrics Server:

apiService:
  create: true
extraArgs:
  - --kubelet-insecure-tls=true
  - --kubelet-preferred-address-types=InternalIP

Install the Metrics Server Helm Chart:

helm install metrics-server bitnami/metrics-server -f values.yaml

Wait for the Metrics Server to be ready. It might take a minute or two before the first metrics are collected.

Run HPA

In order to run the HPA you need to create a deployment and generate some load against it.

Keep in mind that it is required to define the resource limits and request in order to use the HPA. The service is just for allowing access for load-generator.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: php-apache
spec:
  selector:
    matchLabels:
      run: php-apache
  replicas: 1
  template:
    metadata:
      labels:
        run: php-apache
    spec:
      containers:
      - name: php-apache
        image: k8s.gcr.io/hpa-example
        ports:
        - containerPort: 80
        resources:
          limits:
            cpu: 500m
          requests:
            cpu: 200m
---
apiVersion: v1
kind: Service
metadata:
  name: php-apache
  labels:
    run: php-apache
spec:
  ports:
  - port: 80
  selector:
    run: php-apache

Download the example deployment and service and deploy with:

kubectl apply -f php-apache.yaml

Create the HPA for the deployment:

kubectl autoscale deployment php-apache --cpu-percent=20 --min=1 --max=10

Check the current status of the HPA:

kubectl get hpa

This should look like this:

NAME         REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   0%/50%    1         10        1          2d22h

Generate Test Load

Now you create an infinite loop which will generate a load:

kubectl run -i --tty load-generator --rm --image=busybox --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://php-apache; done"

Open a second terminal and check the HPA status:

kubectl get hpa -w

After some time you should see the pods scale:

NAME         REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   0%/50%    1         10        1          2d22h
php-apache   Deployment/php-apache   91%/50%   1         10        1          2d22h
php-apache   Deployment/php-apache   91%/50%   1         10        2          2d22h
php-apache   Deployment/php-apache   253%/50%  1         10        2          2d22h
php-apache   Deployment/php-apache   253%/50%  1         10        4          2d22h
php-apache   Deployment/php-apache   253%/50%  1         10        6          2d22h
php-apache   Deployment/php-apache   101%/50%  1         10        6          2d22h
php-apache   Deployment/php-apache   71%/50%   1         10        6          2d22h
php-apache   Deployment/php-apache   71%/50%   1         10        9          2d22h
php-apache   Deployment/php-apache   51%/50%   1         10        9          2d22h

You can also check the deployment itself:

kubectl get deployment php-apache

NAME         READY   UP-TO-DATE   AVAILABLE   AGE
php-apache   9/9     9            9           2d22h

Stop Load and Clean Up

In order to stop the load, hit CTRL+C in the terminal where you started the load generator.

You can verify the scale down with the commands from above:

kubectl get deployment php-apache -w

Delete the example deployment and service:

kubectl delete -f php-apache.yaml

Vertical Scaling

GSK supports vertical scaling, which can be enabled by simply editing the worker node configuration of your Kubernetes cluster in the Cloud Panel or via the API. Scaling the cluster will recycle all nodes sequentially.

The following node resources can be changed:

  • Cores per worker node via parameter k8s_worker_node_cores
  • RAM per worker node via parameter k8s_worker_node_ram
  • Storage per worker node via parameter k8s_worker_node_storage
  • Storage type per worker node via parameter k8s_worker_node_storage_type

You can either change these in your Cloud Panel in the Configuration section, or via API.

To do so via API, you need to patch your cluster’s parameters. Always include all the parameters in the patch, not just the ones you want to change.

For example:

{
  "parameters": {
      "k8s_worker_node_ram": 4,
      "k8s_worker_node_cores": 2,
      "k8s_worker_node_count": 3,
      "k8s_worker_node_storage": 40,
      "k8s_worker_node_storage_type": "storage"
      }
}

Worker Node Storage Performance Classes

Worker nodes in your cluster use a distributed storage for their operating system. On cluster creation, you choose the performance class for this storage with the parameter k8s_worker_node_storage_type.

The performance class of your worker nodes' storage is independent of your PersistentVolumes and only affects the OS, kubelet and potential hostPath mounts. A higher performance class can help the node stay responsive when under increased memory pressure.

The performance class of your worker nodes can be changed at any time by editing your cluster. You can do so either in your Cloud Panel in the Configuration section, or via API. Changing the performance class will recycle all nodes sequentially.

To do so via API, you need to patch your cluster’s parameters to update the parameter k8s_worker_node_storage_type. Always include all the parameters in the patch, not just the ones you want to change.

For example:

{
  "parameters": {
      "k8s_worker_node_ram": 4,
      "k8s_worker_node_cores": 2,
      "k8s_worker_node_count": 3,
      "k8s_worker_node_storage": 40,
      "k8s_worker_node_storage_type": "storage"
    }
}

Logging

Container logs can be obtained via kubectl. While this is certainly feasible for ad-hoc debugging of single containers, it doesn’t give you the full picture of your application or even the whole cluster.

It is therefore a common practice to ship logs to a centralized log management platform, where they can be transformed and analyzed in one place - giving you that full picture and the means to act on events or trends.

There are multiple ways to get your logs into the log management platform:

  • Your application can directly implement the format your log management platform accepts the logs in, and send them there.
  • Your application can log to stdout and stderr, leaving it to the container engine to store the logs.

It is a good practice to use the latter approach. This approach decouples the application from runtime environment specifics. It is non-blocking for the application and provides a general approach to reliably and securely transfer logs, even when running into temporary unavailability of the log management platform

Log Shipping

While the container engine technically might be able to ship the logs directly to your log management platform, having the container engine store them locally instead and a third-party component read and ship them has proven to be the more reliable and portable solution.

This third-party component is called a log shipper. In general it can run anywhere, has inputs to read logs from locally and outputs to ship logs to remotely. The log shipper is an application agnostic approach - in the sense that it doesn’t need to be integrated into the applications you run on your cluster in any way. It just needs to support the format the logs are stored in as an input and the format the log management platform accepts the logs in as an output.

Accessing Container Logs

The log driver used by the container engine docker on our managed Kubernetes platform is journald.

journald is part of systemd and designed to store logs safely and handle rotation gracefully to prevent node disks from filling up. journald makes it easy for the shipper to reliably transfer logs, since the shipper only needs to keep track of one event stream.

journald stores logs in /var/log/journald. Among the log shippers that support journald as an input are:

Note: The log shipper needs to keep track where it left off, so that after a restart/redployment log shipping doesn’t start at the beginning resp. all logs are transferred again. Since the position is node-specific, a local hostPath mount to store the position in is recommended.

Load Balancing

Applying a service with the type of Load Balancer will provision a gridscale Load Balancer. Below are some helpful tips on integrating with our Load Balancer as a Service (LBaaS):

IP Address Forwarding

The Load Balancer needs to be set to HTTP mode. The client’s IP address is then available in the X-Forwarded-For HTTP header.

Note: When in HTTP mode, HTTPS-termination happens at the Load Balancer level. For the HTTP mode alone, certificates will be obtained via Let’s Encrypt or you can upload your own custom certificate.

Configuring Load Balancer Modes

The cloud controller manager (CCM) uses service annotations to configure the LBaaS for a GSK cluster. If an annotation of a specific parameter is not set, the default value for that parameter will be configured. This feature is supported from these GSK versions 1.18.12-gs1, 1.19.4-gs1, 1.17.14-gs1, and 1.16.15-gs2 and later.

AnnotationDefault value
service.beta.kubernetes.io/gs-loadbalancer-modetcp
service.beta.kubernetes.io/gs-loadbalancer-redirect-http-to-https“false”
service.beta.kubernetes.io/gs-loadbalancer-ssl-domainsnil
service.beta.kubernetes.io/gs-loadbalancer-algorithmleastconn
service.beta.kubernetes.io/gs-loadbalancer-https-ports443
service.beta.kubernetes.io/gs-loadbalancer-custom-certificate-uuidsnil

Examples

  • The following annotations configure the LBaaS with HTTP mode, Round Robin Algorithm, redirect HTTP to HTTPS, and multiple SSL Domains wherein domains are separated by a comma. The service.beta.kubernetes.io/gs-loadbalancer-ssl-domains annotation allows you to add multiple SSL Domains to the loadbalancer.
annotations:
    service.beta.kubernetes.io/gs-loadbalancer-mode: http
    service.beta.kubernetes.io/gs-loadbalancer-redirect-http-to-https: "true"
    service.beta.kubernetes.io/gs-loadbalancer-ssl-domains: demo1.test.com,demo2.test.com
    service.beta.kubernetes.io/gs-loadbalancer-algorithm: roundrobin
  • The following annotations configure the LBaaS with HTTP mode, Round Robin Algorithm, redirect HTTP to HTTPS, a none standard SSL port 4443, and a custom certificate wherein certificate UUIDs are separated by a comma. The service.beta.kubernetes.io/gs-loadbalancer-custom-certificate-uuids annotation allows you to an already uploaded custom certificates to the loadbalancer. Thus, first upload the custom certificate via the panel or API. Then, you can use the uuid of the uploaded custom certificate, for example c8b786e7-53ee-427b-8ff6-498f59f58b14, with service.beta.kubernetes.io/gs-loadbalancer-custom-certificate-uuids annotation.
annotations:
    service.beta.kubernetes.io/gs-loadbalancer-mode: http
    service.beta.kubernetes.io/gs-loadbalancer-redirect-http-to-https: "true"
    service.beta.kubernetes.io/gs-loadbalancer-custom-certificate-uuids: c8b786e7-53ee-427b-8ff6-498f59f58b14
    service.beta.kubernetes.io/gs-loadbalancer-algorithm: roundrobin
    service.beta.kubernetes.io/gs-loadbalancer-https-ports: "4443"

Adding Annotations to an Existing Ingress

You can customize the behaviour of specific Ingress objects using annotations:

kubectl annotate --overwrite svc <INGRESS_NAME> \
"service.beta.kubernetes.io/gs-loadbalancer-mode=http" \
"service.beta.kubernetes.io/gs-loadbalancer-algorithm=roundrobin"

Networking

We use Flannel out-of-the-box, which cannot be currently changed.

Network Policies

Due to Flannel being used as the network overlay, our cluster does not support networking policies.

Persistent Volumes

We differentiate between Persistent Volumes that are based on block devices and those that are based on network filesystems.

Block Device Persistent Volumes

Block device based Persistent Volumes use distributed storages that are directly attached to your GSK nodes.

Since they are block devices with plain, non-clustered filesystems (ext4 by default), they can only ever be attached to a single node at a time and thus only be used by pods that run on the same node. (ReadWriteOnce (RWO) access mode)

Their strength is performance.

Storage Classes

Block device based Persistent Volumes give you the raw performance of the Distributed Storage. You can find a storage class for each of its performance classes.

NAME                      PROVISIONER           RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
block-storage (default)   bs.csi.gridscale.io   Delete          Immediate           true                   68d
block-storage-high        bs.csi.gridscale.io   Delete          Immediate           true                   68d
block-storage-insane      bs.csi.gridscale.io   Delete          Immediate           true                   68d

Reclaim Policy

Reclaim policy Delete makes sure that deleting Persistent Volumes (PV) will also delete the corresponding Distributed Storage.

Deleting and changing preconfigured storage classes to modify this behaviour is not recommended. Your changes will be reverted with every upgrade.

Instead, create your own storage classes that use the same provisioner.

Limitations

Block device based Persistent Volumes are subject to Distributed Storage and Server limitations. Currently, up to 15 storages respectively Persistent Volumes can be attached to a single GSK node at a time. The attach-process takes a few seconds per Storage/PV.

Network Filesystem Persistent Volumes via GridFs

Requires 1.19.16-gs0, 1.20.15-gs0, 1.21.11-gs0 or higher.

Network Filesystem based Persistent Volumes use GridFs to store data. GridFs is an NFS-compatible network filesystem. It grows with your data, you only pay for volume you actually use and your data can be access read-write by any number of GSK nodes at a time. (ReadWriteMany (RWX) and ReadOnlyMany (ROM) access modes)

Its strengths are scalability and being read-write accessible from all your GSK nodes.

Set up GridFs based Persistent Volumes

GridFs is an NFS compatible network filesystem. As such, access is achieved through the NFS CSI driver for Kubernetes.

  1. Create a new GridFs instance or use an existing one.
  2. Follow the first three steps of Connect a Kubernetes Cluster to a PaaS service to make sure your GridFs is connected to your GSK cluster.
  3. Install the NFS CSI driver for Kubernetes as described here.
helm repo add csi-driver-nfs https://raw.githubusercontent.com/kubernetes-csi/csi-driver-nfs/master/charts
helm install csi-driver-nfs csi-driver-nfs/csi-driver-nfs --namespace kube-system --version v4.0.0
  1. Create a storage class that uses the NFS CSI driver as the provisioner and your GridFS as the NFS server.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: gridfs-<PAAS_SERVICE_UUID OF YOUR GRIDFS>
provisioner: nfs.csi.k8s.io
parameters:
  server: <IP ADDRESS OF YOUR GRIDFS>
  share: /
reclaimPolicy: Delete
volumeBindingMode: Immediate
mountOptions:
  - nfsvers=4.1
  1. Use that storage class for your PVCs.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-first-gridfs-pvc
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 10Gi
  storageClassName: gridfs-<PAAS_SERVICE_UUID OF YOUR GRIDFS>
  1. The NFS CSI driver creates a directory for this PVC under the share-path configured in the storage class and makes it available as a new PersistentVolume.

Limitations

Network Filesystem based Persistent Volumes via GridFs can hold any number of PVCs in a single GridFs instance.

Host Path Persistent Volumes

Aside from block device based and network filesystem based Persistent Volumes, hostPath Persistent Volumes can be used for node-local storage.

Please note:

  • Due to the transient nature of the Kubernetes nodes, hostPath Persistent Volumes will be lost whenever the node is being recycled. (f.e. during updates, upgrades or node recovery)
  • Use of hostPath Persistent Volumes can fill up node-local storage and affect health of the node.

Persistent Volumes are not automatically deleted

The PersistentVolume is created automatically when a PersistentVolumeClaim is requested. But it’s not automatically deleted after you delete the GSK cluster. This behaviour prevents data loss of your persistent volumes.

There are two ways to delete the persistent volumes:

  1. After deleting the cluster, it’s also possible to delete the persistent volumes from the Cloud Panel.
  2. Before deleting the cluster, you should delete the related deployments that use the PersistentVolume and the PersistentVolumeClaim from the cluster.

Ingress Controller

Your cluster does not come with an ingress controller preinstalled. You can install the ingress controller of your choice as described in ingress-controllers.

Access and Security

All users with write access (or higher) to the project will be able to download the Kuberenetes certificate.

PKI Certificate Access

Authentication against the Kubernetes master is based on X.509 client certificates, which can be generated and expire after three days. This can be used with gscloud, which will automatically renew the certificate for you.

After installation, you can set up gscloud with your API Keys, which can be found here: gscloud make-config

After setup, register gscloud to fetch and update the certificates automatically using the following command: gscloud kubernetes cluster save-kubeconfig --credential-plugin --cluster <cluster uuid>

Encryption

Data is encrypted at rest, and network traffic is TLS encrypted on the application layer.

Role-based Access Control (RBAC)

GSK supports standard Kubernetes RBAC and can be configued using an access server like Microsoft Access Directory.

Firewall

GSK controlplane and worker nodes utilize the firewall in the OS to secure cluster-internals from the public network.

This does not restrict you from exposing your workloads to the public network.

Backups

Data that belongs to the controlplane of the cluster (such as etcd) is backed up by gridscale.

Data that comes from within the application needs to be backed up by the user. gridscale Storage snapshots and backups are not supported by GSK at this point. They cannot be used for backing up persisted data.

Please employ a solution that runs in the cluster.

Node Pools

Currently, we only support one node pool.

Kubernetes Dashboard

The official Kubernetes dashboard is not deployed by default and can be installed with a single command that is mentioned in the Official Kubernetes Documentation.

Known issues

Storage instances are not deleted from gridscale panel

To prevent this issue, please do NOT delete the PVs (Persistent Volumes) before the storage instances are deleted completely from the panel. If you already have some storage instances dangling in the panel, please contact us to remove them.

Cannot delete k8s cluster when there are other PaaS/servers connected to the cluster’s private network

The issue can be solved by either attaching the PaaS/servers to other networks or removing the PaaS/servers.

Node labels do not persist

Nodes in a Kubernetes cluster are volatile and can be replaced at any time, i.e. during updates, upgrades or node recovery. When they are, replacement nodes do not inherit their labels.

If you control scheduling of your pods with nodeSelector and node labels, please consider migrating to Affinity and anti-affinity.

FAQ

Does gridscale monitor the cluster?

We monitor the overall cluster health of a cluster. We assure that the cluster is healthy and functional, and we will be paged about abnormal conditions of the cluster. gridscale does not monitor the application(s) that are deployed within the cluster. Since we don’t know anything about your workloads, we don’t include performance and resource monitoring from our side as part of the standardised gridscale Managed Kubernetes (GSK).

Do cluster components communicate on the Public or the Private Network?

Cluster communication is strictly private. This includes communication between Kubernetes components, but also communication between pods and/or services. However, as a user you can contact external services. Thereby it would technically be possible, but not usual, to communicate with other services on the cluster through the Public Network and Load Balancers, if that service is exposed to the outside and communication is explicitly directed there through public connection details.

Terms and Abbreviations

  • GSK: gridscale Kubernetes
  • K8s: K-ubernete-s.
  • kubectl: A command line tool which functions as a management interface for a K8s cluster.
  • Node: A K8s cluster is made of a few virtual machines that talk to each other. In this context, a virtual machine is a node. A master (we have one master at the moment) and one or more workers.
  • Control Plane: A fancy way of saying “masters of the cluster”. Technically, all programs that run on the master that make the cluster a cluster. For instance, a specialized database or a program that decides which worker should run which software.
  • Deployment: In most cases an app running on K8s. Technically a collection of containers based on a set of templates (images).
  • PV: Persistent Volume. A persistent storage for Kubernetes deployments.
  • PVC: Persistent Volume Claim. When a client (user, customer, an application) needs a PV, they send a PVC to the K8s cluster.
  • Service: A way of accessing your deployment outside of the cluster, tightly related to Load Balancers and Ingresses.
  • Ingress: A special way of exposing a deployment outside of the cluster. Think of it as a kind of Load Balancer.
  • IngressController: This component runs inside the cluster and is responsible for handling requests for an Ingress.
  • RBAC: Role Based Access-Control. Allows you to selectively give different people different access rights to the cluster.
  • Dashboard: A graphical frontend for the cluster API. The user can see their deployments, nodes and a few metrics without using the command line. This is not enabled by default, but can be easily installed into the GSK.