Skip to content

bug: The condition to check if a node was drained is not sufficient enough #326

@bajo

Description

@bajo

Summary

During rolling upgrade of the nodes, the node is drained. The output of the drain command is checked for the keyword drained. If that is within the output string, the playbook continues to run.
In some situations, eg. running a pod with

kubectl run tmp-netshoot --rm -i --tty --image nicolaka/netshoot -n nondefault

kubectl drain will fail with an error message like

There are pending nodes to be drained:

For the implemented check this looks like the task succeeded and it will continue execution.
A simple fix is to change

- drain.stdout is search('drained')

to

    - '"drained" in drain_result.stdout'
    - '"There are pending nodes to be drained" not in drain_result.stdout'

I will send a pull request with a more informative implementation, so that we actually get the direct output of kubectl drain as well. That should help with further debugging the state of the cluster.

Issue Type

Bug Report

Ansible Version

ansible [core 2.15.13]
  config file = /root/ansible/ansible.cfg
  configured module search path = ['/root/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3/dist-packages/ansible
  ansible collection location = /root/ansible/collections
  executable location = /usr/bin/ansible
  python version = 3.10.12 (main, Feb  4 2025, 14:57:36) [GCC 11.4.0] (/usr/bin/python3)
  jinja version = 3.0.3
  libyaml = True

Steps to Reproduce

  • set variable rke2_version to a value that is different from the currently active version to make sure an upgrade is triggered.
  • Add a pod that will block draining a node.
kubectl run tmp-netshoot --rm -i --tty --image nicolaka/netshoot -n nondefault
  • Run the playbook
tasks:
  - name: load | lablabs.rke2
    import_role:
      name: lablabs.rke2

Expected Results

Rolling upgrade fails as the node cannot be drained.

Actual Results

Rolling upgrade continues to run, fails at a later stage as not all nodes could be upgraded.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions