Skip to content

Conversation

jjaswanson4
Copy link
Contributor

What does this PR do?

This PR prevents failure of async file cleanup when a jid file is not defined when setting up nodes in a controller workflow

How should this be tested?

Attempt to create a workflow:

---
controller_workflows:
  - name: Setup AD Forest
    organization: Example
    simplified_workflow_nodes:
      - identifier: Wait for Connectivity
        unified_job_template: Wait for Connectivity
        success_nodes:
          - Set Base Configurations
      - identifier: Set Base Configurations
        unified_job_template: Set Base Configurations

Run simple playbook:

---
- name: Configure Ansible Controller
  hosts:
    - all
  gather_facts: false
  connection: local
  vars:
    controller_configuration_secure_logging: false
    aap_configuration_secure_logging: false
  pre_tasks:
    - name: Import variables from /runner/variables
      ansible.builtin.include_vars:
        dir: /runner/variables
  roles:
    - infra.aap_configuration.controller_license
    - infra.aap_configuration.dispatch

Without this commit, the playbook will fail:

fatal: [it_automation_service]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'ansible_job_id'. 'dict object' has no attribute 'ansible_job_id'\n\nThe error appears to be in '/usr/share/ansible/collections/ansible_collections/infra/aap_configuration/roles/controller_workflow_job_templates/tasks/add_workflows_schema.yml': line 126, column 7, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n  always:\n    - name: add_workflows_schema | Cleanup async results files\n      ^ here\n"}

This mirrors logic introduced in the main task file of the role.

Is there a relevant Issue open for this?

N/A

Other Relevant info, PRs, etc

N/A

@jjaswanson4 jjaswanson4 requested a review from a team as a code owner September 3, 2025 21:37
@jjaswanson4
Copy link
Contributor Author

Confirmed fix:

# Used to fail here
skipping: [it_automation_service] => (item=Cleaning up job results file: Unknown)  => {"__workflows_schema_results_item": {"__workflow_loop_node_item": {"identifier": "Link 62443-3-3 GPO to OU", "unified_job_template": "Link 62443-3-3 GPO to OU"}, "ansible_loop_var": "__workflow_loop_node_item", "changed": false, "false_condition": "((__workflow_loop_node_item.always_nodes is defined and __workflow_loop_node_item.always_nodes | length > 0) or (__workflow_loop_node_item.success_nodes is defined and __workflow_loop_node_item.success_nodes | length > 0) or (__workflow_loop_node_item.failure_nodes is defined and __workflow_loop_node_item.failure_nodes | length > 0))", "skip_reason": "Conditional result was False", "skipped": true}, "ansible_loop_var": "__workflows_schema_results_item", "changed": false, "false_condition": "__workflows_schema_results_item.ansible_job_id is defined", "skip_reason": "Conditional result was False"}

# Now succeeds
PLAY RECAP *********************************************************************
it_automation_service      : ok=78   changed=4    unreachable=0    failed=0    skipped=58   rescued=0    ignored=0   

Copy link
Contributor

@ivarmu ivarmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about that fix... I agree that the error should be clearer or more concise, but I don't think skipping it should be a good idea. The error is raised at the first task of the block:

    # Creating Workflow Node
    - name: add_workflows_schema | Create the Workflow Nodes
      ansible.controller.workflow_job_template_node:

This is telling that something is wrong with the current node so... for the given workflow, it will be incomplete if we only skip that error.

I'd prefer to add a rescue section and to show the original error with the current node, making the playbook to fail instead of continuing ignoring the error.

@ivarmu
Copy link
Contributor

ivarmu commented Sep 4, 2025

I'm not sure about that fix... I agree that the error should be clearer or more concise, but I don't think skipping it should be a good idea. The error is raised at the first task of the block:

    # Creating Workflow Node
    - name: add_workflows_schema | Create the Workflow Nodes
      ansible.controller.workflow_job_template_node:

This is telling that something is wrong with the current node so... for the given workflow, it will be incomplete if we only skip that error.

I'd prefer to add a rescue section and to show the original error with the current node, making the playbook to fail instead of continuing ignoring the error.

@Tompage1994 @djdanielsson @sean-m-sullivan What do you think?

Copy link
Collaborator

@Tompage1994 Tompage1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I've just been through and checked that this logic isn't missing elsewhere and we seem to be fine

@Tompage1994
Copy link
Collaborator

I'm not sure about that fix... I agree that the error should be clearer or more concise, but I don't think skipping it should be a good idea. The error is raised at the first task of the block:

    # Creating Workflow Node
    - name: add_workflows_schema | Create the Workflow Nodes
      ansible.controller.workflow_job_template_node:

This is telling that something is wrong with the current node so... for the given workflow, it will be incomplete if we only skip that error.
I'd prefer to add a rescue section and to show the original error with the current node, making the playbook to fail instead of continuing ignoring the error.

@Tompage1994 @djdanielsson @sean-m-sullivan What do you think?

So, to me this is just implementing the same logic used in the rest of the collection. This was obviously missed before because it wasn't in a main.yml file like the rest.

I think it is the right thing to be doing though as it's there to do cleanup of the async results files. The lack of a job id implies that there will be nothing to cleanup, therefore should appropriately be skipped.

@jjaswanson4
Copy link
Contributor Author

This is telling that something is wrong with the current node so... for the given workflow, it will be incomplete if we only skip that error.

In my example, what happens is the last node has nothing to be connected to, so no async job to manage the links is created. Then, because no async job is created, there's no async job to cleanup, causing the failure without this added conditional.

Copy link
Contributor

@ivarmu ivarmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@djdanielsson djdanielsson enabled auto-merge (squash) September 4, 2025 20:35
@djdanielsson djdanielsson merged commit 0d759fb into redhat-cop:devel Sep 4, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants