I have an ansible playbook to kill running processes and works great most of the time!, however, from time to time we find processes that just can't be killed so, "wait_for" gets to the timeout, throws an error and it stops the process.
The current workaround is to manually go into the box, use "kill -9" and run the ansible playbook again so I was wondering if there is any way to handle this scenario from ansible itself?, I mean, I don't want to use kill -9 from the beginning but I maybe a way to handle the timeout?, even to use kill -9 only if process hasn't been killed in 300 seconds? but what would be the best way to do it?
These are the tasks I currently have:
- name: Get running processes
shell: "ps -ef | grep -v grep | grep -w {{ PROCESS }} | awk '{print $2}'"
register: running_processes
- name: Kill running processes
shell: "kill {{ item }}"
with_items: "{{ running_processes.stdout_lines }}"
- name: Waiting until all running processes are killed
wait_for:
path: "/proc/{{ item }}/status"
state: absent
with_items: "{{ running_processes.stdout_lines }}"
Thanks!
You could ignore errors on wait_for
and register the result to force kill failed items:
- name: Get running processes
shell: "ps -ef | grep -v grep | grep -w {{ PROCESS }} | awk '{print $2}'"
register: running_processes
- name: Kill running processes
shell: "kill {{ item }}"
with_items: "{{ running_processes.stdout_lines }}"
- wait_for:
path: "/proc/{{ item }}/status"
state: absent
with_items: "{{ running_processes.stdout_lines }}"
ignore_errors: yes
register: killed_processes
- name: Force kill stuck processes
shell: "kill -9 {{ item }}"
with_items: "{{ killed_processes.results | select('failed') | map(attribute='item') | list }}"