Is there an elegant way to check file integrity with md5 in ansible using md5 files fetched from server?

Vlad Manuel Mureșan picture Vlad Manuel Mureșan · Apr 6, 2016 · Viewed 19.7k times · Source

I have several files on a server that I need to download from an ansible playbook, but because the connection has good chances of interruption I would like to check their integrity after download.

I'm considering two approaches:

  1. Store the md5 of those files in ansible as vars
  2. Store the md5 of those files on the server as files with the extension .md5. Such a pair would look like: file.extension and file.extension.md5.

The first approach introduces overhead in maintaining the md5s in ansible. So everytime someone adds a new file, he needs to make sure he adds the md5 in the right place.

But as an advantage, there is a solution for this, using the built in check from get_url action in conjunction with checksum=md5. E.g.:

action: get_url: url=http://example.com/path/file.conf dest=/etc/foo.conf checksum=md5:66dffb5228a211e61d6d7ef4a86f5758

The second approach is more elegant and the narrows the responsibility. When someone adds a new file on the server, he will make sure to add the .md5 as well and won't even need to use the ansible playbooks.

Is there a way to use the checksum approach to match the md5 from a file?

Answer

barnesm999 picture barnesm999 · May 23, 2016

If you wish to go with your method of storing the checksum in files on the server, you can definitely use the get_url checksum arg to validate it.

Download the .md5 file and read it into a var:

- set_fact:
    md5_value: "{{ lookup('file', '/etc/myfile.md5') }}"

And then when you download the file, pass the contents of md5_value to get_url:

- get_url:
    url: http://example.com
    dest: /my/dest/file
    checksum: "md5:{{ md5_value }}"
    force: true

Note that it is vital to specify a path to a file in dest; if you set this to a directory (and have a filename in url), the behavior changes significantly.

Note also that you probably need the force: true. This will cause a new file to download every time you run it. The checksum is only triggered when files are downloaded. If the file already exists on your host it won't bother to validate the sum of the existing file, which might not be desirable.

To avoid the download every time you could stat to see if the file already exists, see what its sum is, and set the force param conditionally.

- stat:
    path: /my/dest/file
  register: existing_file

- set_fact:
    force_new_download: "{{ existing_file.stat.md5 != md5_value }}"
  when: existing_file.stat.exists

- get_url:
    url: http://example.com
    dest: /my/dest/file
    checksum: "md5:{{ md5_value }}"
    force:  "{{ force_new_download | default ('false') }}"

Also, if you are pulling the sums/artifacts from some sort of web server you can actually get the value of the sum right from the url without having to actually download the file to the host. Here is an example using a Nexus server that would host the artifacts and their sums:

- set_fact:
    md5_value: "{{ item }}"
  with_url: http://my_nexus_server.com:8081/nexus/service/local/artifact/maven/content?g=log4j&a=log4j&v=1.2.9&r=central&e=jar.md5

This could be used in place of using get_url to download the md5 file and then using lookup to read from it.