I'm trying to figure out how to copy CAD drawings (".dwg", ".dxf) from a source directory with subfolders to a destination directory and maintaining the original directory and subfolders structure.
I found the following answer from @martineau within the following post: Python Factory Function
from fnmatch import fnmatch, filter
from os.path import isdir, join
from shutil import copytree
def include_patterns(*patterns):
"""Factory function that can be used with copytree() ignore parameter.
Arguments define a sequence of glob-style patterns
that are used to specify what files to NOT ignore.
Creates and returns a function that determines this for each directory
in the file hierarchy rooted at the source directory when used with
shutil.copytree().
"""
def _ignore_patterns(path, names):
keep = set(name for pattern in patterns
for name in filter(names, pattern))
ignore = set(name for name in names
if name not in keep and not isdir(join(path, name)))
return ignore
return _ignore_patterns
# sample usage
copytree(src_directory, dst_directory,
ignore=include_patterns('*.dwg', '*.dxf'))
Updated: 18:21. The following code works as expected, except that I'd like to ignore folders that don't contain any include_patterns('.dwg', '.dxf')
shutil
already contains a function ignore_pattern
, so you don't have to provide your own. Straight from the documentation:
from shutil import copytree, ignore_patterns copytree(source, destination, ignore=ignore_patterns('*.pyc', 'tmp*'))
This will copy everything except
.pyc
files and files or directories whose name starts withtmp.
It's a bit tricky (and not strictly necessairy) to explain what's going on: ignore_patterns
returns a function _ignore_patterns
as its return value, this function gets stuffed into copytree
as a parameter, and copytree
calls this function as needed, so you don't have to know or care how to call this function _ignore_patterns
. It just means that you can exclude certain unneeded cruft files (like *.pyc
) from being copied. The fact that the name of the function _ignore_patterns
starts with an underscore is a hint that this function is an implementation detail you may ignore.
copytree
expects that the folder destination
doesn't exist yet. It is not a problem that this folder and its subfolders come into existence once copytree
starts to work, copytree
knows how to handle that.
Now include_patterns
is written to do the opposite: ignore everything that's not explicitly included. But it works the same way: you just call it, it returns a function under the hood, and coptytree
knows what to do with that function:
copytree(source, destination, ignore=include_patterns('*.dwg', '*.dxf'))