Catalog
24 Feb 2023
This previous post demonstrated a way to use generic website html source to create an inventory of web-accessible files using discovered URL links. A similar methodology can be applied to search through filesystem directory tree to establish a catalog for any matching filetypes.
Use glob
to recursively globstar-match filepaths
#
In this case I use Python’s builtin glob
and regular expression modules to list files and match extensions in the names. I used the os.path
collection of utility methods to pull directory names from the full paths, but a more modern way would probably to use the builtin pathlib
. Pandas is the only non-builtin package used, which could be removed if a DataFrame
is not the desired output.