Glob
24 Feb 2023
This previous post demonstrated a way to use generic website html source to create an inventory of web-accessible files using discovered URL links. A similar methodology can be applied to search through filesystem directory tree to establish a catalog for any matching filetypes.
Use glob
to recursively globstar-match filepaths
#
In this case I use Python’s builtin glob
and regular expression modules to list files and match extensions in the names. I used the os.path
collection of utility methods to pull directory names from the full paths, but a more modern way would probably to use the builtin pathlib
. Pandas is the only non-builtin package used, which could be removed if a DataFrame
is not the desired output.
14 Sep 2022
In recent versions of bash
, the **
expression can be used to indicate matches for a particular directory while including sub-directories. This can be a big help when dealing with some messy filesystem structures.
|
|
This example would match .nc files in $year
and the folders below $year
. Depending on the shell configuration or if running within a script, shopt -s globstar
may be necessary to enable the capability.