Reproducible
9 May 2023
Python Package Management with Rye
“I built various iterations of this over the last three years out of personal interest, but I was cautious about publishing anything out of fear of adding yet another tool into the mix. Right now I’m reaching out to other people in the packaging ecosystem to see if there is a way to work together towards something, and I’m trying to get a better public description out of how I wish packaging in the Python ecosystem could work to align people on a vision.
3 Apr 2023
Reproducible Scientific Python Using Containers
Use Microsoft’s VSCode editor (code
), Docker Containers, and other open-source tools for scientific Python software collaboration, development, and use on Linux and Windows. Securely connect offices, remote workers, storage resources, compute resources, and the cloud with Tailscale as a replacement for traditional VPN.
System Packages #
VSCode Editor #
Download Visual Studio Code from Microsoft.
For a Debian-based GNU/Linux distribution like Ubuntu or Pop OS, the .deb can be installed with sudo dpkg -i code_$version_amd64.deb
.
29 Mar 2023
The Mesh Data Abstraction Library (MDAL) out of OSGeo is a “translator library” for many common conventions found in meteorology and hydrology. The library supports data found in Grib and NetCDF encoded as NetCDF Climate and Forecast Metadata Conventions (CF) or Unstructured Grid Conventions (CF/UGRID) and represented as geospatial mesh data. In addition to supporting mature and well-defined spatial data encodings in self-describing files, the library also supports numerous model-specific formats including Telemac, HEC-RAS, and TUFLOW.
17 Feb 2021
I started my Python package management journey years ago using pip
, then more recently I embraced Anaconda and conda
more fully (particularly with the “conda-forge” repository) to resolve complex dependencies along with system/binary dependencies. Recently, when attempting to update our team’s standard Python docker image with the latest versions of the packages we use, and include some new ones, it appears that relying on conda
and conda-forge is untenable: I have been unable to resolve the appropriate set of versions for the scientific Python packages our team require for our work. I have moved back to pip
for packages which are not provided in the default Anaconda repository. pip
has and continues to make a number of improvements, and had no problem providing our extra dependencies.
23 Aug 2018
Reproducing Conda Environments
This short summary is based on the Anaconda blog post here https://www.anaconda.com/moving-conda-environments/. The original blog post is a great high-level summary for the various methods in conda for reproducing environments.
-
OS and platform specific (pulls from repos)
# On source environment: conda list --explicit > spec-list.txt
# New conda environment: conda create --name new_env_name --file spec-list.txt
-
Different platforms and OS (pulls from repos, also includes
pip
installed packages)# On source environment: conda env export > env.yml
# New conda environment: conda env create -f env.yml
-
Platform and OS specific, no internet on target
17 Mar 2018
Installing NetCDF Python Packages
I was trying to remember how I have installed netCDF4 and related libraries for Python, and what I need to do differently for Windows systems vs. the Linux systems I usually use.
On Linux, sometimes I use the system netCDF C libaries, but often I compile and install specific versions of HDF5 and netCDF4 from scratch. Here is how I have built netCDF for various Docker container images.
|
|
netcdf4-python #
I typically try to use Use pip
to install Python libraries if I can.pip
if you need to, but I am using conda and conda-forge as much as possible now, in fact by using conda, the above compilation steps are usually not necessary as far as I know. See below.
10 Mar 2018
Using the Blockchain for Open-access Journals?
One thing that excites me about the current buzz around blockchain technology is its use for open science. I can’t speak to the feasibility, but it seems to me that a distributed ledger could be an ideal place to publish and provide open-access to scientific research papers and articles.
If including a way to store, deliver and update supporting data, a blockchain could deliver research products that link directly to the data and analysis–providing an unparalleled level of provenance and context for research and results. Citations and work building on similar pieces of data could be connected allowing for straighforward literature searches and discovery.