Installation
web-monitoring-diff requires Python 3.10 or newer. Before anything else, make sure you’re using a supported version of Python. If you need to support different local versions of Python on your computer, we recommend using pyenv or Conda.
System-level dependencies: web-monitoring-diff depends on several system-level, non-Python libraries that you may need to install first. Specifically, you’ll need:
libxml2,libxslt,openssl, andlibcurl.
On MacOS, we recommend installing these with Homebrew:
brew install libxml2 brew install libxslt brew install openssl # libcurl is built-in, so you generally don't need to install itOn Debian Linux, use
apt:apt-get install libxml2-dev libxslt-dev libssl-dev openssl libcurl4-openssl-devOther systems may have different package managers or names for the packages, so you may need to look them up.
Install this package with pip. Be sure to include the
--no-binary lxmloption:
pip install web-monitoring-diff --no-binary lxmlOr, to also install the web server for generating diffs on demand, install the
serverextras:pip install web-monitoring-diff[server] --no-binary lxmlThe
--no-binaryflag ensures that pip downloads and builds a fresh copy oflxml(one of web-monitoring-diff’s dependencies) rather than using a pre-built version. It’s slower to install, but is required for all the dependencies to work correctly together. If you publish a package that depends on web-monitoring-diff, your package will need to be installed with this flag, too.On MacOS, you may need additional configuration to get
pycurlto use the Homebrewopenssl. Try the following:PYCURL_SSL_LIBRARY=openssl \ LDFLAGS="-L/usr/local/opt/openssl/lib" \ CPPFLAGS="-I/usr/local/opt/openssl/include" \ pip install web-monitoring-diff --no-binary lxml --no-cache-dirThe
--no-cache-dirflag tells pip to re-build the dependencies instead of using versions it’s built already. If you tried to install once before but had problems withpycurl, this will make sure pip actually builds it again instead of re-using the version it built last time around.For local development, clone the git repository and then make sure to do an editable installation instead.
pip install ".[server,dev]" --no-binary lxml
(Optional) Install experimental diffs. Some additional types of diffs are considered “experimental” — they may be new and still have lots of edge cases, may not be publicly available via PyPI or another package server, or may have any number of other issues. Because they are not available via public package indexes you’ll need to install them manually.
To install via Pip, run:
curl -O 'https://raw.githubusercontent.com/edgi-govdata-archiving/web-monitoring-diff/refs/heads/main/requirements-experimental.txt' pip install -r requirements-experimental.txtIf using a different package manager, check the requirements-experimental.txt file and install each of the listed packages.
(Optional) Install cChardet. If you are using the diff server and want high-performance character encoding detection, install cchardet. Note that it only supports Python 3.10 at the time of this writing. An alpha release supports up to Python 3.12. Its current maintenance status is unclear.
In Python 3.10:
pip install cchardetIn Python 3.11 or 3.12:
pip install cchardet==2.2.0a2