One of the things that has really bugged me over the years is the trend of writing programs in such a way that they download a large number[1]Sometimes hundreds, or even thousands! of third-party libraries, written mostly by unknown randos on the internet, and then running them as part of your program. It's an obvious security risk[2]There have been quite a few of these so-called "supply chain attacks" in recent times., and even if you go to the trouble of verifying that something is trustworthy, there's no guarantee that it will remain so over the life of your program[3]For example, hackers could take control of a repository, or perhaps someone takes over an expired domain name, and thus its email addresses, which might give them control of a library's repository.
.
With the rise of containerized applications, these dependencies are downloaded over and over and over again, each time the image is rebuilt, which is a problem if your internet access isn't great, and regardless, the engineer in me just detests this kind of waste and inefficiency, just on principle
One way to solve these problems is to run a local proxy for the online repo's i.e. something that downloads the dependencies for you, and caches them. This doesn't address the issue of trusting a third-party library in the first place, but one thing that will help establish that trust is time i.e. after you've been using it for a while, and nothing has been reported on Hacker News or Reddit about problems with it, this will help increase your confidence that it's OK[4]This requires that the artifacts that have been cached in your local proxy are immutable, to prevent attacks by a malicious party on the version stored in the online repo..
There are a few programs out there that provide this kind of service[5]For example, devpi, local-npm, or a Docker pull-through cache., but they are specific to their particular repository[6]Pulp offers support for different types of repository, although the list doesn't seem quite as extensive as Nexus.. However, Sonatype's Nexus Repository Manager offers support for a large number of online repositories, thus providing a one-stop solution, and this tutorial will go through the process of setting it up, and configuring it for some of the more popular online repo's.
Before we start
- I run my own DNS, and have set it up so that the name nexus3[7]I originally used nexus, but this causes problems when trying to open the admin interface in a browser, because that name is on the HSTS preload list, which means that the browser will force the use of HTTPS
resolves to the server running Nexus[8]Actually, a Docker container fronted by nginx.. This can cause problems when trying to access Nexus from inside a Docker container[9]Since DNS often operates differently there., which is discussed here. However, in all cases, you can also reference the server by its IP address[10]Unless, of course, you're running Nexus in a Docker container fronted by nginx
.
- Nexus stores artifacts in the file system, and while it's not essential to do so, it's possible to keep things in separate sub-directories by creating a new storage blob for each one.
- Nexus doesn't seem to have an option to force downloaded artifacts to be immutable, but it's possible to configure repositories to never check back with the online repo to see if an artifact has changed (by setting the Maximum Component Age to -1), which is close enough.
Tutorial index
- Installing Nexus
In which we install the base software.
- Proxy'ing for PyPi
In which we configure Nexus to proxy for PyPi.
- Proxy'ing for npm
In which we configure Nexus to proxy for npm.
- Proxy'ing for NuGet
In which we configure Nexus to proxy for NuGet.
- Proxy'ing for Docker Hub
In which we configure Nexus to proxy for Docker Hub.
- Proxy'ing for RPM-based package managers
In which we configure Nexus to proxy for RPM-based package managers[11]For example, those used by Red Hat, Fedora and Rocky Linux..
- Using Nexus when building Docker images
In which we consider how to build Docker images using Nexus.
- Running your own private Docker registry
In which we set up a private Docker registry.
- Running Nexus on a Raspberry Pi 4
In which we get Nexus running on ARM.
- Running Nexus behind an nginx reverse proxy
In which we configure nginx for Nexus.
References
↵1 | Sometimes hundreds, or even thousands! |
---|---|
↵2 | There have been quite a few of these so-called "supply chain attacks" in recent times. |
↵3 | For example, hackers could take control of a repository, or perhaps someone takes over an expired domain name, and thus its email addresses, which might give them control of a library's repository. |
↵4 | This requires that the artifacts that have been cached in your local proxy are immutable, to prevent attacks by a malicious party on the version stored in the online repo. |
↵5 | For example, devpi, local-npm, or a Docker pull-through cache. |
↵6 | Pulp offers support for different types of repository, although the list doesn't seem quite as extensive as Nexus. |
↵7 | I originally used nexus, but this causes problems when trying to open the admin interface in a browser, because that name is on the HSTS preload list, which means that the browser will force the use of HTTPS ![]() |
↵8 | Actually, a Docker container fronted by nginx. |
↵9 | Since DNS often operates differently there. |
↵10 | Unless, of course, you're running Nexus in a Docker container fronted by nginx ![]() |
↵11 | For example, those used by Red Hat, Fedora and Rocky Linux. |