One of the things that has really bugged me over the years is the trend of writing programs in such a way that they download a large number[1]Sometimes hundreds, or even thousands! of third-party libraries, written mostly by unknown randos on the internet, and then running them as part of your program. It's an obvious security risk[2]There have been quite a few of these so-called "supply chain attacks" in recent times., and even if you go to the trouble of verifying that something is trustworthy, there's no guarantee that it will remain so over the life of your program[3]For example, hackers could take control of a repository, or perhaps someone takes over an expired domain name, and thus its email addresses, which might give them control of a library's repository. .
With the rise of containerized applications, these dependencies are downloaded over and over and over again, each time the image is rebuilt, which is a problem if your internet access isn't great, and regardless, the engineer in me just detests this kind of waste and inefficiency, just on principle
One way to solve these problems is to run a local proxy for the online repo's i.e. something that downloads the dependencies for you, and caches them. This doesn't address the issue of trusting a third-party library in the first place, but one thing that will help establish that trust is time i.e. after you've been using it for a while, and nothing has been reported on Hacker News or Reddit about problems with it, this will help increase your confidence that it's OK[4]This requires that the artifacts that have been cached in your local proxy are immutable, to prevent attacks by a malicious party on the version stored in the online repo..
There are a few programs out there that provide this kind of service[5]For example, devpi, local-npm, or a Docker pull-through cache., but they are specific to their particular repository[6]Pulp offers support for different types of repository, although the list doesn't seem quite as extensive as Nexus.. However, Sonatype's Nexus Repository Manager offers support for a large number of online repositories, thus providing a one-stop solution, and this tutorial will go through the process of setting it up, and configuring it for some of the more popular online repo's.
Before we start
I run my own DNS, and have set it up so that the name nexus3[7]I originally used nexus, but this causes problems when trying to open the admin interface in a browser, because that name is on the HSTS preload list, which means that the browser will force the use of HTTPS resolves to the server running Nexus[8]Actually, a Docker container fronted by nginx.. This can cause problems when trying to access Nexus from inside a Docker container[9]Since DNS often operates differently there., which is discussed here. However, in all cases, you can also reference the server by its IP address[10]Unless, of course, you're running Nexus in a Docker container fronted by nginx .
Nexus stores artifacts in the file system, and while it's not essential to do so, it's possible to keep things in separate sub-directories by creating a new storage blob for each one.
Nexus doesn't seem to have an option to force downloaded artifacts to be immutable, but it's possible to configure repositories to never check back with the online repo to see if an artifact has changed (by setting the Maximum Component Age to -1), which is close enough.
For example, hackers could take control of a repository, or perhaps someone takes over an expired domain name, and thus its email addresses, which might give them control of a library's repository.
This requires that the artifacts that have been cached in your local proxy are immutable, to prevent attacks by a malicious party on the version stored in the online repo.
I originally used nexus, but this causes problems when trying to open the admin interface in a browser, because that name is on the HSTS preload list, which means that the browser will force the use of HTTPS
I've always been fond of the phrase "cattle, not pets", which refers to the idea that computer servers should be treated as cattle (i.e. you should have no problem killing them), as opposed to pets. One of the most important changes over the 35+ years I've been a professional developer is the rise of automating processes. Back in the day, if you wanted to set up a new server, you did it manually, carefully installing all the software and other dependencies, then even more carefully configuring them, and since you didn't want to have to do that work again[1]And it was often the case that you couldn't re-create them, even if you wanted to, because of all the minor tweaks and changes that were invariably made over time, that didn't get documented., these servers were treated as precious pets. But today, with the rise of technologies such as Ansible and containers, servers are disposable - if one fails, just throw it away and run a script to create a new one.
This approach introduces some new considerations (e.g. managing a fleet of servers, re-creating them when they fail, etc.), giving rise to a new class of software known as container orchestration. The king of these is Kubernetes, and since I recently did a bit of work with this, I wanted to set up a local instance for testing. While there are things like minikube, that let you set up a local cluster on your PC, there's nothing like a proper test environment that mirrors a real production environment as closely as possible.
The trend these days is, of course, to do everything in the cloud, so there's no shortage of information on how to set things up using e.g. AWS or GCP, but rather less on how to set up a bare-metal local cluster, so we'll remedy that here with a set of instructions on how to set up a local Kubernetes instance that has:
It's possible to have a single server manage the control plane and act as a node (i.e. run containers), but in the interests of making this cluster as "real" as possible, we'll separate them out.
two more VM's that will act as nodes[2]These will actually run the containers.
We want two of these, so that we can test things like distributing workloads over multiple servers, automatic failover if a server goes down, etc.
And it was often the case that you couldn't re-create them, even if you wanted to, because of all the minor tweaks and changes that were invariably made over time, that didn't get documented.
Yeah, it's been a while I've been quietly chugging away in the background on Awasu client work, as well as other non-Awasu projects[1]And to be honest, I haven't felt much like writing , but it's been some time since my last mega-tutorial, so let's remedy that with a deep dive into the internals of everyone's favorite source control system, git.
This tutorial assumes that you are familiar with using git (e.g. commits, branches, tags), and we'll take a look at the internals of git and how it works, and in particular, its file formats.
A few years ago, I wrote a long series of tutorials showing how to embed Python into a C/C++ program, and periodically threatened to write another series showing how to go the other way i.e. extend Python by calling your own C/C++ code[1]Typically because you want better performance, or because you want to run it multi-threaded, which Python is known to not handle very well..
Well, I've finally made good on that promise and written some tutorials on how to write a Python extension module:
Many moons ago, I wrote a tutorial on how to set up an internet gateway on a Banana Pi, complete with DHCP, DNS, VPN, firewall and ad-blocking. It works well, I still use one today, and have even taken it with me on a few long backpacking trips. However, I worry about it being a bit fragile, and fear the day when an over-zealous customs officer decides it looks like something that could trigger a bomb , so I was overjoyed when I finally found my holy grail: something that does all of the above, in the form factor of a USB thumb drive.
GL-iNet's GL-USB-150 costs around USD 30, and comes with almost everything I need to get online when I'm on the road. This tutorial will be much shorter than the previous one, because nearly everything is already set up and ready to go
Getting started
Plug it in, give it 30 or 40 seconds to start up, then open a browser and go to http://192.168.8.1. To login, the default password is goodlife; once you're in, change this under More Settings/Admin Password.
It runs a DHCP server, and your computer will have already been assigned an IP address in the 192.168.8.xxx range.
Go to the Internet page, click on the Scan link, then connect to a WiFi network.
Open another browser window, and confirm that you're online.
Configuring the VPN
Go to the Management tab in the VPN/OpenVPN Client page, and upload your VPN configurations. This will typically be a ZIP of a bunch of .ovpn files, but if you have them, you will also need to include the .crt and/or .pem files.
Unfortunately, the stock firmware has a bug that prevents the ZIP file from being processed correctly, so you will need to upgrade the firmware first. Get the latest version from here[1]Version 3.026 worked for me., then install it via the Upgrade page.
Once the VPN configurations have been installed, you will be able to select which one to use from the VPN/OpenVPN Client page. Check your IP address to confirm that you are going through the VPN.
Installing software
To install additional software, go to the More Settings/Advanced page, and in the new browser window that opens, go to System/Software and update the package lists[2]This doesn't seem to persist after a reboot, so you have to remember to do this every time .
I installed the following packages:
bash
tmux
openssh-sftp-server (so that I can scp files in and out)
openssh-client (for a version of ssh that allows forwarding)
coreutils (for GNU tools)
bind-dig (for dig)
mtr (a handy network monitoring tool)
To change your default shell to bash, update /etc/passwd.
The only down-side to this device: while you can just about install a minimal version of Python, the disk is so small, there won't be any room for anything else
Ad-blocking
The only thing missing from this device is an ad-blocker. Since it uses dnsmasq for DNS, rather than bind as the Banana Pi does, the process is slightly different, but not much. Here's the script that I use:
# This script downloads blacklisted ad servers and updates dnsmasq to block them.
#
# The following line needs to be added to /etc/dnsmasq.conf:
# conf-file=/root/dns-blacklist
BLACKLIST_URL="http://pgl.yoyo.org/as/serverlist.php?hostformat=dnsmasq&mimetype=plaintext"
BLACKLIST_FNAME=/root/dns-blacklist
echo "Downloading the DNS blacklist..."
TMP_FNAME=/tmp/dns-blacklist
wget -O "$TMP_FNAME" "$BLACKLIST_URL"
if [ $? -ne 0 ] ; then exit 1 ; fi
echo
# fixup the entries so that they return "NX Domain"
echo "Updating the DNS blacklist..."
sed -i 's/address/server/g;s/127.0.0.1//g' "$TMP_FNAME"
echo
# install the new DNS blacklist
echo "Installing the DNS blacklist..."
echo " $TMP_FNAME => $BLACKLIST_FNAME"
mv "$TMP_FNAME" "$BLACKLIST_FNAME"
echo "Restarting dnsmasq..."
/etc/init.d/dnsmasq restart
echo
echo "All done."
The DNS blacklist is downloaded to a temp file, fixed up and then transferred to /root/dns-blacklist. You will need to tell dnsmasq to load this file by adding the following line to /etc/dnsmasq.conf:
conf-file=/root/dns-blacklist
This script can be configured to run periodically, or just run it manually every now and then.
Shutting down
There doesn't seem to be any way to shut down the device cleanly. I'm guessing it's been designed so that people can just pull the thing out of the USB port, but this really irks the sysadmin in me , so to shutdown cleanly, type the following in the console:
halt
The green LED light stays on, but the device will shutdown.
An Awasu user recently asked for some help in getting their Python code to work with Awasu, and while the problem turned out to be related to text encoding (which is not, strictly speaking, anything to do with Awasu), since this is such a common issue, I thought I'd write up some notes on how all it works.
Note that while this tutorial has separate sections on Python 2 and Python 3, even if you're only using one version of Python, you should read both sections if you want to really understand how things work.
This stuff is tricky to get your head around at first, but once you figure it out, it's actually not too bad. The problem is that even when you've got your code right, you start receiving content from elsewhere that is wrong, which breaks your code, so you change it to handle that content, but then your code breaks when you receive content from somewhere else that is doing things correctly[2]Or also doing things incorrectly, but in a different way , and you get stuck in a cycle where your code never works properly Hopefully, these notes will help you know when your code is right, and you can stick to your guns and start yelling at the other guy to fix their code...
A gateway lets you isolate computers in your home network from the internet. To reach the internet, a computer has to through the gateway, which means that if you put a firewall or virus checker or ad-blocker here, all your computers will benefit from them.
As before, this series of tutorials will walk you through the whole process of setting up a gateway, including a lot of not-essential-but-nice-to-have stuff. We start off by setting up a bare-bones gateway:
These days, a firewall is pretty much a necessity, and it's quite eye-opening to watch the logs and see the constant stream of attacks, as people try to break into your computer. And even if you run an ad-blocker like uBlock or AdBlock, a DNS-based ad-blocker can be run along-side it, without affecting browser performance at all[1]Browser plugins tend to slow the browser down noticeably, and can use huge amounts of memory..
Just a quick follow up on my recent epic tome on setting up a Banana Pi as a file server. I mentioned that I configured my disks to use the ext3 file system, and while it's generally fine, it does have one weakness: it is very slow deleting large files[1]Since I use my NAS for back ups, some of my files are well over 100GB.. Even worse, it locks up the file system, impacting other activity and causing stalls, which kinda sucks if you're watching a movie at the time
depesz took a very detailed look at the problem and some possible solutions, the TL;DR being that it's better to progressively shrink the file until it's all gone rather than asking the operating system to delete it as a file.
For the benefit of anyone having problems with this, here's a script that I wrote that implements this idea:
#!/bin/bash
# parse the command-line arguments
if [ $# -lt 1 ] ; then
echo "$(basename $0) file1 file2 file3 ..."
echo " Delete file(s) slowly."
exit 1
fi
chunk_size=100000000
for fname in "$@"; do
# check if we were given a directory
if [ -d "$fname" ]; then
# yup - process each file, then remove the directory
#echo "Deleting directory: $fname"
find "$fname" -type f -exec $(readlink -e "$0") \{\} \;
rm -rf "$fname"
continue
fi
# check if we were given a file
if [ ! -f "$fname" ]; then
# nope - ignore it
continue ;
fi
# yup - delete the file slowly
#echo "Deleting file: $fname"
while true; do
# get the current size of the file
fsize=$(ls -l "$fname" | cut -d' ' -f 5)
if [ $fsize -lt $chunk_size ] ; then
# the file is small enough to just delete
#echo "- Deleting file."
rm -f "$fname"
break
fi
# truncate the file, then loop back
#echo "- Truncating file: $fsize"
truncate -c -s -$chunk_size "$fname" || exit
sleep 0.25
done
done
Pass in a list of files and/or directories and it will slow-delete them. It will take longer to run, but will have far less impact on the rest of the system.
Nothing to do with Awasu, but hopefully someone out there in Internet-land will find it useful...
I've been a big fan of NAS's for many years, that is, a small file server that sits on my network and serves up music, movies, provides space for backups, etc. In the past, I've had Synology and QNAP units, and while they were both nice, they were both were relatively expensive, loaded with features I never used. They also both only lasted a few years, and rebuilding a NAS with 5-6 TB of data is a painfully long process
So for the next one, the plan was to grab an old laptop, load it up with FreeNAS, and then just hang a few disks off it. If and when the laptop dies, I can just set up a new one and the external disks, with all the data on them, should just plug straight in.
However, this is a bit of clunky solution, so when the Raspberry Pi came out, I got very interested in the idea of using that. Unfortunately, the rPi has one big drawback that makes it unsuitable for use as a file server: it only has 10/100 Mbps ethernet. All the computers on my network have gigabit ethernet, and since I'm moving 100's of GB's of data every night for backups, my file server also needs to have gigabit ethernet.
Enter the Banana Pi. Released in late 2014 by LeMaker in China, it's slightly more expensive but significantly more powerful, notably with gigabit ethernet and a SATA port. Add in a case, and I'll be able to build my own future-proof NAS for well under a hundred bucks, plus the cost of the disks.
There are quite a few tutorials floating around that explain how to set up a Banana Pi as a NAS, but they invariably only talk about how to set up the factory image of Open Media Vault[1]This is the successor to FreeNAS, written by one of the FreeNAS guys, that runs on Linux instead of FreeBSD. (which is relatively easy to do), but this series of tutorials will also talk about the many things you need to do after that to get a usable system.
A while back, I posted a tutorial that showed how easy it is to extend Awasu through the use of plugins and channel hooks, and continuing on from that, here's another series that shows how you can control your Awasu via its API.
Whether you just want to find out what state your channels or reports are in, or if you want to programmatically create, update and delete them, the Python and PHP libraries available make it a breeze.
Have a play with them, hope you find them useful and, as always, feel free to ask questions in the forums.
Awasu and the stylized Japanese character in the orange box are trademarks of Awasu Pty. Ltd. Other brands and product names are trademarks of their respective owners. Awasu Pty. Ltd. believes the information in this publication is accurate as of its publication date. Such information is subject to change without notice. Awasu Pty. Ltd. is not responsible for inadvertent errors.