go-apt-cacher is a caching reverse-proxy designed specially for Debian/Ubuntu repositories. As it is written in Go, go-apt-cacher tolerates thousands of concurrent client connections and is very fast.
This article describes our backgrounds and motivation as well as the design of these tools. They are available at https://github.com/cybozu-go/aptutil.
To distribute deb packages among our geographically distributed data centers, direct access to the central repository should be avoided for latency and band-width reasons. Instead, we relied on apt-cacher-ng that proxy the access to the central repository server and cache deb files.
For security reasons, we need to apply patches regularly to these servers. Although patches come from the official Ubuntu repository, we take a snapshot of it to test patches before application. We used apt-mirror for this purpose.
apt-cacher-ng is not very stable when used as a reverse proxy for HTTPS servers, especially when there are a lot of clients. It randomly crashes. We tried to debug it but found that its internals are quite complicated that would take days to catch the bug.
apt-mirror sometimes ends with a broken mirror. Broken here means some files do not match the checksums provided by APT indices. We considered this is a design flaw of apt-mirror.
Instead of fixing these tools, we implement our own tools by using Go. This results in two new tools, namely, go-apt-cacher and go-apt-mirror. Thanks to Go, our tools are portable (run even on non-Linux machines), fast, terse, and can tolerate with tons of clients.
Both tools are designed and implemented to test downloaded files strictly with checksums listed in indices such as
Features of go-apt-cacher include:
- Checksum awareness
go-apt-cacher recognizes APT indices and extracts checksum information.
Cached files are invalidated and dropped when checksums are updated.
- Non-HTTP cache semantics
go-apt-cacher ignores cache-related HTTP headers.
This is because updates of APT repository can and should be checked through checksums.
- LRU eviction
Cached files are evicted in least-recently-used (LRU) fashion.
- Massive clients
go-apt-cacher can accept thousands of concurrent connections from clients.
The number of connections to the upstream servers can be limited.
Features of go-apt-mirror include:
- Checksum awareness
go-apt-mirror tests all downloaded files with checksums.
It rollbacks changes when a file does not match the checksum.
- Atomic update
Mirrors are updated atomically by using
rename(2)with symbolic links.
Unchanged files are reused as hard links for space- and time-efficiency.
- Ultra-fast update
go-apt-mirror checks updates of files by checksums saved with mirrors.
This is quite faster than
- Parallel download
go-apt-mirror downloads files in parallel.
- Partial mirror
go-apt-mirror can mirror repositories partially for specific distributions and/or architectures.
go-apt-cacher and go-apt-mirror are not just re-implementations of apt-cacher-ng and apt-mirror in Go. They are designed for robustness and reliability.
They have already replaced apt-cacher-ng and apt-mirror in our data centers successfully. We have open-sourced them on GitHub hoping someone find them useful.