Size matters: how Fedora approaches minimization

by | Feb 24, 2020 | Edge Computing

As part of a modern IT environment, Linux distributions can look to optimizing their size to be better suited for container use. One of the ways this improvement can happen is through reducing the size of a distribution, a process known as minimization. A new tool is being put together that will enable developers and operators to create minimal images of the appropriate size for the container use cases they need.

Graphic represents the relationships between all of the software repositories in Fedora Linux, many thousands of green dots cross-connected to appear like a cloud nebula.

Graphical representation of Fedora repository relationships. Image by: Adam Šamalík

There are solid needs for distro minimization, which counter the usual argument against reductions of distro size that disk space is cheap. The larger the distribution’s userspace in a container, the more resources a container environment at scale is going to need. 

Edge computing devices also need smaller distributions to run on individual devices, both in terms of memory used as well as lower bandwidth to keep any edge-based Linux distribution updated. Another strong reason for distribution minimization goes back to good-OpSec sense: the smaller the platform, the smaller the attack surface that platform has.

These needs are exactly why Adam Šamalík, a Senior Software Engineer on the RHEL engineering team in Red Hat, approached the task of Fedora minimization. In a January 30 presentation at DevConf.CZ, Šamalík outlined that prior to the release of Fedora 30, the uncompressed base image size of Fedora has steadily risen from Fedora 22’s 200 MB to Fedora 29’s about 300 MB–a 50% increase in size. 

Šamalík didn’t initially want to jump in and start slashing the size of the Fedora base image. Following the Fedora project objectives guidelines, Šamalík wanted to ensure that he had the right goals in place. For example, did the size of the Fedora base images actually need reducing, or did the apps and dependencies running on top of the images need to be scaled back?

Turns out, a little bit of both. Šamalík created a Feedback Pipeline tool specifically designed to let developers and operators define the base image they want to monitor and examine (Fedora 31 vs. Fedora rawhide) as well as a use case for that image. The requirements for an image running Apache HTTPD, for example, would be different than that of an image running PostgreSQL. 

This far, the project has been a success. Through continued use, the Fedora development team was able to pare the container base image of Fedora back down to under 200 MB. With the Feedback Pipeline’s graphing tools, developers can visually examine the dependency and size relationships within a given image and determine if any packages in that image can be reduced in size or removed altogether. And when something gets removed, Feedback Pipeline helps ensure they won’t come back.

Work on the Feedback Pipeline continues, Šamalík explained, and the whole project is open for contributions and examination on GitHub.