dkg's blog - Tools should be distinct from Services

Modern daemon implementations can be run in a variety of ways, in a range of contexts. The daemon software itself can be a useful tool in environments where the associated traditional system service is neither needed nor desired. Unfortunately, common debian packaging practice has a tendency to conflate the two ideas, leading to some potentially nasty problems where only the tool itself is needed, but the system service is set up anyway.

How would i fix it? i'd suggest that we make a distinction between packages that provide tools and packages that provide system services. A package that provides the system service foo would need to depend on a package that provides the tool foo. But the tool foo should be available through the package manager without setting up a system service automatically.

Bad Examples

Here are some examples of this class of problem i've seen recently:

akonadi-server depends on mysql-server

akonadi is a project to provide extensible, cross-desktop storage for PIM data. It is a dependency of many pieces of the modern KDE4 desktop. Its current implementation relies on a private instantiation of mysqld, executed directly by the end user whose desktop is running.

This means that a sysadmin who installs a graphical calendar application suddenly now has (in addition to the user-instantiated local mysqld running as the akonadi backend) a full-blown system RDBMS service running and potentially consuming resources on her machine.

Wouldn't it be better if the /usr/sbin/mysqld tool itself was distinct from the system service?

puppetmaster depends on puppet

Puppet is a powerful framework for configuration management. A managed host installs the puppet package, which invokes a puppetd service to reconfigure the managed host by talking to a centralized server on the network. The central host installs the the puppetmaster package, which sets up a system puppetmasterd service. puppetmaster depends on puppet to make use of some of the functionality available in the package.

But this means that the central host now has puppetd running, and is being configured through the system itself! While some people may prefer to configure their all-powerful central host through the same configuration management system, this presents a nasty potential failure mode: if the configuration management goes awry and makes the managed nodes inaccessible, it could potentially take itself out too.

Shouldn't the puppet tools be distinct from the puppetd system service?

Update: puppet 0.25.4-1 resolves this problem with a package re-factoring; the tools are in puppet-common, and the puppet package retains its old semantics of running the system service. Looks good to me!

monkeysphere Build-Depends: openssh-server

The Monkeysphere is a framework for managing SSH authentication through the OpenPGP Web-of-Trust (i'm one of the authors). To ensure that the package interacts properly with the OpenSSH implementation, the monkeysphere source ships with a series of test suites that exercise both sshd and ssh.

This means that anyone trying to build the monkeysphere package must pull in openssh-server to satisfy the build-depends, thereby inadvertently starting up a potentially powerful network service on their build machine and maybe exposing it to remote access that they didn't intend.

Wouldn't it be better if the /usr/sbin/sshd tool was available without starting up the ssh system service?

Good Examples

Here are some examples of debian packaging that already understand and implement this distinction in some way:

apache2.2-bin is distinct from apache2-mpm-foo: Debian's apache packaging recently transitioned to split the apache tool into a separate package (apache2.2-bin) from the packages that provide an apache system service (apache2-mpm-foo). So apache can now be run by a regular user, for example as part of gnome-user-share.
git-core is distinct from git-daemon-run: git-core provides the git daemon subcommand, which is a tool capable of providing network access to a git repo. However, it does not set up a system service by default. The git-daemon-run package provides a way for an admin to quickly set up a "typical" system service to offer networked git access.
vblade is distinct from vblade-persist: vblade offers a simple, powerful utility to export a single file or block device as an AoE device. vblade-persist (disclosure: i wrote vblade-persist) provides a system service to configure exported devices, supervise them, keep them in service across system reboots, etc.

Tools are not Services: a Proposal

Let's consider an existing foo package which currently provides:

the tool itself (say, /usr/sbin/foo) and
sets up a system service running foo, including stock foo daemon configuration in /etc/foo/*, startup scripts at /etc/init.d/foo, service configs at /etc/default/foo, etc.

I suggest that this should be split into two packages. foo-bin would contain the tool itself, and foo (which Depends: foo-bin) would contain the service configuration information, postinst scripts to set it up and start it, etc. This would mean that every instance of apt-get install foo in someone's notes would retain identical semantics to the past, but packages which need the tool (but not the service) can now depend on foo-bin instead, leaving the system with fewer resources consumed, and fewer unmanaged services running.

For brand new packages (which don't have to account for a legacy of documentation which says apt-get install foo), i prefer the naming convention that the foo package includes the tool itself (after all, it does provide /usr/sbin/foo), and foo-service sets up a standard system service configured and running (and depends on foo).

Side Effects

This proposal would fix the problems noted above, but it would also offer some nice additional benefits. For example, it would make it easier to introduce an alternate system initialization process by creating alternate service packages (while leaving the tool package alone). Things like runit-services could become actual contenders for managing the running services, without colliding with the "stock" services installed by the bundled tool+service package.

The ability to create alternate service packages would also mean that maintainers who prefer radically different configuration defaults could offer their service instantiations as distinct packages. For example, one system-wide foo daemon (foo-service) versus a separate instance of the foo daemon per user (foo-peruser-service) or triggering a daemon via inetd (foo-inetd-service).

One negative side effect is that it adds some level of increased load on the package maintainers — if the service package and the tool package both come from the same source package, then some work needs to be done to figure out how to split them. If the tool and service packages have separate sources (like vblade and vblade-persist) then some coordinating footwork needs to be done between the two packages when any incompatible changes happen.

Questions? Disagreements? Next Steps?

Do you disagree with this proposal? If so, why? Unfortunately (to my mind), Debian has a long history of packages which conflate tools with services. Policy section 9.3.2 can even be read as deliberately blurring the line (though i'd argue that a cautious reading suggests that my proposal is not in opposition to policy):

Packages that include daemons for system services should place scripts in /etc/init.d to start or stop services at boot time or during a change of runlevel.

I feel like this particular conflict must have been hashed out before at some level — are there links to definitive discussion that i'm missing? Is there any reason we shouldn't push in general for this kind of distinction in debian daemon packages?

Tags: daemons, packaging, policy