Skip to content

DontPanicO/svlopp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

139 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

svlopp

svlopp is a minimal, Linux-specific process supervisor built around a single epoll loop and modern kernel primitives.

Alpha Software.

svlopp is under active development and is not production ready. It currently targets contributors and experimentations rather than practical deployments.

What is this?

Right now, svlopp is a Linux process supervisor that runs as a regular process (i.e. it is not PID 1). You define a set of processes and svlopp starts them, tracks them, and reaps them when they exit. The long term goal is to evolve svlopp into a minimal init system for Linux, focused on the essential responsibilities of init.

Design Principles

While svlopp is a recent project, it was preceded by a lot of reading about init systems - source code, docs, articles - to the point where I had already formed strong opinions on the topic before writing any code. svlopp largely reflects those opinions, with the single exception of portability: while I generally consider it important, I found the Unix ecosystem already well served in that regard and chose instead to optimize svlopp for a single target platform - Linux. That decision drove some of the other principles. svlopp uses a signalfd for signal handling, receiving signals as events, which makes it convenient to have a single event loop rather than having a per process watcher approach.

Another important aspect is clarity. While I was reading the source code of other init systems / process supervisors, I noticed a clear distinction between designs that were self explanatory and easy to follow locally - most of them - and those whose functionality is hidden behind multiple layers of abstraction - a few -, requiring me to keep track of long call chains and data flows before even stumbling across a syscall, to the point where I just gave up. svlopp tries to stay in the former category. That's not to scoff at the latter: I have deep respect for any software that's running - and working - in production. Something else I took away from my readings is the idea that - quoting the s6 author - system software should stay the heck out of the way. That means to me that system software should not pollute the user space from both the perspective of other system software it might interact with, and the user's point of view by not trying to reduce the spectrum of software they are able to run on their machines.

This naturally leads to the last point: avoiding feature creep. Features that are not clearly in scope are better left out, and the burden - or the pleasure - of addressing those concerns should be left to other software, built specifically for that purpose, and better suited to solving that particular problem. That's something I see people - including myself - intuitively understanding even before knowing that thing has a name and it's an idea well defined by the Unix philosophy.

High-level architecture

                     ┌───────────────┐
                     │     epoll     │
                     └───────┬───────┘
                             │
             ┌───────────────┼───────────────┐
             │               │               │
        ┌──────────┐   ┌──────────┐   ┌────────────┐
        │ signalfd │   │ timerfd  │   │ control    │
        │ (signals)│   │(periodic)│   │ FIFO       │
        └────┬─────┘   └────┬─────┘   └─────┬──────┘
             │              │               │
             v              v               v
     SIGCHLD / HUP /   enforce timeouts    start /
     TERM / INT        and scheduled       stop /
     reap children,    actions             restart services
     reload,
     shutdown

svlopp runs a single epoll loop that currently watches a signalfd, timerfd and a control FIFO.

Signals are handled via signalfd, allowing them to be treated as regular events rather than asynchronous interruptions. This makes it possible to have all the supervision logic driven by that loop, instead of creating per service monitoring threads or processes. The following signals are currently handled:

  • SIGCHLD: one or more child processes exited, svlopp reaps all exited children
  • SIGHUP: configuration reload, svlopp re-reads the configuration, diffs it against the current state and reconciles
  • SIGTERM / SIGINT: graceful shutdown, svlopp requests services to stop by sending their configured stop_signal (defaults to SIGTERM) and waits for them to exit

On reload (SIGHUP), svlopp reads the configuration and reconciles it with the current runtime state: new services get added and started, removed services get stopped and removed, and changed services get restarted with their updated definition. Some of those actions can be performed immediately (for example, starting a new service or removing a stopped one), while others have to be deferred until the service process has exited. This preserves a single reaping path through SIGCHLD and keeps process lifecycle handling centralized and predictable.

The timerfd fires periodically and is currently used to apply scheduled actions (such as those deferred during a reload after SIGHUP) and to enforce shutdown deadlines, allowing svlopp to forcefully terminate child processes that ignore the stop signal. Deadlines are computed when the configured stop signal is sent by adding stop_timeout_ms to the current time (see Configuration for more details). On each timerfd tick, svlopp checks for overdue deadlines and sends SIGKILL. This provides two guarantees:

  1. SIGKILL is never sent before stop_timeout_ms has elapsed since the stop signal.
  2. SIGKILL is sent on the first timerfd tick after the deadline. As a consequence, stop_timeout_ms is not the exact time between the stop signal and SIGKILL, but the minimum amount of time that will elapse between them.

svlopp maintains a runtime directory (by default /run/svlopp) that is expected to reside on a tmpfs and contains only runtime generated state.

Status file

svlopp maintains a status file in the runtime directory, which contains a snapshot of the current runtime state, one line per service. For running services: <name> <id> <state> <pid>

For stopped services: <name> <id> <state> <stop_reason>

The file is rewritten whenever the runtime state changes which makes it important for the runtime directory to reside on a tmpfs.

Control FIFO

The control FIFO is a named pipe that accepts binary commands from external sources, to start, stop and restart individual services. The protocol uses fixed-size frames of 9 bytes: the first byte encodes the operation, and the remaining 8 bytes carry the service id as a little-endian u64. Service ids are published in the status file. Writers are expected to resolve service names to ids by reading it.

Quick Start

Build svlopp with cargo:

cargo build --release

Create a configuration file:

[services.sleep_forever]
command = "sleep"
args = ["infinity"]

[services.my_daemon]
command = "/usr/local/bin/my_service"
args = ["--config", "/etc/my_service.conf"]

Run svlopp:

sudo ./target/release/svlopp services.toml

By default svlopp uses /run/svlopp as its runtime directory. Since that location typically requires root privileges, you may need to run it with sudo.

Alternatively, you can specify a custom runtime directory:

./target/release/svlopp --run-dir /tmp/svlopp services.toml

At startup, svlopp removes the runtime directory if exists and recreates it. The parent directory (/tmp in the above example) is expected to exist.

To reload configuration, send SIGHUP:

kill -HUP $(pidof svlopp)

To shutdown gracefully, send SIGTERM or SIGINT:

kill -TERM $(pidof svlopp)

Configuration

Configuration is required to define services. As of now svlopp is configured via a single TOML file, and the service definition format consists of:

  • A service name
  • A command (the path to the binary)
  • An optional array for command arguments
  • An optional termination reaction
  • An optional working directory
  • Optional environment variables
  • An optional log file
[services.service_name]
command = "service_bin"
args = ["service", "options"] # optional
on_exit = "Restart" # optional
working_directory = "/home/myuser" # optional
log_file_path = "/var/log/service_name.log" # optional
stop_signal = "SIGQUIT" # optional
stop_timeout_ms = 5000 # optional

[services.service_name.env] # optional
FOO = "BAR"
BAZ = "QUX"

Services are expected to run in the foreground. svlopp supervises the processes it starts and reaps them directly; services that daemonize themselves, double-fork, or are explicitly backgrounded (e.g. using &) will break supervision and are not supported.

Many traditional daemons support a foreground mode. For example, to supervise apache2 you could write a small wrapper script:

#!/usr/bin/env bash

. /etc/apache2/envvars
exec /usr/sbin/apache2 -DFOREGROUND

and make it executable, then add it to your configuration:

[services.apache2]
command = "/path/to/your/wrapper_script.sh"

The optional on_exit field defines what svlopp should do after a service process exits. It is a fallback action, taken only when no other explicit action is pending (for example after a configuration reload triggered by SIGHUP).

Supported values are:

  • None (default): do nothing. The service will not be restarted automatically, but svlopp will keep it in memory, so it can be started again via an explicit command
  • Restart: restart the service after it exits
  • Remove: remove the service from supervision after it exits. A configuration reload (SIGHUP) is required to start the service again

If a service is stopped by svlopp itself (e.g during shutdown, configuration reload or - once supported - via an explicit command) the fallback action is not taken.

The optional working_directory field sets the working directory for the service process. If not specified, the service inherits svlopp's current working directory.

The optional env table defines the environment for the service process. If env is not specified the service inherits svlopp's current environment; otherwise, the inherited environment is completely replaced by the variables defined in env.

This means that specifying an empty env results in a (mostly) empty environment and users are responsible for defining variables such as PATH if the command lookup relies on it.

# this will fail (no `PATH` defined)
[services.service_name]
command = "sleep"
args = ["infinity"]

[services.service_name.env]

# this is fine (absolute path, no `PATH` needed)
[services.service_name]
command = "/usr/bin/sleep"
args = ["infinity"]

[services.service_name.env]

# this is also fine (`PATH` explicitly defined)
[services.service_name]
command = "sleep"
args = ["infinity"]

[services.service_name.env]
PATH = "/usr/bin"

The optional log_file_path field specifies a file to which both stdout and stderr of the service are redirected. If not set, they are redirected to /dev/null. The file is opened in append mode and svlopp does not perform any kind of log rotation or size management.

The optional stop_signal field specifies which signal svlopp sends to request a graceful shutdown of the service. Valid values are:

  • SIGTERM (default)
  • SIGINT
  • SIGQUIT
  • SIGHUP
  • SIGUSR1
  • SIGUSR2

The optional stop_timeout_ms field specifies how long svlopp waits after sending the configured stop signal before forcefully terminating the service with SIGKILL. SIGKILL is sent at the earliest opportunity, which corresponds to the first timerfd tick after the configured timeout has elapsed.

svlopp in still in its early stages, and the configuration format should be expected to evolve. Service definitions will likely expand beyond what is currently available, and the overall configuration structure may change as new features are introduced.

Project Status

Working

  • Basic service definition
  • Configuration parsing
  • Process spawning and lifecycle tracking
  • Event-driven supervision
  • Child processes reaping
  • Subreaping of orhpaned descendant processes (PR_SET_CHILD_SUBREAPER)
  • Graceful shutdown
  • Static configuration reload
  • Runtime control commands (stop/start/restart) via control FIFO
  • Service status reporting with status file

Not yet implemented / still thinking about

  • Useful logging
  • pidfd based process management
  • Extend service definition

Limitations & Known Issues

svlopp is still in an early stage, and several important pieces are either missing or incomplete:

  • Running svlopp as PID 1 is not currently supported
  • Subreaping support is minimal. Orphaned descendants are reaped, but no additional semantics (such as attribution to services) are currently implemented
  • Logging is very limited and poorly structured, if at all

These limitations are known and sometimes intentional at this stage. The focus so far has been on process supervision before expanding other capabilities.

Building

svlopp only builds on Linux. Make sure you have a recent Rust toolchain installed:

cargo build

For a release build:

cargo build --release

Testing

Tests spawn svlopp with one or more services and interact with it via signals and the control FIFO to trigger specific events. Behavior is then verified by checking that the reported state (status file) matches the expected outcome and is consistent with the kernel state (e.g. pid existence in /proc).

Tests are written in Python and run with pytest:

pip install -r requirements-test.txt
PYTHONPATH=. pytest tests/

Contributing

svlopp is in early development and I'm happy to have people look at it, poke at it, and share their thoughts. If you find a bug or have an idea, open an issue. If you want to contribute code, feel free to open a PR, but for anything non trivial it might still be worth opening an issue.

This project uses the Developer Certificate of Origin (DCO). By signing off your commits, you certify that you have the right to submit the code under the project's license. Sign off your commits with:

git commit -s

License

This project is licensed under the Mozilla Public License, version 2.0 (MPL-2.0).

About

A lightweight, single threaded, epoll based linux process supervisor, aiming to evolve into a minimal init system

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors