systemd

There is a feature request from 2019 which is surprisingly1 still open but not really going anywhere. There are scattered efforts to build pieces of it, so for clarity let's write down what I actually want.

What even is "cron-like behaviour"

The basic idea is that the output of a job gets dropped in my mailbox. This isn't because mail is suitable for this, just that it's a well established workflow and I don't need to build any new filtering or routing, and it's showing up at the right "attention level" - not interrupting me unless I'm already paused to check mail, easily "deferred" as unread, can handle long content.

Most uses fall into one of two buckets.

  • Long-timeline jobs (weekly backups, monthly letsencrypt runs) where I want to be reminded that they exist, so I want to see successful output (possibly with different subject lines.)
  • Jobs that run often but I don't want the reminder, only the failure reports (because I have a higher level way of noticing that they're still behaving - a monthly summary, or just "things are still working".)

The primary tools for this are

  • a working mail CLI
  • systemd timer files
  • systemd "parameterized service" files that get triggered by the timer failing (or passing.)

The missing pieces are how to actually collect the output.

Journal scraping?

We could just trust the journal - we can use journalctl --unit or --user-unit to pry out "the recent stuff" but if we can pass the PID of the job around, we can use _SYSTEMD_UNIT=xx _PID=yyy to get the relevant content.

(Hmm, we can get pass %n into the mailing service (systemd.unit(5)), but not the pid?)

Separate capture?

Just run the program under script or chronic pointing the log to %t or %T, and generate it with things we know, and then OnFailure and OnSuccess can mail it and/or clean it up.

While it would be nice to do everything with systemd mechanisms, if we have to we can have the wrapper do all of the work so we have enough control.2

In the end

Once I started poking at the live system, I realized that I was getting ahead of myself - I didn't have working mail delivery.3 Setting up postfix took enough time that I decided against anything more clever for the services - so instead, I just went with a minimal .service file that did

WorkingDirectory=...
Type=exec
ExecStart=bash -c '(time ...) 2>&1 | mail -s "Weekly ..." ...

and a matching .timer file with a variant on

[Timer]
OnCalendar=Monday *-*-* 10:00

The systemd.time(7) man page has a hugely detailed set of syntax examples, and if that's not enough, systemd-analyze calendar --iterations=3 ... shows you the next few actual times (displayed localtime, UTC, and as a human-readable relative time expression) so you can be confident about when your jobs with really happen.

For the initial services like "run an apt upgrade in the nginx container" I actually want to see all of the output, since Weekly isn't that noisy; for other services I'll mix in chronic and ifne so that it doesn't bother me as much, but for now, the confidence that things actually ran is more pleasing than the repetition is distracting.

I do want a cleaner-to-use tool at some point - not a more sophisticated tool, just something like "cronrun ..." that automatically does the capture and mail, maybe picks up the message subject from the .service file directly - so these are more readable. But for now, the swamp I'm supposed to be draining is "decommissioning two machines running an AFS cell" so I'm closing the timebox on this for now.


  1. but not unreasonably: "converting log output to mails should be outside of systemd's focus." 

  2. moreutils gives us chronic, ifne, and lckdo, and possibly mispipe and ts if we're doing the capturing. cronutils also has a few bits. 

  3. This was a surprise because this is the machine I'd been using as my primary mail client, specifically Gnus in emacs. Turns out I'd configured smtpmail-send-it so that emacs would directly talk to port 587 on fastmail's customer servers with authenticated SMTP... but I'd never gotten around to actually configuring the machine itself