Skip to content
Published on

Mastering systemd Timers — Time to Graduate from cron

Authors

Introduction — Why Timers, Why Now

A post titled "You don't love systemd timers enough" recently hit the Hacker News front page and collected hundreds of comments. The title is provocative on purpose. systemd debates are the eternal flamewar of the Linux community, but the argument here was surprisingly sober: "Most of the pain you experience scheduling things with cron was solved by systemd timers ten years ago — so why are you still opening crontab?"

The comment section was just as interesting. One camp argued that "writing two unit files is more hassle than one crontab line," and the counterpunch was "that one line has been silently failing, and you will find out six months later." In 2026, with AI coding agents routinely generating server configuration files, the cost of "writing two unit files" has effectively collapsed to zero — what remains is the difference in operational quality.

This post covers the structural limits of cron, the timer unit architecture, OnCalendar syntax in detail, catch-up of missed runs, failure notifications, user timers, and a practical cron-to-timer migration mapping — all from an operations point of view.

The Structural Limits of cron

cron was born in 1975. Surviving fifty years is an achievement in itself, but in modern operations the following problems keep tripping people up.

1. No Logging

Output from cron jobs is sent as local mail by default. On servers without a configured MTA — which is most cloud instances — the output simply vanishes. So everyone ends up writing this:

# A familiar sight in crontab — manually shoving logs into a file
0 3 * * * /opt/backup.sh >> /var/log/backup.log 2>&1

The problems are obvious. You have to manage log rotation yourself, your script has to print its own timestamps, and answering "did the last run succeed?" means parsing a log file.

2. No Way to Express Dependencies

"After the network is up," "after the mount is complete," "only while another service is running" — cron cannot express any of these. So sleep loops and retry logic creep into your scripts, and the shell script ends up doing the scheduler's job.

3. Missed Runs Are Gone Forever

You scheduled a backup for 3 a.m., but the server was powered off at that moment? cron does nothing. anacron exists as a patch, but it only has daily resolution and is yet another tool to manage.

4. The Environment Trap

cron runs jobs with a nearly empty environment. The "works in my terminal, fails in cron" debugging spiral caused by a different PATH is a rite of passage for every sysadmin.

5. No Concurrency Control

If the previous run has not finished when the next schedule fires, cron just runs it again. You need the flock wrapper idiom to protect yourself.

Here are the five issues side by side:

Problemcronsystemd timer
Loggingmanual redirectioncollected by journald automatically
Dependenciesimpossibleunit dependencies via After, Requires
Missed runslostcaught up with Persistent=true
Environmentempty env, PATH trapdeclared explicitly in the unit file
Concurrencymanual flockservice units are single-instance by default
Failure alertsmail (usually unconfigured)OnFailure handler
Resource limitsimpossibleCPUQuota, MemoryMax, and friends

Anatomy of a Timer — the .timer/.service Pair

A systemd timer consists of two unit files: a .timer unit that defines the "when" and a .service unit that defines the "what." If they share a name, they are paired automatically.

  backup.timer  ──────triggers──────▶  backup.service
  (when to run)                        (what to run)
       │                                  │
       │ OnCalendar=...                   │ ExecStart=/opt/backup.sh
       │ Persistent=true                  │ User=backup
       │ RandomizedDelaySec=...           │ MemoryMax=1G
       ▼                                  ▼
  systemctl list-timers              journalctl -u backup.service

A concrete example: run a backup every night at 3 a.m.

# /etc/systemd/system/backup.service
[Unit]
Description=Nightly backup job
Wants=network-online.target
After=network-online.target

[Service]
Type=oneshot
ExecStart=/opt/scripts/backup.sh
User=backup
Nice=10
IOSchedulingClass=idle
MemoryMax=1G
TimeoutStartSec=2h
# /etc/systemd/system/backup.timer
[Timer]
OnCalendar=*-*-* 03:00:00
Persistent=true
RandomizedDelaySec=15m

[Install]
WantedBy=timers.target

You only enable the timer side. The service is triggered by the timer, so it is not enabled itself.

sudo systemctl daemon-reload
sudo systemctl enable --now backup.timer

# Want to run it once right now? Start the service directly
sudo systemctl start backup.service

The reason to use Type=oneshot is to tell systemd this is a "run and exit" job. And as the example shows, sprinkling on resource controls like Nice, IOSchedulingClass, and MemoryMax keeps the backup job from interfering with production traffic — something cron could never dream of.

OnCalendar Syntax in Detail

OnCalendar is both easier to read and more expressive than cron expressions. The base format is:

DayOfWeek Year-Month-Day Hour:Minute:Second
(every field is optional, * is a wildcard, / is a step, , is a list, .. is a range)

The fastest way to learn is by example.

# Every day at 3 a.m.
OnCalendar=*-*-* 03:00:00

# Top of every hour
OnCalendar=hourly
# Explicit equivalent of the above
OnCalendar=*-*-* *:00:00

# Every 15 minutes
OnCalendar=*:0/15

# Weekdays (Mon-Fri) at 9 a.m.
OnCalendar=Mon..Fri *-*-* 09:00:00

# Every Monday and Thursday at 06:30
OnCalendar=Mon,Thu *-*-* 06:30:00

# Midnight on the 1st of every month
OnCalendar=*-*-01 00:00:00

# 11 p.m. on the last day of the month (~ prefix counts back from month end)
OnCalendar=*-*~01 23:00:00

# First day of each quarter (Jan, Apr, Jul, Oct 1st) at 2 a.m.
OnCalendar=*-01,04,07,10-01 02:00:00

# Every 2 hours, at half past (00:30, 02:30, ...)
OnCalendar=00/2:30:00

# Multiple schedules in one timer — multiple OnCalendar lines are ORed together
OnCalendar=Mon..Fri 09:00
OnCalendar=Sat,Sun 11:00

Always verify that your expression parses the way you intended. systemd-analyze calendar even computes the next trigger time for you.

$ systemd-analyze calendar "Mon..Fri *-*-* 09:00:00"
  Original form: Mon..Fri *-*-* 09:00:00
Normalized form: Mon..Fri *-*-* 09:00:00
    Next elapse: Fri 2026-06-12 09:00:00 KST
       (in UTC): Fri 2026-06-12 00:00:00 UTC
       From now: 14h left

# Check the next 5 trigger times at once
$ systemd-analyze calendar --iterations=5 "*-*-01 02:00"

Monotonic Timers — Relative to Boot or Activation

There are also timers defined not by the calendar, but as "this long after some reference point."

[Timer]
# First run 15 minutes after boot, then repeat 1 hour after each run finishes
OnBootSec=15min
OnUnitActiveSec=1h

OnUnitActiveSec is measured from when the previous run ended, so even if job duration fluctuates, runs never overlap. This is especially handy for polling-style jobs.

Persistent=true — Catching Up on Missed Runs

This is the timer's killer feature. With Persistent=true, systemd records the last trigger time of the timer on disk. If the system was powered off and missed the schedule, it runs the job once immediately after the next boot.

[Timer]
OnCalendar=daily
Persistent=true

If you want "daily backup" guarantees on a laptop or an intermittently powered dev box, this solves it without anacron. The records live here:

$ ls /var/lib/systemd/timers/
stamp-backup.timer  stamp-fstrim.timer  stamp-logrotate.timer

# The mtime of the stamp file is the last trigger time
$ stat -c '%y' /var/lib/systemd/timers/stamp-backup.timer
2026-06-11 03:07:42.000000000 +0900

One caveat: right after boot, all catch-up timers can pile up and fire at once. That is why pairing it with RandomizedDelaySec, described next, is the standard practice.

RandomizedDelaySec and AccuracySec

To avoid the "thundering herd" of many servers hammering an external API or backup storage at the exact same moment, randomize the delay.

[Timer]
OnCalendar=daily
# Random delay between 0 and 4 hours — spreads your fleet across the window
RandomizedDelaySec=4h
# Want the same offset on the same machine every time? (systemd 247+)
FixedRandomDelay=true

AccuracySec is the opposite knob: "how precisely should I be woken up?" If you do not know the default is one minute, it can surprise you. For second-level precision, lower it explicitly.

[Timer]
OnCalendar=*-*-* 09:00:00
# Default 1min → second-level precision
AccuracySec=1s

Conversely, in power-sensitive environments (laptops, embedded), you can raise AccuracySec to batch CPU wakeups together.

Monitoring — list-timers and journalctl

Observability is the biggest win over cron. First, see the state of every timer on one screen:

$ systemctl list-timers --all
NEXT                         LEFT     LAST                         PASSED  UNIT            ACTIVATES
Fri 2026-06-12 03:00:00 KST  8h left  Thu 2026-06-11 03:11:02 KST  12h ago backup.timer    backup.service
Fri 2026-06-12 00:00:00 KST  5h left  Thu 2026-06-11 00:00:01 KST  15h ago logrotate.timer logrotate.service

NEXT and LAST are right there. With crontab you had to compute that yourself. Per-job logs accumulate in journald automatically:

# All logs for the backup service (stdout/stderr included)
journalctl -u backup.service

# Only logs since yesterday, newest first
journalctl -u backup.service --since yesterday -r

# Check the exit code of the last run
systemctl status backup.service

# Curious about run duration stats?
journalctl -u backup.service -o json | jq -r '.MESSAGE' | grep -i finished

Timestamps, log rotation, and exit-code tracking all come for free.

Failure Notifications — the OnFailure Handler

This is the pattern that prevents the classic "the backup silently failed for three months" incident. Attach OnFailure to the service unit, and when the unit ends in a failed state, the specified unit gets started.

# Add to /etc/systemd/system/backup.service
[Unit]
Description=Nightly backup job
OnFailure=notify-failure@%n.service

Build the notification handler as a template unit and reuse it for every job. %n is substituted with the name of the failed unit.

# /etc/systemd/system/notify-failure@.service
[Unit]
Description=Send failure notification for %i

[Service]
Type=oneshot
ExecStart=/opt/scripts/notify-failure.sh %i
#!/usr/bin/env bash
# /opt/scripts/notify-failure.sh — failure alert via Slack webhook
set -euo pipefail
UNIT="$1"
HOST="$(hostname -f)"
# Attach the last 20 log lines before the failure
LOG="$(journalctl -u "$UNIT" -n 20 --no-pager --output=cat)"
PAYLOAD="$(jq -n --arg t "[FAIL] $UNIT on $HOST" --arg l "$LOG" \
  '{text: ($t + "\n" + $l)}')"
curl -sf -X POST -H 'Content-Type: application/json' \
  -d "$PAYLOAD" "$SLACK_WEBHOOK_URL"

Testing is simple: trigger a service that fails on purpose.

sudo systemd-run --unit=failtest --property=OnFailure=notify-failure@failtest.service /bin/false
journalctl -u notify-failure@failtest.service

If you also want success notifications, OnSuccess= (added in systemd 249) works with the exact same pattern.

User Timers — Running Your Jobs Without Root

Personal jobs (dotfiles backup, mail sync, dev DB cleanup) do not need to touch the system scope — run them as user timers.

# Unit file location
mkdir -p ~/.config/systemd/user

# After writing ~/.config/systemd/user/mail-sync.service and mail-sync.timer
systemctl --user daemon-reload
systemctl --user enable --now mail-sync.timer
systemctl --user list-timers
journalctl --user -u mail-sync.service

The trap that catches almost everyone here is lingering. By default, user timers run only while that user has a live session. Log out of SSH and your timers stop with you. To keep them running regardless of login state, enable lingering:

# Keep the user systemd instance alive without a login
sudo loginctl enable-linger "$USER"

# Verify
loginctl show-user "$USER" --property=Linger

If you are replacing a "personal cron under my account" on a server, this is the one line you must remember.

From cron to Timers — the Migration Mapping Table

Here is the correspondence table for porting existing crontab entries to OnCalendar.

cron expressionOnCalendar expressionMeaning
0 3 * * **-*-* 03:00:00daily at 03:00
*/15 * * * **:0/15every 15 minutes
0 * * * *hourlytop of every hour
0 9 * * 1-5Mon..Fri 09:00weekdays at 09:00
0 0 1 * **-*-01 00:00:00midnight on the 1st
0 2 1 1,4,7,10 **-01,04,07,10-01 02:00quarter start at 02:00
@rebootOnBootSec=1min1 minute after boot
@dailydailydaily at midnight
@weeklyweeklyMondays at midnight

The same content in copy-paste-friendly form:

cron:  0 3 * * *          →  OnCalendar=*-*-* 03:00:00
cron:  */15 * * * *       →  OnCalendar=*:0/15
cron:  0 * * * *          →  OnCalendar=hourly
cron:  0 9 * * 1-5        →  OnCalendar=Mon..Fri 09:00
cron:  0 0 1 * *          →  OnCalendar=*-*-01 00:00:00
cron:  0 2 1 1,4,7,10 *   →  OnCalendar=*-01,04,07,10-01 02:00
cron:  @reboot            →  OnBootSec=1min
cron:  @daily             →  OnCalendar=daily

For the migration itself, I recommend this order:

  1. Dump the full list with crontab -l and categorize the jobs.
  2. For each job, write the .service first and confirm a manual systemctl start succeeds. This is where you flush out every PATH and environment issue.
  3. Write the .timer and validate the expression with systemd-analyze calendar.
  4. Enable the timer, comment out the corresponding crontab line, and run both in parallel observation for one cycle.
  5. Confirm healthy behavior via journalctl, then delete the crontab line.

One-Off Jobs with systemd-run — Transient Timers

For one-off jobs not worth a unit file, create an ad-hoc timer with systemd-run. It is the modern replacement for the at command.

# Run once, 30 minutes from now
systemd-run --on-active=30m /opt/scripts/cleanup.sh

# Run once today at 23:00
systemd-run --on-calendar="23:00" /usr/bin/systemctl restart myapp

# As your user, repeating every 2 hours
systemd-run --user --on-unit-active=2h --unit=poll-feed ~/bin/poll-feed.sh

# Inspect and cancel transient timers
systemctl list-timers
systemctl stop run-r1a2b3c4.timer

Transient units disappear on reboot — which matches the meaning of "temporarily, starting now" exactly.

Timezone Pitfalls

Schedulers plus timezones are a perennial source of incidents. Three things to know.

First, OnCalendar is interpreted in the system local timezone by default. If your servers have different timezones, the same unit file fires at different moments. You can pin the timezone explicitly:

[Timer]
# Pin to UTC — recommended for multi-region fleets
OnCalendar=*-*-* 03:00:00 UTC

# Or a specific region (tzdata names)
OnCalendar=*-*-* 09:00:00 Asia/Seoul

Second, on DST transition days there are "times that do not exist" and "times that exist twice." A 02:30 schedule on a European server, for instance, can be skipped on the spring-forward day. For critical jobs, pinning to UTC sidesteps DST entirely.

Third, Persistent catch-up combined with a timezone change can produce unintuitive behavior — after changing the timezone, always confirm NEXT looks right with systemctl list-timers.

# Instantly verify which timezone an expression resolves in, and when
systemd-analyze calendar "*-*-* 03:00:00 UTC"
timedatectl   # check the system timezone

Real-World Recipes

Recipe 1 — Certificate Renewal (certbot)

The certbot.timer shipped by distro packages is already best practice. If you build your own, this is the skeleton:

# certbot-renew.timer
[Timer]
OnCalendar=*-*-* 00,12:00:00
RandomizedDelaySec=12h
Persistent=true

[Install]
WantedBy=timers.target

Twice-daily attempts plus up to 12 hours of random delay is the load-spreading pattern recommended by Let's Encrypt. Accumulated renewal failures become a certificate-expiry incident, so the OnFailure alert is mandatory here.

Recipe 2 — Separating DB Dump and Upload

Split backup creation and remote upload into separate units tied by dependency, and pinpointing failures becomes easy.

# db-dump.service — create the dump
[Service]
Type=oneshot
ExecStart=/opt/scripts/pg-dump.sh
# Start the upload service only if the dump succeeded
ExecStartPost=/usr/bin/systemctl start db-upload.service

Recipe 3 — Nightly Batch with Resource Limits

# nightly-batch.service
[Service]
Type=oneshot
ExecStart=/opt/batch/run-nightly.sh
CPUQuota=50%
MemoryMax=2G
IOSchedulingClass=idle
# Kill and mark failed if it runs past 4 hours → triggers the OnFailure alert
TimeoutStartSec=4h

Even if the batch runs wild, cgroup isolation keeps it from stealing CPU, memory, or IO from production services.

Anti-Patterns — Do Not Do These

  1. Enabling the service unit. If a timer-triggered job has its service enabled too, it runs at every boot. Enable only the .timer.
  2. Running long batches as Type=simple. Without oneshot, systemd considers "started" an instant success and failure detection goes numb.
  3. Doing everything in one shell line inside the timer. Cramming a bash -c one-liner pipeline into ExecStart reproduces exactly the readability problems of the cron era. Split it into a script file.
  4. Overusing Persistent=true. Apply it only to jobs that are safe to run immediately after boot. If the job needs the network, also declare After=network-online.target on the service.
  5. Expecting precise scheduling without knowing the AccuracySec default is one minute.
  6. Forgetting lingering for user timers. Your timers die the moment your SSH session ends.
  7. Migrating without monitoring. Half the value of moving to timers comes from OnFailure and journald. A migration without alert handlers is half a migration.

A Critical View — When cron Is Still the Right Call

To be fair, here are the opposing arguments that held up well in the HN thread.

  • Portability. crontab is the same on BSD, macOS, and inside containers. If your org mixes in systemd-less environments (alpine-based containers, say), standardizing on cron can be the simpler call.
  • The beauty of one line. For a genuinely trivial job, two files plus a daemon reload does feel heavy. Though systemd-run gives you a one-line compromise.
  • Learning cost. If the whole team knows cron syntax, switching has a cost. The counterargument — "let an AI agent generate the unit files" — is real, but humans still need enough knowledge to review them.
  • Neither does distributed scheduling. Orchestrating jobs across multiple nodes is beyond both cron and timers. That territory belongs to Kubernetes CronJob or workflow engines.

The reasonable conclusion: for recurring jobs on a single host, timers are superior on nearly every axis, and cron survives only where portability is the dominant requirement.

Closing Thoughts

Moving from cron to systemd timers is less a "scheduler swap" and more a "promotion of recurring jobs to first-class citizens." Logs go to journald, failures go to OnFailure, dependencies go to the unit graph, resources go to cgroups — you are folding your scheduled jobs into infrastructure the system already has.

Here is a first step you can take today: open crontab -l, pick just one job that matters most, and port it to a .service + .timer pair. Validate the expression with systemd-analyze calendar, wire up the OnFailure alert, and you are done. Once you have done it once, migrating the rest is only a matter of time.

References