www.jdmz.net troy.jdmz.net

Cron on Linux
Scheduled command execution for your Linux server or workstation
This document explains how to configure the cron server to run programs automatically for you every year, month, week, day, hour, minute, and many combinations of these using Red Hat Fedora Core 3.
Cron has many uses. I have used cron to create nightly backups, update server room temperature web pages at various intervals, generate new email '.signature' files every five minutes, and syncronize files nightly via SSH over the internet. I am sure there are many more.
Cron Documentation
Cron documentation is not bad (not bad at all), but I may have to know what to lookup and how. On a Red Hat-ish system like Fedora Core 3, I can use rpm to look up the available man page files. First I find the main cron package:
  $ rpm -qa | grep cron
  vixie-cron-4.1-33_FC3
  anacron-2.3-32
  crontabs-1.10-7
which turns out to be vixie-cron. To see the package description and the list of files it installs, I use rpm again:
  $ rpm -qil vixie-cron
  Name        : vixie-cron                   Relocations: (not relocatable)
  Version     : 4.1                               Vendor: Red Hat, Inc.
  Release     : 33_FC3                        Build Date: Thu 14 Apr 2005 06:40:32 PM CDT
  Install Date: Mon 18 Apr 2005 04:26:48 AM CDT      Build Host: tweety.build.redhat.com
  Group       : System Environment/Base       Source RPM: vixie-cron-4.1-33_FC3.src.rpm
  Size        : 114302                           License: distributable
  Signature   : DSA/SHA1, Thu 14 Apr 2005 09:36:49 PM CDT, Key ID b44269d04f2a6fd2
  Packager    : Red Hat, Inc. 
  Summary     : The Vixie cron daemon for executing specified programs at set times.
  Description :
  The vixie-cron package contains the Vixie version of cron. Cron is a
  standard UNIX daemon that runs specified programs at scheduled times.
  Vixie cron adds better security and more powerful configuration
  options to the standard version of cron.
  /etc/cron.d
  /etc/pam.d/crond
  /etc/rc.d/init.d/crond
  /etc/sysconfig/crond
  /usr/bin/crontab
  /usr/sbin/crond
  /usr/share/man/man1/crontab.1.gz
  /usr/share/man/man5/crontab.5.gz
  /usr/share/man/man8/cron.8.gz
  /usr/share/man/man8/crond.8.gz
  /var/spool/cron
From this output I can see that there are a few man pages I can view for more insight into cron:
  $ man crontab
  $ man -s 5 crontab
  $ man cron
  $ man crond
Use the command 'man man' to see why the '-s 5' needed to be used to view the second crontab man page.
Using Cron
To use cron, it is easiest to start with a blank slate. The crontab command will do that for me, but by default I will need to know how to edit and save files with vi. If I am not comfortable with vi, and do not want to learn to use it, I can use another editor. To use nano (after making sure it is installed with 'which nano'), I could execute this command:
  $ export EDITOR=nano
before using the crontab command:
  $ crontab -e
which will start the editor (or vi) and load the current cron table file for this user, or a blank file if none exists.
If I don't want to set a MAILTO email address (after reading about it in the documentation above), the first line I want in my crontab is a comment. More specifically, this comment:
  #mh hd dm my dw command
(every line starting with a pound symbol (#) is a comment) because it reminds me that the space separated fields are: The command can contain spaces (and need not be quoted), but the other fields must not contain spaces.
Since I like the output and format of the top command, I'll research it a bit (using the 'man top' and 'top --help' commands) and put together a command line I can use in a cron job:
  /usr/bin/top -n 1 -b -S
Notice that I used the whole path to the top executable. Since cron jobs may not provide the same working environment (path, aliases, ...) I get at the command prompt, it is wise to use full paths to all commands and scripts I use. I test the command a couple times to make sure I get the output I thought I was going to get. The output of this command (by default) will be sent to my user via email whenever this cron job executes.
Crontab Examples
To use this command on cron job line using 'crontab -e', I need to fill in all of the fields:
  0 12 * * * /usr/bin/top -n 1 -b -S
This command will execute every day at noon. The first five fields can be values, lists, ranges, or ranges with step values.
The first and second fields in the example are values. They are just simple integers that exist in the range of the field. The '0' in the first field could be any integer between '0' and '59' (minutes of the hour). The '12' in the second field could be any integer between '0' and '23' (hours of the day).
A list is more than one value separated by commas (with no spaces). An example using lists would be:
  0 0,6,12,18 * * * /usr/bin/top -n 1 -b -S
which adds midnight (0), 6 am (6), and 6 pm (18) to the times this job will execute, in addition to the original time of noon (12).
A range is just a lower number (in the range of possible values) and a higher number (also in the range of possible values) separated by a dash ('-'). A cron job using a range might be:
  0 0,6,12,18 * * 1-5 /usr/bin/top -n 1 -b -S
making the 'top' cron job execute four times a day, but only on weekdays (Monday through Friday). A list may also include a range in place of one (or more) of its values, like this:
  0 0,6,9-15,18 * * 1-5 /usr/bin/top -n 1 -b -S
which would allow this cron job to execute every hour from 9 am (9) until 3 pm (15), instead of just at noon (12).
The asterisk (*) fields in the example above are ranges, consisting of every value in the range from the first possible value through the last possible value, which are different for each field. In the original example, the third fields asterisk means '1-31' (days of the month), the fourth fields asterisk means '1-12' (months of the year), and the fifth fields asterisk means '0-7' (days of the week, Sunday being both values zero (0) and seven(7)).
Ranges with step values express lists like this:
  0 0,6,12,18 * * * /usr/bin/top -n 1 -b -S
in this abbreiviated way:
  0 */6 * * * /usr/bin/top -n 1 -b -S
meaning all possible values ('*' or '0-23') that are evenly divisible by 6 (0, 6, 12, and 18). This way of expressing lists makes it much easier to run a cron job every 5 minutes:
  */5 * * * * /usr/bin/top -n 1 -b -S
or every other day:
  0 12 */2 * * /usr/bin/top -n 1 -b -S
in a way that is short, easy to read, and interpret.
Cron Job Output
Sometimes I never want to receive the output of a command I put in my crontab. There are a couple of way to do this. The "all or nothing" way is to set MAILTO="" on the first line of my crontab, but then I will get no output (that I don't arrange to get somehow via the command) for any of my cron jobs. I typically do not do this.
I regularly use output redirection to funnel my command output to a file (if I want a more permanent record of the cron job than an email), or to the 'electronic wastebin' (or 'bitbucket') which is /dev/null (if I want no record of the cron job). If I want to ping a list of hosts every 10 minutes, I can write a script called 'pinghosts.sh' that provides nicely formatted output from the 'ping' command. The script could automatically put it's output to a file, but if I want to use 'pinghosts.sh' from the command line too, that might be a problem. Instead, I could do this:
  */10 * * * * /home/myuser/bin/pinghosts.sh > /home/myuser/pinghosts.log
and every ten minutes the output of 'pinghosts.sh' will replace the contents of the file 'pinghosts.log' ('>' causes the output of the 'pinghosts.sh' command to overwrite the contents of the 'pinghosts.log' file) in my home directory (if 'myuser' is my username). If I want 'pinghosts.log' to be more like a regular log file, tracking output over time, I need to use a different redirection symbol:
  */10 * * * * /home/myuser/bin/pinghosts.sh >> /home/myuser/pinghosts.log
('>>' cause the output of the 'pinghosts.sh' command to be appended to the 'pinghosts.log' file). If I plan to keep 'pinghosts.sh' output over time, I should make sure it doesn't produce more output than I have disk (or quota) space. I could just throw away the file when it gets to large, or maybe just every month or so (hmm...sounds like a job for cron, or maybe logrotate).
If 'pinghosts.sh' produces any error messages, 'pinghosts.log' will not contain them. Instead, an email message containing the error messages will be sent to my user. If that is what I want, I am set, but many times I want error messages to be recorded in my logs of output, so the above examples will not do. To put all output into the file I need to do this [1]:
  */10 * * * * /home/myuser/bin/pinghosts.sh >> /home/myuser/pinghosts.log 2>&1
This works for shells like 'bash' (sh, ksh, zsh) and redirects STDERR (error output) to STDOUT (normal output), just after STDOUT has been redirected to the file 'pinghosts.log'.
Sometimes I just want a cron job script to do what I want and never tell me about it. If I have a script that notifies me by email of "garbage day" every week, I don't want another email telling me it sent the first email successfully. To make this happen, instead of a log file I use /dev/null as the target of my output:
  0 19 * * 2 /home/myuser/bin/garbage-day-email.sh > /dev/null 2>&1
but this will never alert me if the email is unsuccessful, so I may opt for this version:
  0 19 * * 2 /home/myuser/bin/garbage-day-email.sh > /dev/null
instead. This way, at 7 pm on Tuesday evenings I will get an email cheerfully reminding me of "garbage day", or an uglier error message that does the same (if email is working).
Cron Job Scripts
Sometimes what I want to do from a cron job is a whole lot more complicated that a single command. When that kind of situation occurs, it is time to make a script (usually a shell script) to run from my crontab.
A simple backup of my home directory could be one command [2]:
  0 3 * * * /bin/tar czvf /backup/myuser-`/bin/date +%Y%m%d`.tgz /home/myuser
or with '%' characters escaped on many systems (like Centos and Ubuntu) [3]:
  0 3 * * * /bin/tar czvf /backup/myuser-`/bin/date +\%Y\%m\%d`.tgz /home/myuser
which executes a backup of my home directory every day at 3 am (and uses a date code on the archive filename), but I may want to do some more work in the cron job to report more information and set some file attributes after the backup occurs. So, I would write a script to do these tasks:
  #!/bin/bash
  ARCHIVE=/backup/myuser-`/bin/date +%Y%m%d`.tgz
  /bin/tar czvf $ARCHIVE /home/myuser
  /bin/chown myuser:mygroup $ARCHIVE
  /bin/chmod 640 $ARCHIVE
  /bin/echo " "
  /bin/echo -n "Disk Space Used: "
  /bin/ls -sh $ARCHIVE
  /bin/echo "File Details:"
  /bin/ls -l $ARCHIVE
I could name it backup-myuser.sh and call it from my crontab:
  0 3 * * * /home/myuser/bin/backup-myuser.sh
and get my backups, control file ownership and permissions, and report on the resulting archive file. In addition to that, I can change the details of "what gets done" by editting a script and not messing around with my crontab unless I want to change the script name or when it runs.
In my opinion there is something wrong with this backup script. It uses only one shell variable ($ARCHIVE) to keep from refiguring the archive file name every time it is used, but it uses no others. When I am writing a cron script, I like to overuse shell variables. I like to put every executable path, every file path, and any text used in more than one location into shell variables. This can make the script file larger and it can make it smaller, depending on a few things, but I do not do it for saving some space in the script file. I do it for portabability, maintainability, and (if I choose representative [4] variable names) readability reasons. If I copy or move the script where files are in different locations, for whatever reason, I can change the script quickly and easily and adapt it to it's new environment.
An rewrite of the above backup example that follows the 'overuse shell variables' philosophy is here:
  #!/bin/bash
  # backup-myuser.sh

  # file paths
  TAR=/bin/tar
  CHOWN=/bin/chown
  CHMOD=/bin/chmod
  ECHO=/bin/echo
  LS=/bin/ls
  DATE=/bin/date

  # script variables
  DATECODE=`$DATE +%Y%m%d`
  ARCHIVE=/backup/myuser-$DATECODE.tgz
  BACKDIR=/home/myuser
  BUSER=myuser
  BGROUP=mygroup
  BMODE=640

  # do the work
  $TAR czvf $ARCHIVE $BACKDIR
  $CHOWN $BUSER:$BGROUP $ARCHIVE
  $CHMOD $BMODE $ARCHIVE

  # report the results
  $ECHO " "
  $ECHO -n "Disk Space Used: "
  $LS -sh $ARCHIVE
  $ECHO "File Details:"
  $LS -l $ARCHIVE
This example includes a few other "good things" (in my opinion) like whitespace (the blank lines separating different parts of the script), comments (the lines with pound signs (#) followed by explanitory text), and some single use shell variables ($BUSER, $BGROUP, and $BMODE). The first two things are aimed at improving the ability of others (or me, six months after writing it) to quickly read and understand this script. The single use shell variable are an effort to provide a single place (the 'script variables' section) to make any forseen and simple changes to the script, hopefully reducing the probability of future editors (myself included) injecting bugs into the script.
Better Cron Job Scripts
I like to have up to date local mirrors of things like CPAN, Project Gutenberg, and Fedora Core installation and update files. These are all potentially long running processes, so I will want to use some sort of locking mechanism. These "locks" will be used to tell new cron job processes that an older process is already on the job (it's just taking a while). The "locking" I use will check for a file in a directory (used only for cron job locks) and create it if it does not exist. Immediately afterward, I use shell trap commands to setup the removal of that lock file on exit (regular script end and various errors).
I will also want to run the same script on a few different hosts, hopefully without having to do a lot of editing and setup beforehand. To make this happen I will need to test for the existence of some directories and files I need, and create and set their permissions correctly if they do not exist. Some errors take a lot of effort to handle, and if you must handle them it may make sense to handle them in Perl or Python instead of a command shell like bash. However, bash is more than capable of handling the "low hanging fruit" of directory and file existence and creation.
Here is the example:
  #!/bin/sh

  # rsync-fedora4-mirror

  # common source and destination
  CMNSRC=mirror.cs.wisc.edu::fedora-linux-core
  CMNDST=/var/ftp/pub/fedora
  # user
  CUSER=`/usr/bin/whoami`
  # directories
  CRONDIR=/home/$CUSER/cron
  LOCKDIR=$CRONDIR/lock
  EXFILE=$CRONDIR/fc4-excludes.txt
  # options
  GETSOURCES=0
  RSYNCOPTS="-rlHtS --delete --delete-excluded --exclude-from=$EXFILE"

  # files
  RSYNC=/usr/bin/rsync
  ECHO=/bin/echo
  RM=/bin/rm
  TOUCH=/bin/touch
  MKDIR=/bin/mkdir
  FIND=/usr/bin/find
  CHMOD=/bin/chmod
  # directory and file modes for cron and mirror files
  CDMODE=700
  CFMODE=600
  MDMODE=755
  MFMODE=644
  
  # testing for directories and files needed
  if [ ! -d $CRONDIR ]; then $MKDIR -p $CRONDIR; $CHMOD $CDMODE $CRONDIR; fi
  if [ ! -d $LOCKDIR ]; then $MKDIR -p $LOCKDIR; $CHMOD $CDMODE $LOCKDIR; fi
  if [ ! -f $EXFILE ]; then $TOUCH $EXFILE; $CHMOD $CFMODE $EXFILE; fi
  
  # lock file creation and removal
  LOCKFILE=$LOCKDIR/`basename $0`.lock
  [ -f $LOCKFILE ] && $ECHO $LOCKFILE exists && exit 0
  trap "{ $RM -f $LOCKFILE; exit 255; }" 2
  trap "{ $RM -f $LOCKFILE; exit 255; }" 9
  trap "{ $RM -f $LOCKFILE; exit 255; }" 15
  trap "{ $RM -f $LOCKFILE; exit 0; }" EXIT
  $TOUCH $LOCKFILE
 
  # now mirror and set permissions for each group of files
 
  # release
  RSYNCSRC=$CMNSRC/4/i386/os/
  RSYNCDST=$CMNDST/4/i386/os/
  if [ ! -d $RSYNCDST ]; then $MKDIR -p $RSYNCDST; fi
  cd $RSYNCDST
  $RSYNC $RSYNCOPTS $RSYNCSRC $RSYNCDST
  $FIND $RSYNCDST -type d -exec $CHMOD $MDMODE \{\} \;
  $FIND $RSYNCDST -type f -exec $CHMOD $MFMODE \{\} \;
  
  # updates
  RSYNCSRC=$CMNSRC/updates/4/i386/
  RSYNCDST=$CMNDST/updates/4/i386/
  if [ ! -d $RSYNCDST ]; then $MKDIR -p $RSYNCDST; fi
  cd $RSYNCDST
  $RSYNC $RSYNCOPTS $RSYNCSRC $RSYNCDST
  $FIND $RSYNCDST -type d -exec $CHMOD $MDMODE \{\} \;
  $FIND $RSYNCDST -type f -exec $CHMOD $MFMODE \{\} \;
  
  [ $GETSOURCES -lt 1 ] && exit 0
  
  # release sources
  RSYNCSRC=$CMNSRC/4/SRPMS/
  RSYNCDST=$CMNDST/4/SRPMS/
  if [ ! -d $RSYNCDST ]; then $MKDIR -p $RSYNCDST; fi
  cd $RSYNCDST
  $RSYNC $RSYNCOPTS $RSYNCSRC $RSYNCDST
  $FIND $RSYNCDST -type d -exec $CHMOD $MDMODE \{\} \;
  $FIND $RSYNCDST -type f -exec $CHMOD $MFMODE \{\} \;
  
  # updates sources
  RSYNCSRC=$CMNSRC/updates/4/SRPMS/
  RSYNCDST=$CMNDST/updates/4/SRPMS/
  if [ ! -d $RSYNCDST ]; then $MKDIR -p $RSYNCDST; fi
  cd $RSYNCDST
  $RSYNC $RSYNCOPTS $RSYNCSRC $RSYNCDST
  $FIND $RSYNCDST -type d -exec $CHMOD $MDMODE \{\} \;
  $FIND $RSYNCDST -type f -exec $CHMOD $MFMODE \{\} \;
This is a rather trusting example, as many problem could occur if the mirror you are using has "problems". Eliminate the "--delete" and "--delete-excluded" arguments from the RSYNCOPTS variable definition to be less trusting.
If you are not developing software based on Fedora Core RPMs, executing (as the user who will run the cron job):
  $ echo debug >> ~/cron/fc4-excludes.txt
will save space on the mirror, especially if you do not plan to use the 'debug' binaries and sources. However, if you do want to use many Fedora Core RPMs in your development plans, you will want to get the SRPMs, so change GETSOURCES to '1' instead of '0' and this script will get them for you.
I recommend using a mirror that is close to you (network wise), so changing the CMNSRC variable is a good thing to do. Unfortunately, you may have to change other parts fo the script to make it work because not all mirrors are arranged identically.
That's all for now, but I may add more later. Good luck!
Notes:
[1] The magical incantation '2>&1' has more meaning when it is considered that, when running in a UNIX or UNIX-like environment, each file opened (for reading or writing) is assigned a file descriptor number. By default, all commands have three file descriptors open as they run: STDIN (standard input), STDOUT (standard output), and STDERR (standard error). These are assigned the numbers 0, 1, and 2 respectively. So in the given syntax, '2' means standard error, '>' means '2' is being redirected, '&' means '1' is a file descriptor (and not a file named '1'), and '1' means standard output. The reason '2>&1' is specified after STDOUT is redirected to a file is that the command is read by the shell interpreter from left to right and descriptors are resolved immediately. If '2>&1' came first, STDERR would go to the original STDOUT (and not the file), which is not what I want here.
[2] The command '/bin/date +%Y%m%d' is enclosed in backticks, which executes the command and leaves the command output in its place. The output of this command creates an eight digit integer like "20050601", which represents the current year (2005), month (06), and day (01) the command executes. I like this 'datecode' because it sorts well.
[3] The command '/bin/date +%Y%m%d' will not work on as shown on many systems as the '%' characters will have to be escaped (preceeded with a backslash ('\') character) or they will be interpreted as newline by cron. I thank Todd Allis for writing to tell me.
[4] By representative I mean the shell variable accurately represents what it replaces/contains to most people (or just those likely to view and edit the script). So $BACKDIR contains the directory to be backed up, and $BMODE contains the mode of the backup file. Different labels mean different things to different people, but some attention and effort dedicated to naming shell variables can go a long way towards making the script 'easy to digest'.
Author: Troy Johnson
www.jdmz.net troy.jdmz.net