Scheduling Multiple Experiments: The labschedule Tool

Next: Data Conversion Up: Making Experiments Reproducible Previous: Rerunning an Experiment: The Contents Index

Scheduling Multiple Experiments: The `labschedule` Tool

The program labschedule is available to help in the running of multiple experiments either sequentially or in parallel.

labschedule [<options>] <executable> [<arguments>]

The --for option of labschedule allows you to specify a set of for loops, each nested inside the other in the order given on the command line. The innermost command is a call to labrun to run <executable> with the given <arguments>. A variable is defined for each loop. The name of a loop's variable is simply the number of the loop preceded by a % (e.g., %1). The occurrence of a variable in the command line will be replaced by the individual values of the loop range. Loop range values can be given in a number of ways. The simplest way is to simply specify the individual values on the command line.

$\Rightarrow$ Tutorial 13:: Use labschedule to execute four calls to labrun that will run four experiments to sort = 500, 1000, 1500 and 2000 numbers with = 10.
Solution:: labschedule --for '500 1000 1500 2000' sort-demo %1 10

Each call the labschedule creates a log file that records information about the tasks that were scheduled, including when they were started, when they finished, and the command used for scheduling. This file can be found in ./lab_log, along with a .out file containing the output of every successfully completed experiment and a .err file containing the output of every failed experiment. The file names have a form similar to those created by labrun:

<executable>-<date>-<time>.[out|log|err]

where <executable> is schedule by default. The calls to labrun issued by labschedule use the option --name=schedule-%1-%2- to indicate that schedule and the names of the loop variables should be used to name the output files of labrun.

$\Rightarrow$ Tutorial 14:: List the log and output files created by Tutorial
Solution:: ls ./lab_log/schedule*

A loop range can also be specified using any python expression. Particularly helpful here is the range expression. When given one integer argument the range expands to 0, ..., . Two arguments may be used to specify a starting point other than 0 and a third argument can specify a step size other than 1.

$\Rightarrow$ Tutorial 15:: Reschedule the experiments from Tutorial using the range command to specify the loop values. Run labschedule in verbose mode to see what happens.
Solution:: labschedule -v --for 'range(500, 2001, 500)' sort-demo %1 10

You will notice that the expriments were actually not rerun (unless something went wrong in Tutorial ). The labschedule tool recognizes that the specified experiments have already successfully completed and, by default, it doesn't run a successful experiment again. This behavior can be changed by using the --noskip option.

$\Rightarrow$ Tutorial 16:: Rerun the experiments from Tutorial and then examine the ./lab_log directory to find the files created by the rerun.
Solution:: labschedule -v --noskip -for 'range(500, 2001, 500)' sort-demo %1 10
ls ./lab_log/schedule*

Another alternative for rerunning the experiments would be to change the name associated with the experiment using the --name option.

The same syntax used for providing values of comments in the log file (Tutorial ) is also available for specifying for loop ranges. In particular, the values can be read from a file by specifying the name of the file after a @ on the command line.

$\Rightarrow$ Tutorial 17:

Run five experiments with

= 600, 1000, 1250, 1500, and 2000 and

= 10.

Solution:

The following input file provides 5 values for

n_values

600 1000 1250 1500 2000

labschedule --for @n_values sort-demo %1 10

$\Rightarrow$ Tutorial 18:: Add a second loop that varies the value of from 10 to 14, stepping by 1, as well as varying as in the previous tutorial. Use the --print option to simply print the commands that would be executed without actually running them.
Solution:: labschedule --print -for @n_values -for 'range(10, 15)' sort-demo %1 %2

Without using the --print option in Tutorial , 20 calls to labrun would have been made, resulting in 20 separate labrun log files and output files, which is a lot. You can use the --nesting option to reduce the number of files created. This option indicates how deeply nested the calls to labrun produced by labschedule should be. If the nesting level is set to something less than the number of loops specified on the command line, the command that labschedule will give to labrun to execute will be another call to labschedule. In this second call to labschedule, the --direct flag is used to indicate that the second labschedule should run the executable it is given directly, without another call to labrun.

$\Rightarrow$ Tutorial 19:: Use the --nesting option to run the experiments from Tutorial with only five calls to labrun. Change the name prefix for the experiment to sort using the --name option.
Solution:: labschedule --name=sort -nesting=1 -for @n_value -for 'range(10, 15)' sort-demo %1 %2

The --maxtasks option of labschedule allows you to specify that more than one task can be started at once. This generally makes sense only for multiprocessor systems.

$\Rightarrow$ Tutorial 20:: Execute the following 4 sort-demo experiments simultaneously: = 5000, = 10; = 10000, = 10; = 5000; = 5; = 10000, = 5.
Solution:: labschedule --max=4 -for '5000 10000' -for '5 10' sort-demo %1 %2

Next: Data Conversion Up: Making Experiments Reproducible Previous: Rerunning an Experiment: The Contents Index

Tobias Polzin 2002-11-18