utopya_jobscript module

Introduction

A job script is a (usually) small text file to do something important. A simple example is:

#! /usr/bin/env python

# do something:
print( "boe!" )

Such a script could be run in the foreground, in which case the user has to wait for the job to be finished before control is regained. Standard output (“boe!”) and eventually standard error are printed directly and can be watched by the user.

When the script is run in background, control is given back to the user while the script remains running. The standard output and error should be redirected to files.

Alternatively, the script could be submitted to a batch system. In this case, options to identify the job, to specify destination of standard output and error, and to request resoures (memory, cpu’s), could be inserted in the header of the script:

#! /usr/bin/env python

#BSUB -J myjob
#BSUB -oo myjob.out
#BSUB -eo myjob.err

# do something:
print( "boe!" )

The classes provided by this module facilitate creation of job scripts for which the run destination (foreground, background, batch system) is flexible.

Jobscripts to run in foreground

To run a job in the foreground, use one of the following classes:

Jobscripts to run in background

To run in background, use the UtopyaJobScriptBackground class.

Jobscripts to be submitted to a batch system

High performace clusters with a high number of processors and many users logged in at the same time are always equiped with a batch system to handle jobs. Batch jobs are submitted to a queue, and the batch system empties the queue by assigning jobs to first available processors.

Special commands are required to submit jobs to the queue, list the currently submitted and running jobs, and eventually remove jobs from the queue.

Batch job files typically have special comments in the top to tell the batch system the destination queue, the name of output/error files to be used, required memory and maximum run time, etc.

Which batch system is available usually depends on the machine vendor and/or administrator. Each type of batch system has its own job handling command and format for the batch options in the top of the job file. For each type, a seperate class needs to be defined to handle creation and submission. A base class UtopyaJobScriptBatch is provided from which batch type specific class could be derived; see it’s documentation for the methods to be re-defined.

The following specific batch systems are already supported:

If queue systems are fully occupied, testing creation of job files could suffer from long waiting times. To avoid wasting of precious development time, the special UtopyaJobScriptBatchTest class is provided. This will create jobs using fake job options, and run the script in foreground while redirecting standard output and error.

Class hierarchy

The classes provided by this module have been derived with the following hierchy:

Classes

class utopya_jobscript.UtopyaJobScript

Bases: utopya_base.UtopyaBase

Base class for an object that can be used to create and submit a job script.

Derived classes probably need to re-define the Submit() methode only.

Example of usage:

# init object:
jb = UtopyaJobScript()

# fill script lines:
lines = []
lines.append( '#! /usr/bin/env python\n' )
lines.append( '\n' )
lines.append( '# do something:\n' )
lines.append( 'print( "boe!" )\n' )
lines.append( '\n' )

# write:
jb.Create( 'myjob.jb', lines )

# submit created script 
# (here dummy method that issues an error):
jb.Submit( 'myjob.jb' )
Create(jbfile, lines)

Create job file with a name provided by jbfile, and content in the lines argument.

The ‘lines’ should either be a list of ‘str’ objects, or a single ‘str’ object, but in both cases newline characters should be included.

The created job file is made executable to allow execution from a command line.

GetOptionsRc(rcfile, name='batch', rcbase='', env={})

Template for derived classes that need to include batch job options in a job file. This version returns an empty line.

Submit(jbfile, _indent='')

Template for job submision methode by derived classes.

CheckStatus(pid_file, _indent='')

Template for method that checks status of submitted job.

class utopya_jobscript.UtopyaJobScriptForeground

Bases: utopya_jobscript.UtopyaJobScript

Class to create a job that runs in foreground.

Example of usage:

# init object:
jb = utopya.UtopyaJobScriptForeground()

# fill script lines:
lines = []
lines.append( '#! /usr/bin/env python\n' )
lines.append( '\n' )
lines.append( '# do something:\n' )
lines.append( 'print( "boe!" )\n' )
lines.append( '\n' )

# write:
jb.Create( 'myjob.jb', lines )

# run script in foreground:
jb.Submit( 'myjob.jb' )
Submit(jbfile, _indent='')

Run job file in foreground. Standard output and error are not redirected.

CheckStatus(pid_file, _indent='')

Check status of running job. A job running in the foreground is already finished. Therefore this method always returns the same value:

  • stopped’ the process is already finished

class utopya_jobscript.UtopyaJobScriptRedirect

Bases: utopya_jobscript.UtopyaJobScript

Class to create job for running in foreground, while re-directing std.output and std.error to files.

Example of usage:

# init object:
jb = utopya.UtopyaJobScriptRedirect()

# fill script lines:
lines = []
lines.append( '#! /usr/bin/env python\n' )
lines.append( '\n' )
lines.append( '# do something:\n' )
lines.append( 'print( "boe!" )\n' )
lines.append( '\n' )

# write:
jb.Create( 'myjob.jb', lines )

# run script in foreground, redirect output:
jb.Submit( 'myjob.jb' )
Submit(jbfile, _indent='')

Run job file in foreground, but with standard output and error redirected to files. The output and error files have the same name as the job file but extensions ‘.out’ and ‘.err’ respectively.

class utopya_jobscript.UtopyaJobScriptBackground

Bases: utopya_jobscript.UtopyaJobScript

Class to create job for running in background.

Example of usage:

# init object:
jb = utopya.UtopyaJobScriptBackground()

# fill script lines:
lines = []
lines.append( '#! /usr/bin/env python\n' )
lines.append( '\n' )
lines.append( '# do something:\n' )
lines.append( 'print( "boe!" )\n' )
lines.append( '\n' )

# write:
jb.Create( 'myjob.jb', lines )

# run script in background:
jb.Submit( 'myjob.jb' )
Submit(jbfile, _indent='')

Run job file in background.

Standard output and error are written to files with the same name as the job file but extensions ‘.out’ and ‘.err’ respectively.

The process id is written to a file with extension ‘.pid’.

CheckStatus(pid_file, _indent='')

Read the process id from the ‘.pid’ file created by the Submit() method, and check the current status. Returns a str with one of the following values:

  • running’ if the process is still running;

  • stopped’ if the process id is not present anymore, or is in ‘zombie state; for background processes the return status could not be checked yet to see if it exitted without errors.

class utopya_jobscript.UtopyaJobScriptBatch

Bases: utopya_jobscript.UtopyaJobScript

Base class to create and submit a batch job script. The base class itself does not support a particular system, but only provides generic methods that are shared by the specific classes. Derived classes probably need to re-define the following methods only:

  • GetOptionsRc() that should call the same method from the parent (this) class with proper arguments;

  • Submit() to submit a job script to batch queue.

Example of usage of the base class, derived classes should be used in the same way:

# init object:
jb = utopya.UtopyaJobScriptBatch()

# obtain line with job options from rcfile:
options = jb.GetOptionsRc( 'UtopyaJobScriptBatch.rc', rcbase='appl', \
                                env={'name':'myjob'} )

# fill script lines:
lines = []
lines.append( '#! /usr/bin/env python\n' )
lines.append( '\n' )
lines.append( options )
lines.append( '\n' )
lines.append( '# do something:\n' )
lines.append( 'print( "boe!" )\n' )
lines.append( '\n' )

# write:
jb.Create( 'myjob.jb', lines )

# submit:
jb.Submit( 'myjob.jb' )
GetOptions(keys, options, comment='#', prefix='BATCH', arg='-', assign=' ')

Return str line (with newline breaks) with job options. For example for a load-leveler job this could look like:

'#@ job_name = myjob\n#@ output = myjob.out\n#@ error = myjob.err\n#@ queue'

In an actual job script this is then expanded into seperate lines:

#@ job_name = myjob
#@ output   = myjob.out                     
#@ error    = myjob.err                    
#@ queue

With a different queue system, the job options should probably look different. For an LSF job for example, the options could look like:

#BSUB -J myjob
#BSUB -o myjob.out
#BSUB -e myjob.err

Although formatting is different, the options in general consist of a lines starting with unique comment pragma, followed by a flag, and usually a value. Flags might be preceeded by a ‘-’ sign; flag and value might be seperated by whitespace or a ‘=’ sign. In a general notation the format of a job option is:

<comment><prefix> <arg><flag>[<assign><value>]

The formatting to be used is defined by the optional keyword arguments passed to this method. For the LoadLeveler job for example the formatting is defined by:

comment = '#'
prefix  = '@'
arg     = ''
assign  = '='

which will give option lines:

#@ <flag>[=<value>]

For the LSF system these keywords are necessary:

comment = '#'
prefix  = 'BSUB'
arg     = '-'
assign  = ' '

which will give the lines:

#BSUB -<flag>[ <value>]

The flag/value pairs should be passed to this method by two arguments: a list of ‘keys’ that defines the order of the flag/value pairs, and a dictionairy ‘options’ with flag/value tupples for each of the elements in the ‘keys’ list. For example for the LoadLeveler job, the arguments could be::

keys = ['name','out','err','end']
options = { 'name'  : ('job_name','myjob'),
            'out'   : ('output','myjob.out'),
            'err'   : ('error','myjob.err'),
            'end'   : ('queue',None) }

and for the LSF job:

keys = ['name','out','err']
options = { 'name' : ('J','myjob'),
            'out'  : ('o','myjob.out'),
            'err'  : ('e','myjob.err') }
GetOptionsRc(rcfile, name='batch', rcbase='', env={})

Read batch options and formatting rules from rcfile settings, and pass these to the GetOptions() method. The result is similar, a ‘str’ line with batch options to be written to a job file, including newline characters.

The ‘name’ is the first part of the rcfile keys that should be used.

The ‘rcbase’ is an optional prefix for the name. This allows a rcfile to have multiple batch job definitions, each with a different ‘rcbase’ to be used for different job files.

Example rcfile settings for rcbase ‘appl’ and name ‘batch’:

! which keywords:
appl.batch.options           :  jobname output error nodes memory queue

! values:
appl.batch.option.jobname    :  n %(env:name)
appl.batch.option.output     :  o %(jobname).out
appl.batch.option.error      :  e %(jobname).err
appl.batch.option.nodes      :  r nodes:4
appl.batch.option.memory     :  r memory:2G
appl.batch.option.queue      :  q

! batch job option format keyword:
appl.batch.format            :  myformat

! Define format of batch options, e.g.:
!   #BATCH -flag value
! If whitespace is essential, enclose value by single quotes:
myformat.comment       :  #
myformat.prefix        :  BATCH
myformat.arg           :  -
myformat.assign        :  ' '
! format of template to subsitute (environment) values:
myformat.template      :  %(key)
myformat.envtemplate   :  %(env:key)

The first setting is a list of option keys, here ‘jobname’, ‘output’, etc.

For each option key, a flag/value pair needs to be defined; in the example above, for key ‘jobname’ the flag/value pair ‘J %(env:name)’ is defined. Typically, the flag could consist of just a cryptic letter (‘J’) as required by the batch system, while the option key is longer and more descriptive.

The first part of a flag/value is the flag, seperated by whitespace from the value(s) which form the remainder of the line. Multiple options could therefore share the same flag; this is useful when the job scheduler requires the same flag to be used for different settings, for example to set resources.

The value could be defined literaly, such as “nodes:4” is defined above as the value for the number of nodes.

The value could also contain templates for substitution of a value from a flag/value pair assigned to another key. In the example the value assigned to “jobname” is for example also used for the output and error files. The format of the template is defined here as ‘%(key)’, where only the presence of the word “key” is required. A loop over all keys (“jobname”, “output”, etc) will be performed and the word “key” in the template is replaced by the current key; if the result is found in a value, it is decided that the template is used and it is replaced by the corresponding value.

A special substitution is defined for variables from the so-called environment. In this case, the template should contain the word “env:key”. The environment is an optional dictionairy that is passed by the calling method and contains specific values at run time; a loop over the environment keys is performed to search for matching templates. For example, the following environment contains the job name to be used:

env = { 'name' : 'myjob' }

With this environment, the above example will result in the following job options (actually a single str with newlines):

#BATCH -n myjob
#BATCH -o myjob.out
#BATCH -e myjob.err
#BATCH -r nodes:4
#BATCH -r memory:2G
#BATCH -q
Submit(jbfile, _indent='')

Template for submit method. Derived classes should re-implement this method to submit the named ‘jbfile’ to a batch queue.

class utopya_jobscript.UtopyaJobScriptBatchTest

Bases: utopya_jobscript.UtopyaJobScriptBatch

Class to create test job files that are not submitted to a queue but simply run in foreground. Useful for testing job script creation.

Example of usage:

# init object:
jb = utopya.UtopyaJobScriptBatchTest()

# obtain line with job options from rcfile:
options = jb.GetOptionsRc( 'settings-UtopyaJobScriptBatchTest.rc', rcbase='appl', \
                                env={'name':'myjob'} )

# fill script lines:
lines = []
lines.append( '#! /usr/bin/env python\n' )
lines.append( '\n' )
lines.append( options )
lines.append( '\n' )
lines.append( '# do something:\n' )
lines.append( 'print( "boe!" )\n' )
lines.append( '\n' )

# write:
jb.Create( 'myjob.jb', lines )

# submit:
jb.Submit( 'myjob.jb' )
GetOptionsRc(rcfile, rcbase='', env={})

Return str line (with newline characters) with job options based on rcfile settings.

The rcfile settings should start with ‘[<rcbase>.]batch.test’, where the rcbase might be empty or equal to ‘*’ for default settings.

Example settings for rcbase ‘appl’:

! job format for this application:
appl.batch.test.format            :  test_format
! which keywords:
appl.batch.test.options           :  name output error
! flags and values:
appl.batch.test.option.name       :  J myjob
appl.batch.test.option.output     :  oo %(name).out
appl.batch.test.option.error      :  eo %(name).err

! Define format of batch options, e.g.:
!   #TEST -flag value
test_format.comment       :  #
test_format.prefix        :  TEST
test_format.arg           :  '-'
test_format.assign        :  ' '
test_format.template      :  %(key)
test_format.envtemplate   :  %(env:key)

This will return the following job options as a str with newline characters:

#TEST -J myjob
#TEST -oo myjob.out
#TEST -eo myjob.err
Submit(jbfile, _indent='')

Test routine that runs the batch job file in background.

Standard output and error are written to files with the same name as the job file but extensions ‘.out’ and ‘.err’ respectively.

class utopya_jobscript.UtopyaJobScriptBatchLSF

Bases: utopya_jobscript.UtopyaJobScriptBatch

Class to create job for submission to LSF batch system.

Example of job options:

#BSUB -J myjob
#BSUB -oo myjob.out
#BSUB -eo myjob.err

Example of usage:

# init object:
jb = utopya.UtopyaJobScriptBatchLSF()

# obtain line with job options from rcfile:
options = jb.GetOptionsRc( 'UtopyaJobScriptBatchLSF.rc', rcbase='appl', \
                        env={'name':'myjob'} )

# fill script lines:
lines = []
lines.append( '#! /usr/bin/env python\n' )
lines.append( '\n' )
lines.append( options )
lines.append( '\n' )
lines.append( '# do something:\n' )
lines.append( 'print( "boe!" )\n' )
lines.append( '\n' )

# write:
jb.Create( 'myjob.jb', lines )

# submit:
jb.Submit( 'myjob.jb' )

See also man pages of batch job commands:

  • bsub

  • bjobs

  • bkill

GetOptionsRc(rcfile, rcbase='', env={})

Return str line (with newline characters) with job options based on rcfile settings.

The rcfile settings should start with ‘[<rcbase>.]batch.lsf’, where the rcbase might be empty or equal to ‘*’ for default settings.

Example settings for rcbase ‘appl’:

! job format for this application:
appl.batch.lsf.format            :  lsf_format
! which keywords:
appl.batch.lsf.options           :  name output error
! values:
appl.batch.lsf.option.name       :  J myjob
appl.batch.lsf.option.output     :  oo %(name).out
appl.batch.lsf.option.error      :  eo %(name).err

! Define format of batch options, e.g.:
!   #BSUB -flag value
lsf_format.comment       :  #
lsf_format.prefix        :  BSUB
lsf_format.arg           :  '-'
lsf_format.assign        :  ' '
lsf_format.template      :  %(key)
lsf_format.envtemplate   :  %(env:key)

This will return the following job options as a str with newline characters:

#BSUB -J myjob
#BSUB -oo myjob.out
#BSUB -eo myjob.err
Submit(jbfile, _indent='')

Submit job file. Information on job id and commands to follow and cancel the job are written to a file with the same name but extension ‘.info’ .

class utopya_jobscript.UtopyaJobScriptBatchPBS

Bases: utopya_jobscript.UtopyaJobScriptBatch

Class to create job for submission to PBS batch system.

Example of job options:

#PBS -N myjob
#PBS -o myjob.out
#PBS -e myjob.err

Example of usage:

# init object:
jb = utopya.UtopyaJobScriptBatchPBS()

# obtain line with job options from rcfile:
options = jb.GetOptionsRc( 'UtopyaJobScriptBatchPBS.rc', rcbase='appl', \
                        env={'name':'myjob'} )

# fill script lines:
lines = []
lines.append( '#! /usr/bin/env python\n' )
lines.append( '\n' )
lines.append( options )
lines.append( '\n' )
lines.append( '# do something:\n' )
lines.append( 'print( "boe!" )\n' )
lines.append( '\n' )

# write:
jb.Create( 'myjob.jb', lines )

# submit:
jb.Submit( 'myjob.jb' )

See also man pages of batch job commands:

  • qsub

  • qscan

  • qdel

  • qstat

GetOptionsRc(rcfile, rcbase='', env={})

Return str line (with newline characters) with job options based on rcfile settings.

The rcfile settings should start with ‘[<rcbase>.]batch.pbs’, where the rcbase might be empty or equal to ‘*’ for default settings.

Example settings for rcbase ‘appl’:

! job format for this application:
appl.batch.pbs.format            :  pbs_format
! which keywords:
appl.batch.pbs.options           :  name output error
! values:
appl.batch.pbs.option.name       :  J myjob
appl.batch.pbs.option.output     :  o %(name).out
appl.batch.pbs.option.error      :  e %(name).err

! Define format of batch options, e.g.:
!   #PBS -flag value
pbs_format.comment       :  #
pbs_format.prefix        :  PBS
pbs_format.arg           :  '-'
pbs_format.assign        :  ' '
pbs_format.template      :  %(key)
pbs_format.envtemplate   :  %(env:key)

This will return the following job options as a str with newline characters:

#PBS -J myjob
#PBS -o myjob.out
#PBS -e myjob.err
Submit(jbfile, _indent='')

Submit job file. Information on job id and commands to follow and cancel the job are written to a file with the same name but extension ‘.info’ .

The process id is written to a file with extension ‘.pid’.

CheckStatus(pid_file, _indent='')

Read the job id from the ‘.pid’ file created by the Submit() method, and check the current status. Returns a str with one of the following values:

  • running’ if the job is still running;

  • stopped’ if the job id is not present anymore, which is interpreted as that the job was stopped.

class utopya_jobscript.UtopyaJobScriptBatchSlurm

Bases: utopya_jobscript.UtopyaJobScriptBatch

Class to create job for submission to SLURM batch system.

Example of job options:

#SBATCH --job-name=myjob
#SBATCH -output=myjob.out
#SBATCH -error=myjob.err

Example of usage:

# init object:
jb = utopya.UtopyaJobScriptBatchSlurm()

# obtain line with job options from rcfile:
options = jb.GetOptionsRc( 'UtopyaJobScriptBatchSlurm.rc', rcbase='appl', \
                        env={'name':'myjob'} )

# fill script lines:
lines = []
lines.append( '#! /usr/bin/env python\n' )
lines.append( '\n' )
lines.append( options )
lines.append( '\n' )
lines.append( '# do something:\n' )
lines.append( 'print( "boe!" )\n' )
lines.append( '\n' )

# write:
jb.Create( 'myjob.jb', lines )

# submit:
jb.Submit( 'myjob.jb' )

See also man pages of batch job commands:

  • sbatch

  • squeue

  • scancel

GetOptionsRc(rcfile, rcbase='', env={})

Return str line (with newline characters) with job options based on rcfile settings.

The rcfile settings should start with ‘[<rcbase>.]batch.slurm’, where the rcbase might be empty or equal to ‘*’ for default settings.

Example settings for rcbase ‘appl’:

! job format for this application:
appl.batch.slurm.format            :  slurm_format
! which keywords:
appl.batch.slurm.options           :  name output error
! values:
appl.batch.slurm.option.name       :  job-name myjob
appl.batch.slurm.option.output     :  output %(name).out
appl.batch.slurm.option.error      :  error %(name).err

! Define format of batch options, e.g.:
!   #SBATCH --flag=value
slurm_format.comment       :  #
slurm_format.prefix        :  SBATCH
slurm_format.arg           :  '--'
slurm_format.assign        :  '='
slurm_format.template      :  %(key)
slurm_format.envtemplate   :  %(env:key)

This will return the following job options as a str with newline characters:

#SBATCH --job-name=myjob
#SBATCH --output=myjob.out
#SBATCH --error=myjob.err
Submit(jbfile, _indent='')

Submit job file. Information on job id and commands to follow and cancel the job are written to a file with the same name but extension ‘.info’ .

class utopya_jobscript.UtopyaJobScriptBatchLoadLeveler

Bases: utopya_jobscript.UtopyaJobScriptBatch

Class to create job for submission to LoadLeveler batch system.

Example of job options:

#@ name = myjob
#@ output = myjob.out
#@ error = myjob.err
#@ queue

Example of usage:

# init object:
jb = utopya.UtopyaJobScriptBatchLoadLeveler()

# obtain line with job options from rcfile:
options = jb.GetOptionsRc( 'UtopyaJobScriptBatchLoadLeveler.rc', \
                                rcbase='appl', \
                                env={'name':'myjob'} )

# fill script lines:
lines = []
lines.append( '#! /usr/bin/env python\n' )
lines.append( '\n' )
lines.append( options )
lines.append( '\n' )
lines.append( '# do something:\n' )
lines.append( 'print( "boe!" )\n' )
lines.append( '\n' )

# write:
jb.Create( 'myjob.jb', lines )

# submit:
jb.Submit( 'myjob.jb' )

See also man pages of batch job commands:

  • llsubmit

  • llq

  • llcancel

GetOptionsRc(rcfile, rcbase='', env={})

Return str line (with newline characters) with job options based on rcfile settings.

The rcfile settings should start with ‘[<rcbase>.]batch.loadleveler’, where the rcbase might be empty or equal to ‘*’ for default settings.

Example settings for rcbase ‘appl’:

! job format for this application:
appl.batch.loadleveler.format            :  loadleveler_format
! which keywords:
appl.batch.loadleveler.options           :  name output error queue
! values:
appl.batch.loadleveler.option.name       :  name myjob
appl.batch.loadleveler.option.output     :  output %(name).out
appl.batch.loadleveler.option.error      :  error %(name).err
appl.batch.loadleveler.option.queue      :  queue

! Define format of batch options, e.g.:
!   #@ key = value
loadleveler_format.comment       :  #
loadleveler_format.prefix        :  @
loadleveler_format.arg           :  ''
loadleveler_format.assign        :  ' = '
loadleveler_format.template      :  %(key)
loadleveler_format.envtemplate   :  %(env:key)

This will return the following job options as a str with newline characters:

#@ name = myjob
#@ output = myjob.out
#@ error = myjob.err
#@ queue
Submit(jbfile, _indent='')

Submit job file. Information on job id and commands to follow and cancel the job are written to a file with the same name but extension ‘.info’ .