utopya_jobscript
module¶
Introduction¶
A job script is a (usually) small text file to do something important. A simple example is:
#! /usr/bin/env python
# do something:
print( "boe!" )
Such a script could be run in the foreground, in which case the user has to wait for the job to be finished before control is regained. Standard output (“boe!”) and eventually standard error are printed directly and can be watched by the user.
When the script is run in background, control is given back to the user while the script remains running. The standard output and error should be redirected to files.
Alternatively, the script could be submitted to a batch system. In this case, options to identify the job, to specify destination of standard output and error, and to request resoures (memory, cpu’s), could be inserted in the header of the script:
#! /usr/bin/env python
#BSUB -J myjob
#BSUB -oo myjob.out
#BSUB -eo myjob.err
# do something:
print( "boe!" )
The classes provided by this module facilitate creation of job scripts for which the run destination (foreground, background, batch system) is flexible.
Jobscripts to run in foreground¶
To run a job in the foreground, use one of the following classes:
UtopyaJobScriptRedirect
, which redirects standard output (and error) to files.
Jobscripts to run in background¶
To run in background, use the UtopyaJobScriptBackground
class.
Jobscripts to be submitted to a batch system¶
High performace clusters with a high number of processors and many users logged in at the same time are always equiped with a batch system to handle jobs. Batch jobs are submitted to a queue, and the batch system empties the queue by assigning jobs to first available processors.
Special commands are required to submit jobs to the queue, list the currently submitted and running jobs, and eventually remove jobs from the queue.
Batch job files typically have special comments in the top to tell the batch system the destination queue, the name of output/error files to be used, required memory and maximum run time, etc.
Which batch system is available usually depends on the machine vendor
and/or administrator.
Each type of batch system has its own job handling command
and format for the batch options in the top of the job file.
For each type, a seperate class needs to be defined to handle
creation and submission.
A base class UtopyaJobScriptBatch
is provided from
which batch type specific class could be derived;
see it’s documentation for the methods to be re-defined.
The following specific batch systems are already supported:
For LSF, which uses the ‘bsub’ command to submit, use the
UtopyaJobScriptBatchLSF
class.For SLURM, which uses the ‘sbatch’ command to submit, use the
UtopyaJobScriptBatchSlurm
class.For PBS, which uses the ‘qsub’ command to submit, use the
UtopyaJobScriptBatchPBS
class.For the IBM LoadLeveler queue, use the
UtopyaJobScriptBatchLoadLeveler
class.
If queue systems are fully occupied, testing creation of job files could suffer
from long waiting times. To avoid wasting of precious development time, the
special UtopyaJobScriptBatchTest
class is provided.
This will create jobs using fake job options, and run the script in foreground
while redirecting standard output and error.
Class hierarchy¶
The classes provided by this module have been derived with the following hierchy:
Classes¶
- class utopya_jobscript.UtopyaJobScript¶
Bases:
utopya_base.UtopyaBase
Base class for an object that can be used to create and submit a job script.
Derived classes probably need to re-define the
Submit()
methode only.Example of usage:
# init object: jb = UtopyaJobScript() # fill script lines: lines = [] lines.append( '#! /usr/bin/env python\n' ) lines.append( '\n' ) lines.append( '# do something:\n' ) lines.append( 'print( "boe!" )\n' ) lines.append( '\n' ) # write: jb.Create( 'myjob.jb', lines ) # submit created script # (here dummy method that issues an error): jb.Submit( 'myjob.jb' )
- Create(jbfile, lines)¶
Create job file with a name provided by jbfile, and content in the lines argument.
The ‘lines’ should either be a list of ‘str’ objects, or a single ‘str’ object, but in both cases newline characters should be included.
The created job file is made executable to allow execution from a command line.
- GetOptionsRc(rcfile, name='batch', rcbase='', env={})¶
Template for derived classes that need to include batch job options in a job file. This version returns an empty line.
- Submit(jbfile, _indent='')¶
Template for job submision methode by derived classes.
- CheckStatus(pid_file, _indent='')¶
Template for method that checks status of submitted job.
- class utopya_jobscript.UtopyaJobScriptForeground¶
Bases:
utopya_jobscript.UtopyaJobScript
Class to create a job that runs in foreground.
Example of usage:
# init object: jb = utopya.UtopyaJobScriptForeground() # fill script lines: lines = [] lines.append( '#! /usr/bin/env python\n' ) lines.append( '\n' ) lines.append( '# do something:\n' ) lines.append( 'print( "boe!" )\n' ) lines.append( '\n' ) # write: jb.Create( 'myjob.jb', lines ) # run script in foreground: jb.Submit( 'myjob.jb' )
- Submit(jbfile, _indent='')¶
Run job file in foreground. Standard output and error are not redirected.
- CheckStatus(pid_file, _indent='')¶
Check status of running job. A job running in the foreground is already finished. Therefore this method always returns the same value:
‘
stopped
’ the process is already finished
- class utopya_jobscript.UtopyaJobScriptRedirect¶
Bases:
utopya_jobscript.UtopyaJobScript
Class to create job for running in foreground, while re-directing std.output and std.error to files.
Example of usage:
# init object: jb = utopya.UtopyaJobScriptRedirect() # fill script lines: lines = [] lines.append( '#! /usr/bin/env python\n' ) lines.append( '\n' ) lines.append( '# do something:\n' ) lines.append( 'print( "boe!" )\n' ) lines.append( '\n' ) # write: jb.Create( 'myjob.jb', lines ) # run script in foreground, redirect output: jb.Submit( 'myjob.jb' )
- Submit(jbfile, _indent='')¶
Run job file in foreground, but with standard output and error redirected to files. The output and error files have the same name as the job file but extensions ‘.out’ and ‘.err’ respectively.
- class utopya_jobscript.UtopyaJobScriptBackground¶
Bases:
utopya_jobscript.UtopyaJobScript
Class to create job for running in background.
Example of usage:
# init object: jb = utopya.UtopyaJobScriptBackground() # fill script lines: lines = [] lines.append( '#! /usr/bin/env python\n' ) lines.append( '\n' ) lines.append( '# do something:\n' ) lines.append( 'print( "boe!" )\n' ) lines.append( '\n' ) # write: jb.Create( 'myjob.jb', lines ) # run script in background: jb.Submit( 'myjob.jb' )
- Submit(jbfile, _indent='')¶
Run job file in background.
Standard output and error are written to files with the same name as the job file but extensions ‘
.out
’ and ‘.err
’ respectively.The process id is written to a file with extension ‘
.pid
’.
- CheckStatus(pid_file, _indent='')¶
Read the process id from the ‘
.pid
’ file created by theSubmit()
method, and check the current status. Returns a str with one of the following values:‘
running
’ if the process is still running;‘
stopped
’ if the process id is not present anymore, or is in ‘zombie
state; for background processes the return status could not be checked yet to see if it exitted without errors.
- class utopya_jobscript.UtopyaJobScriptBatch¶
Bases:
utopya_jobscript.UtopyaJobScript
Base class to create and submit a batch job script. The base class itself does not support a particular system, but only provides generic methods that are shared by the specific classes. Derived classes probably need to re-define the following methods only:
GetOptionsRc()
that should call the same method from the parent (this) class with proper arguments;Submit()
to submit a job script to batch queue.
Example of usage of the base class, derived classes should be used in the same way:
# init object: jb = utopya.UtopyaJobScriptBatch() # obtain line with job options from rcfile: options = jb.GetOptionsRc( 'UtopyaJobScriptBatch.rc', rcbase='appl', \ env={'name':'myjob'} ) # fill script lines: lines = [] lines.append( '#! /usr/bin/env python\n' ) lines.append( '\n' ) lines.append( options ) lines.append( '\n' ) lines.append( '# do something:\n' ) lines.append( 'print( "boe!" )\n' ) lines.append( '\n' ) # write: jb.Create( 'myjob.jb', lines ) # submit: jb.Submit( 'myjob.jb' )
- GetOptions(keys, options, comment='#', prefix='BATCH', arg='-', assign=' ')¶
Return str line (with newline breaks) with job options. For example for a load-leveler job this could look like:
'#@ job_name = myjob\n#@ output = myjob.out\n#@ error = myjob.err\n#@ queue'
In an actual job script this is then expanded into seperate lines:
#@ job_name = myjob #@ output = myjob.out #@ error = myjob.err #@ queue
With a different queue system, the job options should probably look different. For an LSF job for example, the options could look like:
#BSUB -J myjob #BSUB -o myjob.out #BSUB -e myjob.err
Although formatting is different, the options in general consist of a lines starting with unique comment pragma, followed by a flag, and usually a value. Flags might be preceeded by a ‘-’ sign; flag and value might be seperated by whitespace or a ‘=’ sign. In a general notation the format of a job option is:
<comment><prefix> <arg><flag>[<assign><value>]
The formatting to be used is defined by the optional keyword arguments passed to this method. For the LoadLeveler job for example the formatting is defined by:
comment = '#' prefix = '@' arg = '' assign = '='
which will give option lines:
#@ <flag>[=<value>]
For the LSF system these keywords are necessary:
comment = '#' prefix = 'BSUB' arg = '-' assign = ' '
which will give the lines:
#BSUB -<flag>[ <value>]
The flag/value pairs should be passed to this method by two arguments: a list of ‘keys’ that defines the order of the flag/value pairs, and a dictionairy ‘options’ with flag/value tupples for each of the elements in the ‘keys’ list. For example for the LoadLeveler job, the arguments could be::
keys = ['name','out','err','end'] options = { 'name' : ('job_name','myjob'), 'out' : ('output','myjob.out'), 'err' : ('error','myjob.err'), 'end' : ('queue',None) }
and for the LSF job:
keys = ['name','out','err'] options = { 'name' : ('J','myjob'), 'out' : ('o','myjob.out'), 'err' : ('e','myjob.err') }
- GetOptionsRc(rcfile, name='batch', rcbase='', env={})¶
Read batch options and formatting rules from rcfile settings, and pass these to the
GetOptions()
method. The result is similar, a ‘str’ line with batch options to be written to a job file, including newline characters.The ‘name’ is the first part of the rcfile keys that should be used.
The ‘rcbase’ is an optional prefix for the name. This allows a rcfile to have multiple batch job definitions, each with a different ‘rcbase’ to be used for different job files.
Example rcfile settings for rcbase ‘appl’ and name ‘batch’:
! which keywords: appl.batch.options : jobname output error nodes memory queue ! values: appl.batch.option.jobname : n %(env:name) appl.batch.option.output : o %(jobname).out appl.batch.option.error : e %(jobname).err appl.batch.option.nodes : r nodes:4 appl.batch.option.memory : r memory:2G appl.batch.option.queue : q ! batch job option format keyword: appl.batch.format : myformat ! Define format of batch options, e.g.: ! #BATCH -flag value ! If whitespace is essential, enclose value by single quotes: myformat.comment : # myformat.prefix : BATCH myformat.arg : - myformat.assign : ' ' ! format of template to subsitute (environment) values: myformat.template : %(key) myformat.envtemplate : %(env:key)
The first setting is a list of option keys, here ‘jobname’, ‘output’, etc.
For each option key, a flag/value pair needs to be defined; in the example above, for key ‘jobname’ the flag/value pair ‘J %(env:name)’ is defined. Typically, the flag could consist of just a cryptic letter (‘J’) as required by the batch system, while the option key is longer and more descriptive.
The first part of a flag/value is the flag, seperated by whitespace from the value(s) which form the remainder of the line. Multiple options could therefore share the same flag; this is useful when the job scheduler requires the same flag to be used for different settings, for example to set resources.
The value could be defined literaly, such as “nodes:4” is defined above as the value for the number of nodes.
The value could also contain templates for substitution of a value from a flag/value pair assigned to another key. In the example the value assigned to “jobname” is for example also used for the output and error files. The format of the template is defined here as ‘%(key)’, where only the presence of the word “key” is required. A loop over all keys (“jobname”, “output”, etc) will be performed and the word “key” in the template is replaced by the current key; if the result is found in a value, it is decided that the template is used and it is replaced by the corresponding value.
A special substitution is defined for variables from the so-called environment. In this case, the template should contain the word “env:key”. The environment is an optional dictionairy that is passed by the calling method and contains specific values at run time; a loop over the environment keys is performed to search for matching templates. For example, the following environment contains the job name to be used:
env = { 'name' : 'myjob' }
With this environment, the above example will result in the following job options (actually a single str with newlines):
#BATCH -n myjob #BATCH -o myjob.out #BATCH -e myjob.err #BATCH -r nodes:4 #BATCH -r memory:2G #BATCH -q
- Submit(jbfile, _indent='')¶
Template for submit method. Derived classes should re-implement this method to submit the named ‘jbfile’ to a batch queue.
- class utopya_jobscript.UtopyaJobScriptBatchTest¶
Bases:
utopya_jobscript.UtopyaJobScriptBatch
Class to create test job files that are not submitted to a queue but simply run in foreground. Useful for testing job script creation.
Example of usage:
# init object: jb = utopya.UtopyaJobScriptBatchTest() # obtain line with job options from rcfile: options = jb.GetOptionsRc( 'settings-UtopyaJobScriptBatchTest.rc', rcbase='appl', \ env={'name':'myjob'} ) # fill script lines: lines = [] lines.append( '#! /usr/bin/env python\n' ) lines.append( '\n' ) lines.append( options ) lines.append( '\n' ) lines.append( '# do something:\n' ) lines.append( 'print( "boe!" )\n' ) lines.append( '\n' ) # write: jb.Create( 'myjob.jb', lines ) # submit: jb.Submit( 'myjob.jb' )
- GetOptionsRc(rcfile, rcbase='', env={})¶
Return str line (with newline characters) with job options based on rcfile settings.
The rcfile settings should start with ‘[<rcbase>.]batch.test’, where the rcbase might be empty or equal to ‘*’ for default settings.
Example settings for rcbase ‘appl’:
! job format for this application: appl.batch.test.format : test_format ! which keywords: appl.batch.test.options : name output error ! flags and values: appl.batch.test.option.name : J myjob appl.batch.test.option.output : oo %(name).out appl.batch.test.option.error : eo %(name).err ! Define format of batch options, e.g.: ! #TEST -flag value test_format.comment : # test_format.prefix : TEST test_format.arg : '-' test_format.assign : ' ' test_format.template : %(key) test_format.envtemplate : %(env:key)
This will return the following job options as a str with newline characters:
#TEST -J myjob #TEST -oo myjob.out #TEST -eo myjob.err
- Submit(jbfile, _indent='')¶
Test routine that runs the batch job file in background.
Standard output and error are written to files with the same name as the job file but extensions ‘.out’ and ‘.err’ respectively.
- class utopya_jobscript.UtopyaJobScriptBatchLSF¶
Bases:
utopya_jobscript.UtopyaJobScriptBatch
Class to create job for submission to LSF batch system.
Example of job options:
#BSUB -J myjob #BSUB -oo myjob.out #BSUB -eo myjob.err
Example of usage:
# init object: jb = utopya.UtopyaJobScriptBatchLSF() # obtain line with job options from rcfile: options = jb.GetOptionsRc( 'UtopyaJobScriptBatchLSF.rc', rcbase='appl', \ env={'name':'myjob'} ) # fill script lines: lines = [] lines.append( '#! /usr/bin/env python\n' ) lines.append( '\n' ) lines.append( options ) lines.append( '\n' ) lines.append( '# do something:\n' ) lines.append( 'print( "boe!" )\n' ) lines.append( '\n' ) # write: jb.Create( 'myjob.jb', lines ) # submit: jb.Submit( 'myjob.jb' )
See also man pages of batch job commands:
bsub
bjobs
bkill
- GetOptionsRc(rcfile, rcbase='', env={})¶
Return str line (with newline characters) with job options based on rcfile settings.
The rcfile settings should start with ‘[<rcbase>.]batch.lsf’, where the rcbase might be empty or equal to ‘*’ for default settings.
Example settings for rcbase ‘appl’:
! job format for this application: appl.batch.lsf.format : lsf_format ! which keywords: appl.batch.lsf.options : name output error ! values: appl.batch.lsf.option.name : J myjob appl.batch.lsf.option.output : oo %(name).out appl.batch.lsf.option.error : eo %(name).err ! Define format of batch options, e.g.: ! #BSUB -flag value lsf_format.comment : # lsf_format.prefix : BSUB lsf_format.arg : '-' lsf_format.assign : ' ' lsf_format.template : %(key) lsf_format.envtemplate : %(env:key)
This will return the following job options as a str with newline characters:
#BSUB -J myjob #BSUB -oo myjob.out #BSUB -eo myjob.err
- Submit(jbfile, _indent='')¶
Submit job file. Information on job id and commands to follow and cancel the job are written to a file with the same name but extension ‘.info’ .
- class utopya_jobscript.UtopyaJobScriptBatchPBS¶
Bases:
utopya_jobscript.UtopyaJobScriptBatch
Class to create job for submission to PBS batch system.
Example of job options:
#PBS -N myjob #PBS -o myjob.out #PBS -e myjob.err
Example of usage:
# init object: jb = utopya.UtopyaJobScriptBatchPBS() # obtain line with job options from rcfile: options = jb.GetOptionsRc( 'UtopyaJobScriptBatchPBS.rc', rcbase='appl', \ env={'name':'myjob'} ) # fill script lines: lines = [] lines.append( '#! /usr/bin/env python\n' ) lines.append( '\n' ) lines.append( options ) lines.append( '\n' ) lines.append( '# do something:\n' ) lines.append( 'print( "boe!" )\n' ) lines.append( '\n' ) # write: jb.Create( 'myjob.jb', lines ) # submit: jb.Submit( 'myjob.jb' )
See also man pages of batch job commands:
qsub
qscan
qdel
qstat
- GetOptionsRc(rcfile, rcbase='', env={})¶
Return str line (with newline characters) with job options based on rcfile settings.
The rcfile settings should start with ‘[<rcbase>.]batch.pbs’, where the rcbase might be empty or equal to ‘*’ for default settings.
Example settings for rcbase ‘appl’:
! job format for this application: appl.batch.pbs.format : pbs_format ! which keywords: appl.batch.pbs.options : name output error ! values: appl.batch.pbs.option.name : J myjob appl.batch.pbs.option.output : o %(name).out appl.batch.pbs.option.error : e %(name).err ! Define format of batch options, e.g.: ! #PBS -flag value pbs_format.comment : # pbs_format.prefix : PBS pbs_format.arg : '-' pbs_format.assign : ' ' pbs_format.template : %(key) pbs_format.envtemplate : %(env:key)
This will return the following job options as a str with newline characters:
#PBS -J myjob #PBS -o myjob.out #PBS -e myjob.err
- Submit(jbfile, _indent='')¶
Submit job file. Information on job id and commands to follow and cancel the job are written to a file with the same name but extension ‘.info’ .
The process id is written to a file with extension ‘
.pid
’.
- CheckStatus(pid_file, _indent='')¶
Read the job id from the ‘
.pid
’ file created by theSubmit()
method, and check the current status. Returns a str with one of the following values:‘
running
’ if the job is still running;‘
stopped
’ if the job id is not present anymore, which is interpreted as that the job was stopped.
- class utopya_jobscript.UtopyaJobScriptBatchSlurm¶
Bases:
utopya_jobscript.UtopyaJobScriptBatch
Class to create job for submission to SLURM batch system.
Example of job options:
#SBATCH --job-name=myjob #SBATCH -output=myjob.out #SBATCH -error=myjob.err
Example of usage:
# init object: jb = utopya.UtopyaJobScriptBatchSlurm() # obtain line with job options from rcfile: options = jb.GetOptionsRc( 'UtopyaJobScriptBatchSlurm.rc', rcbase='appl', \ env={'name':'myjob'} ) # fill script lines: lines = [] lines.append( '#! /usr/bin/env python\n' ) lines.append( '\n' ) lines.append( options ) lines.append( '\n' ) lines.append( '# do something:\n' ) lines.append( 'print( "boe!" )\n' ) lines.append( '\n' ) # write: jb.Create( 'myjob.jb', lines ) # submit: jb.Submit( 'myjob.jb' )
See also man pages of batch job commands:
sbatch
squeue
scancel
- GetOptionsRc(rcfile, rcbase='', env={})¶
Return str line (with newline characters) with job options based on rcfile settings.
The rcfile settings should start with ‘[<rcbase>.]batch.slurm’, where the rcbase might be empty or equal to ‘*’ for default settings.
Example settings for rcbase ‘appl’:
! job format for this application: appl.batch.slurm.format : slurm_format ! which keywords: appl.batch.slurm.options : name output error ! values: appl.batch.slurm.option.name : job-name myjob appl.batch.slurm.option.output : output %(name).out appl.batch.slurm.option.error : error %(name).err ! Define format of batch options, e.g.: ! #SBATCH --flag=value slurm_format.comment : # slurm_format.prefix : SBATCH slurm_format.arg : '--' slurm_format.assign : '=' slurm_format.template : %(key) slurm_format.envtemplate : %(env:key)
This will return the following job options as a str with newline characters:
#SBATCH --job-name=myjob #SBATCH --output=myjob.out #SBATCH --error=myjob.err
- Submit(jbfile, _indent='')¶
Submit job file. Information on job id and commands to follow and cancel the job are written to a file with the same name but extension ‘.info’ .
- class utopya_jobscript.UtopyaJobScriptBatchLoadLeveler¶
Bases:
utopya_jobscript.UtopyaJobScriptBatch
Class to create job for submission to LoadLeveler batch system.
Example of job options:
#@ name = myjob #@ output = myjob.out #@ error = myjob.err #@ queue
Example of usage:
# init object: jb = utopya.UtopyaJobScriptBatchLoadLeveler() # obtain line with job options from rcfile: options = jb.GetOptionsRc( 'UtopyaJobScriptBatchLoadLeveler.rc', \ rcbase='appl', \ env={'name':'myjob'} ) # fill script lines: lines = [] lines.append( '#! /usr/bin/env python\n' ) lines.append( '\n' ) lines.append( options ) lines.append( '\n' ) lines.append( '# do something:\n' ) lines.append( 'print( "boe!" )\n' ) lines.append( '\n' ) # write: jb.Create( 'myjob.jb', lines ) # submit: jb.Submit( 'myjob.jb' )
See also man pages of batch job commands:
llsubmit
llq
llcancel
- GetOptionsRc(rcfile, rcbase='', env={})¶
Return str line (with newline characters) with job options based on rcfile settings.
The rcfile settings should start with ‘[<rcbase>.]batch.loadleveler’, where the rcbase might be empty or equal to ‘*’ for default settings.
Example settings for rcbase ‘appl’:
! job format for this application: appl.batch.loadleveler.format : loadleveler_format ! which keywords: appl.batch.loadleveler.options : name output error queue ! values: appl.batch.loadleveler.option.name : name myjob appl.batch.loadleveler.option.output : output %(name).out appl.batch.loadleveler.option.error : error %(name).err appl.batch.loadleveler.option.queue : queue ! Define format of batch options, e.g.: ! #@ key = value loadleveler_format.comment : # loadleveler_format.prefix : @ loadleveler_format.arg : '' loadleveler_format.assign : ' = ' loadleveler_format.template : %(key) loadleveler_format.envtemplate : %(env:key)
This will return the following job options as a str with newline characters:
#@ name = myjob #@ output = myjob.out #@ error = myjob.err #@ queue
- Submit(jbfile, _indent='')¶
Submit job file. Information on job id and commands to follow and cancel the job are written to a file with the same name but extension ‘.info’ .