General Usage of Quest: Difference between revisions
No edit summary |
m (added command to specify partition) |
||
Line 109: | Line 109: | ||
</pre> |
</pre> |
||
This command displays detailed information about a submitted job’s status and diagnostic information that can be useful for troubleshooting submission issues. It can also be used to obtain useful information about completed jobs such as the allocated nodes, resources used, and exit codes. NUIT recommends using the flag ‘–vvv’ or ‘–v –v –v’ to gather additional diagnostic information. |
This command displays detailed information about a submitted job’s status and diagnostic information that can be useful for troubleshooting submission issues. It can also be used to obtain useful information about completed jobs such as the allocated nodes, resources used, and exit codes. NUIT recommends using the flag ‘–vvv’ or ‘–v –v –v’ to gather additional diagnostic information. |
||
<pre> |
|||
mjobctl -m partition=<partition name> <job number> |
|||
</pre> |
|||
This command specifies a partition for a job which is already in the queue. This can be useful if you forget to specify a particular partition in the batch file (or if you want to change the partition, for example from quest3 to quest4), as it allows you to do so without having to delete and resubmit the jobs. |
|||
</ul> |
</ul> |
Latest revision as of 09:53, 14 May 2016
- Login
ssh [netid]@quest.it.northwestern.edu
where [netid] is your NETID. The first time you login, it will ask you to enter file in which to save the key and then to enter passphrase twice. Just press "enter" for these three questions and you should be able to login successfully.
Our group folder is located in /projects/b1011/luijten-group
- Example of job.mbs file
# ### AUTOMATICALLY GENERATED BATCH FILE #MOAB -q [queue_name] #MOAB -A b1011 #MOAB -l walltime=[dd:hh:mm:ss] # ###name of job #MOAB -N [name_of_job] # ### mail for begin/end/abort #MOAB -m ea #MOAB -M [email_address] # ### number of nodes and processors per node #MOAB -l nodes=2:ppn=6 # ### indicates that job should not rerun if it fails #MOAB -r n # ### stdout and stderr merged as stderr #MOAB -j eo # ### write stderr to file #MOAB -e log.err # ### the shell that interprets the job script #MOAB -S /bin/bash module load [module] cd /projects/b1011/luijten-group/[job_location] time mpirun -np 12 [directory_name]/[lammps_version] -in input.dat if [ $? -eq 0 ] ; then touch COMPLETED fi
[queue_name] There are two options for queue name: collab or collab-preempt. Both of them have startup priority of 5000. Collab has maximum cores of 262 and maximum walltime of 7 days. There is no resource restrictions for collab-preempt, but note that queues ending in ‘-preempt’ contain jobs that can be interrupted and re-queued by jobs from a higher priority queue.
[dd:hh:mm:ss] This is the maximum allowed running time for your job. dd: days; hh: hours; mm: minutes; ss: seconds.
[name_of_job] This is the name of your job that will be showed in the queue.
[email_address] This is the email address you used to receive the system notice when job begins, aborted or ended.
[module] Load a module. For mpirun this would be the module mpi. For full list of available modules run module available from the command line.
[job_location] This is the address of the folder where your input file is located.
[directory_name] This is the address of the lammps executable.
[lammps_version] This is the build of lammps you want to run. Must be in [directory_name].
- Submit jobs
msub job.mbs
- Cancel jobs
canceljob [job_number] canceljob `seq [first_job_number] [last_job_number]`
- Check job status
showq
show all jobs
showq -r
show running jobs
showq -i
show idle jobs
showq -w user=[netid]
show jobs belonging to the user specified, where [netid] is your NETID.
showq -w acct=[account number]
show jobs belonging to the account specified. Grail allocation account number: b1011; CCTSM allocation account number: b1023; ESAM allocation account number: b1020.
qstat
show your own jobs
checkjob [job_ID]
This command displays detailed information about a submitted job’s status and diagnostic information that can be useful for troubleshooting submission issues. It can also be used to obtain useful information about completed jobs such as the allocated nodes, resources used, and exit codes. NUIT recommends using the flag ‘–vvv’ or ‘–v –v –v’ to gather additional diagnostic information.
mjobctl -m partition=<partition name> <job number>
This command specifies a partition for a job which is already in the queue. This can be useful if you forget to specify a particular partition in the batch file (or if you want to change the partition, for example from quest3 to quest4), as it allows you to do so without having to delete and resubmit the jobs.