Grid Engine cluster configuration
Grid Engine root directory ($SGE_ROOT) | ${cfg.sge.root} |
Cell name ($SGE_CELL) | ${cfg.cell.name} |
Cluster name ($SGE_CLUSTER_NAME) | ${cfg.sge.cluster.name} |
Qmaster port ($SGE_QMASTER_PORT) | ${cfg.sge.qmaster.port} |
Execd port ($SGE_EXECD_PORT) | ${cfg.sge.execd.port} |
Group id range ($SGE_GID_RANGE) | ${cfg.gid.range} |
| Qmaster spool directory | ${cfg.qmaster.spool.dir} |
| Global execd spool directory | ${cfg.execd.spool.dir} |
| Spooling method | ${cfg.spooling.method} |
| Spooling directory | ${cfg.db.spooling.dir} |
| JMX port | ${cfg.sge.jmx.port} |
| JVM library path | ${cfg.sge.jvm.lib.path} |
| JMX SSL server keystore path | ${cfg.sge.jmx.ssl.keystore} |
| Administrator mail | ${cfg.admin.mail} |
| Succeeded | Failed |
| Qmaster host | ${add.qmaster.host} | ${add.qmaster.host.failed} |
| Execution host(s) | ${cfg.exec.host.list} | ${add.exec.host.list.failed} |
| Shadow host(s) | ${cfg.shadow.host} | ${add.shadow.host.failed} |
| Admin host(s) | ${cfg.admin.host.list} | ${add.admin.host.list.failed} |
| Submit host(s) | ${cfg.submit.host.list} | ${add.submit.host.list.failed} |
How to start with Grid Engine
-
Set the environment:
- if you are a csh/tcsh user:
source ${cfg.sge.root}/${cfg.cell.name}/common/settings.csh
- if you are a sh/bash/ksh user:
. ${cfg.sge.root}/${cfg.cell.name}/common/settings.sh
This will set or expand the following environment variables:
$SGE_ROOT (always necessary)
$SGE_CELL (if you are using a cell other than default)
$SGE_CLUSTER_NAME (always necessary)
$SGE_QMASTER_PORT (if you haven't added the service sge_qmaster)
$SGE_EXECD_PORT (if you haven't added the service sge_execd)
$PATH/$path (to find the Grid Engine binaries)
$MANPATH (to access the manual pages)
-
Submit one of the sample scripts contained in the ${cfg.sge.root}/examples/jobs directory.
qsub ${cfg.sge.root}/examples/jobs/simple.sh
or
qsub ${cfg.sge.root}/examples/jobs/sleeper.sh
-
Use the qstat command to monitor the job's behavior.
qstat -f
-
After the job finishes executing, check your home directory for
the redirected stdout/stderr files
script-name.ejob-id and
script-name.ojob-id.
The job-id is a consecutive unique integer number assigned to each job.
Administering Grid Engine
Grid Engine startup scripts:
| Daemon | Location | Actions |
| Qmaster |
${cfg.sge.root}/${cfg.cell.name}/common/sgemaster |
start | stop | restart |
| Exec daemon |
${cfg.sge.root}/${cfg.cell.name}/common/sgeexecd |
start | stop | softstop | restart |
Startup messages can be found in SMF service log files.
You can get the name of the log file by calling svcs -l SERVICE_NAME
E.g.: svcs -l svc:/application/sge/qmaster:${cfg.sge.cluster.name}
After startup the daemons normally log their messages in their
spool directories as follows, but can be configured to use syslog.
| Qmaster: |
${cfg.qmaster.spool.dir}/messages |
| Exec daemon: |
execd_spool_dir/hostname/messages |
Useful links