bmig - migrate one or more started batch jobs
bmig [ -h ] [ -V ] [ -f ] [ -m host_name ... ] [ -u user_name | all ] [ -J job_name ] [ jobId ... ]
Migrate one or more started jobs (i.e., running or suspended jobs). Only checkpointable or rerunnable jobs are considered for migration. When a checkpointable job is migrated, the job is first checkpointed as though bchkpnt -k (see bchkpnt(1) ) were invoked. If the checkpointing is successful, the job is re-queued and scheduled to be restarted on another host. If the checkpointing is unsuccessful, the job remains on the same host. For a non-checkpointable rerunnable job, the job is first killed as though bkill (see bkill(1) ) were invoked. The job is then re-queued to be rerun from the beginning on another host. When a migrating job is restarted or rerun on another host, the environment variable, LSB_RESTART, is set to the value Y.
A user can only migrate on his or her own jobs. Only root and LSF administrator can migrate jobs submitted by other users.
A job cannot be migrated if it is currently being migrated.
A migrating job cannot be restarted successfully if the checkpoint directory is not accessible on the host restarting the job.
No output is sent to the user if a migrating job is killed (see bkill(1) ) while pending to be restarted on another host.
A job is checkpointed and killed for migration only if the job is not being checkpointed. Thus, if a job is periodically checkpointed, and the checkpoint period is very short, the job may not be migrated.
-h Print command usage to stderr and exit.
-V Print LSF release version to stderr and exit.
-f Force the job to be checkpointed even if non-checkpointable conditions exist (non-checkpointable conditions are operating system-specific). This option has no effect when migrating a non-checkpointable rerunnable job.
-m host_name ...
Restrict candidate jobs for migration to those on hosts host_name ...,
which may be one or more host names or host group names defined in the
lsbatch system. If more than one host is specified, the list must be
enclosed by `"' or `''. (Membership of a host group may be found
using the bmgroup command).
-u user_name | all
Operate on the jobs submitted by the user or user group specified by
user_name, or by all users if the reserved user name all is given. If
jobId is not specified, then only the most recently submitted qualifying
job will be operated upon. The -u option is ignored if a job ID
other than 0 is specified in the jobId option.
-J job_name
Operate on the jobs that have the specified job_name. If jobId is not
specified, then only the most recently submitted qualifying job will
be operated upon. The -J option is ignored if a job ID other than 0
is specified in the jobId option.
jobId ...
Operate only on the jobs specified by jobId .... The options -u and
-J have no effect if a job ID other than 0 is specified. Jobs submitted
by any user may be specified here without using the -u option. If
the reserved job ID 0 is used, then the operation is applied to all
the started jobs satisfying option -u and -J, and all other job IDs
are ignored. If no jobId is specified, then the most recently submitted
job will be operated upon. Job IDs are returned at job submission
time (see bsub(1)
), and may be obtained with the bjobs command (see
bjobs(1)
).
bsub(1) , brestart(1) , bchkpnt(1) , bjobs(1) , bqueues(1) , bhosts(1) , bugroup(1) , mbatchd(8) , lsbatch(5) , kill(1)