Getting help on the HPC clusters
- Overview of getting help on HPC clusters
- Information to include/tips for submitting tickets
- Using the script command to record actions
- Opening a help ticket
- Workshops
Overview of getting help on the HPC clusters
We are constantly trying to improve our documentation on the clusters,
and we ask that for basic questions on usage and how to do things
that you look at that first. We are developing a FAQ
and we have general
usage documentation. Kindly read this before asking questions via
help tickets.
While the HPC systems staff will try to assist you on just about any
question, we are generally not very familiar with the various applications
used in your research, and therefore cannot always
provide much useful assistance.
Often such questions are best directed at your colleagues. The Division of
Information Technology is trying to find ways to facilitate such collaboration
(suggestions are welcome).
Of course, not all questions are covered (or covered clearly) in the
documentation, and in these cases you should open a
help ticket.
And unfortunately, sometimes there are real hardware, software, or other
problems with the system. While we are sometimes aware of these issues from
our own monitoring, HPCCs are by their nature complicated beasts, and some
issues are not easily detected from monitoring. So if you encounter an
issue that you believe to be of a system nature, please open a
help ticket.
Information to include/tips for submitting tickets
To help us serve you better, when you submit a help ticket, please:
- Open a new ticket for new issues. Replying to email from the ticketing
system will append to the existing ticket, which is appropriate if you are
providing further information, etc. about an ongoing issue. But new issues
need to have their own, new ticket. DO NOT REPLY to a
previous email if you are starting a new ticket. Or at least remove the
ISSUE=
and PROJ=
parts of the subject line.
- Provide a descriptive subject line, succinctly summarizing the problem.
Something like "HPC problems" is NOT very useful to us.
- Include your login name (i.e. your @umd.edu email address)
- Please include the name of the cluster you are experiencing a problem
with (e.g. Deepthought/DT, Deepthought2/DT2, Evergreen, Bswift, etc).
- If there are jobs involved in the issue, please provide at least some of
the job numbers.
- Similarly, provide the full path to the script you used to submit the
job, and for the stdout/stderr error files from the jobs.
- If you received an error message when running a command (not already
in the stderr/stdout files above) please provide the message. The
script
command (see below) might be helpful.
- Please do NOT attach the contents of the files above,
or include them in the body of the message. Just provide the paths to the
files on the system. If you plan to submit the job again with a slightly
modified version of the file, please copy it and give us the name that you
copied it to.
- Please, PLEASE do NOT attach screen shots of text.
Please get the output into a file (see the script
command below to capture output if needed), and give us the path
to that file.
- If you are reporting what seem to be problems with your jobs being stuck
in the pending state in the queue, do NOT delete the jobs unless we tell you
to do so. Deleting the jobs makes it harder to diagnose the problems.
- If you are having connection issues, please include the exact command
you are running, the host you are trying to connect to, the username you
are using (DO NOT INCLUDE PASSWORDS), the approximate time of the failed
attempts (as accurately as you can), and if possible the IP address of the
machine you are trying to connect from. The URL
http://noc.net.umd.edu/cgi-bin/netmgr/whoami will give you that last
piece of information.
- For new tickets, please provide context and complete information. Do
NOT assume that we are aware of matters discussed in other tickets you
submitted. Several people respond to the tickets; the person answering your
current ticket might not have dealt with your previous tickets. We also have
thousands of users; even if the same person dealt with your previous issues,
they might not remember it. In general, you do and should NOT mention previous
tickets, but if you think it is relevant, either give the ticket number (so
we can look it up) or a succinct but complete summary. For follow ups on
the same ticket number, we already have a history of the ticket and so you
do NOT have to repeat things in every response.
Using the script command to record actions
Sometimes when diagnosing an issue, we will ask you to show us exactly
what commands you issued and what they returned. Or, you need to show us
a long complicated error message. An useful tool in these cases is the
script
command; once you issue it,
it will start a new shell and log all
of your input to and all the output from the new shell. This is not that
useful for programs that run in a graphical environment, but provides a fairly
good log for command line processes.
For example, in the following, we log the session to the file
help.script
in my home directory:
login-1:~: script help.script
Script started, file is help.script
login-1:~: date
Tue Oct 21 10:41:07 EDT 2014
login-1:~: module list
Currently Loaded Modulefiles:
1) dept/Glue
login-1:~: ncap2
ncap2: Command not found.
login-1:~: exit
exit
Script done, file is help.script
login-1:~:
login-1:~:
login-1:~: cat help.script
Script started, file is help.script
login-1:~: date
Tue Oct 21 10:41:07 EDT 2014
login-1:~: module list
Currently Loaded Modulefiles:
1) dept/Glue
login-1:~: ncap2
ncap2: Command not found.
login-1:~: exit
exit
Script done on Tue Oct 21 10:42:51 2014
login-1:~:
NOTE: Always remember to exit
the shell started
by the script
command.
And, as in the above example, it can be useful
to print the contents of the file (e.g. with the cat
command) to verify things were properly recorded.
Submitting a help ticket
There are several ways to actually submit the ticket to the UMD
Division of Information Technology:
- To update an existing ticket, you can just reply to a previous email
for that ticket.
- You can open a new ticket by emailing
hpcc-help@umd.edu.
Please provide a reasonable subject line.
- You can submit a new ticket online via the web interface
NOTE: The Division of Information Technology at
the University of Maryland does NOT maintain the
MARCC/Bluecrab HPC cluster. While you are welcome to submit a ticket
to the Division of Information Technology for support with issues
on the Bluecrab cluster, and we will try to assist you, many matters
will require or be more readily solved by
contacting the MARCC support staff. Again, please provide a reasonable
subject line. If you decide to contact both support staffs on the same
issue, kindly:
- send separate emails to marcc-help@marcc.jhu.edu and hpcc-help@umd.edu or blind carbon
copy the help email addresses. Both of these email addresses go into
ticketing systems, and the automated replies can create some minor havoc
when two distinct ticketing systems are included on the same issue.
- inform us in the ticket that you have contacted both groups.
Back to Top