Transition from r43sgao or NIC to the newest Linux cluster: Forge

Click here for remaining issues.

0. Logon to Forge using Putty

If this is your first time to logon to Forge using Putty:

1. Turn on Xwin-32 or Xming (for graphics)
2. On your PC, find and double click on Putty
3. Find "r43" and highlight it.
4. Click "Load"
5. Change the Host Name to "forge.mst.edu"
6. Click on "Save" and put "Forge" under "Saved Sessions"
7. Double click "Forge" to bring up the login window and login.
8. Copy cshrc.sgao from /share/apps/sgao/ to your home directory using
  /bin/cp   /share/apps/sgao/cshrc.sgao   ~/.cshrc   (note the dot before cshrc).
9.   /bin/cp   /share/apps/sgao/aliases.sgao   ~/.aliases   (note the dot before aliases).
10. Type /bin/cp   ~sgao/.gmt*4   .
11. Check which shell you are using by typing echo   $0
12. If the result of the previous command is not "-tcsh", change your default shell using the following steps:
  chsh
  (and enter your password)
  /bin/tcsh
13. Type /bin/tcsh

If this is NOT your first time to logon to Forge using Putty:

Only do steps 1, 2, and 7.

0.5. Load the "sgao" module

All the packages that we routinel use (SAC, GMT, etc.) are now in a single package called a "module".
To use them, every time after you logon to Forge (or when you open a new window),
you need to type "ml   sgao" to load the modules (including SAC, GMT etc.; ml stands for "module load").
If you add "ml   sgao" at the end of your ~/.cshrc file, you will no longer need to type "ml sgao" every time, but this will slow down some jobs.

1. Re-compile a Fortran code on Forge:

Most Fortran codes compiled on r43sgao or NIC can NOT be executed on Forge. Therefore, they have to be re-compiled.
To do so:

vi   Makefile
If there is a "-m32", change it to "-m64"
Change "f77" or "g77" to "gfortran " and add " -std=legacy   -ffixed-line-length-90   -fbackslash " if any of them is missing.
(Note 1: the required space before the dashes.
Note 2: It is VERY important to add -ffixed-line-length-90; If you don't, codes may give you wrong results, even if they can compile and run)
Make sure that if /cor1/sgao is in the Makefile change it to /home/sgao

For example, change
g77   -g   -m32   v03.f   -o   evselect.exe
to
gfortran   -g   -m64   -std=legacy   -ffixed-line-length-90   -fbackslash   v03.f   -o   evselect.exe
(Note that a "Tab" is needed in front of gfortran).

Save Makefile
Type "touch   *.f"
Type "make" to compile.

Note: The new compiler (gfortran) does not like "Tab" in a continued line of a Fortran program.
If you got a warning message like "Warning: Extension: Tab character in format at ...",
vi the Fortran code and change the "Tab" to a "Space bar".
(Note: A warning message is not fatal. So this change is not essential).
Then re-compile by typing "make".

If other issues occurred, look at the error messages carefully,
vi the Fortran code, go to the problematic lines, and make the changes.
Then try to re-compile again by typing "make".

Note for those who are using the "original" (that is, without modifications) programs under /home/sgao/progs:
You can simply copy the corresponding *.f and Makefile under the right dir in /home/sgao/progs to your corresponding directory,
and type make to compile. All the programs and Makefiles under /home/sgao/progs have been tested.

2. Run jobs under interactive mode:

After you logged on to Forge, you are on the "login node" which is basically a workstation that all the users logon to.

You can run jobs that do not require a lot of CPU time on the login node such as:
vi, checking SWS measurements in 4c*, checking H-k results, writing and compiling GMT, Fortran, and other programs, running short (e.g., less than a few hours) programs etc.
There is no time limit on the login node (that is, you will not be kicked out after 7 days; see below).

But IT does not like long-running jobs to be ran on the login node because this will slow down the login node and affect other users.

To use one of the many (hundreds) non-login nodes, type
sinteractive   --time=168:00:00   --cpus-per-task=2

(Note 1: I've made an alias called "run2" for the above command, so if you type "run2", you are in the interactive mode;
Note 2: You may need to wait for tens of seconds before srun is waiting for the "queue".)

The "168" above is the maximum number of hours your job can ran. The maximum is 168 hours (7 days).
The "2" in "cpus-per-task=2" means you can use 2 CPUs. For some of the MultiJob programs (MTZ, H-k stacking),
you may want to increase this number (to 10 or even larger, to an maximum of about 80).
To get out of the assigned node and get back to the login node, type "exit".

3. Run jobs in the background (Batch mode)

You can run all the yy* and zz* command files and other programs that do not require human input using the batch mode.
Because the maximum amount of allowed time is 7 days, you may need to split your job into
several and submit them separately if you think that they may take more than 7 days.

You can submit batch jobs from the login node or any other node.

Steps:
copy /home/sgao/demo/00_batch/run.batch to the directory where your command file resides.
Then vi run.batch to change yy_Do_all.cmd to the actual name of your command file or program.
You may also want to change the job-name to something that you can recognize (e.g., Yellowstone-MTZ; no space allowed).
To submit your job, type:
sbatch run.batch

To monitor the progress, type
sview % (note there is a space before %), click on Jobs, and then sort by UserID.

In the folder where you sent your job, there should be a Forge-????.out file where ???? is the job number.
This file holds info about your job. Use the "more" command to see the content of this file (e.g., more Forge*out).
To make sure that this file is related to your current job, use " ls -l Forge*out " to find out the date/time of creation.
More than one such files may exist, because one is created every time you use sbatch. So make sure that you are looking the right one.

4. Mount your W-drive (that is, your web space):

This can only be done on the Login node. Simply type:
"mountdfs" and enter your password.
To cd to your W-drive, type "cd   /mnt/dfs/$USER/userweb/$USER" or simply "www" (because I have made an alias)
Note: The W-drive will be off-line if you are idle for more than about 5 minutes. In this case, mountdfs again.

5. Where are my files?

Your files under your "home directory" on r43sgao, which was named /nethome/users/your_username,
are now under /share/geodatd/users/your_username.

Your files under /home on r43sgao are now under /share/geodatc/your_username.

Your files under /share/geodata on r43sgao are now under /share/geodata. (no name change)

Your files under /share/geodatb on r43sgao are now under /share/geodatb. (no name change)


For more information about Forge, click here