New on Bridges
- Outbound connections enabled for interactive jobs
- Beginning June 1, 2017, GPUs will be allocated as a separate resource
- An updated scratch file system named pylon5 will replace pylon1
- The walltime limit for the LM partitions is increased to 14 days
Outbound Connections for Interactive Jobs
A new option, --egress, has been enabled for interactive jobs. This allows you to monitor your Bridges jobs on your local machine. To enable outbound connections, add --egress to your interact command:
interact --egress ...more options...
More information on the interact command is found in the Running Jobs section of the Bridges User Guide.
Allocating for GPUs Separately
From Bridges' beginning, users with a Bridges Regular Memory allocation have had access to Bridges' GPU nodes. Due to high user demand, the GPU nodes will be allocated as a separate resource beginning on June 1, 2017. Starting in June, users with existing Regular Memory allocations will not have access to the GPU nodes. Only users with a Bridges GPU allocation will be able to use them. You will receive notification of the exact date for this transition as it nears.
If you have a current Regular Memory allocation on Bridges that ends later than June 2017 and you wish to use the GPU nodes, you can request a transfer of some of the Service Units (SUs) in your Bridges Regular Memory allocation to a Bridges GPU allocation.
When you apply for an allocation that will start on June 1, 2017 or later, you will be able to request the Bridges GPU resource when submitting your proposal.
Transferring part of a Regular Memory allocation
Beginning in early May, you can transfer some of your Regular Memory allocation to a GPU allocation. To do this, submit a transfer request through the XSEDE User Portal. Instructions for submitting a transfer request are found here: https://portal.xsede.org/knowledge-base/-/kb/document/avva
Charging for GPUs
Bridges contains two kinds of GPU nodes: NVIDIA Tesla K80s and NVIDIA Tesla P100s. Because of the difference in the performance of the nodes, the charges will be different for the two types of nodes.
The K80 nodes hold 4 GPU units each, each of which can be allocated separately. Service units (SUs) are defined in terms of GPU-hours:
1 GPU-hour = 1 SU
Note that the use of an entire K80 GPU node for one hour would be charged 4 SUs.
The P100 nodes hold 2 GPU units each, which can be allocated separately. Service units (SUs) are defined in terms of GPU-hours:
1 GPU-hour = 2.5 SUs
Note that the use of an entire P100 node for one hour would be charged 5 SUs.
For more information on Bridges' GPU nodes, see https://www.psc.edu/index.php/bridges/user-guide/gpu-use.
For information on running a job on Bridges, including how to submit a job that uses GPUs and how to select the type of GPU node you will use, see https://www.psc.edu/index.php/bridges/user-guide/running-jobs.
Updated Scratch File System, pylon5
An upgraded scratch file system, named pylon5, will be available on Bridges as of March 7th to replace pylon1. pylon5 will serve the same purpose as pylon1 has, specifically:
- It is fast, temporary storage for running jobs
- Files are wiped after 30 days
- It is not backed up
Be aware that any job scripts which specifically reference /pylon1 must be edited to reference /pylon5 instead.
You are responsible for moving your files from pylon1 to pylon5. You will have from March 7 to March 30 to do this. The pylon1 file system will be decommissioned on April 4, 2017, and any files remaining there will be lost.
cp command to move your files.
Moving your pylon1 files to pylon5
Substitute your groupname and username for "yourgroup" and "yourusername" in these examples.We recommend using
rsyncto copy your files from pylon1 to pylon5. An advantage of
cpcommand is that if the transfer gets interrupted, running the same
rsynccommand again only copies those files that were missed (or partially copied) in the first try.
We also recommend that you do the file transfer via the
srun command. This will avoid deterioration of service on the login nodes, and will capture any errors in the job output file.
To transfer all of your pylon1 files, use this command:
srun -p RM-small rsync -av /pylon1/yourgroup/yourusername/ /pylon5/yourgroup/yourusername/
If you want rsync to calculate checksums of the files in both places as additional validation, add the -c option:
srun -p RM-small rsync -avc /pylon1/yourgroup/yourusername/ /pylon5/yourgroup/yourusername/
More information and options to the rsync command can be found by typing
Time limit on Large Memory Partition Increased
In response to demand from users taking advantage of Bridges Large Memory nodes, the walltime limit has been extended to 14 days in the LM partition. For information on how to run jobs in the LM partition, see https://www.psc.edu/bridges/user-guide/running-jobs