Table of Contents
- Data Transfer Overview
- Transfers on TeraGrid
- Transfers to and from NCSA's Mass Storage
System
- Offsite Transfers
Data Transfer Overview
Some common data-movement tasks include:
- Moving data from a production run off of the scratch file system.
- Transfer to NCSA Mass Storage System (MSS).
(see below).
- Transfer from NCSA cluster to an offsite location
(see below).
- Transfer between NCSA clusters.
- SCP
- Advantages:
- Recursive feature allows simple reproduction of entire directory hierarchies
of files
- Data is transmitted of a secure channel
- Convenient way to transfer source code or other relatively small
files to/from your
/home directory.
- Disadvantages:
- Each individual file is transmitted separately. This transmission
becomes an issue when network latency is high.
- Performance is poor over wide area links due to small TCP window sizes
- File transfers larger than 2GB are not supported.
- Recommendations:
- Use to transfer small files and/or directories containing source code or other relatively
small file sets.
- User for tar directories containing large numbers of files when sending
over high-latency networks.
- FTP
- Advantages: Long-established Internet protocol and therefore widely availiable
and easy to implement.
- Disadvantages: Data is transmitted over an open channel.
Transfers on TeraGrid
It is important to remember that transfers made between TeraGrid sites have the
full complement of TeraGrid tools available, including Globus GSI authentication
and dedicated GridFTP servers at each site. The sites are connected over a
high-bandwidth Wide Area Network (WAN). Within this framework, transfers between
computing centers can be best carried out by utilizing the combined network
bandwidth of several machines at the endpoints of a transfer. For more information
about data transfer on TeraGrid, see the Data:
Transfer Overview page.
Transfers to and from NCSA's Mass Storage System
Connectivity of each cluster into MSS varies. In general, multiple transfer streams
will achieve the best aggregate transfer rates. The following utilities are installed
on all production clusters at NCSA and can be used to transfer data to MSS.
uberftp
- Command line or interative FTP interface.
- Parallel streams can be enabled.
- GSI (grid-proxy) authentication availiable for TeraGrid users.
- Supports third-party transfers and the GridFTP protocol.
mssftp/msscmd
mssftp allows a passwordless interactive FTP session to be initiated from
any NCSA production machine.
msscmd is a command line interface to send FTP commands to MSS.
globus-url-copy
- Command line GridFTP client.
- Newer versions allow striped transfers across mutiple servers.
Offsite Transfers: From NCSA Facilities to Remote Systems
NCSA has eliminated clear text passwords. All outside connections
must be made through SSH or Kerberos-enabled Telnet.
Enabling Passwordless Login via Kerberos
Delegating Grid Credentials to a Remote Workstation
-
To use GSI grid authentication from a remote workstation or non-TeraGrid
cluster, the Globus Toolkit (or
at least a subset therein) must be installed.
-
Grid credentials can then be passed to the remote client machine by using
an existing TeraGrid- or NCSA-accepted X.509 certificate as the initial
proxy.
-
Valid proxies can be issued and stored on a TeraGrid or NCSA
MyProxy server then deligated to
a remote system. Refer to
NCSA's MyProxy Server page for instructions on configuring a local installation
of MyProxy to connect to the NCSA server. Note: MyProxy is included in the
Globus Toolkit.
-
Once a valid proxy certificate exists on a correctly configured host, GSI authentcation
tools will automatically connect to hosts for which the user has been granted access.
Offsite Transfer Examples
SSH
The following transfers were performed from a Linux workstation outside of
the NCSA domain. A valid NCSA-issued Kerberos ticket was obtained by running kinit thus
enableing secure passwordless access to NCSA HPC resouces.
Copy a local directory sturcture via streaming tar onto NCSA TeraGrid.
$ tar -cf - tst/ | ssh user@tg-login4.ncsa.teragrid.org "tar xf -"
Copy a local directory into a tarball on Tungsten cluster.
$ tar -cf - tst/ | ssh user@tuna.ncsa.uiuc.edu "cat > tit.tar"
FTP
With a valid NCSA Kerberos ticket, users can enjoy
passwordless access to the NCSA Mass Storage System from a remote workstation.
Performance
Check with the network administrator of your local site for connectivity details
and possible firewall and/or network bottlenecks that can lead to unexpected
or inconsistent network bandwidth or functionality. Transfers can only take
place as fast as the slowest component in the network chain.