Showing posts with label globus. Show all posts
Showing posts with label globus. Show all posts

Thursday, September 17, 2015

Globus endpoint sharing available to UM researchers

We have described in a number of Blog posts some features and benefits of using the the Globus File Transfer service.  Now that UM is a Globus Provider you have a new feature available to you, sharing of directories and files with your collaborators who are Globus users as well.

There are two avenues of sharing possible for you now. The first is via standard server endpoints that have sharing enabled and another via "Globus Connect Personal" client endpoints. Today I will describe sharing for standard servers endpoints only. Sharing for Personal Connect endpoints is a bit more complicated due to differences between OS versions of the client and will be described later.

To see if the endpoint you use has sharing enabled navigate to the endpoint in "Manage Endpoints" within the Globus web interface.  Click on the Sharing tab, note that you may have to Activate (login) a session on the endpoint first.  If sharing is enabled you will be told so and will see a "Add Shared Endpoint" button in the panel.  Shared endpoints are essentially sub-endpoints you can create and provide access to any other Globus user.

Lets go ahead and make a shared endpoint from umich#flux by clicking on the button.  You are presented a web form to provide required information:

Host Path  ~/Test_share

You can either give a complete absolute path or use unix shorthand (~/) for my home directory as I have done (make sure the shared directory exists first!).

New Endpoint Name   traeker#Test_share

Description Tell others know what this is about.


Clicking on the "Create and Manage permissions" button creates the shared endpoint and presents you with a new panel to manage permissions. It shows you the current access setting and clicking on the "Add Permission" button presents you with a number of options of how to share this endpoint with other Globus users.

Share With     check which one to use among  email, user, group, all users

Permissions  check one or both of read, write


A couple of things you to keep in mind as you set these parameters:
  • Be careful about choosing all users as this will allow all users logged into Globus to access this share.
  • By default only read permission is set. If you allow write permission you could get files containing viruses and also get yourself into trouble with any disk usage quotas.


One easy way to manage permissions to a large group of people is to create a Globus group and populate it with users.  Be advised that the entire group will have the same permissions so if you need some users to have different permissions, you either create a different group or add each user to the share individually. Using groups comes in handy when you have multiple shared directories to similar sets of collaborators.

Once a directory is shared with another Globus user he/she can find that endpoint name via the "shared with me" filter on the top of the endpoint list panel.  With name in hand they can now transfer files from/to that endpoint by typing in the name under the "Transfer Files" screen just like another other endpoint they have access to.

You can go back to this shared endpoint to add new or edit any access settings.


Globus endpoint sharing is very powerful as it gives non-UM collaborators access to your research data without having to create a UM "Sponsored account" for them to access your systems.  This is very similar to other cloud file sharing services like Box and Dropbox.  The big difference is that Globus does not store the data and thus quotas are managed by your systems policies.



Sunday, April 26, 2015

Filetransfer Tool Performance

On the HPC systems at ARC-TS we have two primary tools for transferring data, scp (secure copy), and Globus (GridFTP).  Other tools like rsync and sftp operate over scp and thus will have performance comparable to that tool.

So which tool performs the best?  We are going to test two cases each moving data to the XSEDE machine Gordon at SDSC. One test will be for moving a single large file, the second will be many small files.

Large file case.

For the large file we are moving a single 5GB file from Flux's scratch directory to the Gordon scratch directory.  Both filesystems can move data at GB/s rates so the network or tool will be the bottleneck.

scp / gsiscp

[brockp@flux-login2 stripe]$ gsiscp -c arcfour all-out.txt.bz2
                             gordon.sdsc.xsede.org:/oasis/scratch/brockp/temp_project
india-all-out.txt.bz2                     100% 5091MB  20.5MB/s  25.6MB/s   04:08   

Duration: 4m:08s

Globus 

Request Time            : 2015-04-26 22:41:04Z
Completion Time         : 2015-04-26 22:42:44Z
Effective MBits/sec     : 427.089

Duration: 1m:40s  2.5x faster than SCP

Many File Case

In this case the same 5GB file was split into 5093 1MB files.  Many may not know that every file has overhead, and that it is well known that moving many small files of the same size is much slower than moving one larger file of the same total size.  How much impact and can Globus help with this impact read below.

scp / gsiscp

[brockp@flux-login2 stripe]$ time gsiscp -r -c arcfour iobench
                             gordon.sdsc.xsede.org:/oasis/scratch/brockp/temp_project/
real 28m9.179s

Duration: 28m:09s

Globus

Request Time            : 2015-04-27 00:18:40Z
Completion Time         : 2015-04-27 00:25:30Z
Effective MBits/sec     : 104.423

Duration: 7m:50s  3.6x faster than SCP

Conclusion

Globus provides significant speedup both for single large files and many smaller files over scp.  The result is even more significant the smaller the files because of the overhead in scp doing one file at a time.