Data Access, Processing and Transfer

Automatic Data Processing

If the "start mosflm" box is ticked in the Characterisation Data Collection Tab of the CrystalControl GUI, then the automatic stategy calculation is enabled and Mosflm runs in the background.

If the "start XDS" box is ticked in the Standard Data Collection Tab of the CrystalControl GUI, then the automatic data processing is enabled and XDSAPP runs in the background.

The results of the data processing is twofold:

  • pre: as for a preprocessing, limited to all the frames included in the first angles: 5, 15, 45°...etc.
  • full: as for a full data processing, i.e. including all the frames

Results of automatic data processing are available within the internet browser at the beamline as well as the one in the remote session during your beamtime.

 

Manual Data Processing

Prerequisites

Local users can perform manual data processing as described at the beamline. Users having remote access can also process their data only if they are through their DESY Science Accounts. Users may apply for science accounts to analyze data measured at Petra III or Flash on resources located at the DESY computer center. Science accounts have a standard lifetime of 3 years (renewable).

Applying for a Science Account:

  1. the user submits a request to the beamline scientist/manager
  2. the user receives a pre-filled form by Email within a few business days from the DESY-FS Administrator
  3. the form has to be signed and sent back to the Administrator (hard-copy as scan/photo by email/fax in pdf format)
  4. the user receives the initial credentials from the beamline scientist/manager who forwarded the account request. If on site at DESY, the user can authenticate him-/herself by valid ID card or passport at FS-EC's Administrators to retrieve the initial password for the account.

Multi Factor Authentication (MFA) at DESY:

For accessing the DESY computers a two-factor authentication is needed. After you have received your Science Account, you will need to setup an Authenticator App to your smartphone. You can find the instructions here.

 

Accessing Data

There are three methods to access DESY computers from outside using a scientific account:

  1. Command line:
    in a Linux terminal or Cygwin (windows) type:
    • ssh username@max-fs-display.desy.de
    • enter your Science Account password
    • enter your 2FA token
    • cd /asap3/petra3/gpfs/p11/YEAR/data/YourBeamTimeID
  2. ​Graphical User Interface:
    in a web browser type:
    • https://max-fs-display.desy.de:3389/
    • enter your Science Account credentials
    • enter your 2FA token
    • launch e.g. an XFCE session
    • open a terminal
    • cd /asap3/petra3/gpfs/p11/YEAR/data/YourBeamTimeID
  3. Installing and using the FastX3 client:​
    • visit Downloads DESYcloud. Login with your DESY account.
    • click on the desycloud link next to the starnet entry. If you get to an empty page, just reload the page.
    • go to StarNet_FastX3/client/3.3 (or the highest version listed) and download the package matching your OS
      1. FastX-*-setup.exe and FastX3.msi are regular windows installer
      2. FastX-*-setup_nonroot.exe is a "portable" windows installer, e.g. can be placed on a USB stick
      3. FastX3-*.dmg is the MacOSX installer
      4. FastX3-*.rhel7.x86_64.tar.gz is a tarball for Linux, works also for Ubuntu or Debian (possibly requiring some additional packages)
    • install and launch FastX3
    • configure a new https connection (host: max-fs-display.desy.de, port:3389, auth: SSH)
    • double click on the new connection
    • enter your 2FA token
    • launch e.g. an XFCE session
    • open a terminal
    • cd /asap3/petra3/gpfs/p11/YEAR/data/YourBeamTimeID

Pros & cons:​

  • ssh: simple, convenient, but no GPU acceleration and X11 over ssh is slow. Sessions are not persistent.
  • browser: simple, fast,works everywhere, GPU hardware acceleration, but has some limitations (copy&paste, browser capturing key shortcuts). Supports session sharing
  • client: fast, works best, but requires installation of a client
  • browser and client offer persistent sessions: when disconnected your session continues to run without any client connected. Reconnecting to the display server gives access to the running session.
 

Resources Usage and Constraints

Please read the following instructions carefully:

  • the login nodes (display nodes) are interactive and shared nodes, which means users must not run intense jobs on those machines. i.e. NO USE of more than 4 threads / 40 GB RAM for processing
  • these nodes are for small jobs, code development, job submission, short running applications with a low cpu and memory footprint, and graphical applications requiring GPU acceleration
  • the nodes are shared between all users i.e. whatever a user does will affect other users working on these machines as well
  • demanding compute jobs have to be executed in the compute cluster based on SLURM typically as batch jobs (i.e. without GUI)
  • the resources are for work on data taken at DESY Photon Science (i.e. PETRA III or FLASH) or for work for DESY
  • user follows the computing rules as agreed by signing the form to get an account, i.e. in view of local software deposit etc.
  • these resources cannot be used for any kind of commercial purposes
  • misuse of the resources can/will lead to an account revocation
 

Data processing

The processing folders contain two folders, 'xdsapp' and 'manual'. 'Manual' contains a template for manual processing having all the correct parameters. However, to shorten the processing time and make use of the computational structure, reprocessing your data on the maxwell nodes should be done by using a script to queue the jobs in slurm:

  • template scripts can be found at /asap3/petra3/gpfs/common/p11/processing_scripts/
    • xds_users.sh                  For external users
    • xds_inhouse.sh              For in-house users
  • differences on the scripts are on the partition (i.e. pscpu should be used for internal users of Photon Science and psxcpu for external users) and the xds version used for processing
  • copy the script to the folder where you run the processing
  • launch the script by typing sbatch xds_name_of_the_script.sh (e.g., sbatch xds_users.sh  for external users)
  • the script will find a free node for your job to run fast
  • the xds-log you get in a file xds-job-###.out.

Processing Serial Crystallography data by nxds should be done via a script to queue the jobs in slurm:

  • a template script (nxds.sh) as well as an example of input file (nXDS.INP) can be found at /asap3/gpfs/common/p11/processing_scripts/
  • the partition in the script is defined psxcpu for external users (pscpu should be used for internal users of Photon Science)
  • copy the script and the nXDS.INP to the folder where you run the processing
  • the nXDS.INP file needs some editing to fit your data path and data collection parameters (wavelength, detector distance, oscillation...)
  • the processing is launch by typing sbatch nxds.sh

If manual processing through a terminal or GUI is absolutely necessary:

  • allocate a node to yourself by typing in terminal: salloc --node=1 --partition=psxcpu --time=xx:xx:xx
  • ssh to the node allocated to you
  • load the modules of the needed crystallographic softwares on the node (instructions below)
  • HOX! terminate the allocation after usage by typing exit twice on your terminal window

 

Crystallographic and Scientific Software Packages

Software packages might be available out-of-the box as system installs or as part from the OS default repository, others are available via module functionality. A description on how to enable/use a particular software package is given on confluence.

Crystallographic software packages are in the Photon Science section.

In a terminal type (as examples):

  • module avail # to check what programs and versions are available
  • module load xray # sets the environment for old xds and helper GUIs such as XDSAPP
  • module load maxwell xds # sets the environment for new xds
  • module load ccp4/7 # sets the environment for ccp4 version 7
  • module load phenix # sets the environment for phenix

Once the correct module is loaded, the program can be called via the command line by typing its shortcut or alias (xdsapp, phenix, albula....etc)

 

 

Data Transfer

 
  1. Data are transferred locally at P11 via Rsync. The Local Contact sets up the backup on the users' external Hard Drive.
    • For Remote Session Users, an external HDD can be sent with the Dewar.

  2. Data are also available for download via the Gamma portal as it is described in here.

  3. A third option is to use Globus Online: this is available exclusively for Users with Science Accounts and an Institute Subscription to the Globus Group:
    • check with your local administrator if your institute is a subscriber of the Globus Group. If so, log in into Globus Online by choosing your Organization and Identity Provider
    • create an Endpoint on your machine by downloading the suitable "Globus Connect" for you operating system
    • choose DESY in the "Collection Search" and browse to your Beamtime folder
    • start transferring data in the file manager section - further instructions can be found on FS-EC's confluence page