How to run calculations on "Blokhin" cluster

Cluster Overview
Blokhin is 234-core high-performance computing cluster with peak computing power 10.5 TFlops. The cluster is focused on solving labor-intensive tasks requiring high processor computing power and large amounts of RAM.
 23 nodes (h1-h23) based on 6-cores processors Intel Original Core i7 X6 5930K @3.5 GHz (Haswell), 64 Gb DDR4
 
3 nodes (b1-b3) based on two 8-cores processors Intel® Xeon® X8 E5-2667v2 with core clock 3.3 GHz, 384 Gb DDR3
 2 nodes (b4-b5) intended for very demanding tasks requiring RAM. Two 8-core processors Intel Xeon E5-2620 v4 work on these nodes with maximal core clock 3.0 GHz. RAM of b4 node is 768 Gb, b5 node - 640 Gb.

The main cluster control node is equipped with two 8-core processors Intel Xeon X8 E5-2667v2. Software Development Kit Intel Parallel Studio XE 2015 Cluster Edition for C++ and Fortran allows us to achieve the highest performance and stability in the work of programs, compared with OpenSource compilers.

Cluster connection
Management Computer Address: hpc.nano.sfedu.ru    IP-address:  195.208.250.3
To run the calculations, the ssh protocol is used (for example, the Putty program). For file exchange, the ftp protocol is used (for example, Total Commander or WinSCP). Access is by pre-received login and password.
It is possible to access via WebHPC cluster management system 

Installed Software
The cluster has licensed software for performing quantum mechanical calculations. 
FDMNES - finite difference method for calculating x-ray absorption spectra http://neel.cnrs.fr/spip.php?article3137. The approximation used: density functional theory, the calculation is carried out in direct space on the grid around the absorbing atom.
VASP5.4 - pseudopotential method for calculating the electronic structure, geometric optimization and molecular dynamics of periodic structures https://www.vasp.at. The calculation is carried out in reciprocal space (zone method), using the expansion of the wave function in plane waves in the framework of the theory of the density functional. It is possible to take into account amendments LDA + U, GW.
USPEX - method of evolutionary algorithms for the search for stable structures with minimal free energy http://uspex.stonybrook.edu/uspex.html. It has an interface with the VASP program. As input structural data, it is sufficient to specify only the stoichiometry of the desired compound.
ADF - molecular orbitals method in the framework of the electron density functional theory for calculating the electronic structure, excitations and geometric optimization of molecules and nanoclusters. Our laboratory has software for calculating absorption and emission spectra based on wave functions calculated in ADF (ADFEmis).
Wien2k - full potential method  for calculating the electronic structure, spectra and geometric optimization of periodic structures. The calculation is carried out in reciprocal space (band method), using the expansion of the wave function in plane waves and localized orbitals in the framework of the theory of the density functional. It is possible to take into account the spin-orbit interaction, LDA+U, noncollinear magnetism.
ORCA - molecular orbitals method in the framework of the electron density functional theory  for calculating the electronic structure, excitations and geometric optimization of molecules and nanoclusters. In addition to DFT, various multi-electron approaches are implemented - MP2, CC, CI, etc.
MOLCAS - molecular orbitals method in the framework of the electron density functional theory for calculating the electronic structure, excitations and geometric optimization of molecules and nanoclusters. In addition to DFT, various multi-electron approaches are implemented - MP2, CC, CI, etc.
ABINIT - pseudopotential method for calculating electronic structures, geometric modification and molecular dynamics of periodic structures http://www.abinit.org The calculation is carried out in reciprocal space (zone method), using the expansion of the wave function in plane waves in the framework of the density functional theory. It is possible to take into account amendments LDA + U, GW, DMFT.
Quantum ESPRESSO (web-page) - a package of quantum-mechanical calculations in the basis of plane waves using separable norm-preserving (Hamann-Schlüter-Ching, Truller-Martins and other types) or ultra-soft pseudopotentials, as well as the PAW method. The possibility of using exchange-correlation functionals of various types is provided: from the local density approximation (LDA), to the generalized gradient approximation (GGA) in different authors publications. To work with QE, there is an extensive library of pseudopotentials of various atoms. In addition, pseudopotentials with the necessary parameters can be obtained using the ld1 program package.
Anaconda - a set of various packages for python 3. To run the proga.py script on a python cluster from anaconda, you need to use the command: run-cluster  parametersRunCluster  "/opt/anaconda/bin/python -u proga.py"
XTLS - program for calculating the absorption and emission spectra by the method of theory of multiplets in the ligand field (MLFT - multiplet ligand field theory). 


To work with graphical shells of programs on a cluster, you have to install the XMing program on the local computer. After installation, launch XLaunch - select the program launch - specify the cluster IP, your username and password. In the window that opens, run the commands:
xcrysden - graphic editor for Wien2k
molcas MolGUI - graphic editor for MOLCAS

General set of commands for managing calculations
run-cluster parameters of "command"   -   put the task in the execution queue on the cluster (run-cluster -h  - help output)
queue   -   view task queue
scancel JobId   -   break up the task
sinfo  -  information about free and loaded nodes
queueMem  -  view the task queue with information about the used memory
queueHist  -  view task history
scontrol show job JobID  -  viewcalculation details (for example, in which folder is running)
du -sh folderName/ - check the amount of disk space used by the directory folderName
multirun parameters of "command"  - batch job launch; allows you to set a set of parameter values and run a command for each

Parameters run-cluster
  -l, --mpi=                                MPI library. Values: openmpi, intelmpi, no.  By default - intelmpi
  -n, --ntasks=                          The number of MPI processes. By default 1
  -t, --threads-per-task=           The number of threads (for example, OpenMP) per MPI process. By default 1
  -m, --mem-per-task=             The amount of reserved memory per 1 MPI process in megabytes. By default 2000
  -j, --job-name=                      The name of the task. By default - the executable command.
  -e, --email=                           Email to send status change messages. Not sent by default
  -w, --node-list=                      A list of nodes to run the task. Example 1: h2      Example 2:   h[1-5,7]    By default - select automatically
 --same-node                         It run the task on only one node, without breaking the kernels between neighboring nodes.

Examples:
run-cluster    -n 6    -j myjob    -e gudasergey@gmail.com     fdmnes_01
run-cluster    --mpi=openmpi    -n 4    -m 10000    -w big   "proga1 -k 3 file.csv"

Commands  of running FDMNES
run-cluster parameters fdmnes_versionNumber
Numbers: 00, 01, 02, 03, ...
For example, command
run-cluster fdmnes -m 7500 -n 6 -j M034 -e yourmail@mail.ru
It will start the calculation with the latest version of fdmnes on six cores, while 7.5 gigabytes will be allocated for each core, the name of the task in the queue will be M034, and the mail address yourmail@mail.ru a letter of calculation status will come.

VASP run command
vasp   --kpar=..   --npar=..   --ncore=..   --maxmem=..   [other options] 
The total number of parallel MPI processes will be equal to the product kpar*npar*ncore
Parameter List.
mandatory:
  -k, --kpar=<i>            the number of k-points calculated simultaneously
  -b, --npar=<i>           number of zones calculated simultaneously
  -c, --ncore=<i>         the number of parallel streams that simultaneously calculate one orbital
  -m, --maxmem=<i>   The amount of memory reserved per 1 MPI process in megabytes
optional:
  -j, --job-name=<s>    The name of the task. Default is the name of the current folder.
  -e, --email=<s>         Email to send status change messages. Not sent by default
  -s, --same-node        Run the task on only one node, without breaking the kernels between neighboring nodes.
  -w, --node-list=<s>    The list of nodes to run the task. Example 1: h2      Example 2:   h[1-5,7]     By default - select automatically
  -a, --wait                    Wait for the task to complete (required when starting via multirun)
  -h, --help             

vaspFull NPAR NCORE MEMORY(Mb) old command for calculations with spin-orbit interaction and noncollinear magnetism

The NCORE and NPAR cards in the INCAR file will be ignored and rewritten.

User manual (in English) is available on the web site: http://cms.mpi.univie.ac.at/vasp/vasp/vasp.html Starting multiple single-point calculations i.e. when it is necessary to calculate some characteristic of the system while repeatedly varying one of the parameters: 
run-cluster -l no -m Memory -n 6 -w Node mvasp. 

 USPEX run commands
   1. In the INPUT.txt file, specify the vasp NPAR NCORE MEMORY (Mb) command to start VASP (you need to substitute the necessary values instead of NPAR NCORE MEMORY). whichCluster set to 1. Set the allowed number of simultaneously considered vasp-tasks: numParallelCalcs
   2. Run calculation (in background)
        USPEX  -r -o >  output.txt  2>&1  &
The main USPEX script will work as a regular program on the main node, vasp - as slurm tasks on the cluster. To terminate programs with all called routines in Linux, use the rkill command: rkill process_id. You can find out the process id using findUSPEX. If you started background processes, then you need to exit the terminal using the exit command (and not the cross on the window).

ADF run commands
link to ADF manual
ADFGUI generates three files: * .adf, * .pid, * .run when you save input file. To run through the command line, you need to copy to the cluster a file with the * .run extension, for example, N2_freq.run.
To start the calculation, you need to execute the commands in the directory with N2_freq.run:
convertRun2Job N2_freq.run
run-cluster ./N2_freq.job
by default, calculations are started in the latest version of the ADF program. To start the calculation in the old version, you need to change the environment variables in the terminal with the command:
. gotoADF2014    or   . gotoADF2016    or   . gotoADF2017
to return to the old environment variables, just close the terminal.

link on the GUI configuration manual for launching tasks on the cluster
*kirlom: addition to the manual. In the queue parameters  Run command is better to write in follow format:
run-cluster -n 6 -e your@email.com -j $jobname -m 8000 "$job"
Then the task name will coincide with the name of the calculation file, that is convenient when viewing the list of tasks on the cluster, and a notification about the start and completion of the calculation will be sent by e-mail
*guda: and if you do run-cluster -n 6 -e your@email.com -j $jobname -m 8000 "$job" --same-node, then it will not break the task between two nodes. Note that sometimes large tasks crash and you need to specify more RAM.

To start a new version of ADF2017, change the run command in the ADF jobs task queue:
bash -c "source gotoADF2017; run-cluster -n 2 -m 8000 -j $jobname '$job' --same-node"

Wien2k run commands
Initialization of calculations is carried out through the web-interface. To run it, you need to go to your account via putty. We type the w2web command and select the port number (I will write busy for those who consider Guda = 7890(443), Kravtsova = 7891, Bugaev = 7892, Pankin = 7893). Others choose numbers in order and write here. After you set the port number through the browser, go to the interface by typing the address http://hpc.nano.sfedu.ru:7890 (only at the end write your number).

After initialization, DO NOT START the calculation from the graphical shell. Through putty we go to the calculation folder and un the usual run-cluster command without using mpirun (i.e. with the -l no option). The ".machines" file is automatically generated so that lapw0 runs on the first selected node, the rest on all nodes in parallel at k-points.
Example command to run on 1 core:
run-cluster -n 1 -l no 'run_lapw -i 30 -cc 0.0001'

Clarification: i - maximum number of iterations of scf cycle, -cc - criterion of charge convergence. You can use the -ec criterion 0.0001 - this is the total energy convergence. Or both at the same time. For spin-polarized calculations, write runsp_lapw ... For calculations with spin-orbit interaction, you need to perform a standard calculation, save the results with save_lapw nonrel, then initialize the parameters of spin-orbit interaction with initso_lapw, and start the calculation with the run_lapw -so parameter. 

Example command for parallel calculation on 6 cores:
run-cluster -n 6 -l no 'run_lapw -p -i 30 -cc 0.0001'
Geometric optimization launch command example:
run-cluster -n 6 -l no -j name 'min -s 1 -j "run_lapw -p -I 80 -fc 1.0 -it" '

To calculate the spectra, several commands must be run sequentially. In order not to wait for the execution of each subsequent command in the directory, create a file called wienscript with this content (example of spin-polarized calculation with spin-inhabited interaction):
#!/bin/bash
x lapw1 -up -p -c
x lapw1 -dn -p -c
x lapwso -up -p -c
x lapw2 -up -c -so -qtl -p
x lapw2 -dn -c -so -qtl -p
run it by command: run-cluster -n 6 -l no -j JOBNAME --same-node ./wienscript

ORCA run commands
orca nprocs memoryPerThread InputFile

MOLCAS run commands
run-cluster -n NCPU   molcas   INPUT_file

ABINIT run commands
run-cluster -n NCPU -l no 'abinit <tbase1_x.files>& log'

Quantum Espresso run commands
It is recommended to run calculations on one node:
run-cluster -n 6 -m 10000 --same-node "pw.x -input scf.inp scf.out"
run-cluster -n 6 -m 10000 --same-node "ph.x -input ph.in ph.out"
or to run on a large node:
run-cluster -n 16 -m 40000 --same-node "pw.x -input scf.inp scf.out" -j yourcalc_scf
run-cluster -n 16 -m 40000 --same-node "ph.x -input ph.in ph.out" -j yourcalc_ph

XTLS run commands
export HOST="main" - specify that we will run the calculations on main
x925 NiOsdMat - xtls run command
xc < spcana.x - XANES spectrum calculation command
xc < rspc_scav2.RIXS.x - RIXS spectrum calculation command

Batch tasks launch
Command multirun allows for a given set of parameter values ​​to automatically change their values ​​in the input files and program parameters, and run these programs. To then find the resulting data in the output files and collect the obtained values ​​in the table result.txt, you need to use the command multirunCollectResults. Алгоритм использования пакетного запуска заданий:
1) Replace the changing values in the input files (or program parameters) with unique parameter names. For example: atomN_coordX, atomN_coordY.
2) Run multirun with parameters:
   --ntasks - number of tasks that can be performed simultaneously (default: 1)
   --set - set of parameter values. Examples: param = 3.5, 7, 20, 123.7,... - list of values
              param = 20:0.1:50 - range specified by the start value, step and end value
              param = 100, 200, 250:10:300, 305, 308:0.5:320, 320.1 - mixed method (ranges and individual values)
              [param1; param2] = [набор1; набор2] - Cartesian product of two or more sets. Example:
              [atomN_coordX; atomN_coordY] = [-5:0.1:5; 0:0.5:10,15,30]
   --cont, --no-cont - continue computing or not(default: yes)
   --depend - dependent parameters (via ;), which are calculated according to given in --set. Example:
                     atomN_coordZ= 20 + 3*atomN_coordX - 0.5*atomN_coordY
Example:
multirun --set="[paramX; paramY]=[8:1:14; 8:1:14]"  --no-cont vasp-wait 1 1 4000 &
3) Compose a regular expression that finds the desired result value in the output file. The regular expression should find the value of the result as the only group number 1. The best way to compose and test a regular expression is to uselink Rubular. Regular expression to search for numbers: [+-]?(?:\d*\.|\d+)\d*(?:[eE][+-]?\d+)?
Example: to find a number somewhere in the text, if it is known that the word "entropy" is to the left of it after a few spaces, and exatly near to the right"(sigma->0)", use regular expression:
entropy \s+([+-]?(?:\d*\.|\d+)\d*(?:[eE][+-]?\d+)?) \(sigma
Parentheses around a number search regex [+-]?(?:\d*\.|\d+)\d*(?:[eE][+-]?\d+)? mean that the number needs to be returned as a group.
Feel free to contact the administrator for help in drawing up regular expressions (in the letter indicate the example of the output file and the number that you need to automatically find in it).
4) Call result collector multirunCollectResults with parameters:
   --outfile - name of the output file (by default this shows in the screen)
   --regexp - regular expression to search result file
   --multiline - whether line feed is included in the designation of an arbitrary character (period) in the regular expression (by default: no)
Example:
multirunCollectResults --outfile=OUTCAR --regexp="entropy=\s*([+-]?(?:\d*\.|\d+)\d*(?:[eE][+-]?\d+)?)\(sigma"

Command multirunCollectResults can be used even before completion multirun. This is helpful, for example, for testing a regular expression. If you run multirun then instead of run-cluster you have to use the run-cluster-and-wait command, and instead of vasp - vasp-wait. Command multirun fill in the log file: log_multirun.txt. If it will work for a long time, then it must be called in the background (with the symbol& at the very end of all arguments) and then finish work with the cluster through the exit command, rather than closing the terminal window with a cross on the window. You can kill a working task with the rkill command:  rkill   id_calculation. You can find out the process id with ps -x.

To run Jupyter Notebook with PyFitIt you have to use ssh-connection with cluster.
/opt/anaconda/bin/jupyter notebook --ip='*'
At the first, you have to copy the project to your directory PyFitIt