GW parallel strategies: Difference between revisions
No edit summary |
No edit summary |
||
Line 17: | Line 17: | ||
First run the initialization as usual. | First run the initialization as usual. | ||
Second have a look to both the input file and the submission script. | Second have a look to both the input file | ||
$ cat yambo_gw.in | |||
# | |||
# | |||
# Y88b / e e e 888~~\ ,88~-_ | |||
# Y88b / d8b d8b d8b 888 | d888 \ | |||
# Y88b/ /Y88b d888bdY88b 888 _/ 88888 | | |||
# Y8Y / Y88b / Y88Y Y888b 888 \ 88888 | | |||
# Y /____Y88b / YY Y888b 888 | Y888 / | |||
# / / Y88b / Y888b 888__/ `88_-~ | |||
# | |||
# | |||
# GPL Version 4.1.2 Revision 120 | |||
# MPI+OpenMP Build | |||
# http://www.yambo-code.org | |||
# | |||
ppa # [R Xp] Plasmon Pole Approximation | |||
gw0 # [R GW] GoWo Quasiparticle energy levels | |||
HF_and_locXC # [R XX] Hartree-Fock Self-energy and Vxc | |||
em1d # [R Xd] Dynamical Inverse Dielectric Matrix | |||
NLogCPUs=0 # [PARALLEL] Live-timing CPU`s (0 for all) | |||
X_all_q_CPU= "" # [PARALLEL] CPUs for each role | |||
X_all_q_ROLEs= "" # [PARALLEL] CPUs roles (q,k,c,v) | |||
X_all_q_nCPU_LinAlg_INV= 1 # [PARALLEL] CPUs for Linear Algebra | |||
X_Threads= 0 # [OPENMP/X] Number of threads for response functions | |||
DIP_Threads= 0 # [OPENMP/X] Number of threads for dipoles | |||
SE_CPU= "" # [PARALLEL] CPUs for each role | |||
SE_ROLEs= "" # [PARALLEL] CPUs roles (q,qp,b) | |||
SE_Threads= 0 # [OPENMP/GW] Number of threads for self-energy | |||
EXXRLvcs= 21817 RL # [XX] Exchange RL components | |||
Chimod= "" # [X] IP/Hartree/ALDA/LRC/BSfxc | |||
% BndsRnXp | |||
1 | 200 | # [Xp] Polarization function bands | |||
% | |||
NGsBlkXp= 8 Ry # [Xp] Response block size | |||
% LongDrXp | |||
1.000000 | 0.000000 | 0.000000 | # [Xp] [cc] Electric Field | |||
% | |||
PPAPntXp= 27.21138 eV # [Xp] PPA imaginary energy | |||
% GbndRnge | |||
1 | 200 | # [GW] G[W] bands range | |||
% | |||
GDamping= 0.10000 eV # [GW] G[W] damping | |||
dScStep= 0.10000 eV # [GW] Energy step to evaluate Z factors | |||
DysSolver= "n" # [GW] Dyson Equation solver ("n","s","g") | |||
%QPkrange # [GW] QP generalized Kpoint/Band indices | |||
1| 1| 3|6| | |||
% | |||
and the submission script | |||
$ cat job.sh | |||
# | |||
#!/bin/bash | |||
#SBATCH -N 1 | |||
#SBATCH -t 06:00:00 | |||
#SBATCH -J test | |||
#SBATCH --reservation=cecam_course | |||
#SBATCH --tasks-per-node=16 | |||
# | |||
module purge | |||
module load intel/16.0.3 | |||
module load intelmpi/5.1.3 | |||
# | |||
export OMP_NUM_THREADS=1 | |||
# | |||
jobname="First_GW_run" | |||
jdir=${jobname} | |||
cdir=${jobname}_out | |||
# | |||
filein0=yambo_gw.in | |||
filein=yambo_gw_${label}.in | |||
# | |||
cp -f $filein0 $filein | |||
cat >> $filein << EOF | |||
X_all_q_CPU= "1 1 $ncpu 1" # [PARALLEL] CPUs for each role | |||
X_all_q_ROLEs= "q k c v" # [PARALLEL] CPUs roles (q,k,c,v) | |||
X_all_q_nCPU_LinAlg_INV= $ncpu # [PARALLEL] CPUs for Linear Algebra | |||
X_Threads= 0 # [OPENMP/X] Number of threads for response functions | |||
DIP_Threads= 0 # [OPENMP/X] Number of threads for dipoles | |||
SE_CPU= " 1 1 $ncpu" # [PARALLEL] CPUs for each role | |||
SE_ROLEs= "q qp b" # [PARALLEL] CPUs roles (q,qp,b) | |||
SE_Threads= 0 | |||
EOF | |||
# | |||
echo "Running on $ncpu MPI, $nthreads OpenMP threads" | |||
srun -n $ncpu -c $nthreads $bindir/yambo -F $filein -J $jdir -C $cdir | |||
As soon as you are ready submit the job. | As soon as you are ready submit the job. | ||
$ | $ sbatch job.sh |
Revision as of 16:05, 25 April 2017
In this tutorial we will see how to setup the variables governing the parallel execution of yambo in order to perform efficient calculations in terms of both cpu time and memory to solution. As a test case we will consider the hBN 2D material. Because of its reduced dimensionality, GW calculations turns out to be very delicate. Beside the usual convergence studies with respect to k-points and sums-over-bands, in low dimensional systems a sensible amount of vacuum is required in order to treat the system as isolated, translating into a large number of plane-waves. As for other tutorials, it is important to stress that this tutorial it is meant to illustrate the functionality of the key variables and to run in reasonable time, so it has not the purpose to reach the desired accuracy to reproduce experimental results. Moreover please also note that scaling performance illustrated below may be significantly dependent on the underlying parallel architecture. Nevertheless, general considerations are tentatively drawn in discussing the results.
If you are now inside bellatrix
$ pwd /scratch/cecam.schoolXY/yambo_YOUR_NAME
you need to obtain the appropriate tarball
$ cp /scratch/cecam.school/yambo_parallel/hBN-2D.tar.gz . (Notice that this time there is not XY!) $ tar -zxvf hBN-2D.tar.gz $ ls YAMBO_TUTORIALS $ cd YAMBO_TUTORIALS/hBN-2D/YAMBO
To run a calculation on bellatrix you need to go via the queue system as explained in the Tutorials home. Under the YAMBO folder, together with the input file, you will see the job.sh script
$ ls
First run the initialization as usual. Second have a look to both the input file
$ cat yambo_gw.in # # # Y88b / e e e 888~~\ ,88~-_ # Y88b / d8b d8b d8b 888 | d888 \ # Y88b/ /Y88b d888bdY88b 888 _/ 88888 | # Y8Y / Y88b / Y88Y Y888b 888 \ 88888 | # Y /____Y88b / YY Y888b 888 | Y888 / # / / Y88b / Y888b 888__/ `88_-~ # # # GPL Version 4.1.2 Revision 120 # MPI+OpenMP Build # http://www.yambo-code.org # ppa # [R Xp] Plasmon Pole Approximation gw0 # [R GW] GoWo Quasiparticle energy levels HF_and_locXC # [R XX] Hartree-Fock Self-energy and Vxc em1d # [R Xd] Dynamical Inverse Dielectric Matrix NLogCPUs=0 # [PARALLEL] Live-timing CPU`s (0 for all) X_all_q_CPU= "" # [PARALLEL] CPUs for each role X_all_q_ROLEs= "" # [PARALLEL] CPUs roles (q,k,c,v) X_all_q_nCPU_LinAlg_INV= 1 # [PARALLEL] CPUs for Linear Algebra X_Threads= 0 # [OPENMP/X] Number of threads for response functions DIP_Threads= 0 # [OPENMP/X] Number of threads for dipoles SE_CPU= "" # [PARALLEL] CPUs for each role SE_ROLEs= "" # [PARALLEL] CPUs roles (q,qp,b) SE_Threads= 0 # [OPENMP/GW] Number of threads for self-energy EXXRLvcs= 21817 RL # [XX] Exchange RL components Chimod= "" # [X] IP/Hartree/ALDA/LRC/BSfxc % BndsRnXp 1 | 200 | # [Xp] Polarization function bands % NGsBlkXp= 8 Ry # [Xp] Response block size % LongDrXp 1.000000 | 0.000000 | 0.000000 | # [Xp] [cc] Electric Field % PPAPntXp= 27.21138 eV # [Xp] PPA imaginary energy % GbndRnge 1 | 200 | # [GW] G[W] bands range % GDamping= 0.10000 eV # [GW] G[W] damping dScStep= 0.10000 eV # [GW] Energy step to evaluate Z factors DysSolver= "n" # [GW] Dyson Equation solver ("n","s","g") %QPkrange # [GW] QP generalized Kpoint/Band indices 1| 1| 3|6| %
and the submission script
$ cat job.sh # #!/bin/bash #SBATCH -N 1 #SBATCH -t 06:00:00 #SBATCH -J test #SBATCH --reservation=cecam_course #SBATCH --tasks-per-node=16 # module purge module load intel/16.0.3 module load intelmpi/5.1.3 # export OMP_NUM_THREADS=1 # jobname="First_GW_run" jdir=${jobname} cdir=${jobname}_out # filein0=yambo_gw.in filein=yambo_gw_${label}.in # cp -f $filein0 $filein cat >> $filein << EOF X_all_q_CPU= "1 1 $ncpu 1" # [PARALLEL] CPUs for each role X_all_q_ROLEs= "q k c v" # [PARALLEL] CPUs roles (q,k,c,v) X_all_q_nCPU_LinAlg_INV= $ncpu # [PARALLEL] CPUs for Linear Algebra X_Threads= 0 # [OPENMP/X] Number of threads for response functions DIP_Threads= 0 # [OPENMP/X] Number of threads for dipoles SE_CPU= " 1 1 $ncpu" # [PARALLEL] CPUs for each role SE_ROLEs= "q qp b" # [PARALLEL] CPUs roles (q,qp,b) SE_Threads= 0 EOF # echo "Running on $ncpu MPI, $nthreads OpenMP threads" srun -n $ncpu -c $nthreads $bindir/yambo -F $filein -J $jdir -C $cdir
As soon as you are ready submit the job.
$ sbatch job.sh