Parallelization for non-linear response calculations: Difference between revisions

From The Yambo Project
Jump to navigation Jump to search
No edit summary
No edit summary
Line 29: Line 29:
image you want to calculate the response for 40 different frequencies you could set
image you want to calculate the response for 40 different frequencies you could set


"4 cores Open-MP" x "10 cores k-points" x "60 cores frequencies" = 2400 cores.
"4 cores Open-MP" x "10 cores k-points" x "40 cores frequencies" = 1600 cores.


we do not advice to use more than 4 open-MP threads at least you need more memory in the calculations.<br>
we advice not to use more than 4 open-MP threads,  unless you have memory problems  in your calculations.<br>
Notice that the restart for interrupted calculations works only on frequencies.
Notice that the restart for interrupted calculations works only on frequencies.

Revision as of 11:47, 23 March 2022

By default yambo_nl is parallelized on frequencies, that is the most efficient way to distribute calculations among the different processors, other two parallelizations are available in the code:

K-points parallelization
if your system is large and requires more memory or you have few frequencies you can change the parallelization strategy. By using the flag "-V par" you will get the parallelization options in your input, you can decide to turn on the parallelization on k-points in such a way that the product of cores in k-space and in frequency-space is equal to the total number of cores. For example if have 16 cores you can set:

NL_CPU= "4 4"                   # [PARALLEL] CPUs for each role
NL_ROLEs= "w k"                 # [PARALLEL] CPUs roles (w,k)
DIP_CPU= "4 2 2"                      # [PARALLEL] CPUs for each role
DIP_ROLEs= "k c v"                    # [PARALLEL] CPUs roles (k,c,v)
                           

in this way the code will distribute the wave-function on 4 cores and reduce the amount of memory. If this is not enough you can use the Open-MP parallelization, see below.

Open-MP
Another possibility is to compile the code with the --enable-open-mp flag and then use the OpenMP parallelization. \\ For example set the number of threads to 2 with the command:

export OMP_NUM_THREADS="2"
                         

and yambo_nl automatically will use the threads available. In the log file will find:

.....
<---> P1: MPI Cores-Threads   : 16(CPU)-2(threads)
<---> P1: MPI Cores-Threads   : NL(environment)-4 4(CPUs)-w k(ROLEs)
.....

Using all these parallelization you can use a large number of cores for example image you want to calculate the response for 40 different frequencies you could set

"4 cores Open-MP" x "10 cores k-points" x "40 cores frequencies" = 1600 cores.

we advice not to use more than 4 open-MP threads, unless you have memory problems in your calculations.
Notice that the restart for interrupted calculations works only on frequencies.