Issue while compiling VASP(vasp.6.4.2) with Nvidia GPU

Questions regarding the compilation of VASP on various platforms: hardware, compilers and libraries, etc.


Moderators: Global Moderator, Moderator

Locked
Message
Author
bhargabkakati
Newbie
Newbie
Posts: 39
Joined: Mon May 29, 2023 8:56 am

Issue while compiling VASP(vasp.6.4.2) with Nvidia GPU

#1 Post by bhargabkakati » Sun Mar 24, 2024 1:10 pm

Dear experts, I am encountering an issue while compiling VASP (vasp.6.4.2) with Nvidia GPU. For your reference, I have provided the specifications below:

OS: Ubuntu 22.04

CPU: 36 Core

GPU: Nvidia RTX A6000

CUDA Version: 12.2

$ nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver

Copyright (c) 2005-2023 NVIDIA Corporation

Built on Tue_Jun_13_19:16:58_PDT_2023

Cuda compilation tools, release 12.2, V12.2.91

Build cuda_12.2.r12.2/compiler.32965470_0


Additionally, I have attach the makefile.include and the compilation error messages for further analysis. Your guidance on resolving this issue would be highly appreciated.

Thank You.
You do not have the required permissions to view the files attached to this post.

andreas.singraber
Global Moderator
Global Moderator
Posts: 249
Joined: Mon Apr 26, 2021 7:40 am

Re: Issue while compiling VASP(vasp.6.4.2) with Nvidia GPU

#2 Post by andreas.singraber » Sun Mar 24, 2024 11:06 pm

Hello!

It seems your mpif90 command is not a wrapper around the NVIDIA nvfortran compiler executable as expected, but rather one around the gfortran compiler (maybe the default system compiler). Of course, gfortran cannot understand the NVIDIA compiler flags and throws an error. You can check this by running

Code: Select all

mpif90 -show
This will show you the underlying compiler and the flags and libraries which are used by the wrapper mpif90. I expect that this will show gfortran in your case. However, the correct wrapper must use the nvfortran compiler. Maybe the "wrong" mpif90 is still first in your PATH variable and hence takes precedence over the correct one. I would suggest to search for the right mpif90 wrapper. For example, in my installation of NVHPC 23.11 I can see the mpif90 wrapper in this directory:

Code: Select all

/opt/nvidia/hpc_sdk/Linux_x86_64/23.11/comm_libs/mpi/bin/mpif90
Of course, the actual location depends on your installation. If you can find the right wrapper, just use it with its full path in the makefile.include file, e.g.,

Code: Select all

...
FC          = /opt/nvidia/hpc_sdk/Linux_x86_64/23.11/comm_libs/mpi/bin/mpif90 -acc -gpu=cc60,cc70,cc80,cuda12.2
FCL         = /opt/nvidia/hpc_sdk/Linux_x86_64/23.11/comm_libs/mpi/bin/mpif90 -acc -gpu=cc60,cc70,cc80,cuda12.2 -c++libs
...
Alternatively, ask your system administrators to fix the environment to use the correct wrapper.

Hope this helps, all the best,

Andreas Singraber

bhargabkakati
Newbie
Newbie
Posts: 39
Joined: Mon May 29, 2023 8:56 am

Re: Issue while compiling VASP(vasp.6.4.2) with Nvidia GPU

#3 Post by bhargabkakati » Mon Mar 25, 2024 1:33 pm

Dear Andreas Singraber,
Thank you for the suggestion. Your solution helped resolve the issue. Now I am encountering a new issue, as showed below. I can compile VASP successfully without adding wannier90 interface but when I try to add the wannier90 interface the following error occurs. Any help would be greatly appreciated. For your convenience, I have attached the log file and the makefile.include. Thank you.

/usr/bin/ld: cannot find -lwannier: No such file or directory
pgacclnk: child process exit status 1: /usr/bin/ld
make[2]: *** [makefile:132: vasp] Error 2
make[2]: Leaving directory '/home/cms-gpu/softwares/vasp.6.4.2/build/gam'
cp: cannot stat 'vasp': No such file or directory
make[1]: *** [makefile:129: all] Error 1
make[1]: Leaving directory '/home/cms-gpu/softwares/vasp.6.4.2/build/gam'
make: *** [makefile:17: gam] Error 2
make: *** Waiting for unfinished jobs....
/usr/bin/ld: cannot find -lwannier: No such file or directory
pgacclnk: child process exit status 1: /usr/bin/ld
make[2]: *** [makefile:132: vasp] Error 2
make[2]: Leaving directory '/home/cms-gpu/softwares/vasp.6.4.2/build/ncl'
cp: cannot stat 'vasp': No such file or directory
make[1]: *** [makefile:129: all] Error 1
make[1]: Leaving directory '/home/cms-gpu/softwares/vasp.6.4.2/build/ncl'
make: *** [makefile:17: ncl] Error 2
/usr/bin/ld: cannot find -lwannier: No such file or directory
pgacclnk: child process exit status 1: /usr/bin/ld
make[2]: *** [makefile:132: vasp] Error 2
make[2]: Leaving directory '/home/cms-gpu/softwares/vasp.6.4.2/build/std'
cp: cannot stat 'vasp': No such file or directory
make[1]: *** [makefile:129: all] Error 1
make[1]: Leaving directory '/home/cms-gpu/softwares/vasp.6.4.2/build/std'
make: *** [makefile:17: std] Error 2
You do not have the required permissions to view the files attached to this post.

andreas.singraber
Global Moderator
Global Moderator
Posts: 249
Joined: Mon Apr 26, 2021 7:40 am

Re: Issue while compiling VASP(vasp.6.4.2) with Nvidia GPU

#4 Post by andreas.singraber » Mon Mar 25, 2024 1:45 pm

Hello!

The compiler cannot find the file libwannier.a at the linking stage. In your makefile.include you have the following lines:

Code: Select all

WANNIER90_ROOT ?= /home/cms-gpu/softwares/wannier90-3.1.0-serial
LLIBS          += -L$(WANNIER90_ROOT)/lib -lwannier
Hence, the compiler will search in /home/cms-gpu/softwares/wannier90-3.1.0-serial/lib for this file (note the lib at the end). Is the file actually present in this directory? I noticed that in a recent build it was not in the lib directory but rather located in the Wannier90 base folder. If that is the case for you as well, then you need to remove the lib in the second line above:

Code: Select all

LLIBS          += -L$(WANNIER90_ROOT) -lwannier
Hope this helps, all the best,
Andreas Singraber

bhargabkakati
Newbie
Newbie
Posts: 39
Joined: Mon May 29, 2023 8:56 am

Re: Issue while compiling VASP(vasp.6.4.2) with Nvidia GPU

#5 Post by bhargabkakati » Mon Mar 25, 2024 7:59 pm

Dear sir,
Thank you very much. I seemed to have overlooked the directory of libwannier.a. After putting the correct directory (removing "/lib") I got a long string new errors as shown below. I can't seem to figure out what's the issue (might be something minor). Any help would be much appreciated. Thank you.

usr/bin/ld: ws_distance.F90:(.text+0xa15): undefined reference to `_gfortran_st_write_done'
/usr/bin/ld: ws_distance.F90:(.text+0xa64): undefined reference to `_gfortran_st_write'
/usr/bin/ld: ws_distance.F90:(.text+0xa74): undefined reference to `_gfortran_transfer_integer_write'
/usr/bin/ld: ws_distance.F90:(.text+0xa84): undefined reference to `_gfortran_transfer_integer_write'
/usr/bin/ld: ws_distance.F90:(.text+0xa94): undefined reference to `_gfortran_transfer_integer_write'
/usr/bin/ld: ws_distance.F90:(.text+0xa9c): undefined reference to `_gfortran_st_write_done'
/usr/bin/ld: ws_distance.F90:(.text+0xb48): undefined reference to `_gfortran_st_close'
/usr/bin/ld: ws_distance.F90:(.text+0xb72): undefined reference to `_gfortran_string_trim'
/usr/bin/ld: ws_distance.F90:(.text+0xbb4): undefined reference to `_gfortran_concat_string'
/usr/bin/ld: ws_distance.F90:(.text+0xbf8): undefined reference to `_gfortran_concat_string'
/usr/bin/ld: /home/cms-gpu/softwares/wannier90-3.1.0/libwannier.a(ws_distance.o): in function `__w90_ws_distance_MOD_ws_translate_dist':
ws_distance.F90:(.text+0x172f): undefined reference to `_gfortran_internal_pack'
pgacclnk: child process exit status 1: /usr/bin/ld
make[2]: *** [makefile:132: vasp] Error 2
make[2]: Leaving directory '/home/cms-gpu/softwares/vasp.6.4.2/build/std'
cp: cannot stat 'vasp': No such file or directory
make[1]: *** [makefile:129: all] Error 1
make[1]: Leaving directory '/home/cms-gpu/softwares/vasp.6.4.2/build/std'
make: *** [makefile:17: std] Error 2

andreas.singraber
Global Moderator
Global Moderator
Posts: 249
Joined: Mon Apr 26, 2021 7:40 am

Re: Issue while compiling VASP(vasp.6.4.2) with Nvidia GPU

#6 Post by andreas.singraber » Tue Mar 26, 2024 9:03 am

Hello!

I think the problem is that you are trying to link your NVIDIA-compiled VASP to the GNU-compiled Wannier90 library. I do not know whether this is possible, i.e., if there is ABI-compatibility between the NVIDIA and GNU compilers. If yes, then maybe this will work:

Code: Select all

LLIBS          += -L$(WANNIER90_ROOT)/lib -lwannier -lgfortran
However, I would not recommend this, instead please try to compile also the Wannier90 library with the NVIDIA compiler! It should not be difficult, basically just use

Code: Select all

F90 = nvfortran
in the make.inc file in the Wannier90 directory. In the same directory rebuild the Wannier90 library with the commands make clean followed by make lib. Then go back to the VASP and try to link again.

Best,
Andreas Singraber

bhargabkakati
Newbie
Newbie
Posts: 39
Joined: Mon May 29, 2023 8:56 am

Re: Issue while compiling VASP(vasp.6.4.2) with Nvidia GPU

#7 Post by bhargabkakati » Tue Mar 26, 2024 10:33 am

Dear Andreas Singraber,
Your solution worked. Now I have successfully compiled vasp with the wannier90 interface. Thank you very much for your help.

Locked