Page 1 of 1

Error when compiling with mpiCC and nvidia

Posted: Tue Nov 26, 2024 4:58 pm
by jchapmanbu

Hello, I am trying to install VASP 6.4.3 on my local machine using the makefile.include.nvhpc_ompi_mkl_omp_acc as the base makefile. I have included my current installation files as an attachment here. I have installed all the required libraries and confirmed that each component of those libraries is working correctly. However, when I run make all I get the errors below. This is beyond my understanding of compilation since it seems like the version of mpiCC within the nvidia hpc_sdk thinks that the code in timing_.c is wrong, which I'm sure isn't the case. This would imply that my version of mpiCC isn't compatible with this VASP installation, but I have no idea why. Any help you can provide is greatly appreciated.

When running /opt/nvidia/hpc_sdk/Linux_x86_64/24.11/comm_libs/mpi/bin/mpiCC -v, I get:

Export NVCOMPILER=/opt/nvidia/hpc_sdk/Linux_x86_64/24.11
Export PGI=/opt/nvidia/hpc_sdk
nvc++-Warning-No files to process

and
/opt/nvidia/hpc_sdk/Linux_x86_64/24.11/comm_libs/mpi/bin/mpif90 -v, gives:

Export NVCOMPILER=/opt/nvidia/hpc_sdk/Linux_x86_64/24.11
Export PGI=/opt/nvidia/hpc_sdk
nvfortran-Warning-No files to process

*********************************************************************************************************************************************
Error when running make all:

"timing_.c", line 10: error: incomplete type "void" is not allowed
void timing_C(mode,utime,stime,now,minpgf,majpgf,maxrsize,avsize,swaps,ios,cswitch,ierr)
^

"timing_.c", line 10: error: identifier "mode" is undefined
void timing_C(mode,utime,stime,now,minpgf,majpgf,maxrsize,avsize,swaps,ios,cswitch,ierr)
^

"timing_.c", line 10: error: identifier "utime" is undefined
void timing_C(mode,utime,stime,now,minpgf,majpgf,maxrsize,avsize,swaps,ios,cswitch,ierr)
^

"timing_.c", line 10: error: identifier "stime" is undefined
void timing_C(mode,utime,stime,now,minpgf,majpgf,maxrsize,avsize,swaps,ios,cswitch,ierr)
^

"timing_.c", line 10: error: identifier "now" is undefined
void timing_C(mode,utime,stime,now,minpgf,majpgf,maxrsize,avsize,swaps,ios,cswitch,ierr)
^

"timing_.c", line 10: error: identifier "minpgf" is undefined
void timing_C(mode,utime,stime,now,minpgf,majpgf,maxrsize,avsize,swaps,ios,cswitch,ierr)
^

"timing_.c", line 10: error: identifier "majpgf" is undefined
void timing_C(mode,utime,stime,now,minpgf,majpgf,maxrsize,avsize,swaps,ios,cswitch,ierr)
^

"timing_.c", line 10: error: identifier "maxrsize" is undefined
void timing_C(mode,utime,stime,now,minpgf,majpgf,maxrsize,avsize,swaps,ios,cswitch,ierr)
^

"timing_.c", line 10: error: identifier "avsize" is undefined
void timing_C(mode,utime,stime,now,minpgf,majpgf,maxrsize,avsize,swaps,ios,cswitch,ierr)
^

"timing_.c", line 10: error: identifier "swaps" is undefined
void timing_C(mode,utime,stime,now,minpgf,majpgf,maxrsize,avsize,swaps,ios,cswitch,ierr)
^

"timing_.c", line 10: error: identifier "ios" is undefined
void timing_C(mode,utime,stime,now,minpgf,majpgf,maxrsize,avsize,swaps,ios,cswitch,ierr)
^

"timing_.c", line 10: error: identifier "cswitch" is undefined
void timing_C(mode,utime,stime,now,minpgf,majpgf,maxrsize,avsize,swaps,ios,cswitch,ierr)
^

"timing_.c", line 10: error: identifier "ierr" is undefined
void timing_C(mode,utime,stime,now,minpgf,majpgf,maxrsize,avsize,swaps,ios,cswitch,ierr)
^

"timing_.c", line 12: error: expected a ";"
double *utime,*stime,*now,*maxrsize,*avsize;
^

"timing_.c", line 15: error: expected a declaration
{
^

"timing_.c", line 36: error: this declaration has no storage class or type specifier
totaltime = *utime + *stime;
^

"timing_.c", line 38: error: this declaration has no storage class or type specifier
*minpgf = (int) rudata.ru_minflt;
^

"timing_.c", line 38: error: variable "minpgf" has already been defined (previous definition at line 13)
*minpgf = (int) rudata.ru_minflt;
^

"timing_.c", line 38: error: identifier "rudata" is undefined
*minpgf = (int) rudata.ru_minflt;
^

"timing_.c", line 39: error: this declaration has no storage class or type specifier
*majpgf = (int) rudata.ru_majflt;
^

"timing_.c", line 39: error: variable "majpgf" has already been defined (previous definition at line 13)
*majpgf = (int) rudata.ru_majflt;
^

"timing_.c", line 41: error: this declaration has no storage class or type specifier
intsize = ((double) rudata.ru_ixrss) + ((double) rudata.ru_idrss) + ((double) rudata.ru_isrss);
^

"timing_.c", line 43: error: this declaration has no storage class or type specifier
*maxrsize = (double) rudata.ru_maxrss;
^

"timing_.c", line 44: error: this declaration has no storage class or type specifier
*avsize = intsize / totaltime / 100 ;
^

"timing_.c", line 44: error: a value of type "int" cannot be used to initialize an entity of type "int *"
*avsize = intsize / totaltime / 100 ;
^

"timing_.c", line 46: error: this declaration has no storage class or type specifier
*swaps = (int) rudata.ru_nswap;
^

"timing_.c", line 46: error: variable "swaps" has already been defined (previous definition at line 13)
*swaps = (int) rudata.ru_nswap;
^

"timing_.c", line 47: error: this declaration has no storage class or type specifier
*ios = ((int) rudata.ru_inblock) + ((int) rudata.ru_oublock);
^

"timing_.c", line 47: error: variable "ios" has already been defined (previous definition at line 13)
*ios = ((int) rudata.ru_inblock) + ((int) rudata.ru_oublock);
^

"timing_.c", line 48: error: this declaration has no storage class or type specifier
*cswitch = (int) rudata.ru_nvcsw;
^

"timing_.c", line 48: error: variable "cswitch" has already been defined (previous definition at line 13)
*cswitch = (int) rudata.ru_nvcsw;
^

"timing_.c", line 49: error: expected a declaration
}


Re: Error when compiling with mpiCC and nvidia

Posted: Wed Dec 11, 2024 1:52 pm
by pedro_melo

Dear jchapmanbu,

I am afraid that the NVIDIA HPC-SDK you are trying to use is not yet supported by VASP. You can check here for the different compilers and respective versions that we have tested. Could you try compiling VASP with a slightly older version of NVIDIA HPC-SDK?

Kind regards,
Pedro


Re: Error when compiling with mpiCC and nvidia

Posted: Tue Jan 07, 2025 5:33 pm
by jchapmanbu

Hello,
I have updated to the correct version of the NVIDIA-HPC-SDK and that solves the compilation issue. However, I'm now getting an issue during the linking process. It looks like my mkl libraries can't be found (error below), but I'm not sure why because the MKLROOT is set inside of the makefile.include (files attached). I have never experienced this problem when compiling on Linux, so even though I'm using an Ubuntu environment on Windows I'm wondering if this is a windows-specific issue? Any help you can provide is appreciated.

Errors:

/usr/bin/ld: cannot find -lmkl_scalapack_ilp64_dll.lib: No such file or directory
/usr/bin/ld: cannot find -lmkl_blacs_intelmpi_ilp64_dll.lib: No such file or directory
/usr/bin/ld: cannot find -lmkl_intel_lp64: No such file or directory
/usr/bin/ld: cannot find -lmkl_intel_thread: No such file or directory
/usr/bin/ld: cannot find -lmkl_core: No such file or directory

My mkl libraries are installed at: /mnt/c/Users/jc112358/software/Intel/oneAPI/mkl/2025.0

And these are the files there:

a.out mkl_cdft_core_dll.lib mkl_rt.lib mkl_sycl_dft_dll.lib mkl_sycl_vmd_dll.lib
cmake mkl_core.lib mkl_scalapack_ilp64.lib mkl_sycl_dftd_dll.lib mkl_sycld.lib
mkl_blacs_ilp64_dll.lib mkl_core_dll.lib mkl_scalapack_ilp64_dll.lib mkl_sycl_dll.lib mkl_sycld_dll.lib
mkl_blacs_intelmpi_ilp64.lib mkl_intel_ilp64.lib mkl_scalapack_lp64.lib mkl_sycl_lapack_dll.lib mkl_tbb_thread.lib
mkl_blacs_intelmpi_lp64.lib_cp mkl_intel_ilp64_dll.lib mkl_scalapack_lp64_dll.lib mkl_sycl_lapackd_dll.lib mkl_tbb_thread_dll.lib
mkl_blacs_intelmpi_lp64.so mkl_intel_lp64.lib_cp mkl_sequential.lib mkl_sycl_rng_dll.lib mkl_tbb_threadd.lib
mkl_blacs_lp64_dll.lib mkl_intel_lp64.so mkl_sequential_dll.lib mkl_sycl_rngd_dll.lib mkl_tbb_threadd_dll.lib
mkl_blacs_msmpi_ilp64.lib mkl_intel_lp64_dll.lib mkl_sycl.lib mkl_sycl_sparse_dll.lib pkgconfig
mkl_blacs_msmpi_lp64.lib mkl_intel_thread.lib mkl_sycl_blas_dll.lib mkl_sycl_sparsed_dll.lib
mkl_blas95_ilp64.lib mkl_intel_thread_dll.lib mkl_sycl_blasd_dll.lib mkl_sycl_stats_dll.lib
mkl_blas95_lp64.lib mkl_lapack95_ilp64.lib mkl_sycl_data_fitting_dll.lib mkl_sycl_statsd_dll.lib
mkl_cdft_core.lib mkl_lapack95_lp64.lib mkl_sycl_data_fittingd_dll.lib mkl_sycl_vm_dll.lib


Re: Error when compiling with mpiCC and nvidia

Posted: Wed Jan 08, 2025 11:30 am
by pedro_melo

Dear jchapmanbu,

I noticed some things in your makefile.include:
1) the line with the assignment of MKLROOT in you makefile.include is commented with a #. If you do not have MKLROOT defined as an environment variable the compiler will not find the libraries.
2) In the same line you have a slight different path, "mnt/c/Users/jc112358/software/Intel/oneAPI" instead of "/mnt/c/Users/jc112358/software/Intel/oneAPI/mkl/2025.0". Please check that the correct path is written.
3) I think the correct syntax to link libraries is to ignore the termination at the end, so for example it should be -lmkl_scalapack_ilp64_dll, not -lmkl_scalapack_ilp64_dll.lib.lib

Let me know if this helps. Kind regards,
Pedro


Re: Error when compiling with mpiCC and nvidia

Posted: Thu Jan 09, 2025 6:30 pm
by jchapmanbu

My MKLROOT is defined at the top of my makefile.include as an export statement because if I define it the way that is included in the VASP references I get this error:

nvfortran-Fatal-MKLROOT not found. Please set the environment variable MKLROOT to the location of Intel's Math Kernel Libraries (the part of the path to the *.so files that precedes 'lib/<arch>'.)

When I define it as an export statement at the top of the makefile.include I do not get this error but then I get the errors mentioned in my last response. I tried removing the .lib extension to the library files and the error persists. Since I'm including "export MKLROOT=/mnt/c/Users/jc112358/software/Intel/oneAPI/mkl/2025.0" at the top of the makefile.include I'm confused as to how the ld command can't find the files in the MKLROOT path when it's explicitly defined at the beginning of the makefile. I've also added the MKLROOT/lib to the main PATH in the load_modules.sh and the ld command still can't find the files. I know this doesn't seem like a VASP error but rather an issue with the environment I have set up, but I'm curious if anyone had encountered this when compiling VASP on the Ubuntu shell in Windows before.