ifort 9.0 with Red Hat Linux EM64T

Discussion on computers, ROMS installation and compiling

Moderators: arango, robertson

Post Reply
Message
Author
nganju
Posts: 82
Joined: Mon Aug 16, 2004 8:47 pm
Location: U.S. Geological Survey, Woods Hole
Contact:

ifort 9.0 with Red Hat Linux EM64T

#1 Post by nganju » Mon Aug 22, 2005 6:40 pm

Hello, I just purchased a dual-processor 3.6Ghz Xeon machine (Dell Precision 670n) with "Red Hat Enterprise Linux WS v4Intel EM64T 64 bit" operating system. I plan on installing the recent Intel compiler (9.0) which claims to "support" 64bit Intel processors.

Dell's website says "All XEON Processors Support Intel®Extended Memory 64 Technology".

Question 1: Has anyone setup a similar system to compile the ROMS code (successfully)? Any advice on makefiles, flags, etc would be appreciated.

Question 2: What does it mean to "support" 64bit computing? Is this a 64bit machine or some intermediate step? If the processor "supports" 64bit and the compiler "supports" 64bit does that mean the machine is using 64bit computing?

Thanks...

User avatar
kate
Posts: 3695
Joined: Wed Jul 02, 2003 5:29 pm
Location: IMS/UAF, USA

#2 Post by kate » Thu Aug 25, 2005 6:20 pm

It could mean one of two things, as far as I know:

1. If the floating point registers are 64-bit, then double precision should be as fast as single precision for doing floating point operations. This has been true of the IBM Power series, at least pwr3 and pwr4.

2. If the OS and hardware support 64-bit memory addressing, you can run problems that are larger than 2 GB in memory. The BENCHMARK3 test built into ROMS is a problem requiring a 64-bit address space. Some systems (such as IBM, SGI, etc.) have two different modes, two different object file types, and require a consistent build with 64-bit mode enabled, including for the NetCDF library used.

I guess there is also the question of a 64-bit file system, determining whether you can have files larger than 2 GB.

black

#3 Post by black » Thu Sep 29, 2005 9:27 pm

nganju,
I am curiuos to see if your configuration worked. I am trying a similar configuration myself, dual 3.2 Ghz Xeons, Fedora Core 4 64bit, Intel fortran compiler 9.0. What version of ROMS did you use? Any problems compiling and running?
Thanks.

nganju
Posts: 82
Joined: Mon Aug 16, 2004 8:47 pm
Location: U.S. Geological Survey, Woods Hole
Contact:

#4 Post by nganju » Mon Oct 03, 2005 4:20 pm

I am still working on installing the netcdf libraries. there is a file needed by the ./configure called "crt1.o" that is supposed to be in usr/bin/lib, on my machine it is in the folder usr/bin/lib64. So I simply need to copy the file, but alas the systems administrator did not give me full rights on the machine and he is on vacation. so right now i am stalled. please keep me posted on your progress, i will let you know when i move forward...good luck!

black

#5 Post by black » Mon Oct 03, 2005 6:58 pm

Well we tried to install ifort 9.0 64bit, but it didn't work. The 32 bit version installed fine (it comes with the 64bit version) but installation failed for 64bit. So we went ahead and tried to install the netcdf libraries anyway, using the 32bit compiler. That did work either. Can't remember that exact error. Now we are using the Portland Group compiler because it has been test on FC4 64bit. We installed it and the netcdf libraries no problem. I will be trying to run ROMS 2.2 later today. I'll keep you posted.

nganju
Posts: 82
Joined: Mon Aug 16, 2004 8:47 pm
Location: U.S. Geological Survey, Woods Hole
Contact:

#6 Post by nganju » Mon Oct 03, 2005 7:05 pm

ifort 32 and 64 installed ok on the system (i had to install it to my user and not the root--i lack access still). I am not trying to install netcdf with the 64 bit until i can get 32 bit running.

User avatar
jivica
Posts: 122
Joined: Mon May 05, 2003 2:41 pm
Location: The University of Western Australia, Perth, Australia

libs without root passwd

#7 Post by jivica » Mon Oct 10, 2005 7:10 am

You can use LM_LIBRARY_PATH variable to specify directory which should be incluced in search path.
like (bash):
export LM_LIBRARY_PATH=$LM_LIBRARY_PATH:/usr/your_path_to_spec_lib

should work..
Cheers, Ivica

dglee
Posts: 2
Joined: Thu Sep 23, 2004 4:35 pm
Location: Busan National University

#8 Post by dglee » Mon Oct 17, 2005 5:11 am

Hi, I successfully installed ifort9.0 to AMD64 machine. I have 64bit netcdf libary. If anyone want to have 64bit netcdf, please let me know.
John

nganju
Posts: 82
Joined: Mon Aug 16, 2004 8:47 pm
Location: U.S. Geological Survey, Woods Hole
Contact:

netcdf with ifort9.0

#9 Post by nganju » Wed Nov 02, 2005 10:09 pm

It looks like at least one person has had success building netcdf with ifort 9.0. what flags might you have used and what adjustments did you have to make for env. variables? I am still getting an error "skipping incompatible /lib*.* when searching for -lm", and I have tried adding all sorts of paths to LD_LIBRARY_PATH. Any ideas??

nganju
Posts: 82
Joined: Mon Aug 16, 2004 8:47 pm
Location: U.S. Geological Survey, Woods Hole
Contact:

#10 Post by nganju » Tue Dec 06, 2005 7:57 pm

thanks for everyone's suggestions, all is running now. on a side note, I am getting a slight speed-up (~1-2%) when hyperthreading is turned off. I tried various tilings and in all cases it was faster with HT off. any reason to leave HT on?

User avatar
shchepet
Posts: 185
Joined: Fri Nov 14, 2003 4:57 pm

Ifort + EM64T

#11 Post by shchepet » Mon Dec 12, 2005 6:50 pm

I start my reply with nganju's question about light speed-up (~1-2%) when hyperthreading is turned off.

Yes, this is consistent with my own experience and you should always turn hyperthreading OFF in BIOS of your dual-Xeon machine. I observed improvement by as much as 7...10%. In the case of single processor Pentium 4 machine, which also supports hyperthreading, I found that the most optimum way is to use a single-processor kernel specifically optimized for Pentium 4 (Mandrake Linux has kernels labelled as "...i686-4GB-up". I do not know the exact analog of this name in Fedora or Red Hat, but I suspect, they do the same). In this case it does not matter whether hyperthreading is IN or OFF in BIOS. The observed performance improvement (relatively to SMP or Enterprise kernel, which is default for hyperthreaded Pentium 4) is up to 15% (say 3 seconds out of 18).

A possible speculative explanation is that hyperthreading is designed for server kind of environment, when processor is working on multiple tasks with very low computing load, but constantly interrupted with disk I/O or network activity. In this case the system can utilize resources better, because if one task gets stack waiting for I/O, the processor power is given so somebody else, rather than wasted. In the case of heavy computing task, when nothing else is going on in the machine, and your memory bus is saturated all the time, you better have to have an extra real, rather than virtual CPU, and hyperthreading only confuses thread scheduling.


Question 2: What does it mean to "support" 64bit computing? Is this a 64bit machine or some intermediate step? If the processor "supports" 64bit and the compiler "supports" 64bit does that mean the machine is using 64bit computing?


Intel was kind of evasive about it. The EM64T (so called extended memory technology) is somewhat in between ia32 and i64 (which is true 64). However, as far as I know, it has nothing to do with "64-bit computing", i.e. register length and hardware design for arithmetic operations. All Intel's CPUs, starting with Pentium III have 80-bit registers, (which is even longer than 64) and are capable to produce double-precision arithmetic as fast and single precision with no performance penalty whatsoever (in fact, there is no performance penalty for even 80-bit arithmetics).

This results in a peculiar feature that if you compute something like

sum=0.D0
do i=1,100000
sum=sum+A(i)**2
enddo

then sum grows and at some point it becomes much larger than each individual A(i)**2, resulting in roundoff error. However, variable "sum" is permanently stored in a 80-bit register and and A(i)**2 is promoted to 80 bits before it is added to sum. As the result, Pentium is more accurate than any other processor with 64-bit arithmetic in this particular example.

Once the number leaves the register, it is truncated to 64-bit, which the standard double precision.


EM64T and ia64 processors support 64-bit addressing, resulting in the capability to run jobs using more that 2 GB of memory, provided that compiler and operating system is also capable to support it.

EM64T + 64-bit operating system allows you to create netCDF files larger than 2GBytes, provided that you overcome all hassles of compiling netCDF library with proper flags. (Even if you succeed, good luck with portability and reading using standard ia32 matlab.)


Everything said above is applicable to AMD Opteron processor, which is fully compatible with Intel's EM64T.


One can install either 32- or 64-bit version of Linux Operating system on a EM64T machine. To my experience, using 64-bit version with appropriate compilers makes machine faster by as much as 25...30%.








Question 1: Has anyone setup a similar system to compile the ROMS code (successfully)? Any advice on makefiles, flags, etc would be
appreciated.

Yes.

I am mainly using Intel 8.1 (latest release, which is not 8.1.033) compiler. The 9.0 is still somewhat immature, it works, but always complains about "skipping incompatible libraries", and it results in slightly slower executables (within less than 5%). My experience with Intel compilers is do not bother, until the release number reaches .030 or more.

The flags are (these are actually for Opteron's which do not support Intel's latest SSE3 instructions. On newest P4/Xeon one can also set X-flags to -axP -xP, resulting if faster executables; overall, you do not see much difference between P4 on 32-bit operating system: all differences are hidden inside the compiler):


CPP = /lib/cpp -traditional
CPPFLAGS = -D__IFC


#
# Compiler settings: -fpp2 is required only if -openmp is present.
# Not having -fpp2 here just causes compiler warning (-fpp is set to
# level 2 by -openmp), but other than that has no effect.

# Switch -pc80 increases precision of floating point operation to
# 64 bits (vs. 53 bits double precision default).
#
# -qp compiles and links for function profiling with gprof(1);
# this is the same as specifying -p or -pg.
#
# Setting FFLAGS = -O2 -mp (or lower optimization level) is needed
# to pass ETALON_CHECK: -O3 causes roundoff-level differences from
# the length of innermost i-loop (the results still pass ETALON_CHECK
# if NP_XI = NSUB_X = 1, regardless of partition in ETA-direction).


OMP_FLAG = -fpp2 -openmp

# CFT = ifc $(OMP_FLAG) -pc80 -tpp7 -axW -xW
# CFT = ifort $(OMP_FLAG) -pc80 -tpp7 -axN -xN -auto -stack_temps
# CFT = ifort $(OMP_FLAG) -pc80 -tpp7 -axN -xN -align dcommon -auto -stack_temps
CFT = ifort $(OMP_FLAG) -w95 -tpp7 -align dcommon -auto -stack_temps
# -xW

# -warn unused

LDR = $(CFT)

# FFLAGS = -O3 -IPF_fma -ip
FFLAGS = -O2 -mp
# -prof_gen
#
# FFLAGS = -g -CA -CB -CS -CU -CV
# FFLAGS = $(CPPFLAGS) -O0

# LDFLAGS =


COMP_FILES = work.pc work.pcl ifort*

LCDF = -lnetcdf
# LCDF =/usr/local/lib/libnetcdf.a

LMPI = /opt/mpich-1.2.7p1/lib/libmpich.a

User avatar
shchepet
Posts: 185
Joined: Fri Nov 14, 2003 4:57 pm

ifort + EM64T

#12 Post by shchepet » Mon Dec 12, 2005 7:10 pm

I forgot to mention: because on my machine(s) I always have multiple compilers and multiple operating systems, I always place netCDF library into the "lib" directory of the compiler which was used to compile it.

This way, I just have to specify

LCDF = -lnetcdf


without path to it. The path is actually controlled by LD_LIBRARY_PATH which is set in my .cshrc file consistently with the compiler.

Post Reply