xCAT Cluster Management Tools

xCAT is a very good cluster mangement toolkit for the experienced. It takes some time to configure the cluster and it can get a little tedious, but the amount of control that you get over the cluster is fantastic. It works great with the IBM proposed solution for a cluster. and with the relevent hardware, the cluster is really easy to manage. This makes cluster management a breeze. A very good set of documentation is already available, so i will not attempt to recreate them.

Please refer to http://www.alphaworks.ibm.com/tech/xCAT for detailed instructions.

Continue reading xCAT Cluster Management Tools

NWChem for OSX on PPC

NWChem (http://www.emsl.pnl.gov/docs/nwchem/nwchem.html) is a computational chemistry package that is designed to run on high-performance parallel supercomputers as well as conventional workstation clusters. It aims to be scalable both in its ability to treat large problems efficiently, and in its usage of available parallel computing resources. NWChem has been developed by the Molecular Sciences Software group of the Environmental Molecular Sciences Laboratory (EMSL) at the Pacific Northwest National Laboratory (PNNL). Most of the implementation has been funded by the EMSL Construction Project.

Compiling for OSX on PPC

Installing NWChem on the PPC architecture was a pain. Main reason being some Global Arrays Libraries which since has been fixed (i hope). Below documents the steps that i took to get NWChem compiled.

Tools

Below are the opensource or free tools used to get things going

  • GCC3.3.6
  • gcc3 from XCode
  • LAM/MPI 7.1.1

More specifically,

  • Ensure that gcc is from Xcode
  • f77 is from GCC3.3.6
  • LAM-MPI is compiled with the above mentioned compilers

Patches to the code

In armci/src/GNUmakefile, in the statement

SOCKETS  = $(SYSTEM_V)

you need to add MACX, i.e.,

SOCKETS  = $(SYSTEM_V) MACX

In file tcgmsg-mpi/nxtval-armci.c, please comment line#105 (if(NODEID_()== NXTV_SERVER)ARMCI_Free(pnxtval_counter);)

void finalize_nxtval()
{
  /* if(NODEID_() == NXTV_SERVER)ARMCI_Free(pnxtval_counter); */
    ARMCI_Finalize();
}

Also note that i am using GA4.. i.e. i replaced the “tools” in the src directory in NWCHEM and copied the original GNUMakefile over…

Environment Variables

Below are the environment variables used

# NWCHEM stuff
export TCGRSH=ssh
export NWCHEM_TOP=/cluster/nwchem-4.7/src/nwchem-4.7
export NWCHEM_TARGET=MACX
# LAM_MPI configuration for NWCHEM
export MPI_LOC=/opt/cluster/lam-7.1.1/gcc-3
export MPI_LIB=$MPI_LOC/lib
export MPI_INCLUDE=$MPI_LOC/include
export LIBMPI="-llamf77mpi -lmpi -llam -lpthread"
export NWCHEM_NWPW_LIBRARY=/opt/cluster/nwchem-4.7/data/

Ensure that your FC and CC is pointing to the right compilers

Compiling NWChem

The modules and make commands where configured as

$ make nwchem_config NWCHEM_MODULES=all
$ gcc_select 3.3
$ make TARGET=MACX USE_MPI=y DIAG=PAR

The installation mechanics is as described in the INSTALL file

Compiling ga-mpi.x for Global Arrays (GA) tests

Below is the code i used to compile ga-mpi.x to test the GA component.

cc -I../../include -DMACX -O   -c -o ga-mpi.o ga-mpi.c -L/opt/cluster/lam-7.1.1/gcc-3/lib -llamf77mpi -lmpi -llam -lpthread -I/opt/cluster/lam-7.1.1/gcc-3/include
/opt/cluster/gcc-3.3.6/bin/g77 -c -O -O3 -funroll-loops -fno-second-underscore -Wno-globals -I../../include -DMACX  ffflush.F
mpicc -I../../include -DMACX -O   -c -o util.o util.c
if [ -f ga-mpi.c ]; then
/opt/cluster/gcc-3.3.6/bin/g77 -g  -O3 -funroll-loops  -fno-second-underscore -Wno-globals
-o ga-mpi.x ga-mpi.o util.o -L../../lib/MACX -lglobal -lma  -llinalg  -larmci -L/opt/cluster/lam-7.1.1/gcc-3/lib
-ltcgmsg-mpi -llamf77mpi -lmpi -llam -lpthread -lm -lm -L/usr/lib/gcc/darwin/default -lgcc;
else        /opt/cluster/gcc-3.3.6/bin/g77 -g  -O3 -funroll-loops  -fno-second-underscore -Wno-globals
 -o ga-mpi.x ga-mpi.o util.o ffflush.o -L../../lib/MACX -lglobal -lma  -llinalg  -larmci
 -L/opt/cluster/lam-7.1.1/gcc-3/lib -ltcgmsg-mpi -llamf77mpi -lmpi -llam -lpthread -lm -L/usr/lib/gcc/darwin/default
 -lgcc;
fi

Errors

If there are errors about

-mtune=970 -mcpu=970...

i changed these to

-mcpu=powerpc

in config/makefile.h.