Gaussian

Accessing Gaussian

If you require access to Gaussian, please contact arc-help@lists.leeds.ac.uk asking to be added to the Gaussian group. Your request must include the phrase:

I agree to abide by the licensing conditions for academic use and citation as published by Gaussian Inc. and which may be varied from time to time.

Our licensing agreement with Gaussian Inc allows for the use of their programs ONLY for academic research purposes. It is not permitted to use the software for commercial development or software application being developed for commercial release is permitted.

In addition, it is not permitted to compare the performance of Gaussian programs with competitor products (i.e. Molpro, Jaguar, etc).

The source code CANNOT be used or accessed by any individual involved in the development of computational algorithms that may compete with those of Gaussian Inc.

The source code or binaries CANNOT be copied or made available for use outside of The University of Leeds.

Our license agreement with Gaussian, Inc. requires that all academic work created using the Gaussian package cite the use of Gaussian.

The required and proper citation information may be found on the Gaussian website at:
http://www.gaussian.com/g_tech/g_ur/m_citation.htm

Program Technical Support

The most recent version of the Gaussian manual, which has detailed explanations of the program options, a range of examples, and instructions regarding scratch file saving, memory allocation, running in parallel, etc. is available at:

http://www.gaussian.com/g_tech/g_ur/g09help.htm

Initial setup

To set up Gaussian, at the command prompt, enter:

Using Scratch for Temporary Files

Gaussian uses the environment variable GAUSS_SCRDIR to write scratch files during its run. Normal Gaussian runs will be undertaken on the compute nodes. These have a local hard disk /scratch which offers best performance for this data, by default GAUSS_SCRDIR will be set to /scratch .

There are times when setting this to /scratch is not appropriate, e.g. for some runs the /scratch disks may not be large enough to accommodate the temporary files. The sizes of /scratch on the compute nodes is as follows:

ARC2

 

Nodes /scratch size memory / machine cores / machine
Compute 500GB 32GB 16

Sometimes, the Gausssian temporary files can be large and the /scratch directory will not be large enough.
In this case we advise using ARC and setting the GAUSS_SCRDIR to a directory on /nobackup instead:

As /nobackup is a parallel filesystem it should offer better performance than other network drives on the system although this will be slower than using the compute node ‘scratch disks’.

When jobs finish, please delete any scratch files since they can consume a considerable amount of disk space. In general, the only scratch file necessary to save from any given Gaussian job is the .chk file (for restart purposes). Other scratch files, such as read-write files ( .rwf ), integral files ( .int ), and second derivative files ( .d2e ) need not be saved. Automatic deletion of all files but the scratch file is easily accomplished by appropriate placement of the following command in the link 0 section of the Gaussian input file (its use is described in detail in the Gaussian manual):

Running the code

Launching on the login nodes.

Gaussian can be launched by entering g09 on the command line followed by the input filename:

Only very short runs should be launched on the login nodes. Compute intensive runs should be executed through the batch queues (see below).

For further information on the usage of Gaussian, please consult the online documentation.

Launching through the batch queues

To run through the batch queues, construct a script requesting the resources required for the job. It may be appropriate to consider both memory usage and CPU time limits for the job.

A sample script follows:

This will request 1 hour of runtime on a single processor with 1GB memory, this is the default amount of memory so the line #$ -l h_vmem=1G does not really need to be included.

Running in Parallel (Shared Memory)

The Gaussian executable is configured for shared memory parallel execution.

Unlike most shared-memory codes, Gaussian uses its own thread management, it is therefore important to set both the Gaussian input file (link 0 command: %NprocShared=np ) and the job submission script to use an identical number of execution threads. The OMP_NUM_THREADS variable should only be set to 1 or not at all (the default) for Gaussian to work correctly.

A sample script is:

This will request <np> processors, each with 1GB memory.

On ARC2, the maximum size of a job is 16 cores and a total of 32GB memory and on ARC3, the maximum size of job is 24 cores and a total of 128GB on a low memory node or 768GB on a high memory node.

To instruct Gaussian to start <np> threads, %NProcShared=<np> should be set in the Gaussian input file.

It is possible to combine the submission script and input file into a single script so that the np in the submission script always matches %NProcShared=np in the Gaussian input file:

The above script can be submitted to the batch queues through the qsub command:

When it runs, it creates the input file, formaldehyde.com , and then runs the job, such that the .log file and .chk file are saved in the current working directory. The .rwf files are written to /scratch on the compute node where the job runs, and deleted upon completion of the job.

While Gaussian may be run on several processors, note that the maximum number of processors and maximum available memory is often not appropriate, and may even slow performance.

In practice, Gaussian does not appear to scale well over more than 4-6 processors, depending on the job type. The Gaussian manual includes a section that discusses efficiency, and offers recommendations regarding memory allocation for various types of serial and parallel jobs run at different levels of theory with different basis sets.

Running in Parallel (Distributed Memory)

Whereas parallel runs of Gaussian using the method above are limited by the number of CPUs in a compute node, it is also possible to make use of a larger number of CPUs across distributed nodes by using the Linda parallel execution environment.

This requires adding some additional instructions to both the job submission script and the input file.

Submission script for ARC2:

This will set up:

  • A 30hr runtime job
  • Using all the cores on 2 nodes

The corresponding link 0 section of the input file would therefore be:

Note that:

  • %nprocshared has the same value as tpp

Managing temporary files

Gaussian creates large temporary files in the $GAUSS_SCRDIR . To prevent this directory filling up, it is best to clear it out at the end of the run. An example script which runs Gaussian, then clears the temporary files is:

Using Gaussview

Gaussview is installed for the preparation of Gaussian input files and viewing of output.

It may be better to install this application on a local workstation, rather than viewing the graphics over the network.

An X-server should be running on the local machine & your SSH connection should have X11 forwarding enabled to use the Gaussview GUI. Documentation for gaining ssh access to the ARC systems is in the web pages that describe how to login in from each of the 3 main operating systems i.e., Logon from Linux, Logon from Mac OS X and Logon from Windows.

Note the X-server needs to have 3D OpenGL extensions, most Linux/Mac X servers will have this functionality, however older versions of Exceed may not support this. Gaussview appears to work with Cygwin X-server, but has made Xming crash. ARC3 also has X2GO installed and this handles X-Windowing so that normally it provides a smooth images/animations with a no lag between frames and handles graphics communications without producing errors to your shell.

Launching on the front end

The Gaussian module sets up an alias to Gaussview gv :

Launching through the batch queues

If your use of Gaussview is compute intensive, you are encouraged to submit it as an interactive job to the batch queues. The following line can be used for this:

This will ask for 1 hour of runtime.

Launching Gaussian jobs from within gview.

This is possible, however you are encouraged to save the gaussian input file and submit the job to the queues via the command line.