The back end and libraries are now up for general testing on a variety of boxes and operating systems. The tarball is usually updated as the web page is. To download and install g95 on unix systems, run the following command (originally from Joost Vandevondele):

wget -O -
linux.tgz | tar xvfz -

This will create a directory named 'g95-install' in the current directory. Run (or better yet make an appropriate symbolic link to) ./g95-
g95 in order to run g95.


Simple coarray example

This example shows how to use coarrays to split up an integration over multiple cores on the same computer or across a network. The same program runs on each core, referred to as an "image". Each image has a unique image number, one to the total number of cores/images.

This example defines a derived type called 'work', which is used to communicate between image number one and the rest of the images. Square bracket signify a coarray in the variable declaration. Square brackets in expressions indicate memory references on another image. The 'sync images' statement causes an image to wait until the specified image synchronizes against the original image.

The example can be extended by modifying the 'work' derived type to include inputs and outputs to each sub-task. In our example, an integral is split into sub-ranges, which are passed to separate images which compute the integral. Image 1 adds the individual ranges up into the final integral. The example can be modified to collect and collate individual results in any manner desired.

module parallel_module

type work
   double precision :: a, b, integral
   logical :: halt
end type work

type(work) :: coarray[*]   ! Shared by all images

! integrate-- Integration.  Our example is the definite integral of
! f(x) = x.  Your integral or sub-calculation goes here.

  double precision function integrate(a, b)
    double precision a, b

    integrate = 0.5 * (b*b - a*a)
  end function integrate

! worker-- Infinite loop that processes integration requests from the
! master.  All images except #1 end up here.

  subroutine worker
       sync images(1)   ! Wait for image 1 to set up our work
       if (coarray % halt) stop

       coarray % integral = integrate(coarray % a, coarray % b)
       sync images(1)   ! Release image 1 to compute final integral
  end subroutine worker

! parallel_integral-- Subroutine for parallel integration of the
! interval (a, b).
  double precision function parallel_integral(a, b)
    double precision a, b, h

    if (num_images() == 1) then
       parallel_integral = integrate(a, b)

! Set up the work for each image

    h = (b - a) / (num_images() - 1)
    do i=2, num_images()
       coarray[i] % halt = .FALSE.
       coarray[i] % a = a + (i-2) * h
       coarray[i] % b = a + (i-1) * h

    sync images(*)    ! Release workers
    sync images(*)    ! Wait for workers to finish

!  Collect the results

    parallel_integral = 0.0
    do i=2, num_images()
       parallel_integral = parallel_integral + coarray[i] % integral
  end function parallel_integral

! shutdown-- Called by image #1 to cause all other images to stop.

  subroutine shutdown
    do i=2, num_images()
       coarray[i] % halt = .TRUE.
    end do

    sync images(*)
  end subroutine shutdown
end module parallel_module

! Top-level program that calls the parallel integrator

program p
  use parallel_module

  if (this_image() /= 1) call worker

  do i=1, 10
     print *, i, parallel_integral(0.0d0, real(i, kind=8))

  call shutdown
end program p

Running a G95 coarray program:

SMP and Network versions of coarrays are available on the x86/linux, x86-64/linux and ia64/linux platforms, with g95 0.94 compiled after Jan 15, 2013. No special options are needed to compile your program.

To run a 10-image coarray program on an SMP machine use

  ./a.out --g95 images=10

To run a 10-image coarray program across a (heterogeneous) network use

  ./a.out --g95 master images=10

On a single machine. On one or more compute nodes, use

  ./a.out --g95 host images=1

where the number of images on the host nodes add up to the number of images on the master. Multiple images started on a host node communicate with each other much more efficiently than using the network.

When you start the programs, they automatically link up with each other over the network. When the master locates the number of requested images, it starts your program on all machines.

SMP coarrays are free, more than four networked hosts requires purchasing a license.

Quick Overview of Coarrays

Coarrays are an exciting parallel programming extension for fortran. They are special variables that can be shared across multiple instances of the same program, which are called 'images'. The main advantage of coarrays is the high level of integration with the fortran language itself, making programs vastly more readable than subroutine calls to parallel libraries. Synchronization primitives are also provided.

Coarrays look like fortran arrays, except with square brackets instead of round. The square brackets indicate a reference to a coarray on another (or maybe the same) image. A simple coarray declaration looks like:

INTEGER :: x[*]

This declares x to be an integer that is sharable across images. In an expression, x refers to the x on the current image. x[1] refers to the x on image one and so on. The *-notation is like a fortran assumed-size array-- the bounds of the coarray are determined at run time, not compile time.

Using a coarray variable without square brackets refers to the local value within expressions. Using a coarray variable with square brackets refers to the coarray variable on a particular image. On the right side of an assignment, the value is loaded from an image. On the left side of an assignment, the value is stored to that image. The statement:

x[1] = x[2]
causes the value of x on image two to be loaded, then stored to x on image one. The statement has the same effect when executed on any image.

A simple coarray program is:

integer :: m[*]

print *, m

if (THIS_IMAGE() == 1) then
   do i=1, NUM_IMAGES()
      print *, i, m[i]

The THIS_IMAGE() intrinsic returns the image number of the current image, which is one through NUM_IMAGES(), which returns the total number of images in the calculation.

The SYNC ALL statement causes all images to wait until all images have reached the SYNC ALL statement. Once this has happened image one prints the value of m for all other images. All other images quietly terminate.

A huge advantage that coarrays have over message passing libraries is that the programs are much more readable. Message passing libraries have to be called, and the argument lists can be lengthy and hard to read. With coarrays, the square brackets tell you that cross-image communication is taking place and language statements are used to implement synchronization instead of subroutines.

For more information on coarrays and how to use them, see John Reid's paper "Coarrays in the next Fortran Standard" and my own Coarray Compendium. The compendium is a gentle but not exhaustive introduction to coarrays. It includes some example programs.


Coarrays in the next Fortan Standard (John Reid, N1747) HTTP
Coarray Compendium HTTP