Optimal interpolation Fortran module with Octave interface

From GHER

Revision as of 07:37, 17 July 2014 by Alex (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

This Fortran 90 module performs a n-dimensional optimal interpolation (OI). The optimal interpolation allows to interpolate arbitrarily located observations to a regular grid using a background field as first guess. The observations and the background fields may contain errors. Optimal interpolation merges observations and background taking their expected variances into account. The merged field is optimal in the sense that it has the lowest error variance.

The error correlation function of the background field is assumed to decrease exponentially with the square of the distance. The correlation length in every dimension must be specified. Other correlation functions can easily be implemented.

In some disciplines, this technique is called objective analysis or kriging but the underling algorithm is the same.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

This package is also included in the octave-forge repository. Some Linux distributions already include octave-forge and you only need to install the octave-forge package in order to use this package. For other distribution, you can compile either octave-forge or optiminterp from source.

Contents

Implementation

Let's start with some notations. The vector \mathbf x^b is the background state estimation, a first guess. The observation are contained in the vector \mathbf y^o. The operator \mathbf H extracts from a state vector the corresponding values at the location of the observation. The vector \mathbf H \mathbf x^b represents thus the first guess at the location of the observations. The difference is noted \mathbf d:


\mathbf d =\mathbf y^o - \mathbf H \mathbf x^b

The elements of matrix \mathbf P^b the represents to covariance between two points (x1,...,xn) and (y1,...,yn) in the n-dimensional space. Knowing the error covariance of the observations \mathbf R and the error covariance of the background state \mathbf P^b the Kalman gain matrix is defined as:


\mathbf K
=\mathbf P^b \mathbf H^T \left(
\mathbf H \mathbf P^b \mathbf H^T+ \mathbf R \right)^{-1}


The optimal state vector \mathbf x^a can be computed as:

 
\mathbf x^a - \mathbf x^b = \mathbf K \mathbf d

The optimal interpolation code operates on difference from the background state. From the difference \mathbf d it computes the state vector correction \mathbf x^a -\mathbf x^b. It is thus up to the users of this program to form the vector \mathbf d and after state vector correction is obtained to compute the analysis \mathbf x^a.

The error covariance \mathbf P^a of the analyzed state \mathbf x^a is given by:

 
\mathbf P^a  =  \mathbf P^b - \mathbf K \mathbf H \mathbf P^b


The following assumptions are made:

  • \mathbf R is assumed diagonal.
  • \mathbf P^b is simply given by a parametrization. Two analytical structures are available:
  1. Gaussian (default):  
P^b(x_1,...,x_n,y_1,...,y_n)=\sigma(x_1,...,x_n)^2 \exp\left(
-\frac{(x_1-y_1)^2}{L_1^2}
...
-\frac{(x_n-y_n)^2}{L_n^2} 
\right)
  2. DIVA (based on modified Bessel function 2nd kind). Use preprocessor option -DBACKGROUND_COVARIANCE_DIVA to enable this background covariance.

The error variance σ(x1,...,xn)2 and the correlation lengths L1,...,Ln have to be specified.

  • Only a limited number m (specified to the user) contribute to the analysis of a given point.

Requirements

  1. Fortran 90 compiler
  2. LAPACK and BLAS

For the optional GNU Octave interface you will need:

  1. Octave with its development package

Download the package

Download the sources of the toolbox from http://octave.sourceforge.net/optiminterp/index.html

Fortran interface

Decompress the file

tar -xzvf optiminterp-0.3.4.tar.gz
cd optiminterp-0.3.4

The optimal interpolation module is written in Fortran 90. The file example_optiminterp.F90 in src shows how to use the optimal interpolation module from Fortran. On a Linux or UNIX system, you would compile this example, by

cd src
gfortran -o example_optiminterp optimal_interpolation.F90  example_optiminterp.F90 -llapack -lblas

Replace gfortran with the name of your Fortran 90 compiler. You can check, if the example is correctly compiled by:

$ ./example_optiminterp
Expected:  0.022206
Obtained:  0.022206

Octave interface

Installation from your Linux repository

Some Linux distributions include already this package. In this case you can install it using your distributions' package manager. Under Debian, Ubuntu or Mint this can be done from a terminal via

sudo apt-get install octave-optiminterp

Users of Fedora can install this package by issuing

yum install octave-optiminterp

This works also on Red Hat and Centos after enabling the Extra Packages for Enterprise Linux repositories.

Automatic installation using Octave's package manager

Inside octave, you can download and install the package by issuing:

pkg install -verbose -forge -auto optiminterp

You should see no error messages. If you are curious to know where octave has placed the optimal interpolation toolbox issue the following command:

octave:3> pkg list
Package Name | Version | Installation directory
-------------+---------+-----------------------
 optiminterp |   0.3.4 | /home/abarth/octave//optiminterp-0.3.4

If your version of octave was compiled with g77, you can tell mkoctfile to use a different Fortran compiler. To use the gfortran compiler, you have to set the following environment variables before starting octave and installing the package.

export F77=$HOME/local/gfortran-20080226/bin/gfortran
export FLIBS=$($F77 -print-file-name=libgfortran.so)

For g95:

export F77=g95
export FFLAGS=-fno-second-underscore
export FLIBS=$(g95 -print-file-name=libf95.a)

For ifort:

export F77=ifort
export FLIBS="-L/opt/intel/fc/10.0.023/lib -lifcore_pic  -lguide"

Manual installation

Decompress the file:

tar -xzvf optiminterp-0.3.4.tar.gz
cd optiminterp-0.3.4/src

Since we need to link Fortran and C++ code, the manual installation is compiler specific. Basically we need to know where the Fortran run-time libraries are located. The locations and the names of the run-time libraries are specified with the FLIBS environment variable.

gfortran

If you have gfortran version 4.1 or newer, all you need to do is to compile with:

gfortran -fPIC -c optimal_interpolation.F90 optiminterp_wrapper.F90
FLIBS=$(gfortran -print-file-name=libgfortran.so) mkoctfile optiminterp.cc optiminterp_wrapper.o optimal_interpolation.o

Unfortunately, the Fortran 90 interpolation module doesn't work with gfortran 4.0. It compiles but it generates a run-time error. In this case, I recommend to use the g95 Fortran compiler.

g95

As gfortran, g95 is also an open source Fortran 90 compiler. Binaries for most platforms are available at the g95 website. Download and extract the tar file corresponding to your platform, e.g.:

cd $HOME
wget -O - http://ftp.g95.org/g95-x86_64-32-linux.tgz | tar  xvfz -

This installs g95 in your home directory under g95-install. In order to avoid some typing you can define an alias. You need probably to adapt the name of the g95 executable as it depends on your platform.

alias g95="$HOME/g95-install/bin/x86_64-unknown-linux-gnu-g95"

This alias is only defined in the current shell. In order to make this alias permanent you need to add this line in your $HOME/.bashrc file.

Issue the following command:

g95  -fno-second-underscore  -fPIC -c optimal_interpolation.F90 optiminterp_wrapper.F90
FLIBS=$(g95 -print-file-name=libf95.a) mkoctfile optiminterp.cc optiminterp_wrapper.o optimal_interpolation.o

Intel compiler

Assuming that your Intel compiler is installed in /opt/intel/fc/9.0/, you can compile it with:

ifort -fPIC -O3 -c optimal_interpolation.F90 optiminterp_wrapper.F90
FLIBS="-L/opt/intel/fc/9.0/lib -lifcore_pic  -lguide" mkoctfile -v optiminterp.cc optiminterp_wrapper.o optimal_interpolation.o

PGI compiler

The PGI compiler works also:

pgf90 -fPIC -c optimal_interpolation.F90 optiminterp_wrapper.F90
DL_LD="pgf90 -fPIC" mkoctfile -v optiminterp.cc optiminterp_wrapper.o optimal_interpolation.o

or assuming that the Fortran run-time libraries are in /usr/pgi/linux86-64/6.0/libso/, you can compile it with:

pgf90 -fPIC -c optimal_interpolation.F90 optiminterp_wrapper.F90
FLIBS="-L/usr/pgi/linux86-64/6.0/libso/ -lpgf90 -lpgf90_rpm1 -lpgf902 -lpgf90rtl -lpgftnrtl -lpgc" mkoctfile -v optiminterp.cc optiminterp_wrapper.o optimal_interpolation.o

Testing the optimal interpolation module

The octave script "test_optiminterp.m" performs a 1D, 2D, 3D and 4D optimal interpolation. All tests should pass; any error indicates that either there is a bug in the optimal interpolation package or that it is incorrectly installed.

octave:1> test_optiminterp
Testing 1D-optimal interpolation: OK
Testing 2D-optimal interpolation: OK
Testing 3D-optimal interpolation: OK
Testing 4D-optimal interpolation: OK

Using the optimal interpolation module

The following is a minimal example about how to use the optimal interpolation package in Octave. This example is included in the file "example_optiminterp.m". The package also includes an example about how to use the OI package from Fortran 90 called "example_optiminterp.f".

% number of observations to interpolate

on = 200;

% create randomly located observations within 
% the square [0 1] x [0 1] 

x = rand(1,on);
y = rand(1,on); 

% the underlying function to interpolate

f = sin(6*x) .* cos(6*y);

% the error variance of the observations

var = 0.01 * ones(on,1);

% the grid onto which the observations are interpolated

[xi,yi] = ndgrid(linspace(0,1,100));

% the correlation length in x and y direction

lenx = 0.1;
leny = 0.1;

% number of influential observations

m = 30;

% run the optimal interpolation
% fi is the interpolated field and vari is its error variance

[fi,vari] = optiminterp2(x,y,f,var,lenx,leny,m,xi,yi);


Simple example for using the optimal interpolation module with your own data

Let's assume that your data is the text file data.txt with the following contents:

  0.699949  -1.262174   4.785741
 -1.984238  -1.404258   5.116080
 -1.268750  -0.309276   4.821194
  0.997442  -0.887413   4.810200
  1.584304   1.681749   5.087108
  0.177603   1.051811   4.981727
  [...]

The 1st and 2nd column represent the Cartesian coordinate in a 2D space and the 3rd column represents the observations.

This data is interpolated by the script example_optiminterp_data.m copied below:

% This script illustrates a 2D interpolated with user-data

% load the data file
data=load('data.txt');

% the 1st and 2nd column represent the location
x = data(:,1);
y = data(:,2);
% the 3rd column represents the observations to interpolate
obs = data(:,3);

% create a regular mesh from -2 to 2 in x and y direction
% with a grid spacing of 0.1

[xi,yi] = ndgrid(-2:0.1:2,-2:0.1:2);

% number of influential points

m = 30;

% compute the first guess (or background), here we take the 
% mean of all observations
mean_obs = mean(obs);

% remove the mean
f = obs - mean_obs;

% var is the error variance ratio between of the observations and background.
% observations are assumed to be 100 times more accurate (in terms of error variance) than the
% background estimation (here the mean)

var = 0.01 * ones(size(f));

% call optiminterp for 2d case with a correlation length of 1.5 in x and y direction

[fi,vari] = optiminterp2(x,y,f,var,1.5,1.5,m,xi,yi);

% put the mean back
obsi = fi + mean_obs;

Here is what you get:

File:Example optiminterp data.jpg

If you want to try this example, download example_optiminterp_data.m and data.txt and place them in the same folder. Start Octave, and then change the directory to this folder and execute example_optiminterp_data:

octave:1> cd /your/folder
octave:2> example_optiminterp_data

Parameters of optiminterp1,...,optiminterp4

Input

  • f: The vector f corresponds to a data point (observation) minus the background (\mathbf d =\mathbf y^o - \mathbf H \mathbf x^b)
  • x,y,z,...: Every element in f corresponds to a point located at (x,y,z,...)
  • var: The error variance of the observations divided by the error variance of the background field \frac{R}{P^b}.
  • lenx, leny, lenz,... are the correlation lengths in x-,y-,z-,... directions respectively (L1,L2,L3,...).
  • m: represents the number of influential points. If m increases, the algorithm converges to the optimal solution. m should be as large as your computer resources allow. In practice, m should be more than 30.
  • xi, yi, zi,... are the data points where the field is interpolated.

Output

  • fi: The vector fi corresponds to the analysis increment (\mathbf x^a - \mathbf x^b).
  • vari: The error variance of the analysis divided by the error variance of the background field \frac{P^a}{P^b}.

It is important not to forget to:

  1. remove the background estimation (first guess) from the observations
  2. divide the error variance of the observations R by the error variance of the background Pf.

Compiler benchmarks

Some informal benchmarks:

Compiler Version Optimization Run-time (s)
gforrtan 4.0 -O3 3.81
gfortran 4.3.0 20080206 -O3 3.79
g95 0.90 -O3 5.15
ifort 10.0 20070426 -O3 3.06

Compilation with OpenMP

The optimal interpolation module is parallelized using OpenMP directives. To use multiple CPUs of a SMP machine you need to enable OpenMP (using the -openmp flag of the Intel compiler or -mp flag of PGI compiler) and set the environment variable OMP_NUM_THREADS to the desired number of CPUs to use before starting octave.

Troubleshooting

  • Undefined symbols with Intel's Fortran Compiler.
octave:1> test_optiminterp
Testing 1D-optimal interpolation: failed
error: [...]/optiminterp-0.3.4/optiminterp.oct: undefined symbol: __intel_cpu_indicator

You will also need to link against Intel's ifport library by adding "-lifport" at the end of the link command. If you get other error messages complaining about undefined symbols you might want to follow the instructions described in Intel's Compilers for Linux - Compatibility with the GNU Compilers (Thanks Claudio for pointing this out)

  • Runtime error with gfortran.
octave:1> test_optiminterp
Testing 1D-optimal interpolation: Fortran runtime error: dimension of return array incorrect

This problem is fixed in optiminterp 0.2.4.

  • Issuing Control-C stops octave after calling optiminterp compiled with g95. I don't know why this happen. Since gfortran is faster anyway than g95, recommend using gfortran instead.

Comments and feedback

Any suggestions, comments, bug fixes, ... are of course very appreciated. Send them to: a dot barth at ulg dot ac dot be.

Alexander Barth

Personal tools