DINEOF init file

From GHER

Revision as of 03:42, 15 July 2012 by WikiSysop (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

The Init file

All the data needed by DINEOF have to be specified in the file dineof.init. A quick look at the one provided with the package should help you to adjust it to your data.

  • The gappy data file and the mask are specified as follows:
data = ['/yourpath/your_gappy_data_file.ext']
mask = ['/yourpath/your_mask_file.mask']

or, if using netCDF format:

data = ['/yourpath/your_gappy_data_file.nc#vbe']
mask = ['/yourpath/your_mask_file.nc#vbe_mask']

the ''#'' separating the netCDF file name from the variable name.

  • Then the name of the time file has to be specified. This allows to activate the filtering of the temporal covariance matrix as specified here. As an example:
time = '/yourpath/your_gappy_data_file.nc#time'
alpha = 0.01
numit = 3

The time data has to specify the increment in time between your data file individual images. For example, if all your data are separated by 1 day, your time file can consist of the numbers 1,2,3,4... etc. If your day 3 is missing, then it should be 1,2,4... The number itself does not matter (it can be 1 or 5000), but the increment between them does. The time file should be in binary format or in netCDF.

The other two parameters allow to increase or decrease the strength of the smoothing (alpha) and the reach of the filter (numit). For details see paper on Enhancing temporal correlations in EOF expansions for the reconstruction of missing data using DINEOF

If you want to deactivate this option, use:

alpha = 0
  • Next, the number of modes you want to calculate, and the the maximum size for the Krylov subspace (see Toumazou and Cretaux, 2001) are specified. You can put these two numbers quite high (smaller than the temporal size of your data, though), as DINEOF will only compute the optimal number of EOFs + 3.
nev = 15
ncv = 20

ncv must be at least nev+5, and smaller than the temporal size of your matrix. You can begin by setting nev = 15, and then adjust this number for subsequent reconstructions (i.e., if you run DINEOF and only 6 modes are needed, you can reduce nev to 10, for example. If the 15 modes have been taken for the reconstruction, probably DINEOF needs more information to compute the reconstruction. Try to increase nev to, for example, 20. The number of retained modes is specified in the screen output and it is saved in a file in your disk.).

  • The parameter neini allows you to start the calculation of the EOFs from any EOF:
neini = 1

will start at the first EOF (recommended)

  • The next parameter is tol, which is set to 10-8, and that you can use for DINEOF without changing it.
  • The parameter nitemax sets the maximum of iterations to be made for each EOF calculation. You should put it to a large number (like 300) and increase it only if you see that you reach this number for each EOF
nitemax = 300
  • The parameter toliter sets the precision criteria defining the threshold of automatic stopping of DINEOF iterations, once the ratio (rms of successive missing data reconstruction)/stdv(existing data) becomes lower than toliter<tt>
toliter = 1.0e-3
  • Next is the <tt>rec parameter. When set to ''1'' , DINEOF will recosntrut the whole matrix using the EOF base. With rec = 0, only missing data are reconstructed, and initially present points will be exactly the same as in the introduced matrix. When set to ''1'' , DINEOF gives smoother results, but that also avoids cold spikes at cloud edges and other sources of noise in the initial matrix.
  • The parameter eof set to 1 will write on the disk the EOF basis used in DINEOF. Left and right EOFs, singular values and variance for each mode are written on the disk. If you are only interested in the reconstructed matrix, you can set this parameter to 0.
  • When running DINEOF with different variables, you must set norm to 1, to normalize the different input matrices. If you are using only one variable, you can set norm to 0.
  • Output folder. The EOF modes files will be written to the folder specified in this line.
  • clouds: this parameter allows you to choose the points for cross-validation. If commented out, DINEOF will randomly generate the cross-validation points for the error assessment of the reconstruction. However, you can also choose your own set of points to be used for cross-validation, usually in the form of clouds. An Octave/Matlab routine is provided (see folder Scripts) to generate this file.
  • The name of the reconstructed matrix is specified next, in ''results'' . If you want it to be written in another folder, you can also specify it at this line. If you want the ouptput to be written in the binary format, you can specify something like:
results = ['/yourpath/your_gappy_data_file.ext.filled']

If you want the output to be written in netCDF format, then you have to write something like:

results = ['/yourpath/your_gappy_data_file.nc#vbe_filled']

The #vbe_filled means that a variable called ''vbe_filled'' will be created in the netCDF file. Note that the input and output formats are independent, i.e you can use the binary format for the input and the netCDF format for the output, and vice-versa.


  • seed: DINEOF needs a random number to initiate the choice of the cross-validation points. If you repeat twice the same experiment with the same seed, the results will be the same. If you choose another value for seed the results might vary slightly.
Personal tools