# NcArray

### From GHER

This toolbox for GNU Octave and Matlab allows to access a single or a collection of NetCDF files (or OPeNDAP URLs) as a multi-dimensional array. The object ncArray behave as regular Octave/Matlab arrays (for manipulating the variable of a NetCDF array) and as a Octave/Matlab structure for accessing the NetCDF attributes.

This toolbox uses the function `ncinfo`, `ncread` and `ncwrite` to manipulate the NetCDF files and works on Matlab and Octave provided those scripts exist. This is the case for Matlab since version R2011a and Octave with the netcdf toolbox 1.0.0 (or later).

## Contents |

# Download and installing

## Using octave's package manager (for Octave only)

Issue the following in octave:

pkg install -forge -auto ncarray

## Copying the source code (for Matlab and Octave)

Source code is available at sourceforge and distributed under the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. It can be downloaded with:

svn co https://svn.code.sf.net/p/octave/code/trunk/octave-forge/extra/ncArray/inst ncArray

In Octave/Matlab is must be added to the function search path, using for example `addpath`:

addpath /path/to/your/copy/of/ncArray

## Testing (and benchmarking)

All test of the scripts test_ncarray should pass.

### Matlab (R2013a)

>> tic; test_ncarray; toc All tests passed. Elapsed time is 23.669172 seconds.

### Octave (3.8.1) and octave-netcdf 1.0.6

>> tic; test_ncarray; toc All tests passed. Elapsed time is 8.55791 seconds.

Run times are on a Dell Latitude e6530 with SSD.

# Using the toolbox

## ncArray

An ncArray object is create by passing the file and variable name to the ncArray function:

A = ncArray('file.nc','temp');

The order of the dimensions is inverted to the order of dimensions that tool nc_dump reports. For example if according to ncdump, a variable has the dimension time, latitude and longitude, the order of the dimensions of the ncArray array are longitude, latitude and time. This choice is consistent with the matlab functions ncread and ncwrite. This stems from the fact that matlab/octave stores array in column-major order and programs written in C (such as ncdump) use the row-major order (more info http://en.wikipedia.org/wiki/Row-major_order).

% load a subset v = A(index_range1,index_range2,...); % load the entire variable: v = full(A) % load a attribute value (only octave): attribute_value = A.attribute_name; % or (octave and matlab) attribute_value = A.('attribute_name') % write data to NetCDF file: A(:,:) = data;

Note that here `A = data;` will not work, because it replaces the ncArray A by the variable data.

For files conforming to the CF-convention, also coordinates can be loaded with the object ncArray:

value = A(:,:,:); [longitude,latitude,time] = A(:,:,:).coord;

## ncCatArray

Often a variable is split into multiple NetCDF file. The function `ncCatArray` allows to access a collection of netCDF as a multi-dimensional array. This function works similar to the Matlab `cat` function which concatenates a variable along a given dimension. All netCDF files must have a similar structure. In particular all variables must have the same size, except for the dimension along which the variables are concatenated.

The file names can either be specified as a cell-array of strings:

A = ncCatArray(dim,{'file1.nc','file2.nc','...},varname)

dim is the dimension along which the variables varnames in the files 'file1.nc', 'file2.nc',... are concatenated.

Shell wild card ("globbing pattern") can also be used if all file names are similar:

A = ncCatArray(dim,'file*.nc',varname)

The variable varname in all files (in alphabetic order) matching 'file*nc' will be concatenated. File names can also be constructed by a function handle.

A = ncCatArray(dim,filenamefun,varname,range)

For example:

A = ncCatArray(3,@(t) ['file-' datestr(t,'yyyymmdd') '.nc'],... datenum(2012,07,08):datenum(2012,07,09));

Data can be loaded and written as before using a range of indices. If attributes are accessed, then netCDF attribute of the first netCDF file is used.

## Reduction

The following reduction function are defined:

- sum: sum
- prod: product
- mean: mean
- nanmean: mean ignoring NaNs
- var: variance
- std: standard deviation
- min: minimum
- max: maximum
- moment: central moment of a given order
- reduce: general-purpose reduction function

All reduction function operate over a given dimension and load only the minimum amount of data in memory.

Computing the mean over a large number of files (Analysis*/xa-xf.nc) can be done by:

icediff = ncCatArray(3,'Analysis*/xa-xf.nc','icec'); micediff = mean(icediff,3);

## Compressed and remote files

`ncArray` attempts to download remote files (when the location starts with http:// or ftp://) and decompresses files with the extensions `.gz`, `.bz2` or `.xz`. It uses the directory defined in the global variable `CACHED_DECOMPRESS_DIR` (which defaults to a temporary directory in /tmp). This directory is not removed after a octave/matlab session so that downloaded (or decompressed files) can be reused quickly. If this directory contains files larger than 10 GB in total (the global variable `CACHED_DECOMPRESS_MAX_SIZE` which defaults to "1e10", expressed in bytes), then the oldest files will be deleted.