# Acquisition et analyse des données, compléments

Materials for Data acquisition and analysis (OCEA0035-1)

## Organisation

This lecture is given the first semester each year. We will review theoretical concepts needed for the course, but mainly we will focus on the application of various data analysis techniques to several data sets. The exercises are done in Matlab or Octave and the necessary code will be either provided or developed during the lecture.

# Exercises

## Exercise 1

Quality control: using file 8762075.sealevel.txt or 8762075.sealevel.xls, representing the hourly sea level height on the west Florida Shelf in 2004, detect the suspect data that might be classified as outliers. Discussion on why these suspect data should/should not be classified as outliers.

## Exercise 2

Linear regression

To illustrate the application of linear regression and calculation of trends, atmospheric carbon dioxide from Mauna Loa (Hawaii) will be used.

Data: CO2 (ppmv) and time dimension; excel format

Apply a linear regression to the data (code here). Discuss if a linear regression is appropriate in this case. Calculate a linear regression by periods, and compare the trends for each period.

## Exercise 3

Filters and interpolation

Using this time series from the data in exercise 1 (date information here, apply a Gaussian-window filter to extract the annual cycle. If there are small gaps in the time series, an interpolation of the data to a regular time step will be done. Compare between the Linear and spline methods of interpolation.

## Exercise 4

Error Assessment

Using these data from the Cariaco basin (Venezuela), assess the error of the model respect to the observations using the various error measures viewed during the lecture. Explanation of what do we learn from these results about the model performance.

## Final Exercise

Exploratory data analysis

Using these data from the TAO website (Tropical Atmosphere Ocean array):

• Exploratory representation of data and analyses:
• Detection of exclusion values (missing values)
• Detection and elimination of outliers (minimum and maximum threshold, deviation from mean...)
• Other suspect data?
• Representation of data:
• Time series
• Vertical profiles (different seasons, different years)
• Distribution of variables (histogram)
• Discussion on representation of data: scales, readability of axes, units ...
• Interpolation of data to get rid of gaps
• Filter the data to analyse the seasonal cycle
• Description of the seasonal and inter-annual variability of the variables downloaded