Data Format


Data are available for download in the following states:

Native Unmodified Sensor Messages: The unmodified data messages output from the sensor are provided at the native resolution in daily text files. The time of collection and sensor ID is prepended to each data message. No quality control has been performed.
Native Resolution Parameter Values: These data are parsed from the sensor data messages and provided as individual parameters at the native resolution (e.g. 1 Hz). Processed data are also sometimes available. Timestamps and data frequencies vary between sensors. Preliminary quality flags are included.
Binned Data with Statistics: These data from individual parameters are provided in temporal (e.g. one-minute) bins. The mean and accompanying statistics are provided. Preliminary quality flags are included.
Binned Data without Statistics: This dataset includes data from all sensors and all parameters. The data are provided in temporal (e.g. one-minute) bins. Only the mean is provided (no statistics, no quality flags).
 

Native Unmodified Sensor Messages

All sensor data received by CORIOLIX are archived as ASCII flat files, one file per sensor per day. The unmodified raw messages (strings) are stored at the native resolution collected by the sensor. The format of the strings varies by sensor. Each message string is prepended with the UTC timestamp of receipt and the CORIOLIX sensor ID. No quality control has been performed on these data. These datasets are delivered to a long-term achive (e.g. R2R or NCEI) after every cruise. These native resolution data files are also available to end-users for download.
 

Native Resolution Parameter Values

Data Download Page: High Resolution Data Download

For real-time applications, a subset of sensor data are extracted and stored within CORIOLIX. In some cases, post-processed parameters are also derived. These searchable datasets are available at their full resolution. The collection timestamps and data values are unmodified. Preliminary quality control flags are provided as an additional field.

Date Range

High resolution data are stored and available only for a short term. The number of days to store can be configured from 10 days or more (90 days is recommended) on the Data/Access/Archive page.


 

Binned Data with Statistics

Data Download Page: Binned Data with Statistics Download

Binning Method

Data are binned using an (unweighted) mean over a set time interval. Statistics are also captured (std, min, max, num, spotval, median) and made available to the end-user.

Circular data (such as heading or wind direction) are averaged using circular mean and circular standard deviation calculations. For example, the circular average of 355 and 5 is 0 not 180.

Vector data (such as true winds) are averaged using vector averaging. First, the components (u,v,w) are each averaged using a simple mean. These mean components are then used to derive the mean direction and mean magnitude:
mean_direction = rad2deg(atan2(mean_u, mean_v)) + 180
mean_magnitude = sqrt(mean_u^2 + mean_v^2)


Binning Time Interval

All parameters are binned by default to 1-minute bins, centered on "nice" minute timestamps (12:03:00, 12:04:00, etc). On request prior to a cruise, binned data at a different (custom) time interval may also be generated for that cruise. The default for the custom bin interval is 10 minutes. Data collected less frequently than every minute (e.g. ceilometer data are collected every six minutes) are populated with NaNs between data points. The timestamp provided with each bin is the center time of the bin.


Date Range

Binned data are stored and available only for a short term. The number of days to store can be configured from 10 days or more (90 days is recommended) on the Data/Access/Archive page.


Binned Data Quality

Bins are derived only from data that have flag values of 0, 1, or 2 (i.e. no suspect or failed data). For example, if there are 10 data points within the binning time interval, and one of them has a flag set to "fail", only the remaining 9 data points are used to calculate the bin value.


Binned Quality Flags

Quality flags representing the binned data are a composite of all of the flags from that bin. For each flag position, the largest value is used in the composite binned flag.

For example:
Individual Data Flags
111144000022222222111100000000
121122000033222222114400000000
Composite Flag
121144000033222222114400000000


Binned Data Structure

A query for a single day of one-minute binned data will return 1440 rows (60 minutes * 24 hours).

Every downloaded binned data file contains the following columns:
time: the center timestamp of the bin
spot_time: the timestamp of the spot value
spot_value: data value collected closest in time to the center time of the bin
num_values: the number of values in the bin

When relevant, binned data files will also contain these additional columns:
mean: the unweighted mean (see above for circular and vector means)
stddev: the standard deviation
minimum: the minimum value in the bin
maximum: the maximum value in the bin
median: the median value in the bin
Note that for some data types (e.g. text), a spot value is a more relevant metric than the mean.

 

Binned Data without Statistics

Data Download Page: Binned Data Download

Binning Method

Binned data may be represented by either spot values or unweighted means over a set time interval. Text, geographic point, and datetime data are represented by spot values. Circular data (such as heading or wind direction) are averaged using circular mean and circular standard deviation calculations. For example, the circular average of 355 and 5 is 0 not 180.

Vector data (such as true winds) are averaged using vector averaging. First, the components (u,v,w) are each averaged using a simple mean. These mean components are then used to derive the mean direction and mean magnitude:
mean_direction = rad2deg(atan2(mean_u, mean_v)) + 180
mean_magnitude = sqrt(mean_u^2 + mean_v^2)
All other numeric data (floats, integers) are calculated using unweighted means.


Binning Time Interval

All parameters are binned by default to 1-minute bins, centered on "nice" minute timestamps (12:03:00, 12:04:00, etc). On request prior to a cruise, binned data at a different (custom) time interval may also be generated for that cruise. The default for the custom bin interval is 10 minutes. Data collected less frequently than every minute (e.g. ceilometer data are collected every six minutes) are populated with NaNs between data points. The timestamp provided with each bin is the center time of the bin.


Date Range

Binned data are stored and available only for a short term. The number of days to store can be configured from 10 days or more (90 days is recommended) on the Data/Access/Archive page.


Binned Data Quality

Bins are derived only from data that have flag values of 0, 1, or 2 (i.e. no suspect or failed data). For example, if there are 10 data points within the binning time interval, and one of them has a flag set to "fail", only the remaining 9 data points are used to calculate the bin value.


Binned Data Structure

A query for a single day of one-minute binned data will return 1440 rows (60 minutes * 24 hours).

Every downloaded binned data file contains the following columns by default:
time: the center timestamp of the bin
latitude: the latitude closest in time to the bin center timestamp (a spot value)
longitude: the longitude closest in time to the bin center timestamp (a spot value)
point: the geo point derived from the lat/lon values
In addition, the file may contain multiple additional parameters on request (one column per parameter).
Each parameter column contains either the mean or spot value (as appropriate) for that parameter.

 
 
 
OSU logo NSF logo RCRV logo
RCRV Datapresence and Engineering Support Center
Oregon State University
Corvallis, OR 97331, USA
coriolix_support@oregonstate.edu