Data are binned using an (unweighted) mean over a set time interval.
Statistics are also captured (std, min, max, num, spotval, median) and made available to the end-user.
Circular data (such as heading or wind direction) are averaged using circular mean and circular standard deviation calculations.
For example, the circular average of 355 and 5 is 0 not 180.
Vector data (such as true winds) are averaged using vector averaging.
First, the components (u,v,w) are each averaged using a simple mean.
These mean components are then used to derive the mean direction and mean magnitude:
mean_direction = rad2deg(atan2(mean_u, mean_v)) + 180
mean_magnitude = sqrt(mean_u^2 + mean_v^2)
All parameters are binned by default to 1-minute bins, centered on "nice" minute timestamps (12:03:00, 12:04:00, etc). On request prior to a cruise, binned data at a second (custom) time interval may also be generated for that cruise. Data collected less frequently than every minute (e.g. ceilometer data are collected every six minutes) are populated with NaNs between data points. The timestamp provided with each bin is the center time of the bin.
Binned data are stored and available only for the past X days. The number of days can be configured from 10 - 365 days (90 days is recommended) on the Data/Access/Archive page.
Three binned products are provided: "All", "Best", and "SAMOS". The distinction is the quality of data used to calculate the bin.
All Bins (a)
The "all" binned product bins all data points within the time interval regardless of quality. Quality flags are ignored.
Best Bins (b)
Best bins are derived only from data that have flag values of 0, 1, or 2 (i.e. no suspect or failed data).
For example, if there are 10 data points within the binning time interval, and one of them has a flag set to "fail",
only the remaining 9 data points are used to calculate the bin value.
Special note: QA/QC happens at various stages throughout the data lifecycle. "Best" bins are regenerated (a) at various stages of the automated QA/QC process, (b) on demand by the sensor technician during the manual QA/QC process, and (c) at the end of the cruise. As a consequence, the "best" binned values may change throughout the cruise. Data earlier in the lifecycle will, by definition, have fewer opportunities for flags to be set to "suspect" or "fail".
SAMOS Bins (c)
The third binned product, "SAMOS", uses only a subset of flags to determine whether data should be included in the bin or not.
Data must pass only the following four quality tests to be included in the bin:
(1) Gap Test
(2) Syntax Test
(3) Gross Range Test
(4) Global Range Test
SAMOS performs further quality assessment on these binned data.
Quality flags representing the binned data are a composite of all of the flags from that bin.
For each flag position, the largest value is used in the composite binned flag.
For example:
Individual Data Flags
111144000022222222111100000000
121122000033222222114400000000
Composite Flag
121144000033222222114400000000
Special note: QA/QC happens at various stages throughout the data lifecycle.
As a consequence, the composite quality flags may change throughout the cruise.
Each downloaded binned data file contains the following columns:
Datetime: the center timestamp of the bin
Latitude: the latitude closest in time to the bin center timestamp (a spot value)
Longitude: the longitude closest in time to the bin center timestamp (a spot value)
Parameter Data: a JSON dictionary containing the binned data and statistics
Multiple parameters may be included in a single file (one column per parameter).
Binned data and statistics for a given parameter and time bin are stored as a JSON dictionary.
An example follows below:
{
"a": [
359.017,
5.259,
0.1,
359.9,
120,
359.5,
-999
],
"b": [
358.982,
5.321,
1,
359.9,
117,
359.5,
-999
],
"c": [
358.982,
5.321,
1,
359.9,
117,
359.5,
-999
],
"fa": "224122222222222222222222222222",
"fb": "221122222222222222222222222222",
"fc": "221122222222222222222222222222",
"sa": "2021-05-06 22:12:59Z",
"sb": "2021-05-06 22:12:59Z",
"sc": "2021-05-06 22:12:59Z"
}
where:
"a": all array (mean, std, min, max, num, spot, median)
"b": best array (mean, std, min, max, num, spot, median)
"c": samos array (mean, std, min, max, num, spot, median)
"fa": all flag (combination of flags from the data used to derive the "all" bins)
"fb": best flags (combination of flags from the data used to derive the "best" bin)
"fc": samos flag (combination of flags from the data used to derive the "samos" bin)
"sa": spot datetime (all)
"sb": spot datetime (best)
"sc": spot datetime (samos)
Spot values are unmodified data values that were collected closest in time to the center time of the bin.