| |
|
Reference documents

| Quality control of bottle data (IML) |
GENERAL INFORMATION |
|
The information presented in this document is based on the expertise developed by IML's data management section, whose members have been compiling and examining discrete data over the past several years. We drew largely on those procedures used by NOAA's National Oceanographic Data Center during the production of the World Ocean Database (Conkright et al. 2002) as well as many of the tests proposed in the GTSPP Real-Time Quality Control Manual ("Global Temperature-Salinity Pilot Project"; Unesco 1990).
The quality control of bottle data includes the validation of temperature, salinity, chlorophyll, dissolved oxygen, nitrate, nitrite, phosphate, and silicate measurements from a sample of seawater. The quality control procedure is composed of a set of tests performed in the MatLab environment.
Like the quality control procedures for CTD data, bottle data quality control is divided into five steps:
-
Step 1: Tests validating the important metadata such as the time and position.
Step 2: Tests comparing data values within a profile.
Step 3: Comparison of the profile to a climatology.
Step 4: Comparison of all profiles from the same mission.
Step 5: Visual inspection of the cruise track and of the data themselves.
All metadata can be modified, in particular the time-space coordinates, without making the profile unusable.
No data are modified by the quality control procedure. A quality flag is added (see next section) to flag the data as good, doubtful, erroneous, or missing only for the step 2 tests. If the data must be modified for some reason, these modifications are made outside the quality control procedure and the quality flags must be manually adjusted in consequence.
Conkright, M.E., J.I. Antonov, O. Baranova, T.P. Boyer, H.E. Garcia, R. Gelfeld, D. Johnson, R.A. Locarnini, P.P. Murphy, T.D. O'Brien, I. Smolyar, and C. Stephens. 2002a. World Ocean Database 2001, Volume 1: Introduction. Edited by S. Levitus. NOAA Atlas NESDIS 42, U.S. Government Printing Office, Washington., D.C., 167 pp.
Unesco. 1990. GTSPP real-time quality control manual. Intergovernmental Oceanographic Commission, Manuals and Guides no. 22. |
 |
DESCRIPTION OF THE INDIVIDUAL QUALITY FLAGS |
The tests performed during step 2 add quality flags to the temperature, salinity, chlorophyll, dissolved oxygen, nitrate, nitrite, phosphate, and silicate data. The quality flag is a whole number between 0 and 9. The meanings of the quality flags are given in the following table:
| Flag |
Meaning |
 |
 |
| 0 |
no quality control |
| 1 |
value seems correct |
| 2 |
value appears inconsistent with other values |
| 3 |
value seems doubtful |
| 4 |
value seems erroneous |
| 5 |
value was modified |
| 6 |
reserved for future use |
| 7 |
possible problem with data point-further investigation required (IML temporary flag) |
| 8 |
QC was performed by data producer |
| 9 |
value missing |
The quality flags of higher value (except 7) take precedence over those with lower value; e.g., a QC flag of 0 has a lower priority than a flag of 9. As such, if a test judged a data value as doubtful (flag 3) and the following test judged it as erroneous (flag 4), the quality flag 4 would be retained. The QC flag of 7 is an exception since it is temporary.
|
 |
GLOBAL QCFF FLAG |
The QCFF ("Quality control failed flag") flag allows one to determine which test(s) the quality flag results from. It applies to the step 2 quality control tests as well as the text 3.6. Each test in this step is associated with a number 2x, where x is a whole positive number. Before running the quality control, a QCFF value of 0 is attributed to each line of data. When a test fails, the value of 2x that is associated with that test is added to the QCFF. In this way one can easily identify which tests failed by analyzing the QCFF value. If the QC flag of a record is modified by hand, a value of 1 is added to the QCFF.
The following is a list of the tests that are currently available:
| TESTS: DESCRIPTION (QCFF) |
 |
Test 1.1: GTSPP Platform Identification
Test 1.2: GTSPP Impossible Date/Time
Test 1.3: GTSPP Impossible Location
Test 1.4: GTSPP Position on Land
Test 1.5: GTSPP Impossible Speed
Test 2.1: Global Impossible Parameter Values (2)
Test 2.2: Regional Impossible Parameter Values (4)
Test 2.4: Profile Envelope (16)
Test 2.5: Constant Profile (32)
Test 2.7: Replicate Comparisons (128)
Test 2.8: Bottle versus CTD Measurements (TEMP, PSAL, DOXY) (256)
Test 2.9: Excessive Gradient or Inversion (TEMP, PSAL, NTRZ, PHOS) (512)
Test 2.10: Surface Dissolved Oxygen Data versus Percent Saturation (1024)
Test 3.5: Petrie Monthly Climatology (TEMP, PSAL)
Test 3.6: Brickman Monthly Climatology (NTRA, PHOS, SLCA) (2048)
Test 5.1: Cruise Track Visual Inspection
Test 5.2: Ratio and Profile Visual Inspection (station data)
Test 5.3: Replicates Visual Inspection (whole cruise data)
Test 5.4: Bottle Versus CTD Measurements Visual Inspection (whole cruise data)
Test 5.5: Ratio and Profile Visual Inspection (whole cruise data)
Test 5.6: Variable patterns with time (whole cruise data) |
If a data point failed test 2.9, then the QCFF would be 512. If test 2.7 had already failed, then the QCFF would be 512+128=640. |
 |
DESCRIPTION OF THE QUALITY CONTROL TESTS |
Test 1.1: Platform Identification |
| This test verifies that all the mission profiles were sampled from the same ship. |
Test 1.2: Impossible Date/Time |
| This test verifies that the date and time of the beginning and end of the profile fall within the mission dates. |
Test 1.3: Impossible Location |
| This test verifies that the profile's position is possible; that is, that the latitude falls between -90 and 90 and the longitude between -180 and 180. |
Test 1.4: Position on Land |
| This test checks whether the profile's position falls on land: a detailed map of the coastline of the estuary and Gulf of St. Lawrence Gulf is used. This test is also done for missions that take place in the Hudson Bay area or off the eastern Canadian coast, but the maps are not as detailed. |
Test 1.5: Impossible Speed |
| This test checks the ship speed between two consecutive profiles. The ship speed is calculated from the time-space position at the beginning of the profile and those from the end of the preceding profile. If the end position or date/time of the preceding profile is missing, the test uses the coordinates at the beginning of the preceding profile to determine ship speed. The calculated speed is compared with the ship's cruising speed. |
 |
Test 2.1: Globally Impossible Variable Values |
Test 2.1 checks if the temperature, salinity, chlorophyll, dissolved oxygen, nitrate, nitrite, phosphate, and silicate data are globally possible based on the criteria in the table below (from WOD05, Conkright et al. 2002). If a data value is judged impossible and thus erroneous, its QC flag is replaced by 4.
| Code |
Variable |
Unit |
Minimum
value
|
Maximum
value |
 |
 |
 |
 |
 |
| TEMP |
Temperature |
oC |
-2.5 |
35 |
| PSAL |
Salinity |
(psu) |
0 |
50 |
| DOXY |
Dissolved oxygen |
mL/L |
0 |
11 |
| CPHL |
Chlorophyll |
mg/m3 |
0 |
50 |
| NTRZ |
Nitrate+Nitrite |
mmol/m3 |
0 |
515 |
| NTRI |
Nitrite |
mmol/m3 |
0 |
15 |
| PHOS |
Phosphate |
mmol/m3 |
0 |
4.5 |
| SLCA |
Silicate |
mmol/m3 |
0 |
250 |
|
Test 2.2: Regionally Impossible Variable Values |
Test 2.2 checks if the temperature, salinity, chlorophyll, dissolved oxygen, nitrate, nitrite, phosphate, and silicate are regionally possible based on the criteria in the table below (using the range given for the coastal North Atlantic, WOD05; Conkright et al. 2002. The salinity was modified to reflect conditions particular to the Gulf). If a data value is judged impossible and thus erroneous, its QC flag is replaced by 4.
| Code |
Variable |
Unit |
Minimum
value
|
Maximum
value |
 |
 |
 |
 |
 |
| TEMP |
Temperature |
oC |
-2.5 |
35 |
| PSAL |
Salinity |
(psu) |
0 |
35 |
| DOXY |
Dissolved oxygen |
mL/L |
0 |
10 |
| CPHL |
Chlorophyll |
mg/m3 |
0 |
50 |
| NTRZ |
Nitrate+Nitrite |
mmol/m3 |
0 |
515 |
| NTRI |
Nitrite |
mmol/m3 |
0 |
15 |
| PHOS |
Phosphate |
mmol/m3 |
0 |
4.5 |
| SLCA |
Silicate |
mmol/m3 |
0 |
250 |
The region is defined by the following coordinates (longitude, latitude):
(-56.0, 52.0), (-73.0, 49.5), (-73.0, 46.0), (-64.5, 46.0), (-62.3, 45.2), (-56.0, 48.2), (-56.0, 52.0).
If the study area is outside these boundaries, the test is not done.
Conkright, M.E., J.I. Antonov, O. Baranova, T.P. Boyer, H.E. Garcia, R. Gelfeld, D. Johnson, R.A. Locarnini, P.P. Murphy, T.D. O'Brien, I. Smolyar, and C. Stephens. 2002a. World Ocean Database 2001, Volume 1: Introduction. Edited by S. Levitus. NOAA Atlas NESDIS 42, U.S. Government Printing Office, Washington., D.C., 167 pp.
|
 |
Test 2.4: Profile Envelope |
Test 2.4 checks whether the temperature, salinity, chlorophyll, dissolved oxygen, and nutrient data fall within the limits presented below by depth interval. The data value is judged doubtful if it does not fall within the permitted interval and its QC flag is set to 3. Data already judged erroneous are not considered.
| Code |
Depth
interval (m) |
Unit |
Minimum
value
|
Maximum
value |
 |
 |
 |
 |
 |
| TEMP |
0-50 |
°C |
-2.5 |
35 |
| TEMP |
50-100 |
°C |
-2.5 |
30 |
| TEMP |
100-400 |
°C |
-2.5 |
28 |
| TEMP |
400-1100 |
°C |
-2.0 |
28 |
| PSAL |
0-50 |
(psu) |
0 |
35 |
| PSAL |
50-100 |
(psu) |
1 |
35 |
| PSAL |
100-400 |
(psu) |
3 |
35 |
| PSAL |
400-1100 |
(psu) |
10 |
35 |
| DOXY |
0-30 |
mL/L |
0 |
10 |
| DOXY |
30-200 |
mL/L |
0 |
9 |
| DOXY |
200-1500 |
mL/L |
0 |
8 |
| CPHL |
0-1500 |
mg/m3 |
0 |
50 |
| NTRZ |
0-1500 |
mmol/m3 |
0 |
515 |
| NTRI |
0-1500 |
mmol/m3 |
0 |
15 |
| PHOS |
0-500 |
mmol/m3 |
0 |
4.5 |
| PHOS |
150-1500 |
mmol/m3 |
0.01 |
4.5 |
| SLCA |
0-150 |
mmol/m3 |
0 |
250 |
| SLCA |
150-900 |
mmol/m3 |
0.01 |
250 |
|
Test 2.5: Constant profile |
| Test 2.5 verifies if the temperature, salinity, chlorophyll, dissolved oxygen, and nutrient data of the same profile have identical values within a profile. To fail this test, the variable must have the same value at all depths. The quality flags are then set to 7 and the values must be checked individually and the flags modified to a valid QC code. The test is done for all the replicates of a variable. Discrete data already flagged doubtful, erroneous, or missing are not considered. |
 |
Test 2.6: Freezing point |
| The freezing point is calculated from the salinity and the pressure. A temperature value lower than the corresponding freezing point is judged erroneous and its flag is set to 4. Temperature data previously judged erroneous or missing are not considered. |
Test 2.7: Replicate samples |
This test compares replicates of a sample among themselves. It applies to temperature, salinity, chlorophyll, dissolved oxygen, and nutrient data. The maximum deviations tolerated between replicates are listed in the table below and were determined empirically by examining several data sets. The quality flags of replicate values that fail the test are set to 7, thus the values must be subsequently verified individually and a valid QC flag assigned. Data previously judged doubtful, erroneous, or missing are not considered.
| Code |
Variable |
Unit |
Tolerated
difference |
 |
 |
 |
 |
| TEMP |
Temperature |
°C |
0.01 |
| PSAL |
Salinity |
(psu) |
0.01 |
| DOXY |
Dissolved oxygen |
mL/L |
0.5 |
| CPHL |
Chlorophyll |
mg/m3 |
0.5 |
| NTRZ |
Nitrate+Nitrite |
mmol/m3 |
3.5 |
| NTRI |
Nitrite |
mmol/m3 |
0.1 |
| PHOS |
Phosphate |
mmol/m3 |
0.5 |
| SLCA |
Silicate |
mmol/m3 |
4.0 |
|
 |
Test 2.8: Comparison of bottle and CTD data |
This test compares analyses resulting from bottle data with the same data type sampled by the CTD. It applies to temperature, salinity, and dissolved oxygen; chlorophyll is not considered because the CTD fluorometres are not (for the moment) calibrated. The maximum tolerated differences are noted in the table below. All replicates from the same variable are compared to data recorded by the CTD. Those that fail the test will be verified individually; thus, their quality flags are set to 7. Data previously judged doubtful, erroneous, or missing are not considered.
| Code |
Variable |
Unit |
Tolerated
difference |
 |
 |
 |
 |
| TEMP |
Temperature |
°C |
0.1 |
| PSAL |
Salinity |
(psu) |
0.2 |
| DOXY |
Dissolved oxygen |
mL/L |
1.0 |
|
Test 2.9: Excessive gradients and inversions |
This test is based on the maximum gradients and inversions given in the WOD01 World Ocean Database 2001. Vertical gradients (positive changes) or inversions (negative changes) of temperature, salinity, nitrate, and phosphate are calculated to see whether they exceed the values in the table below. The gradient is obtained as the difference between one observation (V2) and the previous observation (V1) divided by the difference in depth between the two observations. The exercise is repeated for all replicates of the same variable. Missing values are temporarily replaced by the average of replicates judged correct (Q=1) and possibly problematical (Q=7) so that the test can still be completed. Data that fail this test are given a flag of 7. Data previously judged doubtful, erroneous, or missing are not considered.
| Code |
Variable |
Unit |
Inversion |
Gradient |
 |
 |
 |
 |
 |
| TEMP |
Temperature |
°C/m |
-10 |
10 |
| PSAL |
Salinity |
(psu)/m |
-0.1 |
5 |
| DOXY |
Dissolved oxygen |
(mL/L)/m |
checks not applicable |
| CPHL |
Chlorophyll |
(mg/m3)/m |
checks not applicable |
| NTRZ |
Nitrate+Nitrite |
(mmol/m3)/m |
-1.0 |
1.0 |
| PHOS |
Phosphore |
(mmol/m3)/m |
-1.0 |
1.0 |
| SLCA |
Silicate |
(mmol/m3)/m |
checks not applicable |
Conkright, M.E., J.I. Antonov, O. Baranova, T. P. Boyer, H.E. Garcia, R. Gelfeld, D. Johnson, R.A. Locarnini, P.P. Murphy, T.D. O'Brien, I. Smolyar, C. Stephens. 2002. World Ocean Database 2001, Volume 1: Introduction. Ed: Sydney Levitus, NOAA Atlas NESDIS 42, U.S. Government Printing Office, Washington, D.C., 167 pp.
|
Test 2.10: Oxygen saturation (percentage) in surface waters |
| This test verifies that the percent oxygen saturation in surface waters, i.e., 0-10 m, falls between 85% and 150%. If the dissolved oxygen value fails this test, it is assigned a quality flag of 3. Dissolved oxygen data previously judged doubtful, erroneous, or missing are not considered. |
Test 3.5: Petrie's monthly climatology (temperature and salinity) |
This climatology was compiled by Petrie et al. (1996) for the Gulf of St. Lawrence. The average and standard deviations of temperatures, salinities, and densities at fixed depths from 21 regions of the gulf were calculated for each month. This test uses the climatology to determine the validity of observations from a mission. If the difference between the observations and the climatology exceeds three standard deviations, then a warning is given. It is then the user's responsibility to determine whether he wants to reject a set of observations or add quality indicators to some observations. The problem with this test is that a data point cannot be rejected simply because it fails the test: it is possible that the profile reflects a particular event, but it is also possible that the instrument was not functioning properly and the data are incorrect. No QC flag is modified as a result of this test.
Petrie, B., K. Drinkwater, A. Sandström, R. Pettipas, D. Gregory, D. Gilbert and P. Sekhon. 1996. Temperature, salinity and sigmat-t atlas for the Gulf of St. Lawrence. Can. Tech. Rep. Hydrogr. Ocean Sci. 178, v+256 pp. |
Test 3.6: Brickman's monthly climatology (nitrate, phosphate, silicate) |
This climatology was compiled by Brickman and Petrie (2003) for the Gulf of St. Lawrence. Monthly averages and standard deviations of nitrate, phosphate, and silicate measurements were calculated for four depth intervals in 12 gulf regions. Test 3.6 uses this climatology to determine the validity of the observations of each profile of a mission. If the difference between the observations and the climatology exceeds three standard deviations, then a warning is given. It is then the user's responsibility to determine whether he wants to reject a set of observations or add quality indicators to some observations in particular. The problem with this test is that a data point cannot be rejected simply because it fails the test: it is possible that the profile reflects a particular event, but it is also possible that the instrument was not functioning properly and the data are incorrect. In addition, certain regions of the gulf have only sparse measurements, especially at some times of the year; this also must be taken into consideration. Observations failing this test are flagged Q=7. The data values must be examined individually and valid flags assigned.
Brickman, D. and B. Petrie. 2003. Nitrate, silicate and phosphate atlas for the Gulf of St. Lawrence. Can. Tech. Rep. Hydrogr. Ocean Sci. 231, xi+152 pp. |
 |
Test 5.1: Visualization of the cruise track |
| This test plots the cruise track, allowing the identification of gross position errors. |
Test 5.2: Visualization of profiles |
This step allows one to examine each profile, with each variable plotted against depth. All observations are displayed on two screen images as follows:
- T-S diagram of the CTD data at the bottle sample depths.
- CTD profiles of temperature, salinity and dissolved oxygen with bottle sample measurements of the same variables plotted by depth.
- Observations of chlorophyll, nitrate, nitrite, phosphate and silicate by depth. Averages of replicates flagged correct or possibly problematical (Q=1 or 7) are drawn as a solid line.
- N:P and N:Si ratios are plotted with depth. The ratios are calculated from the averages of the replicates judged correct or possibly problematical (Q=1 or 7).
For all these graphs, bottle data already flagged during previous tests as doubtful (Q=3), erroneous (Q=4), or possibly problematical (Q=7) are plotted with different symbols indicating their Q flag. No Q flags are automatically assigned with this test. Outlier values are examined individually and quality flags determined as necessary. |
Test 5.3: Visualization of replicates |
| Theoretically, replicates should have a 1:1 ratio, although this is rarely the case in practice. To identify potentially erroneous observations, replicate measures of temperature, salinity, chlorophyll, dissolved oxygen, and nutrients from the whole mission are plotted. For all these graphs, bottle data already flagged during previous tests as doubtful (Q=3), erroneous (Q=4), or possibly problematical (Q=7) are plotted with different symbols indicating their Q flag. No Q flags are automatically assigned with this test. Outlier values are examined individually and quality flags determined as necessary. |
Test 5.4: Visualization of bottle data vs data collected with the CTD |
| Theoretically, measurements made of the same variable but using different methods should agree; however, in practice this is rarely the case. To identify potentially bad observations, bottle data measurements of temperature, salinity, chlorophyll, and dissolved oxygen are plotted vs the corresponding measurements made with the CTD for the whole mission. For all these graphs, bottle data already flagged during previous tests as doubtful (Q=3), erroneous (Q=4), or possibly problematical (Q=7) are plotted with different symbols indicating their Q flag. If the calibration of a CTD sensor is inadequate (or not done), the relationship between the bottle data values and the sensor readings will not be strong. Nevertheless, outlier values can still be identified and investigated by this method. |
Test 5.5: Visualization of all the profiles of a mission |
| The data resulting from bottle sample analyses of temperature, salinity, chlorophyll, dissolved oxygen, nutrients, and N:P and N:Si ratios for the whole mission are plotted by depth to reveal any potentially aberrant observations. For all these graphs, bottle data already flagged during previous tests as doubtful (Q=3), erroneous (Q=4), or possibly problematical (Q=7) are plotted with different symbols indicating their Q flag. No Q flags are automatically assigned with this test. Outlier values are examined individually and quality flags determined as necessary. |
Test 5.6: Visualization of the pattern of bottle data as a function of depth, time, and space |
| This test allows one to simultaneously view the bottle data analyses of chlorophyll, dissolved oxygen, nitrate, phosphate, silicate, and the N:P and N:Si ratios for the whole mission as a function of depth and time/space. The bottle sample depths are first plotted as histograms. Then the variable averages at these depths are plotted as points superimposed on the depths. Only data judged correct or possibly problematical (Q=1 or 7) are used to calculate the averages. This type of graph allows one to identify data that deviated from the overall general pattern. |
QUALITY CONTROL RESULTS |
| The bottle data that underwent this quality control must be re-examined if the quality flag assigned was 7: this flag is temporary and indicates that there is a possible problem with this data point. For example, it could indicate that replicate values are not similar. In this case, the data manager should examine the replicates and determine what Q flag is appropriate. No Q flags of 7 should remain in the archived data set. |
 |
 |
Reviewed:
2012-02-21 |
|