Backscattering of light is commonly measured by ocean observing systems, including ships and autonomous platforms, and is used as a proxy for the concentration of water column constituents such as phytoplankton and particulate carbon. Multiple on-going projects involve large numbers of independent measurements of backscatter, as well as other biologically relevant parameters, to understand how biology is changing in time and space throughout the global ocean. Rarely are there sufficient measurements to test how well these instruments are inter-calibrated in real-world deployment conditions. This paper develops a procedure to align multiple independently calibrated backscatter instruments to each other using nearby profiling casts and applies this method to nine instruments deployed during a recent field campaign in the North Pacific during August–September of 2018. This process revealed several incorrect calibrations; post-alignment, all nine instruments aligned extremely well with each other. We also tested an alignment to a deep-water reference and found that this method is generally sufficient but has significant limitations; this procedure lacks the ability to correct instruments measuring only shallow profiles and can only account for additive offsets, not multiplicative changes. These findings highlight the utility of process studies involving several independent measurements of similar parameters in the same area.
1. Introduction
Optical backscattering, bb(λ) (m−1), is an important aquatic inherent optical property that is measurable by in situ instruments and can be derived from satellite measurements of ocean reflectance. Backscatter from particulates, bbp(λ) (m−1), is correlated with the concentration and microphysical characteristics of water column constituents such as chlorophyll (Westberry et al., 2010), detrital/particulate biomass (Stramski and Kiefer, 1991), and particulate carbon in organic (POC; Stramski et al., 1999; Cetinić et al., 2012) and inorganic (PIC; Balch et al., 2005) forms. Current standard practice for ocean observing platforms is to include estimates of bb(λ) from at least one wavelength (Perry and Rudnick, 2003; Johnson and Claustre, 2016; Organelli et al., 2017). Measurements of near-surface bb(λ) and associated bbp(λ) are important components of validation studies for satellite remote sensing algorithms (Werdell et al., 2018; Bisson et al., 2021) and necessary to evaluate uncertainty and error in satellite-derived bbp retrievals (Bisson et al., 2019).
Backscattering sensors typically measure the volume scattering function (VSF), β(θ,λ) (m−1 sr−1), at a given angle θ (Sullivan et al., 2013). The particulate portion of the VSF, βp(θ,λ) (m−1 sr−1), is calculated by subtracting the contribution from seawater:
where the seawater contribution to the VSF, βsw, depends primarily on the scattering angle, wavelength, and salinity (S; Zhang et al., 2009), and is weakly dependent on temperature (T; Zhang and Hu, 2018) and pressure (p; Hu et al., 2019). The particulate backscattering coefficient is the integral of VSF at all backwards angles (i.e., θ > 90°):
which can be simplified using an appropriate scaling factor to (Oishi, 1990; Boss and Pegau, 2001; Sullivan et al., 2013):
A recent study found that χp varies in different waters but can be assumed to be spectrally invariant (Zhang et al., 2021a). In practice, constant globally representative values are often used (Sullivan et al., 2013; Schmechtig et al., 2015). Measurements in this work are reported in units of particulate backscatter (m−1) following instrument-specific calculations delineated in Equations 1 and 3, even though the actual measurements are in units of counts and calibrated to VSF (m−1 sr−1), for ease of comparison between instruments.
Most observational campaigns involve either one or only a small number of backscattering sensors, making it difficult to recognize biases or inconsistencies in the measurements. Here, we have taken a statistical, data-oriented approach to comparing bbp(700) (hereafter, bbp implies a measurement at 700 nm) measured by multiple instruments deployed during the August–September 2018 North Pacific campaign of the project EXport Processes in the Ocean from RemoTe Sensing (EXPORTS-NP; Siegel et al., 2021). Optical backscattering calibrations included a scaling factor and a “dark counts” offset parameter. By comparing sensors to each other, we found that many of the instruments deployed during EXPORTS-NP have calibrations that are inconsistent with each other at 95% confidence, using known sources of uncertainty. We recommended several modifications to the scaling factors and offsets for these instruments which, when applied, brought the entire suite of backscattering sensors into agreement with each other. Finally, we compared the results from our inter-sensor alignment to a simpler method for adjusting measurements by subtracting the median value for each sensor at depth (e.g., Briggs, 2011, Version 1.4, March 16, 2011). This method resulted in similar final alignments, but is limited to use on instruments with sufficiently deep profiles and cannot account for multiplicative factors, which were found to be important here.
2. Data and methods
2.1. Instrumental measurements
EXPORTS-NP data used here come from six platforms (Table 1, Figure 1): the R/V Sally Ride (SR), the R/V Roger Revelle (RR), a Lagrangian float (LF; D’Asaro, 2003), a Wirewalker (WW; Rainville and Pinkel, 2001), a Seaglider (SG; Eriksen et al., 2001), and a Biogeochemical float (BGC; Johnson and Claustre, 2016). All ship-based sensors considered here were deployed over the side, either on a rosette or from slow-drop profiles. The EXPORTS-NP experiment was centered on the LF, which drifted below the sea surface at 1025.85 kg m−3, or about 100 m (Siegel et al., 2021). Other platforms provided spatiotemporal context about the LF (Figure 1). We use data from nine instruments across these six platforms (Table 1), encompassing all of the instruments deployed during this cruise that measured VSF throughout the water column (to at least 100 m depth) either at 700 nm or at two wavelengths such that β(θ, 700) could be interpolated. Instrument names typically are given by describing how many wavelengths of backscatter (BB) and fluorescence (FL) they measure. Exceptions to this norm are the Multi-Channel Optical Measurement Sensor (MCOMS; Wetlabs), which measures one wavelength each of backscatter and fluorescence, and the HydroScat-6 (HS6; Hobi Labs), which measures scattering at six wavelengths. In this paper, instruments are identified by the instrument type and platform; for example, BB2FL-SG is an instrument measuring backscatter at two wavelengths (here we only consider 700 nm) and fluorescence at one wavelength, and is mounted on the Seaglider.
Sensor . | Platform (nm) . | Scattering Angle (degrees) . | Resolution (10−4 m−1) . | Deep Valuea (10−4 m−1) . |
---|---|---|---|---|
BB9 | RR (profile) | 124 | 0.2 | — |
HS6 | RR (profile) | 140 | 0.2 | — |
FLBB | RR (rosette) | 142 | 0.4 | 2.0 |
FLBB | SR (rosette) | 142 | 0.1 | 3.2 |
BB9 | SR (profile) | 124 | 0.3 | — |
BB2FL | SG | 124 | 0.2 | 2.4 |
BBFL2 | WW | 124 | 0.2 | 2.2 |
FLBB | LF | 142 | 0.5 | — |
MCOMS | BGC | 150 | 0.02 | 2.3 |
Sensor . | Platform (nm) . | Scattering Angle (degrees) . | Resolution (10−4 m−1) . | Deep Valuea (10−4 m−1) . |
---|---|---|---|---|
BB9 | RR (profile) | 124 | 0.2 | — |
HS6 | RR (profile) | 140 | 0.2 | — |
FLBB | RR (rosette) | 142 | 0.4 | 2.0 |
FLBB | SR (rosette) | 142 | 0.1 | 3.2 |
BB9 | SR (profile) | 124 | 0.3 | — |
BB2FL | SG | 124 | 0.2 | 2.4 |
BBFL2 | WW | 124 | 0.2 | 2.2 |
FLBB | LF | 142 | 0.5 | — |
MCOMS | BGC | 150 | 0.02 | 2.3 |
a Median value from depths of 400 to 500 m depth.
The instruments are assumed to measure β(θ,700) at (or near) one of three different scattering angles: 124, 142, or 150°, although a recent study found that these assumed scattering angles can be off by several degrees (Zhang et al., 2021b). We subtracted the seawater portion of the VSF, βsw, using Equation 1, and converted into bbp using Equation 3 with χp(θ) specifically determined for the EXPORTS-NP cruise. Calculated χp was dependent on θ but did not exhibit significant variation with depth, time, or wavelength. For this region, the following χp(θ) values (standard deviation) were used: 1.04 (0.005) for 124°, 1.14 (0.01) for 142°, and 1.16 (0.015) for 150° (Zhang et al., 2021a). For the HS6-RR at 140°, we used a value of 1.14 but increase the uncertainty to 0.02 to account for additional uncertainty, as since this instrument measured VSF over a smaller angular width than the other sensors and at a slightly different center angle. Every instrument considered here except for the BB9 sensors measured β(θ,λ) at 700 nm; for the BB9-RR and BB9-SR we linearly interpolated from measurements of β(θ,λ) at wavelengths adjacent to 700 nm; 650 and 715 nm for BB9-RR and 652 and 717 nm for BB9-SR. (The BB9-SR also measured at 679 nm, but this wavelength is known to be contaminated by chlorophyll fluorescence; see Boss et al. (2007).)
Instruments used during the experiment were calibrated at several different facilities following instrument-type specific procedures (Table S1). All but two instruments were calibrated by Sunstone Scientific, an independent instrument developer, which uses a serial solution of NIST-traceable polystyrene microspherical beads in Milli-Q pure water, with known particle size distributions and VSF modeled using Mie theory (Sullivan et al., 2013), and measures the wavelength distribution of each sensor. The MCOMS-BGC was instead calibrated by Seabird, a major oceanographic instrument manufacturer whose calibrations differ from Sunstone Scientific in that the calibration is against a reference sensor and assumes a nominal wavelength distribution for each instrument, which can be 5–10 nm different from the measured value. The HS6-RR was calibrated by Hobi Instrument Services using the plaque method. Briefly, the angular response function of the instrument is calculated for the whole sampling volume, based on signals measured over variable distances of the plaque (of known surface scattering function) from the sensor (detailed description in Dana and Maffione (2002)). These factory calibrations result in a dark signal, to be subtracted from each measurement, and a multiplicative scaling factor to convert the raw signal into VSF. The minimum resolution for each sensor is equivalent to the response associated with one count of the sensor (Table 1).
Standard procedure is to use an in situ dark signal measured during a field deployment rather than a factory dark calibration. This signal is acquired by conducting a standard profile for each instrument with black tape over the detector, such that it should not measure any scattered light. During the cruise we obtained dark profiles from all instruments deployed from the two ships. For FLBB-SR, the taped cast was within 2% of the factory calibration, so the factory calibrated value was used instead (these decisions reflect the data submitted to the official data repository).
Backscattering sensors often contain spikes thought to be due to sporadic measurements of large, sinking particulates. We used sequential minimum and maximum filters, as in Briggs et al. (2011) to remove positive spikes in the BB2FL-SG data, consistent with the processing used in the data submission for this product. The backscattering signal from several of the other sensors exhibited maximum and minimum spikes, which are likely driven by averaging and binning procedures used upstream in the data analysis. Therefore, for all other sensors a running 7-point median filter was used to remove spikes from the backscattering data.
Finally, in some instances we eliminated certain time periods from each instrument. Several of the autonomous platforms either started taking measurements before the main part of the field deployment or remained afterwards. To avoid any possible issues that might arise from the temporal evolution of the biology in the region before or after the main part of the cruise, we limited all measurements to the time period of the main deployment, which was August 14 to September 9, 2018. For the FLBB-RR, we also eliminated the first week of sampling because these data are much noisier than the later casts; we believe these casts were contaminated with light from a neighboring instrument (an Underwater Vision Profiler). We also detected as-yet unexplained increases in bbp with depth in the FLBB-LF that appeared to be dependent on instrument trajectory through the water column. This instrument floats at about 100 m depth, and every day rises to the surface, transmits its location, profiles from the surface to about 200 m, and then returns to its target isopycnal. Surprisingly, the FLBB-LF recorded an increase in bbp from 100–200 m while moving downwards through the water column, but not while moving upwards. An increase in bbp with depth from 100 to 200 m was not recorded by any other instrument. We have therefore only considered data from the upper 100 m for the FLBB-LF.
2.2. Finding matched pairs of observations
Spatiotemporal variations in biological parameters affecting bbp can occur on all scales and complicate comparisons between instruments deployed on different platforms. Internal waves, which involve vertical oscillations of up to tens of meters and act on hourly timescales, are particularly problematic. We remove the influence of internal waves by matching measurements to density surfaces rather than pressure (or depth), using coincident measurements of T and S. The conductivity meter connected to the BB9-RR profiles was malfunctioning during this field deployment, but we found a stable relationship between T and S, allowing us to predict S to a high level of accuracy. For the BB9-RR, salinity for the data used here was calculated as:
The HS6-RR did not include measurements of T or S, so we instead used measurements from the nearest rosette CTD from the RR (typically within 1 hour). For both the HS6-RR and BB9-RR, the T and S used here are the same as those in the official data submission.
For each cast, the potential density was calculated and filtered VSF data were sorted into 0.1 kg m−3 bins, from 1024 to 1027.4 kg m−3. The mean value in each bin was used for the comparison, and the standard deviation was used in the uncertainty analysis (see below). Our density bin size generally allowed for multiple values in each bin; the number of measurements per bin varies depending on the platform but is generally 3–10 for densities less than 1026.5 kg m−3 (about 200 m) and 20–100 in the less stratified waters below. Note that orienting measurements to density bins not only removes the influence of internal waves, but also means that the more numerous measurements made in the less stratified, deeper waters do not dominate the final fit.
Dedicated inter-calibration profiles were performed between some instruments on different platforms. These dedicated calibration profiles were always within 1 km and 2 hours of each other (and generally much closer). Opportunistic inter-calibration profiles at similar temporal and spatial separations were also frequently found, especially between autonomous platforms. However, there were several pairs of instruments where no such profile matches were present, and in several cases only one such profile match existed, increasing the uncertainty on the final fitted parameters and leading to a few instances where the lack of data covering a sufficient range of values led to clearly nonsensical relationships (e.g., negative correlations). We therefore decided to use less restrictive criteria for profile matches, and allow profiles within 5 km and 6 hours of each other. These exact parameters are not crucial; sensitivity testing of these parameters with sample platforms did not result in significant differences for relatively small changes, although r2 relationships did tend to drop starting at 10 km separation. In practice these parameters may change based on the ocean region and should be decided using the data at hand.
When comparing different instrument profiles, using only profiles that appear to be in a similar water mass is important. Setting temporal and spatial separation thresholds is one method for meeting this objective. Another is to disallow individual measurement pairings with different T and S along the same density surface. We found that setting restrictive T and S difference thresholds resulted in a smaller number of samples but generally did not increase the correlation between profiles (r2) noticeably, suggesting that in this region, T and S differences between profiles separated by up to 5 km and 6 hours do not tend to signify changes in bbp. We therefore set relatively permissive thresholds of |ΔT| ≤ 0.5°C and |ΔS| ≤ 0.1 (practical salinity), which in this region of the ocean roughly correspond to a density difference of 0.1 kg m−3, the resolution of our density bins.
2.3. Sensor comparison and uncertainty analysis
For each sensor pairing, we calculated the best-fit relationship between Sensor 1 (x) and Sensor 2 (y), using models for a best-fit line, y = ax + b, and offset, y = x + b. We used orthogonal distance regression (ODR), a type-II linear regression that takes into account uncertainty on both x and y, from the scientific Python (scipy) odr package. We first calculated the best-fit line, eliminated those measurements with a z-score magnitude greater than 3 (i.e., three standard deviations away from the best-fit line), and re-calculated the best-fit line and offset without these outliers.
Uncertainty in binned bbp measurements comes from the sensor, the environment, and the calibration. Sensor and environment errors were assumed to be measurement-independent. Calibration errors, on the other hand, will equally bias all measurements from a given sensor. This section only involves the measurement-independent errors; calibration errors are considered in the next section.
For the measurement-independent errors, the resolution of each sensor (i.e., one count) was used as a metric for sensor noise. For sensors that did not measure at exactly 700 nm, the noise from the nearest wavelength was used (715 nm for BB9-RR and 717 nm for BB9-SR). Converting into backscattering units, this error has a range of 0.02–0.5 × 10−4 m−1 (Table 1). This error is convolved with environmental uncertainty associated with variance in the signal in each density bin. For N measurements within a density bin j with observed variance , the recorded instrument variance is , where σnoise is the factory-calibrated instrument resolution. The other two types of environmental uncertainty are errors in the calculation of seawater scattering, assumed to be 2.24% (Dall’Olmo et al., 2009; Zhang et al., 2009), and errors in χp, σχp, given above. Uncertainties are added in quadrature; that is (cf. Equations 1, 3),
These results do not take into account calibration uncertainties. The dark offset and multiplicative scaling factor applied originally to each sensor have their own uncertainties (σD and σS, respectively), which act in the same way on each measurement. Calibrations from Sunstone Scientific include an estimate of the scaling factor uncertainty, which typically ranges from 2 to 2.2%, although this uncertainty level is likely an under-estimate (Zhang et al., 2021b). For those instruments without such an estimate (HS6-RR and MCOMS-BGC), we used a value of 2%. The dark offset uncertainty is the standard deviation of the signal measured during the dark profiles, calculated either in situ or in a factory calibration. The new uncertainties on a and b are:
(see the Text S1 for a derivation). In most cases, this additional calibration uncertainty ( and ) dominates the total uncertainty in a and b. In this manuscript, we express uncertainties as 95% confidence intervals, calculated using a t-distribution with degrees of freedom equal to n − 2 for the best-fit line and n − 1 for the best-fit offset, where n is the number of measurement comparisons.
3. Results
Median average bbp profiles show clear differences between many of the instruments (Figure 2a). The most obvious outlier is the FLBB-LF, which consistently measured higher bbp and a smaller range of variability in comparison to all other platforms. Smaller differences are also clearly apparent at depth where the range of variability in bbp is especially low.
We used the methodology described above to match observations between platforms with each other, and find best-fit alignments between each platform pairing (Table 2, Figures 3, S1–S28). In these figures, fits that are consistent (within a 95% confidence interval) are shown in black, whereas fits where either the multiplicative factor a is statistically dissimilar (at 95% confidence) from 1 or the offset parameter b (in the best-fit offset case) is statistically dissimilar from 0 are in gray. If no paired profiles were found that match the spatial and temporal distance criteria described above, no relationship is shown (e.g., between FLBB-LF and BB9-SR; Figure 3).
Sensors . | Statistics . | Best-Fit Linea . | Best-Fit Offseta . | . | ||||
---|---|---|---|---|---|---|---|---|
Sensor 1 . | Sensor 2 . | Profiles . | n . | r2 . | a . | b . | b . | Fig. . |
x . | y . | . | . | . | . | 10-4 m-1 . | 10-4 m-1 . | . |
FLBB-SR | FLBB-RR | 11 | 287 | 0.96 | 0.95 ± 0.06b | −1.0 ± 0.5b | −1.2 ± 0.5 | S1 |
BBFL2-WW | FLBB-RR | 42 | 1,113 | 0.94 | 1.00 ± 0.06 | −0.3 ± 0.6 | −0.2 ± 0.6 | S2 |
BB2FL-SG | FLBB-RR | 19 | 420 | 0.94 | 1.02 ± 0.06 | −0.5 ± 0.6 | −0.4 ± 0.6 | S3 |
BB9-RR | FLBB-RR | 15 | 149 | 0.73 | 0.94 ± 0.09 | 0.3 ± 0.8 | −0.1 ± 0.6 | S4 |
BB9-SR | FLBB-RR | 4 | 85 | 0.95 | 0.96 ± 0.07 | 0.3 ± 1.5 | 0.2 ± 1.6 | S5 |
HS6-RR | FLBB-RR | 26 | 365 | 0.84 | 1.31 ± 0.08 | 0.6 ± 0.6 | 2.2 ± 0.5 | S6 |
FLBB-LF | FLBB-RR | 7 | 61 | 0.72 | 2.13 ± 0.38 | −37.8 ± 11.7 | −14.0 ± 4.1 | S7 |
MCOMS-BGCc | FLBB-RR | — | — | — | 1.05 ± 0.10 | −0.3 ± 0.5 | −0.1 ± 0.5 | — |
BBFL2-WW | FLBB-SR | 7 | 192 | 0.92 | 1.15 ± 0.07 | 0.4 ± 0.5 | 1.1 ± 0.4 | S8 |
BB2FL-SG | FLBB-SR | 16 | 344 | 0.96 | 1.05 ± 0.06 | 0.6 ± 0.4 | 0.8 ± 0.4 | S9 |
BB9-RR | FLBB-SR | 5 | 58 | 0.83 | 1.02 ± 0.14 | 1.4 ± 1.2 | 1.6 ± 0.5 | S10 |
BB9-SR | FLBB-SR | 41 | 870 | 0.98 | 1.04 ± 0.06 | 1.3 ± 1.6 | 1.5 ± 1.5 | S11 |
HS6-RR | FLBB-SR | 7 | 91 | 0.63 | 1.41 ± 0.17 | 1.5 ± 1.0 | 3.9 ± 0.5 | S12 |
FLBB-LF | FLBB-SR | 6 | 58 | 0.70 | 2.44 ± 0.45 | −43.0 ± 13.6 | −12.5 ± 4.1 | S13 |
MCOMS-BGC | FLBB-SR | 1 | 24 | 0.99 | 1.11 ± 0.08 | 0.7 ± 0.3 | 1.1 ± 0.2 | S14 |
BB2FL-SG | BBFL2-WW | 36 | 798 | 0.89 | 0.96 ± 0.06 | −0.0 ± 0.5 | −0.2 ± 0.5 | S15 |
BB9-RR | BBFL2-WW | 15 | 156 | 0.71 | 1.03 ± 0.10 | 0.1 ± 0.8 | 0.3 ± 0.5 | S16 |
BB9-SR | BBFL2-WW | 3 | 70 | 0.98 | 0.87 ± 0.06 | 1.0 ± 1.4 | 0.4 ± 1.6 | S17 |
HS6-RR | BBFL2-WW | 27 | 362 | 0.81 | 1.32 ± 0.09 | 0.7 ± 0.6 | 2.5 ± 0.5 | S18 |
FLBB-LF | BBFL2-WW | 3 | 27 | 0.71 | 1.87 ± 0.51 | −31.6 ± 13.1 | −13.6 ± 4.3 | S19 |
BB9-RR | BB2FL-SG | 6 | 55 | 0.79 | 0.90 ± 0.12 | 0.7 ± 0.9 | 0.1 ± 0.5 | S20 |
BB9-SR | BB2FL-SG | 4 | 67 | 0.93 | 1.01 ± 0.08 | 0.8 ± 1.6 | 0.8 ± 1.6 | S21 |
HS6-RR | BB2FL-SG | 9 | 98 | 0.76 | 1.19 ± 0.15 | 0.8 ± 0.9 | 1.8 ± 0.5 | S22 |
FLBB-LF | BB2FL-SG | 15 | 129 | 0.62 | 1.76 ± 0.24 | −29.8 ± 8.6 | −13.7 ± 4.1 | S23 |
BB9-SR | BB9-RR | 4 | 41 | 0.86 | 1.01 ± 0.14 | 0.0 ± 2.0 | 0.1 ± 1.6 | S24 |
HS6-RR | BB9-RR | 20 | 203 | 0.63 | 1.38 ± 0.14 | −0.1 ± 0.9 | 2.2 ± 0.5 | S25 |
FLBB-LF | BB9-RR | 2 | 14 | 0.86 | 2.04 ± 0.54 | −36.1 ± 14.4 | −14.4 ± 4.5 | S26 |
HS6-RR | BB9-SR | 3 | 42 | 0.75 | 1.25 ± 0.20 | 0.9 ± 1.9 | 2.2 ± 1.6 | S27 |
FLBB-LF | HS6-RR | 1 | 8 | 0.66 | 1.40 ± 1.04 | −23.5 ± 23.0 | −15.1 ± 4.9 | S28 |
Sensors . | Statistics . | Best-Fit Linea . | Best-Fit Offseta . | . | ||||
---|---|---|---|---|---|---|---|---|
Sensor 1 . | Sensor 2 . | Profiles . | n . | r2 . | a . | b . | b . | Fig. . |
x . | y . | . | . | . | . | 10-4 m-1 . | 10-4 m-1 . | . |
FLBB-SR | FLBB-RR | 11 | 287 | 0.96 | 0.95 ± 0.06b | −1.0 ± 0.5b | −1.2 ± 0.5 | S1 |
BBFL2-WW | FLBB-RR | 42 | 1,113 | 0.94 | 1.00 ± 0.06 | −0.3 ± 0.6 | −0.2 ± 0.6 | S2 |
BB2FL-SG | FLBB-RR | 19 | 420 | 0.94 | 1.02 ± 0.06 | −0.5 ± 0.6 | −0.4 ± 0.6 | S3 |
BB9-RR | FLBB-RR | 15 | 149 | 0.73 | 0.94 ± 0.09 | 0.3 ± 0.8 | −0.1 ± 0.6 | S4 |
BB9-SR | FLBB-RR | 4 | 85 | 0.95 | 0.96 ± 0.07 | 0.3 ± 1.5 | 0.2 ± 1.6 | S5 |
HS6-RR | FLBB-RR | 26 | 365 | 0.84 | 1.31 ± 0.08 | 0.6 ± 0.6 | 2.2 ± 0.5 | S6 |
FLBB-LF | FLBB-RR | 7 | 61 | 0.72 | 2.13 ± 0.38 | −37.8 ± 11.7 | −14.0 ± 4.1 | S7 |
MCOMS-BGCc | FLBB-RR | — | — | — | 1.05 ± 0.10 | −0.3 ± 0.5 | −0.1 ± 0.5 | — |
BBFL2-WW | FLBB-SR | 7 | 192 | 0.92 | 1.15 ± 0.07 | 0.4 ± 0.5 | 1.1 ± 0.4 | S8 |
BB2FL-SG | FLBB-SR | 16 | 344 | 0.96 | 1.05 ± 0.06 | 0.6 ± 0.4 | 0.8 ± 0.4 | S9 |
BB9-RR | FLBB-SR | 5 | 58 | 0.83 | 1.02 ± 0.14 | 1.4 ± 1.2 | 1.6 ± 0.5 | S10 |
BB9-SR | FLBB-SR | 41 | 870 | 0.98 | 1.04 ± 0.06 | 1.3 ± 1.6 | 1.5 ± 1.5 | S11 |
HS6-RR | FLBB-SR | 7 | 91 | 0.63 | 1.41 ± 0.17 | 1.5 ± 1.0 | 3.9 ± 0.5 | S12 |
FLBB-LF | FLBB-SR | 6 | 58 | 0.70 | 2.44 ± 0.45 | −43.0 ± 13.6 | −12.5 ± 4.1 | S13 |
MCOMS-BGC | FLBB-SR | 1 | 24 | 0.99 | 1.11 ± 0.08 | 0.7 ± 0.3 | 1.1 ± 0.2 | S14 |
BB2FL-SG | BBFL2-WW | 36 | 798 | 0.89 | 0.96 ± 0.06 | −0.0 ± 0.5 | −0.2 ± 0.5 | S15 |
BB9-RR | BBFL2-WW | 15 | 156 | 0.71 | 1.03 ± 0.10 | 0.1 ± 0.8 | 0.3 ± 0.5 | S16 |
BB9-SR | BBFL2-WW | 3 | 70 | 0.98 | 0.87 ± 0.06 | 1.0 ± 1.4 | 0.4 ± 1.6 | S17 |
HS6-RR | BBFL2-WW | 27 | 362 | 0.81 | 1.32 ± 0.09 | 0.7 ± 0.6 | 2.5 ± 0.5 | S18 |
FLBB-LF | BBFL2-WW | 3 | 27 | 0.71 | 1.87 ± 0.51 | −31.6 ± 13.1 | −13.6 ± 4.3 | S19 |
BB9-RR | BB2FL-SG | 6 | 55 | 0.79 | 0.90 ± 0.12 | 0.7 ± 0.9 | 0.1 ± 0.5 | S20 |
BB9-SR | BB2FL-SG | 4 | 67 | 0.93 | 1.01 ± 0.08 | 0.8 ± 1.6 | 0.8 ± 1.6 | S21 |
HS6-RR | BB2FL-SG | 9 | 98 | 0.76 | 1.19 ± 0.15 | 0.8 ± 0.9 | 1.8 ± 0.5 | S22 |
FLBB-LF | BB2FL-SG | 15 | 129 | 0.62 | 1.76 ± 0.24 | −29.8 ± 8.6 | −13.7 ± 4.1 | S23 |
BB9-SR | BB9-RR | 4 | 41 | 0.86 | 1.01 ± 0.14 | 0.0 ± 2.0 | 0.1 ± 1.6 | S24 |
HS6-RR | BB9-RR | 20 | 203 | 0.63 | 1.38 ± 0.14 | −0.1 ± 0.9 | 2.2 ± 0.5 | S25 |
FLBB-LF | BB9-RR | 2 | 14 | 0.86 | 2.04 ± 0.54 | −36.1 ± 14.4 | −14.4 ± 4.5 | S26 |
HS6-RR | BB9-SR | 3 | 42 | 0.75 | 1.25 ± 0.20 | 0.9 ± 1.9 | 2.2 ± 1.6 | S27 |
FLBB-LF | HS6-RR | 1 | 8 | 0.66 | 1.40 ± 1.04 | −23.5 ± 23.0 | −15.1 ± 4.9 | S28 |
aBoldface indicates a recommended alignment (to FLBB-RR). Italics indicate an alignment which is inconsistent, within a 95% confidence interval, with a = 1 for the best-fit line or b = 0 for the best-fit offset.
bRecommended despite being within 95% confidence interval of a = 1 because doing so significantly improved several other postalignment comparisons; see text.
cUsing alignments between MCOMS-BGC/FLBB-SR and FLBB-SR/FLBB-RR; see Text S2.
Five instruments, all four measuring at 124° and the FLBB-RR, agree with each other within uncertainty (black arrows). In contrast, three instruments require adjustments (grey arrows) in most, if not all, comparisons: FLBB-LF, HS6-RR, and FLBB-SR. The FLBB-LF required a scaling factor adjustment ranging from 1.4 to 2.4. Although this range appears large, the error bars on this value are generally high, such that all comparisons are consistent (within 95% confidence) with an additional required multiplicative factor of about 2.0, as well as an offset of approximately −35 × 10−4 m−1. The HS6-RR was similarly consistently biased, requiring a scaling factor adjustment (discounting the alignment with FLBB-LF) of about 1.3, and an offset adjustment of approximately 0.5 × 10−4 m−1.
The FLBB-SR alignment is more complicated. Discounting its comparisons with the FLBB-LF and the HS6-RR, which are consistent with other instruments’, two matches require a multiplicative factor adjustment (with MCOMS-BGC and BBFL2-WW) and three others require only an additive adjustment (note that the relationship is shown “in the other direction” with the FLBB-RR; e.g. the best-fit offset of −1.2 × 10−4 m−1 for FLBB-SR to align with FLBB-RR and the best-fit offset of +1.6 × 10−4 m−1 for BB9-RR to align with the FLBB-SR are consistent with each other). All of the matches are consistent with a reduction by a factor of about 0.95 (and an offset of about −1.0 × 10−4 m−1), suggesting that a relatively minor multiplicative factor could be warranted, although it does not rise to level of 95% confidence.
To form our alignment, we picked a common reference instrument. We then tested the alignment of all sensors to this reference instrument to see if it leads to improved comparisons, consistent with a = 1 and b = 0 m−1, between non-reference instruments. We chose FLBB-RR for a reference, because it already agreed well with several other instruments (Figure 3), profiles below 100 m, and was calibrated by an in situ taped cast during the deployment, which is considered best practice (Table S1). If the 95% confidence interval of a for the best-fit line included 1 and the confidence interval of b for the best-fit offset included 0, no modification was made (BBFL2-WW, BB2FL-SG, BB9-RR, and BB9-SR). For the other three instruments (FLBB-SR, FLBB=LF, and HS6-RR), we align each to the FLBB-RR by using the values of the best-fit line comparison. For the FLBB-SR, the best-fit a was within 95% confidence of being statistically indistinct from 1; however, using the best-fit line to this instrument significantly improved the post-alignment comparisons.
After this alignment, we re-performed the ODR regression for each set of instruments. This regression was done in the same way as the initial regression, except that we augmented the final uncertainties on a and b by the uncertainties in the alignment; that is, Equation 6 became:
We found improvements between almost all other comparisons (Figure 4). There were still two mis-matches (at 95% confidence), between FLBB-LF and BB9-RR and between BB9-SR and BBFL2-WW, which matches expectations; for 28 comparisons we statistically expect 5%, or 1–2, to mis-match at the given confidence interval.
The methodology outlined here works best with a dense network of backscattering sensors. A dense network, however, is not typical of most field campaigns. An alternative approach is to assume that bbp at some suitable deep depth is constant, and align each instrument to a common deep value (e.g., Briggs, 2011). Such analysis is only possible for those sensors that descend to such a depth. For the sensors considered here, only five (FLBB-SR, FLBB-RR, BB2FL-SG, BBFL2-WW, and MCOMS-BGC) descended to at least 500 m, and only three of these (FLBB-SR, FLBB-RR, and MCOMS-BGC) included multiple profiles down to 1000 m. We attempted an alignment to a common deep value by using the median backscatter between 400 and 500 m depth (i.e., subtracting each instruments’ deep value and then adding the deep value for FLBB-RR; Table 1). The final result by design aligns the median instruments signal perfectly between 400 and 500 m (Figure 2c). The differences between the deep-water alignment, based on median values, and the recommendation here, based on comparisons between near-coincident measurement profiles, were always within 0.2 × 10−4 m−1 of the best-fit offset result, suggesting an alignment to this deep value is generally sufficient. However, the drawbacks to this method are that it requires deeper profiles than several of our instruments made, and it does not account for multiplicative factors.
4. Discussion
The EXPORTS-NP campaign allowed us to evaluate the alignment of different backscattering sensors, each of which was calibrated independently. Our methodology utilized chance encounters between different platforms, rather than only making use of dedicated comparison casts. The region studied here, near Ocean Station Papa in the North Pacific, is generally considered an “eddy desert” (Chelton et al., 2007; 2011) with low eddy kinetic energy (Xu et al., 2014) and an energy budget dominated by relatively large-scale surface fluxes of heat and precipitation (Large, 1996; Ren and Riser, 2009). The range of bbp measured in these oligotrophic waters was relatively small, making it statistically easier to find biases in bbp measured by different platforms as these biases were large compared to the range of observed bbp. Repeating this analysis for a region with a greater range of bbp, such as the North Atlantic, would increase confidence in this methodology, with a caveat that the criteria for a measurement pairing may need to be modified in environments with greater small-scale variability.
Theoretically, some variation in bbp between platforms could be a result of the different sampling strategy of each platform. For this paper, we only considered data gathered during the main part of the EXPORTS-NP experiment when all platforms were taking measurements; while the BB2FL-SG, FLBB-LF, and MCOMS-BGC measurements continued past those of the main cruise (and, in the case of BB2FL-SG, also preceded EXPORTS-NP), we did not consider data from these extended deployment periods. Some platforms sampled in different regions; for example, the SR was surveying the greater region about which the RR and LF were sampling. However, the fact that comparisons between sensors from the same platform yielded similar results to sensors from different platforms suggests this is not a major concern for this study. For example, the FLBB-RR and the HS6-RR sampled in very similar waters, as they were on the same platform. Our approach suggests that the HS6-RR data require a scaling by a factor of 1.31, a factor that is quite consistent with comparisons between all other platforms, even those with substantially different sampling patterns.
Our results suggest that several instruments require adjustments to their bbp measurements to align with those of the reference sensor, FLBB-RR. The instrument requiring the most substantial modification was the FLBB-LF, which measured much higher bbp over a smaller range of values than all the other instruments. The calibrations for this instrument were also somewhat unusual, with a standard deviation error of 4 counts for the dark value (normally, this uncertainty is about one count). However, a calibration error is unlikely, as pre- and post-cruise calibrations (both by Sunstone Scientific) of dark counts offset and scaling factor differed by only 3–5% (Table S1) and cannot explain the large discrepancy observed here. One possibility is that the FLBB-LF was partially obstructed during the cruise, leading to a large amount of permanent observed “backscatter” and a smaller range of variability. Even with these instrument-specific issues, the method outlined here was able to align the data from the FLBB-LF to other instruments and recover data that, after alignment, appears to be consistent with bbp measured by the other platforms (Figure 2b).
Another anomalous instrument was the HS6-RR. One major difference between the HS6-RR and all other instruments was the calibration procedure, which used the plaque method rather than beads (see Data and Methods). In oligotrophic waters, biases resulting from different calibration procedures could be magnified, potentially resulting in differences in scale factor of the size seen here. Another difference in the HS6-RR observations was the width of the scattering angle measured. In a recent Monte Carlo simulation, Zhang et al. (2021b) estimated the full-width at half-maximum (FWHM) of the angular response weighting function for the HS6 to be 4.6° which is significantly narrower than the 41° FWHM of Seabird sensors (Twardowski et al., 2012). This difference could lead the χp factor to be overestimated by 3–4% for Seabird sensors (Zhang et al., 2021a). In this manuscript, we accounted for the effect of these very different scattering angle widths by increasing the uncertainty for χp used for the HS6-RR, but a full investigation of the effects of varying scattering angle widths on the resulting bbp calculations remains to be done. Given the small range of observed χp(θ), variation in this parameter would be unlikely to result in the scaling factor discrepancy seen here.
In addition, we have also recommended an adjustment to the FLBB-SR, including a relatively minor (5%) multiplicative change and an offset of about −10−4 m−1. This negative offset was present in multiple comparisons (Figure 3). We are not at present able to explain this offset. The most common problem with dark casts, which could lead to an additive offset, is mis-application of the tape causing stray light to enter, biasing the dark counts high and therefore the final calibrated signal low—the opposite of what we see here. However, we do note that a correction for this sort of offset can be made using a deep-value adjustment, which does not require as dense of a network of instruments.
A number of observational programs are invested in understanding changes in biological parameters, such as optical backscatter, throughout the global ocean. By necessity, such programs involve many different platforms. For example, the BGC-Argo array currently consists of over 400 operational floats, and plans to eventually operate 1,000 floats throughout the global ocean (Claustre et al., 2020). Uncertainty for these sensors is not, however, dominated by their own measurements, but rather by their calibration. Therefore, even if the sensor can resolve changes in bbp as small as 2 × 10−6 m−1, this resolution does not mean that the total uncertainty of this sensor is this low– calibration uncertainties are generally recorded as about 2%, an uncertainty that will be compounded when trying to understand how measurements from this particular instrument compare with other BGC floats (or measurements from other platforms entirely, as considered here). Indeed, the deep-water alignment analysis presented here (Figure 2c) is highly similar to a recent study of BGC-Argo floats that uncovered biases of about 0.4 × 10−4 m−1 in several instruments (Poteau et al., 2017), which was ultimately linked to scaling factor errors of up to 20% (Barnard, 2019). Other large projects such as GO-SHIP (Global Ocean Ship-based Hydrographic Investigations Program; Talley et al. (2016)) and Bio-GO-SHIP also involve multiple measurements of optical backscatter at different wavelengths and, possibly, different scattering angles. The results given here suggest that it will be difficult to interpret small temporal or spatial changes in bbp through these programs as resulting from actual long-term variability, rather than calibration errors between instruments.
To reduce concerns about inter-comparability between measurements, conducting multiple calibrations of instruments, ideally before and after deployment, is important. The differences between these calibrations should be considered and, when necessary, incorporated into the error of the resulting observations. Dark counts have been observed to vary due to external factors such as power delivery and platform-specific effects, potentially accounting for some deviation in factory and in situ values (e.g., Cetinić et al., 2009). Conducting taped casts of instruments whenever possible is therefore important. Care should be taken when conducting these casts, as they can be prone to errors (e.g., incorrect placement of the electrical tape can lead to higher dark counts). Finally, care should be taken whenever an analysis using data from multiple instruments shows relatively small changes in backscattering or quantities derived from backscattering, such as POC, because these differences may be due to inaccurate calibrations rather than real changes in backscattering or derived variables. One major step forward will be high quality sensor calibration and characterization of each sensor (i.e., calculating the angular and spectral response functions, as in Zhang et al., 2021b). Another necessary advance is a better characterization of variations in time and space in χp(θ), as well as differences between sensors measuring at the same centroid scattering angle but with different fields of view (e.g., the HS6-RR and FLBB-RR). These will be especially important when comparing data from large observational campaigns such as BGC-Argo and (Bio-)GO-SHIP.
5. Conclusions
This manuscript details the process of comparing a dense network of backscattering sensors with each other. This methodology was applied to the EXPORTS-NP cruise data using nine different backscattering sensors measuring either at 700 nm or at adjacent wavelengths that were interpolated to 700 nm. The results of this alignment indicated several sensors that required updated multiplicative and/or additive parameters, and we provide recommended alignments of all sensors to a single instrument (see Table 2). The methodology applied in this paper is possible only for field experiments that utilize a large number of sensors, such that meaningful intercomparisons can be made. In addition to the North Pacific EXPORTS experiment considered here (Siegel et al., 2021), this methodology would also be appropriate for the North Atlantic EXPORTS campaign (Siegel et al., 2016) or the 2008 North Atlantic Bloom Experiment (Cetinić et al., 2012).
Upcoming satellite mission, such as PACE (Plankton, Aerosol, Cloud, ocean Ecosystem; Werdell et al., 2019), are predicated on the ability to accurately derive aquatic properties, such as bbp(700), for use in subsequent estimation of phytoplankton compositions, carbon stocks, and other important biological and chemical constituents that inform advanced climate studies. Given the role of the open ocean in such studies, and the very small bbp signals in oligotrophic areas, collecting in situ measurements of high fidelity and accuracy at low signals is essential to support global satellite data product performance assessments. Of further importance is that measurements from varied instruments be consistent and inter-comparable so as not to introduce additional uncertainty into satellite performance assessments when they are validated against mixed datasets. The scales of correction reported here are of the same order of magnitude as the uncertainties required for robust satellite performance assessments, indicating the importance of large in situ observational arrays to conduct alignments such as the one reported here to ensure inter-comparability of measurements.
Data accessibility statement
The BGC float data used here are from Float 5905988 and can be accessed at https://usgodae.org/ftp/outgoing/argo/dac/aoml/5905988/. All other datasets used in this paper can be found within the SeaBASS data archive at: doi:10.5067/SeaBASS/EXPORTS/DATA001. Code to perform the analysis found in this paper is at doi:10.5281/zenodo.6143682.
Supplemental files
Supplemental materials are available documenting calibrations for each instrument (Table S1), deriving how to include calibration uncertainty in the best-fit parameter uncertainty (Text S1), the comparison between MCOMS-BGC and FLBB-RR (Text S2), and Figures S1–S28 showing the individual comparison plots between different instruments (see Table 2), and a detailed figure caption for Figures S1–S28 (Text S3).
Acknowledgments
The authors would like to thank the captains and crews of the R/Vs Roger Revelle and Sally Ride. They are grateful for the entire EXPORTS scientific team, especially Dave Siegel, in planning and implementing this cruise, and collecting, processing, and submitting for public use the resulting data. Conversations with Eric D’Asaro, Deric Gray, Norm Nelson, and Collin Roesler assisted in the interpretation of our results. BGC float data are collected and made freely available by the International Argo Program, and they thank NOAA/AOML and Andrea Fassbender for supporting the BGC float used here and the editors, Giorgio Dall’Olmo, and an anonymous reviewer for their constructive comments on earlier drafts of this article.
Funding
ZKE was supported by the NASA Postdoctoral Program at the NASA Goddard Space Flight Center, administered by Universities Space Research Association. IC is supported by the Universities Space Research Association, and PJW is supported by the NASA PACE mission. The authors were also supported by NASA grants 80NSSC17K0656 and 80NSSC20K0350 (XZ), 80NSSC17K0568 (EB), 80NSSC17K0663 (CL and MJP), and 80NSSC17K0662 and 80NSSC18K1323 (MO).
Competing interests
The authors are aware of no competing interests.
Author contributions
Conception and design of article: ZKE, IC, EB.
Analysis and interpretation of data: ZKE, IC, XZ, EB, PJW, LH.
Contributed to acquisition of data: IC, XZ, EB, SF, CL, MO, MJP.
Drafting and/or revising the article text and figures: All.
Approved the submitted version for publication: All.
References
How to cite this article: Erickson, ZK, Cetinić, I, Zhang, X, Boss, E, Werdell, PJ, Freeman, S, Hu, L, Lee, C, Omand, M, Perry, MJ. 2022. Alignment of optical backscatter measurements from the EXPORTS Northeast Pacific Field Deployment. Elementa: Science of the Anthropocene 10(1). DOI: https://doi.org/10.1525/elementa.2021.00021
Domain Editor-in-Chief: Jody W. Deming, University of Washington, Seattle, WA, USA
Guest Editor: David Siegel, Department of Geography and Earth Research Institute, University of California Santa Barbara, Santa Barbara, CA, USA
Knowledge Domain: Ocean Science
Part of an Elementa Special Feature: EXport Processes in the Ocean from RemoTe Sensing (EXPORTS)