Multi-decadal surface ozone trends at globally distributed remote locations

Extracting globally representative trend information from lower tropospheric ozone observations is extremely difficult due to the highly variable distribution and interannual variability of ozone, and the ongoing shift of ozone precursor emissions from high latitudes to low latitudes. Here we report surface ozone trends at 27 globally distributed remote locations (20 in the Northern Hemisphere, 7 in the Southern Hemisphere), focusing on continuous time series that extend from the present back to at least 1995. While these sites are only representative of less than 25% of the global surface area, this analysis provides a range of regional long-term ozone trends for the evaluation of global chemistry-climate models. Trends are based on monthly mean ozone anomalies, and all sites have at least 20 years of data, which improves the likelihood that a robust trend value is due to changes in ozone precursor emissions and/or forced climate change rather than naturally occurring climate variability. Since 1995, the Northern Hemisphere sites are nearly evenly split between positive and negative ozone trends, while 5 of 7 Southern Hemisphere sites have positive trends. Positive trends are in the range of 0.5–2 ppbv decade –1 , with ozone increasing at Mauna Loa by roughly 50% since the late 1950s. Two high elevation Alpine sites, discussed by previous assessments, exhibit decreasing ozone trends in contrast to the positive trend observed by IAGOS commercial aircraft in the European lower free-troposphere. The Alpine sites frequently sample polluted European boundary layer air, especially in summer, and can only be representative of lower free tropospheric ozone if the data are carefully filtered to avoid boundary layer air. The highly variable ozone trends at these 27 surface sites are not necessarily indicative of free tropospheric trends, which have been overwhelmingly positive since the mid-1990s, as shown by recent studies of ozonesonde and aircraft observations.


Introduction
Of the greenhouse gases (CO 2 , CH 4 , O 3 , N 2 O, H 2 O, synthetic greenhouse gases, e.g. HFCs, SF6) tropospheric ozone is perhaps the most difficult to observe and quantify on the global scale, due to its acute spatial variability resulting from its variable lifetime (minutes in the polluted boundary layer, to roughly three weeks in the free troposphere ) and its range of sources (injection from the stratosphere, or photochemical production from natural and anthropogenic precursor gases) and sinks (surface deposition and chemical destruction) [Monks et al., 2009;Lin et al., 2019]. The challenge is compounded by varying emissions of ozone precursor gases which, over the past few decades, have shifted from high and mid-latitudes toward low latitudes, where ozone production efficiency is greater . With respect to quantifying tropospheric ozone's impact on climate change, the most useful observations are those from globally distributed, long-term surface sites in remote locations, and free tropospheric observations (ozonesonde or aircraft), especially those from the upper troposphere, where ozone's long-wave radiative effect is most effective [Kuai et al., 2017].
Since the 1990s, periodic updates of observed surface and free tropospheric ozone trends in remote locations have appeared in the literature at irregular intervals [e.g. Logan 1985Logan , 1999aLogan , b, 2012Oltmans and Levy, 1994;Oltmans et al., 1998Oltmans et al., , 2006Oltmans et al., , 2013Parrish et al., 2012;Gaudel et al., 2018]. These data analyses have been complemented by a series of assessments that summarize the current state of knowledge regarding tropospheric ozone's global distribution and trends [e.g. IPCC, 2013;Gaudel et al., 2018;Blunden et al., 2018]. These studies and assessments have been extremely useful for keeping the research community and policy makers informed on tropospheric ozone's trends and variability. However, due to ozone's continuous global redistribution and the 1-2 year time lag between data availability and publication date, these findings quickly become outdated. Furthermore, most of the previous global trend analyses focused on very few sites, some of which were impacted by local and regional pollution sources, as will be shown later.
Recently, The Tropospheric Ozone Assessment Report (TOAR) produced a series of peer-reviewed publications that documented the state of knowledge on ozone's global distribution and trends, relying on surface observations through the year 2014 (https://collections.elementascience.org/toar). While TOAR provided a wide range of analysis on global ozone trends with relevance to climate [see Gaudel et al., 2018, also known as TOAR-Climate] it could not cover every aspect in detail. Fortunately, TOAR was specifically designed to enable new research, independent of the original TOAR effort, allowing future studies to explore unaddressed science questions. This paper is part of that growing legacy, joining several new independent studies that have been enabled by the wide range of ozone metrics archived in the TOAR Surface Ozone Database [Lu et al., 2018;Jaffe et al., 2018;Strode et al., 2018;Chang et al., 2019;Shen et al., 2019;Seltzer et al., 2020]. Our analysis has several goals: 1) identify the most useful sites in the TOAR database for understanding longterm ozone trends at remote locations worldwide, for the purpose of evaluating global chemistry-climate models; 2) document and display the wide range of ozone trends and multi-year ozone fluctuations recorded at remote sites around the world using the most recently available data (most sites have data through [2017][2018]; 3) demonstrate that the ozone trends recorded at high altitude Alpine sites in central Europe do not match the ozone increases observed by commercial aircraft in the lower free troposphere above Europe; and 4) use the TOAR database and the recent peer-reviewed literature to place these remote ozone time series in the context of ozone trends observed in the polluted boundary layer and the free troposphere.
This analysis focuses on ozone observations from 27 remotes sites, selected from over 9000 ozone time series in the TOAR Surface Ozone Database [Schultz et al., 2017]. Although few in number, these are the best available sites for understanding long-term ozone trends at remote locations worldwide, and they are well suited for evaluating the ozone trends calculated by the global chemistry-climate models that simulate the evolution of tropospheric ozone on multi-decadal time scales Myhre et al., 2017;Young et al., 2018]. Our analysis goes beyond the scope of TOAR, which only addressed five of the 27 sites selected for this paper , by using the most recently available data, and by using a more refined trend analysis method that is based on monthly anomalies.
The analysis and results are presented as follows. Section 2 describes the site selection method and the statistical analysis applied to estimate the long-term ozone trends at each site, as well as periods of multi-year ozone fluctuations. Sites were limited to those with continuous ozone time series that extend from the present back to at least 1995, so that multi-decadal ozone trends can be assessed. Importantly, all sites have at least 20 years of data, which improves the likelihood that any robust trend value is due to changes in ozone precursor emissions and/or forced climate change rather than naturally occurring climate variability [Weatherhead et al., 1998;Barnes et al., 2016]. Section 3 presents the results of the trend analysis focusing on: 1) the four longest remote ozone time series, established in the mid-1970s; 2) all 27 remote sites since 1995; and 3) a comparison between the long-term trends at the high elevation surface sites in the Alps, and the lower free-troposphere above Europe, as observed by commercial aircraft. Section 4 provides context for the ozone trends at the 27 remote sites by comparing them to observed trends in the polluted boundary layer and the free troposphere. Finally, Appendix S-A in the Supplemental Material describes the advantages of the statistical methods employed by this study over those used by TOAR. The appendix also provides a comparison between these statistical methods and several other popular methods, as well as an evaluation of polynomial fits, which can lead to inaccurate trend estimates. Watch (GAW) program and therefore follow strict data quality control guidelines Schultz et al., 2015], with station audits and calibration information available from the World Calibration Centre for Surface Ozone, Carbon Monoxide, Methane and Carbon Dioxide [Klausen et al., 2003;Buchmann et al., 2009;WCC-Empa, 2019]. All surface ozone data were uploaded to the TOAR Surface Ozone Database [Schultz et al., 2017], processed by the TOAR Database team and then retrieved from the TOAR Surface Ozone Database via the JOIN web interface: https://join.fz-juelich.de. Data processing involved reformatting the data, and where necessary, unit conversions and time shift to UTC. Ozone observations made from commercial aircraft above Western Europe are also used, as described below.

Trend estimation
To describe and quantify the long term changes in ozone at each site we chose a linear regression model (described below) to quantify the ozone rate of change, and locally weighted regression (or lowess, or loess smoother) to describe the nonlinear multi-year ozone fluctuations at each site [Cleveland, 1979;Cleveland and Devlin, 1988]. We considered other statistical methods for estimating the long-term rate of change and multiyear fluctuations, and we provide a detailed comparison of all methods in Appendix S-A to demonstrate why the linear regression model and locally weighted regression are appropriate for this study.
Due to ozone's strong seasonal cycle, the trend estimated from monthly observed ozone can be less accurate than the trend estimated from monthly anomalies, if several months of data are missing. Therefore, we report trends based on monthly mean ozone anomalies, which also reveal the influence of unforced climate variability on interannual ozone fluctuations [Logan et al., 2012;Oltmans et al., 2013;Tarasick et al., 2016;Lin et al., 2014;. The observed monthly means are calculated by the TOAR Surface Ozone Database from hourly observations (for either daytime or nighttime conditions, as described below). Acceptance of a monthly mean depends on the hourly data availability being greater than 75%; the data availability requirement was relaxed to 50% for Kislovodsk due to intermittent data sampling in the early part of the record. To calculate a monthly anomaly we first calculated monthly mean ozone at a particular site for the entire time series, and then calculated the 16-year mean for all 12 months of the year over the base period 1995-2010. The difference between the observed monthly mean and the 16-year monthly mean for the same month yields the monthly anomaly.
Trends are estimated with the following linear regression model: Where y is the monthly mean ozone anomaly, t is a monthly index from January 1957 to December 2018, α is a constant, β is a linear trend, γ and δ are coefficients for a 12-month harmonic series of seasonal cycle (M = 1,…, 12), and R t is AR(1) to account for autocorrelation, i.e.
R t = ρR t-1 + ∈ t with ∈ t a normal random error series. The sine and cosine terms account for any residual seasonal signal that may remain in the monthly anomalies. Equation (1) follows the form of standard statistical methods for calculating trends in time series of environmental data characterized by seasonal cycles and autocorrelation [Cochrane and Orcutt, 1949;Weatherhead et al., 1998;Chandler and Scott, 2011]. This method is appropriate for long time series and the estimated trends are not disproportionately affected by outliers as the time series are comprised of at least twenty years of monthly values. Trends are reported with 95% confidence intervals and p-values. The p-value is the probability under a specified statistical model that a statistical summary of the data would be equal to or more extreme than its observed value. Following the recent recommendation of the American Statistical Association [Wasserstein and Lazar, 2016;Wasserstein et al., 2019] we do not treat p < 0.05 as a "bright-line" to label a trend as statistically significant or meaningful. Trend values are reported regardless of p-value, and some of the figures follow the TOAR color scheme for indicating the p-value of a particular ozone time series [Fleming and Doherty et al., 2018;Gaudel et al., 2018;Mills et al., 2018]. The Supplemental Material illustrates the degree to which trends can differ when they are based on simple linear regression of observed monthly means, versus the linear regression model (1) and monthly anomalies. For example, at Tustervatn, Norway (Figure S-1a) the ozone trend based on the observed monthly mean is 0.6 ± 0.9 ppbv decade -1 (p = 0.17), while the trend based on anomalies is stronger with tighter confidence limits, 0.9 ± 0.8 ppbv decade -1 (p = 0.02). Minamitorishima, is a very small island at the edge of the tropics in the North Pacific Ocean (Figure S-1b), impacted by Asian outflow during the winter months [Wada et al., 2011]. The trend at Minamitorishima based on observed monthly means indicates an ozone decrease of 14% (-1.7 ± 1.8 ppbv decade -1 ; p = 0.07) from 1994 to 2018. However, the trend based on anomalies shows a weaker decrease of only 3% (-0.6 ± 0.8 ppbv decade -1 ; p = 0.17). In addition, the monthly anomalies reveal repeated cycles of ozone increases and decreases that are not evident from the observed monthly means. The ozone fluctuations at Minamitorishima are related to shifting transport patterns produced by climate variability [Okamoto et al., 2018].
Monthly anomaly time series and their associated linear trend values are shown for all sites in Appendix S-B. These plots also show the results of the locally weighted regression (or lowess, or loess) method, which is used to describe the multi-year ozone fluctuations at each site [Cleveland, 1979;Cleveland and Devlin, 1988]. The lowess smoother fits are not used for a formal trend estimate, instead they represent a method for estimating a regression surface through a multivariate smoothing procedure, fitting the monthly ozone anomalies to the most attributable long term variability in the time series, in a moving fashion analogous to how a moving average is computed for a time series. The attributable long-term variability is subjective and for our purposes we selected a time scale of 10 years for examining ozone variability. A comparison of our selected lowess smoother to one that is sensitive to a shorter time scale, and to a similar low pass, filterbased fast Fourier transform is presented in Appendix S-A. As shown in Appendix S-B, the lowess smoother can be used to identify multi-year periods of ozone increases and decreases not detected by the linear regression model. In most cases the linear trend provides an adequate description of the long-term rate of change; clear exceptions are, 1) Zugspitze, which can be summarized by a strong upward trend (6.4 ± 1.1 ppb decade -1 , p ≤ 0.01) over 1978-1997, and a weak downward trend (-1.2 ± 1.0 ppb decade -1 , p = 0.01) over 1998-2017; 2) Centennial, which exhibits a sharp drop in 2004; 3) Jungfraujoch which experienced several years of anomalously low ozone in the early 1990s; and 4) Izaña, where ozone increased abruptly in the late 1990s.

Comparison of polynomial fits to the lowess smoother
Appendix S-A in the Supplemental Material provides an evaluation of polynomial fits, which have been used in previous research (e.g. Logan et al., 2012;Parrish et al., 2014) to estimate long-term ozone trends. The idea of a polynomial fit is analogous to gradually adding "nodes" to the trend estimate, with each additional term of the polynomial equation adding another bend to the curve, thus one or multiple change(s) in the trend can be revealed in the resulting estimate. However, due to the challenges of fitting polynomials to highly variable tropospheric ozone time series (described below) they are not as effective as the lowess smoother for identifying the multi-year ozone variability common to tropospheric ozone observations. Here we provide a brief summary of two of the most severe problems associated with polynomial fits, and the reader is referred to Appendix S-A for further details.
1) A polynomial rarely fits the data as well as commonly used low-pass filter methods, such as the lowess smoother. As demonstrated by this analysis, the lowess smoother provides a robust characterization of the ozone variability across the span of the multi-decadal time series. In contrast, low order polynomial fits bear the risk of missing features detected by the lowess smoother, while higher order polynomials contain artefacts in spite of their closer fit. Figure 1 compares the lowess smoother to 2 nd , 3 rd , 4 th , 10 th and 20 th order polynomials for the midlatitude sites of Jungfraujoch, Switzerland and Strath Vaich Dam, Scotland. At Jungfraujoch the 2 nd and 3 rd order polynomials miss major ozone fluctuations revealed by the lowess smoother, and indicate ozone has been decreasing since 2010, even though it has actually increased. The higher order polynomials approach the pattern revealed by the lowess smoother but they exaggerate the ozone increase since 2010. Similar results are found for Strath Vaich Dam, with the 20 th order polynomial having greatly exaggerated variability at either end of the time series; this poor fit is known as the Runge phenomenon [Runge, 1901;Fornberg and Zuev, 2007]. 2) For polynomial fits, the trend solution at the beginning of the time series can be impacted by the data at the end of the time series. Figure 2 compares the lowess smoother to polynomials at Jungfraujoch and Strath Vaich Dam, for the full time series (ending in 2015-2018) and for the condition when the time series ends in 2010. The addition of data to the end of the time series causes the polynomial to bend differently through the data in the early years. This demonstration shows that the slope in the early part of the time series is inaccurate, and is merely an artefact of the polynomial fit, rather than being a true representation of the data. In contrast, the addition of data to the end of the time series has no impact on the lowess fit through the early part of the time series, as expected. We also bring attention to the fact that the fit of the lowess smoother can change abruptly at the end of the time series when additional data become available. For example, the lowess smoother follows the Strath Vaich Dam data sharply downwards when the time series ends in 2010, as no other information is available in this time-limited scenario. But once additional years of data become available the lowess smoother then accounts for the increase of ozone after 2010 and bends upwards, again following the available data. This example of abrupt changes in localized trends demonstrates why the lowess smoother (or any trend estimator) should not be used to extrapolate trends beyond the available data.

Data selection
The focus of this study is to quantify long-term ozone trends worldwide, at as many remote monitoring sites as possible. We require each site to have at least 20 years of data, beginning no later than 1995 and extending to the present (2013-2018). Sites must be in remote locations far from fresh anthropogenic emissions, which could lead to localized production or destruction of ozone [Escudero et al., 2014;Simon et al., 2015;Strode et al., 2018]. The TOAR Surface Ozone Database, which holds ozone metrics from more than 9,000 sites worldwide [Schultz et al., 2017], was scanned for sites that meet these criteria, yielding the 27 remote surface sites reported in Table 1. Twenty sites are in the Northern Hemisphere and seven are in the Southern Hemisphere; six sites are in the polar regions and two are in the tropics; thirteen sites are high elevation (>1500 m), of which eight are mountaintop sites; six sites are influenced by the marine boundary layer due to their island or coastal locations (see Table 1 for a listing of the sites in each of these environments). The diurnal ozone cycle at each site was examined to select the time of day when atmospheric conditions are well-mixed. This criterion typically resulted in daytime (08:00 to 19:59 solar time) data being selected for low elevation sites, to avoid the nighttime hours when ozone is likely to be depleted under the nighttime temperature inversion. Conversely, nighttime (20:00 to 07:59 solar time) data were selected for mountaintop sites to focus on regionally representative ozone, and to avoid local air masses with depleted ozone that are transported from the valleys to the mountaintops under daytime upslope wind conditions [Price and Pales, 1963;Weiss-Penzias et al., 2006;Gheusi et al., 2011;Gallagher et al., 2012;Cristofanelli et al., 2013]. The mountaintop sites in this analysis are Mauna Loa, Hawaii [Oltmans et al., 1986]; Zugspitze [Gilge et al., 2010], Jungfraujoch [Cui et al., 2011] and Sonnblick [Gilge et al., 2010] in the Alps; Mt. Waliguan in central China [Xu et al., 2016]; Kislovodsk High Mountain Station in southern Russia [Tarasova et al., 2009]; Izaña on the island of Tenerife in the Canary Islands ; and El Tololo in the Andes Mountains of Chile [Anet et al., 2017]. In the case of Kislovodsk High Mountain Station, which has a weak diurnal cycle, 24-hour data were used due to limited sampling in the early part of the record. Some sites, such as those in the polar regions (e.g Zeppelin [Solberg, 2003]) or those in the marine boundary layer (e.g. American Samoa), have virtually no diurnal ozone cycle and therefore data from day or night, or all 24 hours of the day can be used. The time of day selected for each site is indicated in Tables 2 and 3.   Table 3: As in Table 2 but the change in ozone is reported as the total increase or decrease of ozone from the beginning of the period to the end of the period, in units of ppbv or percent, based on the linear regression values in Table 2 (e.g. from 1995 to 2017 ozone at Grand Canyon decreased by 3.4 ppbv or 9%). Trends with p-values <= 0.05 are shown in bold font, and trends with p-values in the range 0.05 < p <= 0.10 are shown in italics. A p-value of 0.00 indicates any value less than 0.005. Data selection varies by site, with nighttime (N), daytime (D) and 24-hour (24)  Every surface site selected for this analysis was screened to ensure its remote location, using information provided by the TOAR Surface Ozone Database. The first step was to identify all rural sites with data spanning 1995-2015, using an objective algorithm developed by TOAR [Schultz et al., 2017]. This algorithm takes advantage of several global gridded geo-data sets (human population density, satellite-detected tropospheric column NO 2 , a bottom-up NO x emission inventory, satellite-detected nighttime lights of the world, and satellite-detected land cover), which are cross-referenced with the locations of each site in the TOAR database. Using thresholds of human population, satellite-detected tropospheric column NO 2 , and satellitedetected nighttime lights of the world, the algorithm identified 400 rural sites in the Northern Hemisphere and just 8 in the Southern Hemisphere. Next, the JOIN web interface (https://join.fz-juelich.de) was used to visually inspect the surroundings of each rural site, as revealed by current satellite imagery (provided by Google maps; see Schultz et al. [2017] for further details), to identify the sites in remote settings. All sites are more than 100 km from large urban areas, with the exception of the three Alpine sites and Cape Point. As described below the data at these sites were filtered by time of day (Alpine sites) or wind direction (Cape Point) to limit the influence from nearby urban areas.
Sections 2.3.1 through 2.3.6 provide additional information on selected sites to describe unique data filtering methods, historical observations, or site characteristics that may influence interpretation of the ozone time series. Three additional sites from the rural German boundary layer are described in Section 2.3.7. These sites are not remote, and they are shown to provide context to the ozone trends observed at the high altitude Alpine sites. Finally, Section 2.3.8 described the IAGOS commercial aircraft observations above Western Europe. Focus is placed on the observations at 650 hPa to demonstrate that the IAGOS data reveal the ozone trend in the lower free troposphere above Europe, while ozone observations at the same altitude in the Alps are impacted by the European boundary layer.  [Oltmans, 1981]. Prior to 1975, ozone was measured using an electrochemical concentration cell (ECC) meter, which depends on the oxidation-reduction of potassium iodide to measure ozone. In 1975 the measurement technique was updated to the modern UV-absorption method. As described by Oltmans [1981], concurrent measurements using both methods were conducted to ensure time series consistency. Ozone observations at American Samoa were suspended in January, 2016, but may resume at a later date.

Historical ozone observations from MLO and SPO
Historical ozone observations are available at MLO and SPO and are reported here in order to extend the MLO and SPO time series as far back in time as possible. Ozone was first measured at MLO over a 2-year period from August, 1957to July, 1959[Price and Pales, 1963 using the Regener Automatic instrument, based on the Ehmert technique [Bowen and Regener, 1951]. As discussed by TOAR [Tarasick and Galbally et al., 2019], the Ehmert technique is a wet chemical method that uses a neutral buffered sampling solution containing iodide and thiosulfate (S 2 O 3

2-
). The technique is reliable and compares very well to the UV-absorption method [Tarasick and Galbally et al., 2019]. A Regener Automatic instrument also measured ozone at SPO from February 1961 to July 1963 [Oltmans and Komhyr, 1976a, b].

High and low humidity ozone records at MLO
MLO is located at the interface between the tropics and mid-latitudes, and is therefore influenced by both easterly tropical air masses and westerly mid-latitude air masses [Harris and Kahl, 1990;Oltmans et al., 2006]. The frequency of these air masses varies with time of year and on interannual and even decadal time scales due to unforced short-term climate variability associated with El Niño/Southern Oscillation (ENSO) and the Pacific-North American pattern [Lin et al., 2014]. Ozone is typically greater in the mid-latitude air masses, and the long term trend at MLO is affected by the relative frequency of air mass transport from high and low latitudes in response to climate variability [Lin et al., 2014]. To reduce the noise in the long-term ozone trend due to climate variability we apply an air-mass classification method for examining ozone trends at MLO, previously used by the NOAA State of the Climate reports [Arndt et al., 2018] and described in detail by the Tropospheric Ozone Assessment Report . Co-located dewpoint observations are used to separate the ozone observations into low humidity air samples, representative of mid-latitude air masses that originate at higher altitudes and higher latitudes, and high humidity air samples, representative of tropical air masses from lower latitudes and lower altitudes.

Zugspitze merged ozone records
Ozone was measured at the summit of Zugspitze (2960 m) in the Alps of southern Germany from 1978 to 2010 by the late H.-E. Scheel of the Fraunhofer Institute (IFU) and the Karlsruher Institut für Technologie, Garmisch-Partenkirchen, Germany [Scheel et al., 1997]. These data have been widely reported in the literature [Oltmans et al., 2006;Gilge et al., 2010;Logan et al., 2012;Gaudel et al., 2018] and are available from the TOAR Surface Ozone Database. A chemiluminescence instrument was used from 1978 to 1996, when a UV-absorption instrument was installed [Logan et al., 2012]. The two instruments overlapped for three years and the data from the two methods were found to have similar accuracy, according to WMO/GAW audits [Herzog et al., 1996]. In particular, the long-term stability of the sensitivity of the chemiluminescence instrument (Bendix 8002) has been comparable or even better than the stability of the different UV instruments used at the site. In 2000 the German Federal Environment Agency (UBA) commenced ozone measurements from the Schneefernerhaus station (2656 m), 300 m below the Zugspitze summit [Gilge et al., 2010], however the record in the TOAR Surface Ozone Database begins in 2002. Comparison of the overlapping time period shows that the summit and Schneefernerhaus ozone records are very similar in terms of diurnal, seasonal and interannual variations, with the main difference being that the Schneefernerhaus ozone values are slightly lower [Zellweger et al., 2011]. To estimate the ozone trend at Zugspitze from 1978 to the present, the summit and Schneefernerhaus ozone anomaly records must be merged. A complication arises because reference ozone values spanning the 1995-2010 base period can't be calculated for the Schneefernerhaus record, which only begins in 2002. Therefore, the following procedure was employed. First, monthly mean nighttime ozone at Schneefernerhaus was compared to the corresponding summit values for the period 2002-2010. Due to its lower elevation, ozone at Schneefernerhaus is 1.3 ppbv less than the summit values, on average. Monthly ozone anomalies were calculated at the summit site using the summit's 1995-2010 base period. The summit ozone base period was then reduced by 1.3 ppbv (making the two overlapping times series virtually indistinguishable from each other) and used to calculate monthly ozone anomalies at Schneefernerhaus. The two monthly ozone anomaly records could then be merged, using 1978-2001 data from the summit, and 2002-2016 data from Schneefernerhaus.

Mace Head, Ireland
The monitoring site of Mace Head, Ireland, is situated in a very rural area, 50 km west of Galway, the largest municipality in the region (population, 80,000). While this site on the western edge of Europe often samples aged marine boundary layer air from the North Atlantic Ocean, it is also impacted by air masses containing aged pollution from Europe [Derwent et al., 2007]. Previous studies have filtered the Mace Head data using co-located trace gas observations or back trajectories to remove the influence from distant European emissions [Derwent et al., 2018], resulting in a different trend for the filtered data compared to the full unfiltered record. For the present analysis the full unfiltered ozone record has been utilized, although nighttime data were omitted to avoid observations affected by local nighttime deposition. Therefore, the ozone trends reported for Mace Head are influenced by both the marine boundary layer and aged air masses from Europe. Comparison of the daytime ozone observations to the baseline values reported by Derwent et al. [2018] shows that both data sets have similar seasonal cycles, with the annual mean baseline ozone being approximately 2 ppbv greater than the unfiltered daytime data. Derwent et al. [2018] concluded that coarse resolution global models (e.g. 2° × 2.5°) may have difficulty in simulating the recirculation of aged air masses from Europe to Mace Head, and recommended that these particular models be evaluated against the filtered time series. Looking forward, the most recent generation of global atmospheric chemistry models are run at much higher resolution (e.g. 1° × 1°, or finer) He et al., 2020], and therefore evaluating these models against the full unfiltered ozone record will test their ability to replicate ozone levels in the aged European plumes.

Cape Point, South Africa
Another coastal monitoring site used in this analysis is Cape Point, South Africa. This site is not remote due to its proximity to Cape Town Metropolitan Municipality, 25 km to the north (population 4.0 million). However, the station's principal investigator uses observed wind direction to filter the data set and omit ozone observations influenced by recent emissions from the nearby urban area, equal to 22% of the total data set [Brunke et al., 2004]. The subset of data filtered by wind direction is used for this analysis; no other remote ozone time series in the TOAR database was archived with a full time series and a subset filtered by wind direction. Due to the filtering of this time series by wind direction these ozone observations are representative of the air masses originating to the west of Cape Point. Therefore, when comparing this filtered time series to a global chemistry-climate model we recommend that the model grid cell chosen for evaluation be located to the west of Cape Point, where recent continental outflow is infrequent.

Cape Grim, Tasmania
Australia's Cape Grim Baseline Air Pollution Station is located on a rural coastal bluff in northwestern Tasmania and was established in 1976 [Derek et al., 2016]. The ozone monitoring program commenced in 1976 and surface ozone data with quality assured measurements have been obtained since 1982 [Galbally et al., 2000]. The Cape Grim ozone record is the longest mid-latitude record available from the Southern Hemisphere and it is an important data set for monitoring ozone changes in this part of the world. While the air sampled at Cape Grim generally approaches from the Southern Ocean to the west, the observed ozone can often be modified due to influence from the boundary layer of Australia or Tasmania. Molloy and Galbally [2014] used co-located observations of radon, wind speed, wind direction and condensation nuclei to filter 10 years (1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001) of Cape Grim ozone data to identify baseline transport conditions from the Southern Ocean (wind and radon data are available from the TOAR database). They found that approximately 40% of the data set was representative of baseline conditions and that the positive ozone trend was approximately 8% greater in baseline air compared to the full data set for those years. The data set used here is the recently produced 1982-2017 Cape Grim surface ozone on the WMO GAW/BiPM standard scale, available at the World Data Centre for Reactive Gases (https://www.gaw-wdcrg.org/). This record has not yet been selected for baseline conditions.

Izaña, North Atlantic Ocean
The Izaña high mountain Observatory (28.3° N, -16.5° E, 2367 m), Canary Islands, located in the subtropical North Atlantic Ocean under the descending branch of the Hadley cell (around 30° N), is representative, most of the time, of free-tropospheric background conditions . These conditions are fully assured during nighttime (20:00-08:00 UTC) when katabatic flow conditions predominate. Surface ozone measurements began in 1987 in the framework of what is now the WMO/GAW program. Surface ozone quality assurance is maintained with periodic audits carried out by Empa, The Swiss Federal Laboratories for Materials Science and Technology (6 in total, with a seventh in progress, WCC-Empa (2019)).
The free troposphere background conditions at Izaña are characterized by the presence of air masses from the mid-troposphere over the central and western North Atlantic Ocean, with relatively high ozone values because they are frequently impacted by upper tropospheric air masses throughout their trajectories [Prospero et al., 1995;Oltmans et al., 1996;Rodríguez et al., 2004;Rodríguez and Cuevas, 2013], or by strong stratosphere-troposphere exchange processes, driven by deep lows and cut-off lows north of the Canary Islands [Kentarchos et al., 2000;Cuevas et al., 2013]. The opposite situation occurs with the arrival of Saharan air masses, characterized by low ozone values . The low ozone values are partly due to a low latitude origin within the North African continental boundary layer, and partly due to these air masses having a high mineral dust content, and relatively high water vapor (in contrast to clean free troposphere conditions), that lead to ozone destruction [Andrey et al., 2014]. The impact of each of these two factors on the observed ozone is still an open question. Therefore, tropospheric ozone interannual variability and trends at Izaña are modulated not only by changes in tropospheric chemistry, but also by changes in natural atmospheric processes such as baroclinic events that favor ozone enrichment in the midand upper troposphere, and those that modulate the frequency and intensity of the Saharan Air Layer intrusions over the subtropical North Atlantic, mainly in summer, which cause ozone reduction. These processes are, in turn, modulated by changes in large-scale processes and meteorological patterns, such as mid-latitude Rossby waves, the Saharan Heat Low, and the North African Dipole Intensity [Rodríguez et al., 2015;Cuevas et al. 2017]. The ozone trend at Izaña was previously found to be 0.19 ± 0.05% yr −1 for the period 1988-2009 .

German boundary layer observations
Three long-term ozone monitoring sites located at relatively low elevations in the southern German boundary layer were selected for comparison to the high elevation Alpine sites: Hohenpeissenberg, Schwarzwald-Süd and Pfälzerwald-Hortenkopf ( Table 1). The sites are in rural locations and unlikely to be influenced by fresh NO emissions, which destroy ozone. However, the sites are impacted by aged boundary layer pollution and the longterm trends at these sites reflect ozone levels as they have responded to changing ozone precursor emissions within Germany and neighboring countries.
2.3.8. Commercial aircraft observations of ozone and specific humidity above Western Europe As detailed in Section 3.3 below, Alpine mountain sites, which have previously been used to assess long-term ozone changes in the free troposphere over Europe, show significant impacts from aged regional pollution. A more robust estimate of lower free tropospheric ozone changes can be obtained from frequent commercial aircraft observations. Since 1994 the In-Service Aircraft for the Global Observing System (IAGOS) program (formerly known as MOZAIC, 1994-2011) has measured ozone worldwide, using instruments onboard commercial aircraft of internationally operating airlines Petzold et al., 2015; http://www.iagos.org]. Ozone is measured using a dual-beam UV-absorption monitor (time resolution of 4 seconds) with an accuracy estimated at about ± (2 nmol mol -1 + 2%) . Because most IAGOS aircraft have belonged to airlines based in Europe since the program began in 1994, Western Europe is the program's most frequently sampled region of the world. Above northwestern Europe (0°-15° E, 47°-55° N) IAGOS aircraft measured 34,600 ozone profiles between 1994 and 2016, with 99% of profiles from Frankfurt, Paris, Munich, Brussels, Dusseldorf and Amsterdam. The lower tropospheric portions of the profiles have been shown to be regionally representative of ozone across Western Europe . The sampling frequency varies according to airline schedules, but on average, four profiles are recorded somewhere in this region every day. Previous analysis has shown that 12 profiles per month can produce monthly mean ozone values in the free troposphere with an uncertainty less than 5-10% [Saunois et al., 2012], which demonstrates that the high sampling frequency by IAGOS aircraft above Western Europe can easily produce accurate monthly mean ozone profiles. IAGOS aircraft can take-off and land at any time of day and all data are used in this analysis. No diurnal ozone cycle occurs in the free troposphere above Europe (above the 750 hPa level), although a clear ozone cycle occurs in the boundary layer, and is strongest below 950 hPa [Petetin et al., 2016]. For this analysis ozone trends are estimated for several pressure levels from 950 to 250 hPa, which avoids the lowest layers with very strong diurnal cycles. Any data points in the stratosphere were removed from the analysis, as determined from the potential vorticity values associated with each ozone profile, and available from the IAGOS data portal (https://doi. org/10.25326/20). Particular attention is given to the ozone trend at 650 hPa because this pressure surface resides in the lower free troposphere, and because it is close to the average pressure of the high elevation ozone monitor at the summit of Jungfraujoch (located at 3580 m above sea level with an annual average pressure of 655 hPa, based on observations by MeteoSwiss, as reported in the TOAR Surface Ozone Database).
Following the methods of Schultz [1995] and Weiss-Penzias et al. [2006] this analysis also takes advantage of IAGOS observations of relative humidity, temperature and pressure to calculate vertical profiles of specific humidity. Relative humidity was measured by the MOZAIC Capacitive Hygrometer (1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011) Neis et al., 2015a] and the IAGOS Capacitive Hygrometer (2011-present) [Neis et al., 2015b]. These observations are used to estimate the quantity of European boundary layer air at the summit of Jungfraujoch. We used the observed monthly median specific humidity values in the boundary layer and at 650 hPa, as measured by IAGOS ( Figure S-2), to answer the question: How much air from the European boundary layer must be mixed with free tropospheric air at 650 hPa to achieve the same specific humidity values as Jungfraujoch? These calculations were based on simple linear mixing ratio relationships, with the assumption that specific humidity is a conserved tracer. However, specific humidity is not entirely conserved within an air mass ascending from the boundary layer to a pressure level of 650 hPa, because some of the moisture will be lost through precipitation. Therefore, estimates of the percentage of boundary layer air at the summit of Jungfraujoch are considered to be a lower bound.

Trends since the early 1970s
Before focusing on ozone trends based on monthly anomalies, we first illustrate the range of ozone values that can be found at remote locations around the world, in terms of the seasonal cycle, interannual variability, multi-year fluctuations and long-term trends. We examine ambient monthly mean ozone values at the four sites that provide the longest continuous ozone time series at remote locations (40+ years), beginning in either 1973 or 1975: Barrow Atmospheric Baseline Observatory, Mauna Loa Observatory (MLO), American Samoa Observatory and South Pole Observatory. Trends from these sites are periodically reported in the literature [Oltmans et al. 1994[Oltmans et al. , 1998[Oltmans et al. , 2006Arndt et al., 2018;Gaudel et al., 2018], and here we include the most recently available data, along with the reliable historical observations from MLO (1957MLO ( -1959 and South Pole (1961Pole ( -1963 to evaluate ozone changes over six decades. These time series are shown in Figure 3 with straight lines fit through the data, based on simple linear regression for illustra-tive purposes; for time series of this length, the slope of the trend line through the ambient ozone values is very similar to the slope estimated from monthly anomalies, as reported in Tables 2 and 3. Monthly ozone anomalies for each of these sites are shown in Appendix S-B (see Supplemental Material). The two Northern Hemisphere sites show increasing ozone since the 1970s, with the trend at MLO in the tropics being twice as strong as the trend at Barrow in the Arctic. In the Southern Hemisphere, trends at Samoa (tropical South Pacific Ocean) and South Pole are similarly positive and weak, but with South Pole having a much higher p-value (p = 0.22). Extension of the trend lines at MLO and South Pole back to the time of the historical observations shows that ozone at South Pole has changed little since the early 1960s, while ozone at MLO has increased by roughly 50% since the late 1950s (based on an average value of ~3 0 ppbv in the late 1950s and 45 ppbv in 2018). This increase, on the northern edge of the tropics, is similar to the northern mid-latitude increases reported by Tarasick and Galbally et al. [2019], who compared a range of historical ozone observations at rural sites (made between 1896 and 1975) to modern observations. Figure 3 also shows that ozone in the low humidity air masses (biased toward a mid-latitude origin) is much greater than ozone in the high humidity air masses (biased toward a tropical origin); see Section 2.3.1 for a description of the method for selecting low and high humidity air samples at MLO. Table 2 shows that ozone in the low humidity air masses has increased at twice the rate of ozone in the high humidity air masses. Specifically, since 1974, ozone has increased in the low humidity air masses at MLO at the rate of 2.1 ± 0.4 ppbv decade -1 , for a total increase of 9.1 ppbv or 24% , with the lowess smoother showing only small deviations from the linear regression line (Appendix S-B). As discussed by , the low humidity air masses at MLO are more likely to represent mid-latitude air originating to the west (i.e. Asia) than the high humidity air masses which are more likely to come from the tropics and from the east (i.e. Central America). Ozone precursor emission rates are changing rapidly across Asia due to economic forces and air quality control policies, with increases in some regions and decreases in others Lin et al., 2017;Miyazaki et  2018]. Therefore, further modelling work is required to quantify the impact of changing Asian emissions on the MLO ozone trend.

Trends across all remote locations
Focus is now shifted to all 27 remote surface sites in an attempt to assess long-term surface ozone trends on the global scale. Trends are also assessed for the IAGOS aircraft observations at 650 hPa above Western Europe, and for the low and high humidity air masses observed at MLO (see anomaly time series plots in Appendix S-B). The lowess fits to each of these time series are shown in Figure 4, comparing the ozone fluctuations for all sites in northern high latitudes, northern mid-latitudes, the tropics and the southern extra-tropics. The lowess fits demonstrate the variability that is found among sites even in the same latitude bands. For example, in the northern polar region both Zeppelin and Tustervatn show a strong increase of ozone from 1990 until 2004, followed by a strong decrease, a pattern not seen at the other four sites. At northern midlatitudes most sites show an increase of ozone during the 1990s, but since 2000 the time series diverge with some showing increases and others decreases. In the tropics the dry air masses at MLO are unique as they show a steady increase from 1973 to 2018, while Minamitorishima has a repeated cycle not seen at the other sites. In the southern extra-tropics the time series have diverged since 2010 with half showing increasing ozone and the other half showing decreases. Figure 5 shows the ozone trends for each time series with an arrow-based approach, used previously by Cooper et al. [2012] and by TOAR [Fleming et al., 2018;Gaudel et al., 2018;Mills et al., 2018]. The figure reports trends (based on monthly anomalies) through the most recently available year, but beginning in two different time periods: 1) 21 sites that began between 1974 and 1990; and 2) all 27 sites since 1995. The corresponding numeric values of the trends, including confidence intervals and p-values, are reported in Tables 2 and 3. Because trends since 1995 are available for all sites, and because this time period is longer than 20 years (necessary to overcome the influence of periodic climate variability [Barnes et al., 2016]) the main conclusions from this analysis are based on trends since 1995. As a first step towards periodically documenting 21 st century ozone trends, and for comparison to the global trends reported by TOAR-Climate, Figure    Northern Hemisphere trends since 1974-1990 are mainly positive with 11 of 17 time series showing positive trends in the range of 0.6-2.3 ppbv decade -1 (p-values less than 0.10). Two time series have negative trends (p-values less than 0.10) occurring in the Caucasus Mountains of southern Russia (Kislovodsk) and the high elevations of the western USA (Gothic in Colorado). Only 4 sites are available from the Southern Hemisphere, all tending towards increasing ozone, with the strongest trends at the midlatitude sites of Cape Grim and Cape Point (0.4-1.6 ppbv decade -1 ; p-value < 0.05). Trends are positive but weak at American Samoa (0.2 ± 0.2 ppbv decade -1 ; p = 0.05) and at South Pole (0.3 ± 0.4 ppbv decade -1 ; p = 0.16).
Northern Hemisphere trends since 1995 are nearly evenly split with 5 of 23 time series having positive trends (p < 0.10), and 5 sites having negative trends (p < 0.10). The remaining sites show no clear indication of a trend. The positive trends occurred in the Arctic, western Ireland, Western Europe (IAGOS observations), central China and in the low humidity air masses at MLO, while the negative trends are clustered in the western USA and the Alps. Section 4 provides some discussion of the variable ozone trends at northern mid-latitudes and places these trends in the context of free tropospheric trends, which are less variable and predominantly positive. To understand if the trends vary seasonally, Supplemental Figure 5: Decadal ozone trends based on monthly anomalies at remote surface sites. Also shown are the IAGOS trends at 650 hPa above Europe (∇). All trends end with the most recently available year, but begin in 1973-1990 (top) and 1995 (bottom). Arrows in the right panel indicate the ozone rate of change (ppbv decade -1 ), and all trend values correspond to the data reported in Table 2. Symbol colors indicate the p-value associated with the trend at each site. High elevation sites (>1500 m a.s.l.) are indicated (∆), and MLO trends are shown for low humidity (top arrow, labeled 'L'), all (middle arrow, labeled 'A') and high humidity (bottom arrow, labeled 'H') air masses, and offset by latitude for clarity. DOI: https://doi.org/10.1525/elementa.420.f5 Figure S-4 shows results for all four seasons. In general, the latitudinal variation of trends is similar in all seasons, however the magnitude and sign of the trends can vary greatly from one season to the next. For example, in the Northern Hemisphere negative trends dominate during June-July-August, while positive trends dominate during December-January-February. Since 1995 seven time series are available from the Southern Hemisphere with five having positive trends (p < 0.10), one site (El Tololo) with little or no change, and only the site of Ushuaia in southern Argentina showing a negative trend (p < 0.05). Lu et al. [2019] used a global chemical transport model to explore the impact of the poleward expansion of the Southern Hemisphere Hadley circulation on tropospheric ozone across the Southern Hemisphere. They determined that the observed ozone increases cannot be explained by increases of anthropogenic ozone precursor emissions. Rather, the poleward expansion of the Southern Hemisphere Hadley circulation has changed transport patterns across Southern mid-and high latitudes, and it has also shifted the flux of stratospheric ozone into the troposphere toward higher latitudes. The shifting transport patterns and their redistribution of ozone precursors has produced an environment at mid-and high latitudes that is more conducive to photochemical ozone production, which may partially explain the positive ozone trends at four of the Southern Hemisphere extratropical sites. Further work is required to understand why the trend at Ushuaia is opposite to the other Southern Hemisphere sites. This coastal site is on a peninsula with a small airport runway 500 m to the north and the city of Ushuaia (population 75,000) 5 km to the north. While fresh anthropogenic emissions can affect the site, the frequency is low as the winds are predominantly westerly, guided by the mountainous terrain bordering the Beagle Channel [Adame et al., 2019]. Furthermore, the station operator removes observations from the ozone record when the site is impacted by air mass transport from the city. Adame et al. [2019], conducted a transport climatology for Ushuaia based on back-trajectories and found that during all seasons the site is impacted by air masses transported from the mid-and high latitudes of the South Pacific Ocean and from Antarctica. This extensive source region is virtually devoid of anthropogenic or biomass burning emissions, and the seasonal cycle of ozone and CO is consistent with photochemical processes in the remote marine boundary layer where low NO x concentrations lead to net ozone destruction [Galbally et al., 2000]. Therefore, the trend at this site is likely to have greater impact from climate variability than from changes of ozone precursors.
Since 2000 Northern Hemisphere ozone trends have a similar pattern to the trends beginning in 1995, with a range of negative and positive values, but with the rate of change becoming more extreme in some cases ( Figure  S-3). The wider spread in the range of trend values can be partly explained by the shorter length of the time series (15-18 years) as short-term climate variability (e.g. El Niño/Southern Oscillation) modulates ozone at remote locations due to periodic shifts of transport pathways [Barnes et al., 2016;Fischer et al., 2011;Lin et al., 2014]. Timescales greater than 20 years are typically required to overcome the influence of climate variability. As in the Northern Hemisphere, the pattern of ozone trends in the Southern Hemisphere is similar to that when the time series began in 1995, but with more variability and greater p-values.

The relationship between low-elevation, mountaintop and free tropospheric ozone trends above Europe
This section demonstrates that the IAGOS commercial aircraft observations can be used to quantify the ozone trend in the lower free troposphere above Western Europe, whereas unfiltered nighttime ozone observations at the high Alpine sites are impacted by the European boundary layer and therefore do not reflect the ozone trend in the lower free troposphere of Western Europe. The Alpine sites of Zugspitze, Jungfraujoch and Sonnblick have provided important ozone observations from the center of Western Europe since 1978, and the trends have been analyzed to gain insight into the changes of baseline ozone that flows into Europe [Zanis et al., 1999;Oltmans et al., 2006;Gilge et al., 2010;Cui et al., 2011;Logan et al., 2012;Parrish et al., 2012;Cooper et al., 2014;. Because these sites are located 2800-3600 m above sea level, some past studies have considered these sites to be representative of the lower free-troposphere. However, many studies of trace gas and particulate matter observations at these and other Alpine sites, in conjunction with regional-scale transport analyses, have clearly shown that Alpine peaks are frequently impacted by emissions from the European boundary layer, especially in the summer months [Baltensperger et al., 1997;Carnuth et al., 2002;Zellweger et al., 2003;Henne et al., 2004;Steinbacher et al., 2004;Henne et al., 2005;Kaiser et al., 2007;Folini et al., 2009;Zellweger et al., 2009;Henne et al., 2010;Griffiths et al., 2014;Collaud Coen et al., 2018;Yuan et al., 2019a]. Previous work has questioned the reliability of the Jungfraujoch observations prior to 1994 [Zanis et al., 1999;Logan et al., 2012], however the remainder of this analysis focuses on data since 1994.
The Alpine sites are not truly remote because they are located in the center of Europe. However, local influences can be avoided by limiting the trend analysis to nighttime observations, which excludes the daytime upslope winds that transport air from the valleys up to the summits. The nighttime air masses may contain residual air from the previous day's boundary layer if the boundary layer exceeded the elevation of the mountaintop, as demonstrated at North American sites like Mt. Bachelor, Oregon (2.8 km above sea level) [Weiss-Penzias et al., 2006] and Whistler, British Columbia (2.2 km above sea level) [Gallagher et al., 2012]. Influence from the European boundary layer at the Alpine summits is most common in the warm months, as determined from an analysis of the boundary layer tracer, radon, which has elevated levels at Jungfraujoch during the months of April-September [Griffiths et al., 2014]. Similarly, studies have documented the growth of the Alpine boundary layer to more than 1 km above the summit of Zugspitze [Carnuth et al., 2002], and a weekly cycle of CO 2 at the summit of Zugspitze, caused by regional scale emissions [Yuan et al., 2019a,b]. Moist Alpine boundary layer air has also been observed in the lower free troposphere above Milan, Italy, after vertical transport by mountain venting processes and subsequent advection from the Alps to Milan [Henne et al., 2005].
In contrast to the high Alpine sites, the IAGOS observations at 650 hPa are above the European boundary layer [Petetin et al., 2016], as described in Section 2.3.8. This difference is illustrated by Figure 6 which compares monthly values of observed specific humidity (based on a 1995-2014 climatology) at the summit of Jungfraujoch to specific humidity at 650 hPa above Western Europe, as reported by IAGOS commercial aircraft (at the same altitude as Jungfraujoch). Nighttime specific humidity values are greater at Jungfraujoch during all months, indicating a greater influence from the moist boundary layer at Jungfraujoch, with an even larger difference for daytime conditions. The enhancement is greatest during the warm months, matching the Jungfraujoch radon analysis by Griffiths et al. [2014]. Comparison of European boundary layer specific humidity values to the values at 650 hPa and at Jungfraujoch (see Section 2.3.8) indicates that approximately 18% of the air at Jungfraujoch is from the boundary layer during the daytime or nighttime hours of winter (December-February), increasing to 33% during the night and 45% during the day in the summer months (June-August). This measurement-based analysis demonstrates that the Alpine observations are not fully representative of the lower free-tropospheric air masses observed by the IAGOS aircraft, especially in summer. Figure 7 shows ozone trends above Western Europe (1994-2017) as measured by IAGOS aircraft on 10 pressure levels from the lower troposphere to the upper troposphere. The monthly means of ambient ozone are shown to illustrate the pronounced seasonal cycle of ozone that occurs above Western Europe, with a strong peak in summer at all levels (the data were filtered according to potential vorticity to remove stratospheric observations from the highest levels). Based on monthly anomalies, ozone increased in the mid-and upper troposphere (600-250 hPa) at the rate of 1.3 to 4.5 ppbv decade -1 (p-values less than 0.05), consistent with ozone increases above Germany and Belgium as observed by ozonesondes Van Malderen et al., 2016]; trends in the lower troposphere (950-700 hPa) are also positive, but weaker (1.0-1.2 ppbv decade -1 ; p-values less than 0.05). The trend at 650 hPa, at the same altitude as Jungfraujoch, is 1.3 ppbv decade -1 (p = 0.00) ( Table 2), which is similar to the trends in the mid-tropospheric layers above. Figure 8 directly compares the IAGOS trend at 650 hPa to the trends at the three Alpine sites since 1995, plus the trends at three low-elevation rural sites in southern Germany, which are indicative of the polluted European boundary layer (see also Table S-1). IAGOS is the only time series with a clear positive trend, whereas ozone at Jungfraujoch is relatively unchanged since 1995, and ozone has decreased at Sonnblick, Zugspitze and the three low-elevation sites. These trends are based on data from all months of the year, and as shown next, the difference between IAGOS and the Alpine sites is driven by the summer months. Figure 9 shows the ozone anomaly trends at the same sites as in Figure 8, but for the warm months of May-August when the European boundary layer is deep, and for the cold months of November-February when the boundary layer is shallow. During the cold months differences among the sites are small. IAGOS and Jungfraujoch show small positive trends, while ozone is flat at Zugspitze and Sonnblick, similar to the low-elevation sites. However, there is a greater difference between IAGOS and the other  sites during the warm months. Ozone has decreased since 1995 at the three low-elevation sites, consistent with widespread summertime ozone decreases observed across much of Europe, as illustrated by the TOAR analysis in Supplemental Figure S-5. The low-elevation sites show the high surface ozone anomalies that occurred across Europe during the heatwave of August, 2003 [Vautard et al., 2005]. The trends at the Alpine sites are similar and they also show the summertime peak in 2003. In contrast the IAGOS trend (above the boundary layer) is positive but weak (0.6 ppbv decade -1 ; p = 0.34), and it shows no peak during 2003 (although IAGOS observed strong ozone enhancements below 650 hPa during August, 2003 [Tressol et al., 2008]). Further analysis is provided in Supplemental Figure S-6, to determine if the trends of the high and low ozone extremes (5 th and 95 th percentiles) are similar between IAGOS and Jungfraujoch. During the warm months there are no clear similarities between the two sites, but during the cold months positive trends are strongest for the 5 th percentiles (1.0 -1.9 ppbv decade -1 ), at both sites.
In conclusion, the negative trends at Sonnblick and Zugspitze, based on data from all months (Figure 8), are driven by the summertime decreases (Figure 9), which are also seen at the low elevation sites in southern Germany. The relatively flat trend at the higher elevation site of Jungfraujoch is in between the positive IAGOS trend and the negative trends at Sonnblick and Zugspitze (Figure 8). This observation is consistent with the analysis of Collaud Coen et al. [2018], who found less influence from the European boundary layer at Jungfraujoch than at Zugspitze or Sonnblick. While the IAGOS data provide the most straight-forward assessment of lower freetropospheric ozone trends above Europe (1.3 ± 0.8 ppbv decade -1 since 1994), it is still feasible to filter the Alpine data to isolate observations made in the free troposphere using co-located observations of trace gases and particulate matter Yuan et al., 2018]. As a basic illustration, we used specific humidity observations at Jungfraujoch to isolate the low humidity air samples, which should be biased towards free tropospheric conditions. We found that during the warm months, the trend at Jungfraujoch approaches that of IAGOS when the sampling is restricted to low humidity air samples (Table S-2).

Discussion
The goal of this analysis has been to examine long-term ozone trends at as many remote surface sites as possible, for the purposes of understanding multi-decadal regionalscale changes of surface ozone and for the evaluation of global atmospheric chemistry models [Young et al., 2018]. However, because these 27 sites only cover a relatively small portion of the globe, they do not provide a comprehensive overview of global surface ozone trends. Current research is attempting to quantify the extent to which the global ozone monitoring network represents the global ozone distribution at the surface and in the free troposphere [Sofen et al., 2016;Brown-Steiner et al., 2018;Chang et al., 2019;Young et al., 2018]. Sofen et al. [2016] gathered present-day ozone observations from approximately 2400 regionally-representative monitoring sites worldwide, including most of the sites from the present analysis.
Using a global atmospheric chemistry model they determined that these sites are representative of 25% of the Earth's surface area (33% in the Northern Hemisphere and 18% in the Southern Hemisphere). The coverage was limited despite the large number of sites because most sites were clustered in North America, Europe and Japan. Therefore, we conclude that the trends reported by the present analysis are representative of far less than 25% of the Earth's surface.
To place the 27 remote sites in the context of the global ozone monitoring network, Figure 10 shows the global distribution of observed surface ozone using all available monitoring sites (approximately 5000) from the TOAR Surface Ozone Database, below 2000 m elevation, and averaged onto a 2° × 2° grid for the period 2005-2014 [Schultz et al., 2017]. These data were retrieved from the TOAR archive (https://doi.pangaea.de/10.1594/ PANGAEA.876108), which contains many observationbased data products (data through 2014) to facilitate research on regional and global ozone distributions. The ozone metric shown in Figure 10 is the monthly mean of the maximum daily 8-hour average, selected because the majority of these sites (urban, suburban and rural) experience diurnal cycles [Strode et al., 2018], and this metric avoids the nighttime and early morning hours when surface ozone is at a minimum due to surface deposition under stable conditions. The value shown for each grid cell is the annual maximum of the 12 monthly values to reveal the time of year and the ozone values associated with peak photochemical ozone production, which is strongest in low and mid-latitudes (plots of standard seasonal ozone values from the TOAR database are provided by Schultz et al. [2017] and Gaudel et al. [2018]). In terms of the seasonal ozone peak, the greatest ozone values are found from April to August at northern mid-latitudes (25°-45° N), primarily in the southwestern USA, the Mediterranean region, northern India, eastern China, South Korea and Japan. Ozone values are much less in the tropics and southern mid-latitudes. Similar results are found when the data are limited to just the rural sites ( Figure S-7). With respect to latitude, peak ozone levels occur between 20°-50° N, the same zone with the greatest diversity of ozone trends seen at our remote sites ( Figure 5) and in the polluted boundary layer of North America, Europe and East Asia (Figure S-5;Chang et al., 2017), consistent with the expectation that photochemical ozone production is most active in this region and is responding to changing precursor emissions Ziemke et al., 2019].
Focusing on the variable ozone trends at northern mid-latitudes, one might conclude that overall, ozone decreases are more common than increases since 1995 ( Figure 5). But as discussed above, the number of available sites is too small to construct an area-weighted trend that is representative of the entire latitude band. An obvious shortcoming of this analysis is the lack of long-term rural monitoring sites across South and East Asia where ozone increases are expected due to increasing ozone precursor emissions Lin et al., 2017;. Relying on the peer-reviewed literature and the TOAR surface ozone database (with data through 2014), Gaudel et al. [2018] concluded that surface and lower tropospheric ozone has generally increased across East Asia over the past two decades. This conclusion is reinforced by a new study that has provided the first overview of China's extensive surface ozone monitoring network (1600 nonrural sites) for the period 2013-2017 [Lu et al., 2018]. Over this short 5-year period the data show a broad increase of ozone across the country. An update to this trend spanning the years 2013-2018 is provided in Supplemental Figure S-8, showing that ozone was similarly high in 2018. Our analysis also lacks long-term rural ozone monitoring at the west coast of North America, which would indicate the impact of rising Asian ozone on the eastern side of the North Pacific Ocean. Focusing on annual average baseline ozone observations, Jaffe et al. [2018] found ozone increases at the summit of Mt Bachelor, Oregon (2763 m above sea level), from 2004 to 2016, but decreases over the same period at the marine boundary layer site of Trinidad Head on the northern California coast, located 850 km to the southwest of Mt. Bachelor (neither site has data prior to 2004). The opposing ozone trends at these baseline monitoring sites in the same region of the western USA illustrate the variable nature of ozone trends at northern mid-latitudes.
In terms of its contribution to radiative forcing, ozone in the free troposphere has a greater impact than ozone at the surface [Lacis et al., 1990;Forster and Shine, 1997;Worden et al., 2008Worden et al., , 2011Bowman et al., 2013;Kuai et al., 2017]. Supplemental Table S-3 summarizes recent studies that have quantified trends of free tropospheric ozone and tropospheric column ozone at multiple locations around the world based on ozonesondes and IAGOS commercial aircraft observations Cooper et al., 2014;Tarasick et al., 2016;Van Malderen et al., 2016;Cohen et al., 2018;Gaudel et al., 2018], and serves as an update to a similar table that appeared in the Fifth Assessment Report of the Intergovernmental Panel on Climate Change [IPCC, 2013; see Table 2.SM.1 in Hartmann et al., 2013]. Focusing on time series at least 20-years in length, Table  S-3 demonstrates a general increase of ozone in the free troposphere of both hemispheres, although a few regions show no increase, or even a decrease in a few isolated layers of the troposphere. For the longest time series (>30 years) the trends can vary across shorter time periods, depending on the decades chosen for analysis Tarasick et al., 2016]. These findings are consistent with the earlier assessment by IPCC [IPCC, 2013]. The ozone increases are especially prevalent across the northern tropics and mid-latitudes, with the strongest increases above South and East Asia in the range 2-6 ppbv decade -1 , which are generally far greater than trends at the remote surface sites (Table 2). Clearly, the highly variable positive and negative trends at the surface, as shown in Figure 5, are not indicative of ozone trends in the free troposphere, which are overwhelmingly positive. This comparison is a further demonstration that scattered surface ozone trends cannot be assumed to be representative of ozone trends in other regions and levels of the troposphere. While the TOAR database reveals the observation-based surface ozone distribution from all available monitoring sites (see Figure 10, ), very large data gaps exist and models must be used for a complete global picture. The surface ozone distribution from the Atmospheric Chemistry and Climate Model Intercomparison Project (ACCMIP) multi-model ensemble generally captures the same ozone peaks as the TOAR observations but with a high bias [Young et al., 2013a[Young et al., , 2013b[Young et al., , 2018. The ACCMIP ensemble also indicates that the northern mid-latitude ozone enhancement (Figure 10) circles the globe, and is present in regions without ozone observations. Comparison to model simulations of pre-industrial ozone indicates that the northern mid-latitude ozone enhancement is due to anthropogenic emissions [Young et al. 2013a[Young et al. , 2013b. Over time, the region of the Northern Hemisphere with the greatest net ozone production due to anthropogenic emissions has migrated southward as emissions have shifted from North America and Europe to Asia. A model study determined that emissions changes (in terms of magnitude and location) increased lower tropospheric ozone in the 10°-30° N latitude band by 3-5 ppb over the period 1980-2010 . This modeled rate of increase (1.0-1.7 ppbv decade -1 ) is similar to the observed rates at MLO and Izaña ( Table 2). No other long-term remote monitoring sites are available from high elevations in the 10°-30° N latitude band, however additional support is provided by IAGOS commercial aircraft profiles, which revealed lower tropospheric ozone increases above southern India and Southeast Asia from the period 1994. Furthermore, the Hok Tsui coastal site just south of Hong Kong has measured ozone in this very polluted region of southern China since 1994. The data have been filtered by Wang et al. [2009;2019] to focus on marine air masses from the South China Sea, revealing a strong ozone increase of 3.6 ppbv decade -1 .
Finally, we show how newly available ozone observations (since 2010) have updated our understanding of ozone variability at the three high elevation Alpine sites. Previous analysis of long-term ozone trends at these sites concluded that ozone increased throughout the 1990s, peaked in the early 2000s and then decreased until the time series ended in 2009-2011 [Parrish et al., 2012. The rates of decrease were strongest in summer and were determined with quadratic fits. Our analysis, based on the lowess smoother, also shows ozone maxima in the late 1990s or early 2000s, followed by decreasing ozone until approximately 2010 (see time series plots in Appendix S-B in the Supplemental Material). However, the newly available data through 2016-2018 show that ozone has remained approximately level at the three Alpine sites since 2010, with a slight increase since 2015 at Jungfraujoch. When focusing on just the summer months, ozone in recent years is consistent with the overall decreasing trends observed at these sites since the late 1990s or early 2000s (Section 3.3 and Table S-1). In contrast, ozone in the other seasons has remained level or increased slightly (e.g Jungfraujoch in winter, Table  S-1), resulting in annually averaged ozone levels that have changed little since 2010. Extension of these time series with future observations, and application of the lowess smoother to the new data, can quickly reveal if the nearlevel ozone values of recent years will continue.

Conclusions
This study has estimated surface ozone trends at 27 globally distributed remote locations (20 in the Northern Hemisphere, and 7 in the Southern Hemisphere), focusing on continuous ozone time series that extend from the present back to at least 1995. The analysis builds on the original TOAR effort by providing multi-decadal trend analysis at 22 remote sites that were not explored by . To avoid the confounding influence of ozone's seasonal cycle and periods of missing data, the trends are based on monthly mean ozone anomalies rather than ambient ozone mixing ratios, using a statistical method that was not employed by TOAR . Since 1995, ozone trends at the available Northern Hemisphere sites are nearly evenly split between positive and negative trends, with more than half of the trends being relatively weak, between 1 and -1 ppbv decade -1 (p-values typically greater than 0.10); the Southern Hemisphere had positive trends at 5 out of 7 sites. In both hemispheres, the robust positive trends (p-values less than 0.10) are in the range of 0.5-2 ppbv decade -1 . The longest time series available are from South Pole, where ozone is essentially unchanged since the early 1960s, and Mauna Loa, Hawaii where ozone has increased by roughly 50% since the late 1950s (~2.3 ppbv decade -1 ).
Particular focus was placed on the high elevation Alpine monitoring sites, which were compared to IAGOS commercial aircraft observations in the lower free-troposphere above Western Europe. The Alpine sites are frequently impacted by the polluted European boundary layer, especially in summer, and can only be used to quantify lower free tropospheric ozone if the time series are carefully filtered to avoid observations impacted by the boundary layer. In contrast, the IAGOS observations at 650 hPa are above the European boundary layer, and show that ozone in the lower free troposphere above Western Europe has increased at the rate of 1.3 ± 0.8 ppbv decade -1 since 1994. Stronger increases were detected in the mid-and upper troposphere (1.5-4.5 ppbv decade -1 ).
We emphasize that while we used all long-term remote monitoring sites available, the data are only representative of much less than 25% of the global surface area, and therefore they cannot be used to produce a global mean surface ozone trend. Full understanding of global surface ozone trends based on observations would require a major expansion of the surface ozone observation network. Based on newly developed methods [Sofen et al., 2016;Weatherhead et al., 2017;Chang et al., 2019] the locations of these additional sites could be carefully planned to maximize their spatial representativeness.
We conducted a review of the recent peer-reviewed literature to summarize free tropospheric (primarily in the range 700-300 hPa) and tropospheric column ozone trends around the world, beginning in 1995 or earlier (Table S-3). The results revealed a general increase of ozone in the free troposphere of both hemispheres, although a few sites showed no increase, or even a decrease in a few isolated layers of the troposphere, and the trends can vary depending on the time period chosen for analysis. Since the mid-1990s free tropospheric ozone increases are especially prevalent across the northern tropics and mid-latitudes, with the strongest increases above South and East Asia in the range 2-6 ppbv decade -1 , which are generally far greater than trends at the remote surface sites, either at sea level or on mountaintops. We conclude that the highly variable positive and negative trends at the surface are not necessarily indicative of ozone trends in the free troposphere, which have been overwhelmingly positive since the mid-1990s.
The ozone time series presented in this analysis provide an important benchmark for the evaluation of multi-decadal chemistry-climate model simulations of tropospheric ozone. We recommend that these remote sites constitute one component of a three-step approach: 1) Because ozone has a greater impact on radiative forcing in the free troposphere than at the surface, the models should be evaluated against in situ free tropospheric observations from ozonesondes and commercial aircraft [Liu et al., 2013;Cohen et al., 2018;Gaudel et al., 2018;Stauffer et al., 2018;Tarasick and Galbally et al., 2019]. 2) Evaluating coarse-resolution chemistry climate models in regions of high ozone precursor emissions can be challenging, due to the difference in scale between a model grid cell and ozone observations at a single polluted location. For this purpose TOAR has provided pre-processed gridded products at 2° × 2° and 5° × 5° resolution based on all, rural and urban observations [Schultz et al., 2017;Young et al., 2018]. 3) Ozone production and loss proceed at different rates at remote surface sites than in the free troposphere, and the ozone time series presented in this paper are ideal for evaluating model surface processes at remote locations [Young et al., 2018;Lin et al., 2019].
Care is required when selecting the appropriate model grid cell and vertical layer for comparison to these remote surface sites. For example, the Cape Point data were filtered for onshore flow conditions when transport is from the west and south. Therefore to focus on conditions above the South Atlantic Ocean a model grid cell should be chosen that is to the west of Cape Town where recent continental outflow is infrequent. For a mountaintop site, such as Jungfraujoch, the actual elevation of the sta-tion will likely be far above the coarse model terrain, and therefore a grid cell should be chosen that does not reside in the free troposphere above the Alpine boundary layer. However, this complication does not arise for Mauna Loa because the terrain of Hawaii is not resolved by a coarseresolution global-scale model. Therefore, a direct comparison can be made between the nighttime ozone observations at Mauna Loa (representative of the lower free troposphere) and a free tropospheric model grid cell at the same elevation. Selection of the most appropriate grid cell for model evaluation will vary by model. We recommend that a range of grid cells surrounding the sampling coordinates be examined, in order to identify the cell that most closely matches the sampling environment of the remote monitoring site. Once the appropriate model time series has been selected for evaluation against the observed ozone time series, we recommend the following straightforward statistical analysis to determine if the modeled long-term ozone trend, bias and interannual variability are consistent with the observed values. 1) Calculate monthly ozone anomalies for the model time series using the same reference period, and drop any monthly values that do not appear in the observed time series. 2) Compare the slopes of the observed and modelled regression lines using the Student's t-test based on the standard error of regression models, paying special attention to the equality of the fitted variances associated with the slopes, as described in the tutorial by Andrade and Estévez-Pérez [2014]. If nonlinear changes are a dominant feature of the trend, as indicated by the lowess smoother, we recommend the application of piecewise linear regression. However, the time series segments must be long enough to arrive at meaningful parameter estimates. The piecewise linear regression method is described in the Supplemental Material, and a recent example is provided by Banerjee et al. [2020]. 3) This same method can be used to quantify the model bias. 4) The standard R-squared metric can be used to quantify the variance in the observations that is explained by the model. 5) There is no ideal metric for comparing two non-linear trends (e.g a comparison of the lowess smoother), however the simple root mean squared error (RMSE) can be used for this purpose, as can similar model skill measures [Krause et al., 2005]. 6) Additional guidance can be found in the textbooks by Box et al. [2015], Chandler and Scott [2011] and Judd et al. (2011).
The field of climate change research has developed a range of widely accepted statistical methods for the analysis of trends, extreme events and model evaluation [von Storch and Zwiers, 1999;Zwiers and von Storch, 2004;Hennemuth et al., 2013], as well as common practices for communicating uncertainty in assessment findings [Mastrandrea et al., 2011]. Statistical analysis in the field of atmospheric chemistry has not yet reached this level of maturity. Development of a suite of best statistical practices for atmospheric chemistry would be of great value to the field, and is an endeavor that could be effectively addressed by the next phase of the Tropospheric Ozone Assessment Report (https://igacproject.org/activities/TOAR/TOAR-II).

Data accessibility statement
The surface ozone observations reported in this paper, from 1970 to the present, can be downloaded from the TOAR Surface Ozone Database via the JOIN web interface: https://join.fz-juelich.de. The historical ozone observations from MLO (1957MLO ( -1959 and SPO (1961SPO ( -1963 are available from NOAA's Global Monitoring Laboratory: ftp://aftp.cmdl.noaa.gov/data/ozwv/SurfaceOzone/Historical/. IAGOS aircraft observations are available from: https://doi.org/10.25326/20. In addition, the monthly surface ozone values upon which the trends were calculated can be easily downloaded as ASCII text files from the following URL: http://doi. org/10.34730/e792cad833174ebcafd9f052711e5660.

Supplemental file
The supplemental file for this article can be found as follows: • Supplemental Material. Supplemental figures, tables and appendices. DOI: https://doi.org/10.1525/elementa.420.s1 ECE CLRTAP (EMEP), AMAP and through Norwegian Institute for Air Research (NILU) internal resources. Specific developments have been possible due to projects like EUSAAR (EU-FP5), EBAS-Online (Norwegian Research Council INFRA) and HTAP (European Commission DG-ENV). A large number of specific projects have supported development of data and meta data reporting schemes in dialog with EBAS data providers. The IAGOS program acknowledges the European Commission for its support of the MOZAIC project (1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003) the preparatory phase of IAGOS (2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013) and IGAS (2013IGAS ( -2016; the partner institutions of the IAGOS Research Infrastructure (FZJ, DLR, MPI, KIT in Germany, CNRS, Météo-France, Université Paul Sabatier in France, and University of Manchester, UK); the French Atmospheric Data Center AERIS for hosting the database; and the participating airlines (Lufthansa, Air France, China Airlines, Iberia, Cathay Pacific, Hawaiian Airlines) for transporting the instrumentation free of charge. Special thanks also goes to the following scientists, air quality specialists and agencies for providing ozone data from the remote and rural monitoring sites: Weili Lin, Meteorological Observation Center, China Meteorological Administration, Beijing, for the Mt. Waliguan data; Rolf Weller, Alfred Wegener Institute, Germany, for the Neumayer, Antarctica data; Mr. Kazuyuki Saito (ghg_obs@met.kishou.go.jp), Japan Meteorological Agency (JMA) for the Minamitorishima GAW global station data; Gary Lear, U.S. Environmental Protection Agency Clean Air Markets Division Clean Air Status and Trends Network (CASTNET), for the data from Denali, Gothic, Great Basin, Grand Canyon and Centennial; Ute Dauert and Stefan Feigenspahn, Umweltbundesamt Dessau, Germany for obtaining the ozone measurements from the German federal air quality network; Thumeka Mkololo, South African Weather Service, for providing the baseline filtered ozone data at Cape Point; Gastón Torres Aravena, Dirección Meteorológica de Chile, for providing the data from El Tololo, Chile; Iris Buxbaum, Umweltbundesamt/Federal Environment Agency, Austria, for providing the Sonnblick data; Landesamt für Umwelt, Wasserwirtschaft und Gewerbeaufsicht, Rheinland Pfalz provided the Pfälzerwald-Hortenkopf measurements; Landesanstalt für Umwelt, Messungen und Naturschutz, Baden-Württemberg provided the Schwarzwald-Süd measurements; the Scottish Environment Protection Agency provided the Strath Vaich Dam data. Martin Steinbacher acknowledges funding from the GAW Quality Assurance/ Science Activity Centre Switzerland (QA/SAC-CH), which is supported by MeteoSwiss and Empa.