Top-down estimates of anthropogenic VOC emissions in South Korea using formaldehyde vertical column densities from aircraft during the KORUS-AQ campaign

Nonmethane volatile organic compounds (NMVOCs) result in ozone and aerosol production that adversely affects the environment and human health. For modeling purposes, anthropogenic NMVOC emissions have been typically compiled using the “bottom-up” approach. To minimize uncertainties of the bottom-up emission inventory, “top-down” NMVOC emissions can be estimated using formaldehyde (HCHO) observations. In this study, HCHO vertical column densities (VCDs) obtained from the Geostationary Trace gas and Aerosol Sensor Optimization spectrometer during the Korea–United States Air Quality campaign were used to constrain anthropogenic volatile organic compound (AVOC) emissions in South Korea. Estimated top-down AVOC emissions differed from those of the up-to-date bottom-up inventory over major anthropogenic source regions by factors of 1.0 ± 0.4 to 6.9 ± 3.9. Our evaluation using a 3D chemical transport model indicates that simulated HCHO mixing ratios using the top-down estimates were in better agreement with observations onboard the DC-8 aircraft during the campaign relative to those with the bottom-up emission, showing a decrease in model bias from –25% to –13%. The top-down analysis used in this study, however, has some limitations related to the use of HCHO yields, background HCHO columns, and AVOC speciation in the bottom-up inventory, resulting in uncertainties in the AVOC emission estimates. Our attempt to constrain diurnal variations of the AVOC emissions using the aircraft HCHO VCDs was compromised by infrequent aircraft observations over the same source regions. These limitations can be overcome with geostationary satellite observations by providing hourly HCHO VCDs.


Introduction
Nonmethane volatile organic compounds (NMVOCs) are emitted by biogenic and anthropogenic activities and contribute to ozone production and the formation of secondary organic aerosols (Sillman, 1999;Kanakidou et al., 2005). Ozone and aerosols are harmful air pollutants and also influence the climate because they affect the radiative balance (Bellouin et al., 2020;Skeie et al., 2020; the Intergovernmental Panel on Climate Change (IPCC), 2013). In addition, several anthropogenic NMVOCs are carcinogenic, and their emissions are strictly regulated to protect human health (EPA, 2018). Therefore, accurate information on NMVOC emissions is essential for understanding the effects of natural and anthropogenic processes on air quality and climate.
Anthropogenic NMVOC emissions are typically compiled using the "bottom-up" approach inferred from information on the degree of fossil fuel use, population, traffic, and industrial activities with species emission factors dependent on burning types and regulatory efficiency for air pollutant controls (Woo et al., 2012). However, a level of uncertainty is associated with these proxies, which cause discrepancies between model simulations based on bottom-up emissions and observations. To minimize the discrepancies between the models and the observations, "top-down" emissions have been statistically estimated using observations as a constraint to alter the bottom-up emissions. Various observations can be used, including in situ and remote-sensing observations (Cao et al., 2018;Kaiser et al., 2018;Fried et al., 2020;Souri et al., 2020).
Observations of formaldehyde (HCHO), which is an abundant species produced by the oxidation of NMVOCs, have primarily been used to estimate top-down emissions of NMVOCs. In particular, satellite measurements of HCHO, which cover broad areas, have been used to estimate biogenic isoprene emissions, which are the most substantial emission source of NMVOCs globally (Kaiser et al., 2018). Anthropogenic NMVOC emission estimation with satellite data is somewhat challenging because of the instrumental sensitivity and the coarse pixel sizes of satellite measurements. Zhu et al. (2014) overcame this limitation by temporally oversampling the ozone monitoring instrument (OMI) HCHO data to estimate total anthropogenic volatile organic compound (AVOC) emissions in Houston. Cao et al. (2018) estimated anthropogenic NMVOC emissions as well as biogenic and biomass burning emissions in China using both HCHO and glyoxal satellite measurements together with an adjoint model, and Souri et al. (2020) estimated NMVOC emissions using the Ozone Mapping and Profiler Suite (OMPS) HCHO satellite data and a nonlinear joint analytical inversion method.
HCHO satellite measurements have advanced spatially and temporally to help improve the accuracy of top-down estimates. The Tropospheric Monitoring Instrument (TRO-POMI) measures HCHO vertical columns with a finer spatial resolution of 7 Â 3.5 km 2 (5.5 Â 3.5 km 2 since August 2019; De  than those of previous instruments: OMI, OMPS, the Global Ozone Monitoring Experiment-2 (GOME-2), the Scanning Imaging Absorption Spectrometer for Atmospheric Cartography (SCIAMA-CHY), and GOME. Instruments on future geostationary satellites, including the Geostationary Environment Monitoring Spectrometer (GEMS) in Asia, Tropospheric Emissions: Monitoring Pollution (TEMPO) in North America, and Sentinel-4 in Europe, will measure HCHO vertical columns 8 times per day with fine spatial resolutions comparable with those of TROPOMI (Ingmann et al., 2012;Zoogman et al., 2017;Kim et al., 2020).
HCHO measurements from aircraft platforms can enhance top-down estimates with finer spatial resolutions than those of satellites, although aircraft measurements are limited regarding spatial and temporal coverage. Nowlan et al. (2018) revealed HCHO column measurements from the GEOstationary Coastal and Air Pollution Events Airborne Simulator (GCAS) with a spatial resolution of 1 Â 1 km 2 , which is 24.5 (19.25 since August 2019) times finer than that of TROPOMI. Another instrument, the Geostationary Trace gas and Aerosol Sensor Optimization (GeoTASO) spectrometer, can provide trace gas column measurements with a finer spatial resolution of 250 Â 250 m 2 compared to those of GCAS (Nowlan et al., 2016;Judd et al., 2018). However, NMVOC emission estimates have not yet been conducted using HCHO column measurements from aircraft platforms.
Here, we focus on estimating anthropogenic NMVOC emissions in South Korea, which are largely undetermined, using the top-down approach. HCHO column densities were used as the observational constraints for this study and were obtained from GeoTASO onboard the National Aeronautics and Space Administration (NASA) B200 aircraft during the Korea-United States Air Quality (KORUS-AQ) campaign, which occurred in May-June 2016. Furthermore, we explore the future application of HCHO observations from geostationary satellites by examining the diurnal variation in AVOC emissions. In Section 2, we describe our observations and a 3D chemical transport model used in this study, and in Section 3, the method for top-down estimates is described. Finally, we estimate the total AVOC emissions and discuss the results and future applications for geostationary satellites.

KORUS-AQ campaign
The KORUS-AQ campaign was an international cooperative air quality field study based out of Osan Air Base, Songtan, South Korea (about 60 km south of Seoul) in May-June 2016 and was jointly hosted by the National Institute of Environmental Research in South Korea and the NASA in the United States (https://www-air.larc.nasa. gov/missions/korus-aq). Air quality in South Korea is influenced by complex sources, such as local emissions from traffic and industries, as well as transport from China. Therefore, KORUS-AQ offered the opportunity to understand local and transport effects on the air quality in South Korea. One of the KORUS-AQ objectives was a detailed comparison of complementary measurements between in situ and remote-sensing instruments to prepare a new generation of air pollutant measurements from geostationary satellites (GEMS in Asia, TEMPO in North America, Sentinel-4 in Europe). During the campaign, extensive measurements of air pollutants such as aerosols and trace gases were conducted from aircrafts, ground sites, and ships and were guided by air quality model simulations. We briefly describe the observation data and the model used in this study below.

Observations
We used HCHO vertical column densities (VCDs) from GeoTASO and in situ observations from the DC-8 aircraft. GeoTASO is an ultravoiolet-visible (UV-VIS) airborne spectrometer used to derive VCDs of trace gases (Leitch et al., 2014). During the KORUS-AQ campaign, GeoTASO onboard the NASA B200 aircraft measured backscattered radiances in two channels (UV: 265-410 nm; VIS: 405-Art. 9(1) page 2 of 16 Kwon et al: Top-down estimates of AVOC emissions using HCHO VCDs from aircraft 695 nm) with a spatial resolution of 250 Â 250 m 2 . Nowlan et al. (2016Nowlan et al. ( , 2018 described GeoTASO measurements and HCHO retrievals related to aircraft instruments. During the campaign, two steps were followed to retrieve the HCHO VCD from GeoTASO. First, the slant columns were derived using the Smithsonian Astrophysical Observatory trace gas retrieval algorithm, which was developed for HCHO retrievals in the OMI and OMPS instruments on satellites and the GCAS instrument on an aircraft (González Abad et al., 2015;González Abad et al., 2016;Nowlan et al., 2018). The fitting window was 328.5-356.5 nm, and an observed radiance spectrum was used as a reference spectrum in a radiative transfer equation based on the Beer-Lambert law. The radiance reference spectrum was obtained using measured radiances at approximately 11:30 local time on May 25, 2016, over a mountain region (37.6 N, 128.4 E), which is relatively clean in terms of local anthropogenic and biogenic sources. The slant columns retrieved using the radiance reference were differential slant columns because the radiance reference already included background HCHO columns over the reference region. Therefore, background correction was required. Second, the HCHO slant columns with background corrections were converted to vertical columns using air mass factors (AMFs), which are correction factors used to convert the slant light path to the vertical light path as a function of parameters such as geometric angles, surface reflectance, cloud, trace gas profiles, and aerosols. Nowlan et al. (2018) defined vertical columns below the aircraft as a function of vertical columns and AMF values below and above the aircraft as follows: where V is the VCD, dS is the differential slant column density, A is the AMF, subscript R stands for reference, and superscript # and " indicate the quantity below and above the aircraft, respectively. We used model simulation results from the Weather Research and Forecasting model coupled with Chemistry (WRF-Chem) with a horizontal resolution of 4 km Â 4 km for background correction (V # R and V " R ) and vertical column above the aircraft (V " ; Goldberg et al., 2019). AMF values were calculated at 342 nm using WRF-Chem simulations, bidirectional reflectance distribution function, and aerosols from a 12.5-km GEOS-5 (the Goddard EArth Observing System, Version 5) nature run with hourly output. Following the approach applied by Goldberg et al. (2019) for NO 2 , the WRF-Chem HCHO profiles for the AMF calculations were scaled with the ratio of model HCHO to the in situ HCHO observations from the DC-8 aircraft (this factor is altitude-dependent but typically is on the order of 2). AMF values were calculated assuming cloud-free conditions, that is, only pixels with cloud flags equal to 0 are used.
The NASA DC-8 aircraft completed 20 research flights carrying various instruments to measure gaseous species, aerosols, and cloud information. To provide profile information, missed approach and vertical spiral measurements were performed near the Olympic Park and Mt. Taehwa, corresponding to adjacent ground sites near the eastern part of Seoul. The HCHO mixing ratio was measured by the Compact Atmospheric Multispecies Spectrometer (CAMS; Weibring et al., 2010;Richter et al., 2015;Fried et al., 2020) onboard the DC-8 aircraft. During KORUS-AQ, Fried et al. (2020) reported a limit of detection (28-80 pptv) and 6% accuracy (1s level) for 1-s CAMS HCHO measurements. A comprehensive discussion of this instrument, calibration methods, and its performance and deployment during the KORUS-AQ study can be found in Fried et al. (2020) and references therein. To examine the NMVOC simulation in the model, we used NMVOC species measured by the Whole Air Sampler (WAS; https://airbornescience.nasa.gov/ instrument/WAS_UCI) and the Proton-Transfer-Reaction Mass Spectrometer (PTR-MS; https://airbornescience. nasa.gov/instrument/PTR-MS). Measurement techniques, calibrations, and performances for WAS and PTR-MS were discussed in Müller et al. (2014) and Simpson et al. (2020). Table 1 summarizes the volatile organic compound (VOC) species from each measurement, and the characteristics for each measurement are summarized in Table S1.
For the validation of GeoTASO HCHO VCDs, we took an average of GeoTASO data on a 0.1 Â 0.1 grid, and these gridded values were then compared with those from the DC-8 aircraft and were also used in our analysis below. We used GeoTASO pixels with the condition that retrieved slant columns plus 2-times fitting random uncertainties are positive in cloud-free scenes. To calculate HCHO VCDs from the DC-8 aircraft, it was assumed that the DC-8 HCHO mixing ratio below 2 km is uniformly distributed within the planetary boundary layer (PBL) of 2 km and that the DC-8 HCHO mixing ratio in the free troposphere is consistent everywhere. Averaging kernels (AK) of Geo-TASO were applied to calculate DC-8 HCHO VCDs. Therefore, DC-8 HCHO VCDs are calculated as follows: AKðkÞ Á xði; j; kÞ þ X n DC8 k¼n PBL AKðkÞ Á x 0 ðkÞ; where xði; j; kÞ is the partial column at the location (i; jÞ and vertical layer (k), x 0 ðkÞ is the partial column in the free troposphere at the vertical layer (k), n PBL is the layer of PBL, and n DC8 is the layer of the aircraft altitude. Figure 1a and 1b show GeoTASO and DC-8 HCHO VCDs, respectively. High GeoTASO HCHO VCDs of 1.3 Â 10 16 to 2.8 Â 10 16 molecules cm -2 appeared over AVOC source regions including Seoul, Daesan, and Daegu. DC-8 HCHO VCDs were also high over the source regions, measuring 5.4 Â 10 16 molecules cm -2 near Daesan but showed different VCDs compared to GeoTASO, especially in the Daesan region. This discrepancy resulted from the different flight observation times between the B200 aircraft carrying GeoTASO and the DC-8 aircraft. To compare the two data sets, we collected data over the same locations where the B200 and DC-8 aircrafts both flew within an hour. The GeoTASO HCHO VCDs show good agreement with those of the DC-8 aircraft, with a correlation coefficient of 0.64 except for two outliers ( Figure 1c). The correlation coefficient was improved to 0.89 at 16-18 local times when trace gases were uniformly mixed in the PBL, which was the assumption when calculating DC-8 VCDs.
The two outliers appeared over Dasean at 11:30-12:30 local time on June 2, 2016. At this time, the DC-8 aircraft flew in the plumes emitted from the petrochemical complexes over Daesan, and the DC-8 HCHO mixing ratio reached up to 34 ppbv at 11:43 as found by Fried et al. (2020). However, simulated HCHO mixing ratio below 2 km used for GeoTASO AMF calculations was 7.5 ppbv at the maximum, which is approximately 4.5 times lower than the DC-8 observation. If we accounted for this discrepancy in the AMF calculation by scaling up the simulated HCHO profiles based on the observation, the GeoTASO VCDs would increase by 48%, resulting in a closer agreement with the DC-8 HCHO VCDs ( Figure  1c). This indicates the importance of a priori profiles for GeoTASO HCHO retrievals with high spatial resolutions.
In addition, the spatial disagreement between the DC-8 and B200 observations could also have contributed to the two outliers. At the point where the highest DC-8 HCHO mixing ratio was observed, the B200 aircraft flew near the plume while turning, and GeoTASO onboard the B200 aircraft could not capture the plume properly. Therefore, comparisons between the in situ and remote-sensing observations should be made carefully in terms of spatial and temporal overlaps, especially for targeting concentrated pollution plumes.

Model description
To evaluate our estimates of AVOC top-down emissions in South Korea during the KORUS-AQ campaign, we used a global 3D chemical transport model (GEOS-Chem v12.01) and its nested version (Bey et al., 2001;Wang et al., 2004). The nested model is driven by assimilated meteorological data from the Goddard Earth Observing System-Forward Processing (GEOS-FP) over the East Asian domain with a 0.25 Â 0.3125 horizontal resolution and 47 vertical layers from the surface to 0.01 hPa. Boundary conditions obtained from the global simulation with a 2 Â 2.5 horizontal resolution were used for the nested simulation.
Anthropogenic emissions over East Asia were taken from the KORUS v5 inventory (Woo et al., n.d.), which is based on the Clean Air Policy Support System in South Korea. The VOC speciation in the KORUS v5 is based on the Statewide Air Pollution Research Center chemical mechanism (SAPRC-99) and is then mapped to GEOS-Chem VOC species (Table S2). We describe AVOC emissions using the GEOS-Chem VOC speciation.
Total AVOC emissions from the KORUS v5 were 420 moles s -1 (765 Gg year -1 ) during the KORUS-AQ campaign in South Korea. Fractions for each species are shown in Figure 2. Goldberg et al. (2019) showed that NO x emissions from the KORUS v1 inventory were underestimated by approximately 50% for the Seoul Metropolitan Area   (Guenther et al., 2006;Guenther et al., 2012), and biomass burning emissions were calculated using the Global Fire Emissions Database Version 4 inventory (van der Werf et al., 2010). We added simplified ethene chemistry with the hydroxyl radical (OH) and ozone to the model based on IUPAC recommendations (Tables S3 and S4). Ethene accounts for the second largest amount of AVOC emissions in South Korea ( Figure 2) and was ranked fourth in contribution to OH reactivity in SMA during the KORUS-AQ campaign (https://espo.nasa.gov/sites/default/files/ documents/KORUS-AQ%20RSSR.pdf, last access: May 12, 2020). Also, we considered methanol emitted from biogenic, biomass burning, and oceanic sources (van der Werf et al., 2010;Guenther et al., 2012;Chen et al., 2019). Figure 3 shows simulated HCHO mixing ratios below 2 km with and without ethene chemistry and methanol emissions in the model. Ethene has a relatively short chemical lifetime of approximately 1.4 days (Atkinson and Arey, 2003) and is emitted from industrial sources. Adding the ethene chemistry to the model increased the HCHO mixing ratio, especially over Daesan and Busan-Changwon-Geoje areas, for which the mixing ratio below 2 km increased by 0.25 ppbv (Figure 3c). Methanol emissions increase the HCHO mixing ratio, but the impacts are not significant due to its long chemical lifetime of approximately 12 days (Atkinson, 2000).

Method for the top-down emission estimates
Total AVOC emissions can be estimated using the total HCHO net production per unit time and the HCHO yields of AVOC species (Zhu et al., 2014). Total HCHO net production per unit time is calculated using Equation 3 as follows: where S is the total HCHO net production rate (kmol h -1 ), t HCHO is the lifetime of HCHO, VCD 0 is the background HCHO VCD, and dA is the differential area.
To estimate AVOC emissions from anthropogenic sources using HCHO observations, we removed background HCHO produced by other sources including the oxidation of methane and biogenic VOCs. Therefore, the background region was chosen within 37 -38 N and 128.5 -130 E over the northeastern area of South Korea, where there are a few biogenic sources and no anthropogenic sources. The GeoTASO HCHO VCD is 6.3 Â 10 15 molecules cm -2 over the background area in Figure 1, which is relatively lower than those over other regions. For the background HCHO VCD, we used the averaged HCHO VCD from GEOS-Chem, which is 6.3 Â 10 15  molecules cm -2 and is consistent with the value from GeoTASO. However, the effects of biogenic sources on the background HCHO VCD can be different over each source region. We estimated the uncertainty of the background HCHO VCD from the variation of biogenic sources and present a discussion below. HCHO lifetimes below 2 km over land were calculated using DC-8 OH mixing ratios and photolysis rates from the Tropospheric Ultraviolet and Visible radiation model (Madronich and Flocke, 1999). They varied from 1.9 to 4.3 h for 8:00-17:00 local time and the mean value for 9:00-15:00 local time was 2.2 h during the KORUS-AQ.
We used total HCHO net production rates (S) with HCHO yields and emission fractions of AVOC species to estimate total AVOC emissions (E; kmol h -1 ) as follows: where f i is an emission fraction accounting for species i and Y i is the HCHO yields produced by the OH oxidation of species i. We used HCHO yields under high NO x conditions from previous studies (Dufour et al., 2009;Stavrakou et al., 2009;Zhu et al., 2014) that calculated HCHO yields for 1 day using the Master Chemical Mechanism (MCM). Table S5 summarizes the HCHO yields used in this study. In order to apply the HCHO yields to our top-down estimates, the air residence time over each source region should be at least longer than 1 day. We used 10 m winds from the GEOS-FP meteorological data to calculate the air residence time over source regions (Figure 2) where AVOC emissions were estimated. The residence times over SMA1 and Daegu were estimated as 1.3 and 1.6 days, respectively, and are long enough for the top-down estimates using 1-day yields. Over SMA2 and Daesan, the residence times were approximately 0.7 and approximately 0.6 days (approximately 16 and approximately 15 h), which are marginally sufficient to apply 1-day yields for species with short lifetimes but are relatively short for species with long lifetimes. We found that the residence times over Gunsan, Busan-Changwon (B-C), and Geoje were 0.2-0.4 days (4-9 h), which are too short to apply 1-day yields. In Section 4.2, we discuss the uncertainties of our estimates with the use of 1-day HCHO yields over source regions with a varying degree of the air residence time.
The total uncertainty of estimated AVOC emissions (s E ) is calculated by uncertainty propagation as follows: where s VCD is the total uncertainty of VCD, s VCD 0 is the uncertainty of VCD 0 , s t HCHO is the uncertainty of t HCHO , s Y i is the uncertainty of Y i , and s f i is the uncertainty of f i . s VCD is calculated as the root sum square of the random uncertainty and systematic uncertainty for GeoTASO HCHO VCDs. The random uncertainty of HCHO VCDs ranged from 0.02 Â 10 15 to 0.5 Â 10 15 molecules cm -2 and was small compared to the observed HCHO VCDs, which were obtained by taking an average of 100-71,000 pixel data on a 0.1 Â 0.1 grid. The systematic uncertainty of HCHO VCD is caused by absorption cross-section spectra, AMFs, and model simulation results for the background correction in the GeoTASO retrieval. A detailed uncertainty analysis of GeoTASO VCDs during the KORUS-AQ is beyond the scope of this study, and the uncertainty analysis of HCHO VCDs from satellites and aircrafts was discussed in detail in previous studies (De Kwon et al., 2019;Nowlan et al., 2018). In this study, we defined the root mean square error of 6.8 Â 10 15 molecules cm -2 as the systematic uncertainty of the GeoTASO observations in comparison with DC-8 HCHO VCDs (Figure 1), which was 39% of the averaged GeoTASO HCHO VCDs. We derived s VCD 0 from the variation of biogenic HCHO VCDs, which were calculated using the range of DC-8 isoprene observations during the KORUS-AQ and the 1-day HCHO yield of isoprene. Biogenic HCHO VCDs range from 2.8 Â 10 15 to 5.5 Â 10 15 molecules cm -2 , and s VCD 0 of 0.8 Â 10 15 molecules cm -2 was calculated as a standard deviation for a uniform probability distribution of the range of the biogenic HCHO VCDs. s t was 0.13 h, calculated as the standard deviation of the mean for the lifetimes of HCHO for 9:00-15:00 local time for the KORUS-AQ. s Y i is a standard deviation for a uniform probability distribution of the difference in yields between the minimum and maximum values in previous studies (Dufour et al., 2009;Stavrakou et al., 2009;Zhu et al., 2014;Stavrakou et al., 2015). If HCHO yields for specific species were not given in the previous studies, uncertainties were assumed as 10% of the yields. s f i was obtained by regression slopes of each species between the model using the bottom-up inventory and the DC-8 observations shown in Figure 4. Finally, we multiplied the total uncertainty s E from Equation 5 by a coverage factor (k ¼ 2) with a confidence level of 95%.

Evaluation of model simulation
We used the 60-s DC-8 HCHO mixing ratios to evaluate the GEOS-Chem simulation. The DC-8 observations were first averaged for every hour on horizontal and vertical grids of the model for a comparison with the simulation. Figure 5 shows simulated and observed HCHO mixing ratios, averaged below 2 km during the KORUS-AQ, with 10 m winds from the GEOS-FP meteorological data. The highest value, up to 5.1 ppbv over Daesan, was observed where large petrochemical complexes and power plants exist. In situ 1-s HCHO observations by Fried et al. (2020), however, showed a peak mixing ratio of 35 ppbv in a plume from the Daesan petrochemical complexes. The gridded DC-8 HCHO mixing ratios over Gunsan and Daegu were 4.0 and 2.9 ppbv, respectively. The spatial variability of the simulated HCHO shows good agreement with that of the observed HCHO, showing a correlation coefficient of 0.62 between the model and the observations. However, we found that the values from the model were lower than the observations with a regression slope of 0.74 and a normalized mean bias (NMB) of -25% between the two. Differences were significant over the nearby Yellow Sea close to Daesan and Gunsan regions, indicating possible underestimation of AVOC emissions from large industrial complexes. Also, over the westernmost flight tracks, the low HCHO mixing ratio in the model infers a low background HCHO mixing ratio or transport of AVOCs with relatively long lifetimes from China. The westernmost flight tracks over the Yellow Sea aimed to examine the long-range transport of air pollutants in weather conditions with westerly winds.
We also compared the simulated and observed HCHO profiles averaged in the morning and in the afternoon In the morning, however, the simulated HCHO mixing ratio was generally low compared to the observation at all altitudes. In the afternoon, while the simulated HCHO mixing ratios below 2 km were still lower than those of the DC-8 aircraft, differences between the model and the observation were reduced above 2 km. HCHO has a relatively short lifetime and reflects the local VOC emissions. Thus, the simulated discrepancy of HCHO near the surface implies that AVOC emissions are too low in the model. In the next section, we present a top-down estimate of AVOC emissions in South Korea and evaluate it by comparing the observations with the HCHO simulated from the top-down AVOC emissions.

Top-down total AVOC emissions in South Korea
We estimated total AVOC emissions over major AVOC source regions in South Korea: SMA, Daesan, Gunsan, Daegu, B-C, and Geoje ( Figure 2). Other major source regions including Yeosu and Ulsan were excluded because no Geo-TASO observations were available during the campaign. We further divided SMA into two source regions. SMA1 includes Seoul and Incheon that are characterized by dense traffic and numerous industrial complexes, and SMA2 includes multiple industrial complexes including semiconductor industrial sources to the east of the main metropolitan area.  Table 2 summarizes our top-down estimates of total AVOC emissions for each source region compared with the bottom-up emissions from the KORUS v5. The estimated top-down emissions are 1.5 + 0.7 to 6.9 + 3.9 times higher than those of the KORUS v5 depending on regions, while the estimated emission over SMA1 is consistent with the bottom-up inventory. The largest increase by a factor of 6.9 + 3.9 occurs in Daesan where one of the largest petrochemical complexes in South Korea is located.
We found that our top-down estimate in Daesan is higher than the result of Fried et al. (2020) who also conducted a top-down estimate of VOC emissions focusing on the Daesan petrochemical complexes using their HCHO observations aboard the DC-8 aircraft. Fried et al. (2020) obtained a ratio of 2.9 + 0.6 between their topdown VOC emission estimate and that of the KORUS v5 inventory. The discrepancy can be attributed to scope of the study area. Our result includes not only the Daesan petrochemical complexes but also other nearby sources in the Daesan region as shown in Figure 2.
The top-down AVOC emissions may have been overestimated due to transport influences from outside of the Daesan region, which was mainly affected by easterly winds when GeoTASO observations occurred. We found significant VOC sources located in the east of Daesan and south of the SMA attributed to steel and semiconductor facilities. To quantify the effect of transport from these nearby sources, we used a GeoTASO HCHO VCD background value of 1.3 Â 10 16 molecules cm -2 , which was observed nearby but outside the Daesan region (36. 6 -37.2 N and 126.8 -127.1 E). The estimated AVOC emissions were 159 + 108 kmol h -1 and were 3.7 + 2.5 times higher than those in the bottom-up inventory. These values are still high compared to Fried et al. (2020) but are in agreement with their results within the mutual uncertainty estimates.
We provide the top-down AVOC emissions with uncertainties in Table 2. Individual input contributions (Equation 5) to total uncertainty varies regionally, but HCHO VCDs and emission fractions are the most significant inputs across all regions to contribute to the total uncertainty. The uncertainty of HCHO VCDs accounts for 32% of the total uncertainty on average, and the uncertainty of emission fractions for !C3 alkenes contributes to 43% because !C3 alkenes in the model show the largest discrepancy from the DC-8 observations (Figure 4). Uncertainties in background HCHO VCDs and HCHO lifetimes account for 9% and 6% of the total uncertainty, respectively. The uncertainty associated with 1-day HCHO yields is 3%, which is relatively small.
However, the uncertainty of HCHO yields associated with the air residence times was not included in the total uncertainty; therefore, we examined the sensitivity of the top-down estimates to changes in the yields related to the air residence time. As discussed in Section 3, the air residence times for some regions were not long enough to estimate AVOC emissions using the 1-day HCHO yields from VOCs. HCHO yields from each VOC species for approximately 15 h were estimated with cumulative yields in the MCM scheme from Dufour et al. (2009), and HCHO yields for species not given in Dufour et al. (2009) were assumed as 50% of the 1-day yields. The yield for toluene was estimated using a ratio of the 15-h yield to the 1-day yield for xylene. Using HCHO 15-h yields, the estimated AVOC emissions increased by 42 kmol h -1 over SMA2 and Daesan compared to Table 2, but the increments are within uncertainty ranges.
The air residence time over Gunsan was too short to estimate total AVOC emissions using 1-day yields; therefore, emissions may have been underestimated. The residence times over B-C and Geoje were also too short, but the effects of residence times on the estimated emissions may have been compromised due to the easterly winds when GeoTASO observations occurred (Figure 1). B-C and Geoje are located in the southwest of Ulsan, where the largest petrochemical complex in South Korea exists (Figure 2). Therefore, the top-down estimates over B-C and Geoje may have been overestimated because the VCD 0 that was used in this study may be low compared to the effects of HCHO VCDs from Ulsan. Further research with more observational constraints is required to acquire improved estimates of total AVOC emissions over Gunsan, B-C, and Geoje. Figure 7 shows simulated HCHO mixing ratios using the top-down estimates in South Korea. In comparison to the results of the bottom-up inventory (Figure 5), for the top-down estimate results, positive and negative biases were reduced in source regions while positive biases Ratio 1.0 + 0.4 4.1 + 1.8 6.9 + 3.9 2.5 + 1.1 3.9 + 1.7 1.5 + 0.7 1.6 + 0.9 The ratio is top-down to bottom-up AVOC emissions. AVOC ¼ anthropogenic volatile organic compound; The estimations in the areas are compromised by the short air residence times.
Kwon et al: Top-down estimates of AVOC emissions using HCHO VCDs from aircraft Art. 9(1) page 9 of 16 increased over the sea close to Daesan, B-C, and Geoje, where the estimated emissions can be affected by adjacent emission sources. In particular, a 2 ppbv increase in the simulated HCHO mixing ratio below 2 km occurred in Daesan compared to the results of the bottom-up inventory. Overall, the results using top-down estimates show improved statistics relative to those obtained using the bottom-up inventory in comparison with the DC-8 observations (Table 3). In particular, the NMB decreased from -25% to -13% in the model after using the top-down AVOC emissions. When the westernmost flight observations were excluded in the statistics calculation, to primarily focus on domestic sources in South Korea, the NMB was further reduced to -8%. Figure 6 also shows simulated HCHO profiles using the top-down estimates, which are in better agreement with the DC-8 observations (solid lines in Figure 6). We can see considerable increases in simulated HCHO mixing ratios below 2 km. In the morning, the simulated surface HCHO mixing ratio is higher than that observed by the DC-8 aircraft, which can be caused by low PBL heights in the model (Oak et al., 2019). In the afternoon, we can see a uniform increase in HCHO mixing ratios below 2 km following the development of PBL heights. In the free troposphere, the top-down emission does not change the simulated HCHO mixing ratio, indicating that the domestic emissions do not significantly affect HCHO mixing ratios in the free troposphere. We think that the simulated low bias in the free troposphere could be attributed to insufficient production of HCHO from species with long lifetimes in the model. Figure 8 shows the simulated and observed HCHO profiles over the major anthropogenic source regions where the total AVOC emissions are estimated using Geo-TASO HCHO VCDs. The simulated HCHO mixing ratios with the top-down estimates are higher than those with the bottom-up emission in most regions, resulting in better agreement with the DC-8 observations. However, in the B-C and Geoje areas, the simulated HCHO mixing ratios using both the bottom-up and top-down emissions are higher than the observations. As discussed above, the total AVOC emissions in B-C and Geoje might be overestimated due to emissions from Ulsan. In addition to the influences from adjacent sources, several issues need to be addressed in future analyses as discussed below.
In our top-down estimate method, we assumed no error in AVOC speciation in the inventory. Therefore, the estimated top-down total AVOC emission was speciated using the bottom-up emission AVOC fractions. However, discrepancies between the model and observations vary between VOC species. Figure 4 shows comparisons of AVOC species between the DC-8 aircraft and the GEOS-Chem using the bottom-up inventory in the major source regions. Methyl ethyl ketone (MEK) and methanol concentrations in the model were very low compared to the DC-8 observations. Measured MEK was up to approximately 11 ppbv, which is higher than that previously reported by Kim et al. (2015). It is noted that the PTR-MS The values in parentheses are statistics for domestic AVOC emission effects on the HCHO mixing ratio. Slope and R denote the regression slope and correlation coefficient, respectively. RMSE ¼ root mean square error; NMB ¼ normalized mean bias; AVOC ¼ anthropogenic volatile organic compound; HCHO ¼ formaldehyde.
Art. 9(1) page 10 of 16 Kwon et al: Top-down estimates of AVOC emissions using HCHO VCDs from aircraft measurements of MEK may suffer from interferences (such as C4-aldehydes close to petrochemical sites or propylene glycol methyl ether acetate near semiconductor facilitiesfor example, Hwaseong-Pyeongtaek, which is an MEK hotspot in Figure S1). Nonetheless, we concluded that the GEOS-Chem model significantly underestimates MEK concentrations over South Korea. Methanol concentrations are also significantly underestimated ( Figures S1-S3). Souri et al. (2020) showed that methanol was 10 times lower in the model compared to the DC-8 observations during the KORUS-AQ campaign.
In the model using bottom-up inventory, acetaldehyde (ALD2) and acetone (ACET) were lower than the DC-8 observations. ALD2 is a common product of AVOC oxidation in the atmosphere, while ACET is both directly emitted and formed in the atmosphere from AVOC oxidation. ALD2 and ACET show high mixing ratios over Daesan and Gunsan ( Figure S1), reflecting that they are photo-oxidation products of many AVOC precursors emitted from petrochemical and industrial facilities. Low ALD2 in the model may affect HCHO mixing ratios from source regions and surrounding areas because it has a relatively short lifetime of approximately 8.8 h compared to MEK (5.4 days), methanol (12 days), and ACET (68 days; Atkinson, 2000;Atkinson and Arey, 2003;Yáñez-Serrano et al., 2016).
The simulated mixing ratios over SMA2 were too high for toluene (>5 ppbv), xylene (>2 ppbv), !C3 alkenes (>2 ppbv), and ethene (>2.5 ppbv) compared to the observations as shown in Figure 4. Whereas the model failed to capture high mixing ratios observed over Daesan for benzene, propane, !C4 alkanes (ALK4), !C3 alkenes, and ethene. These discrepancies may be caused by the VOC speciation in the bottom-up inventory. Simpson et al. (2020) conducted an analysis of WAS and PTR-MS VOC observations during the KORUS-AQ and showed relatively low contributions from the solvents, including toluene and xylene, to the total AVOCs over Seoul compared to those of the KORUS v5 inventory. The simulated aromatic contributions to the total AVOCs in Seoul largely reflected those of the KORUS v5 inventory and were higher than those of the DC-8 observations.

Diurnal variation of AVOC emissions
Aircraft observations can provide information on hourly concentration variations, although the spatial coverages of the aircraft observations are somewhat limited compared to satellite observations. We used the GeoTASO observations over SMA1 þ SMA2 to constrain the diurnal variation of AVOC emissions as a demonstration of the potential application of the geostationary satellite observations. The method described in Section 3 was used but was considered as a function of time. HCHO lifetimes were calculated with DC-8 OH mixing ratios and photolysis rates at each time. The hourly background HCHO VCDs were calculated from GEOS-Chem over the background region described in Section 3. To account for diurnal variations of the HCHO yields of VOC species, we assumed that the yields are a function of OH mixing ratios, and the normalized OH mixing ratio was used as a scale factor for imposing the diurnal variation of the HCHO yields. Figure 9 shows the diurnal variations of the total AVOC emissions over the SMA1 þ SMA2 areas, as implemented in GEOS-Chem and estimated from GeoTASO HCHO VCDs.
Diurnal variations in the model were considered with diurnal scale factors based on fossil fuel use. In the model, the emission was 466 kmol h -1 on average and was uniform in the daytime. The top-down emission was 378 (+213) kmol h -1 on average and was smaller than that implemented in the model. Although our analysis yields diurnally varying top-down emissions, they differ marginally from those implemented in the model based on the fuel use.
The number of observations was not large enough to constrain the diurnal variation of AVOC emissions in Seoul. The Airborne GeoTASO observations occurred over different locations in the SMA1 þ SMA2 region with irregular observation frequencies during the campaign. Also, we assumed the diurnal variations of the HCHO yields for species using OH mixing ratios, which may not represent the real HCHO yields variation. However, this result shows the possible applications of observations from instruments onboard geostationary satellites, which conduct regular measurements of target species over the same locations and, therefore, can monitor emission changes more accurately.

Conclusions
We found that the simulated HCHO mixing ratios using the KORUS v5 bottom-up inventory were lower than those of the DC-8 observation during the KORUS-AQ campaign, implying that the AVOC emissions in South Korea are too low in the model. Therefore, we estimated the top-down AVOC emissions over major AVOC source regions in South Korea using GeoTASO HCHO vertical columns from the aircraft platform during the KORUS-AQ campaign. The estimated top-down AVOC emissions were 1.0 to 6.9 times higher than those of the KORUS v5 bottom-up inventory. The simulated HCHO mixing ratios using the top-down estimate emissions showed better statistics than the result using the bottom-up inventory. In particular, NMB decreased from -25% to -13% in the whole domain and Art. 9(1) page 12 of 16 Kwon et al: Top-down estimates of AVOC emissions using HCHO VCDs from aircraft was further reduced to -8% when focused on domestic sources only. The remaining limitations in our method were discussed. The estimated emissions over Daesan, which were 6.9 + 3.9 times higher than those of the KORUS v5, were higher than the results of Fried et al. (2020). This was because our results included other nearby sources in the Daesan region as well as the Daesan petrochemical complexes, of which plumes were used for the top-down estimate in Fried et al. (2020). Nevertheless, overestimation may be caused by transport influences from outside of the Daesan region. After removing the impacts of transports from other nearby sources, the estimated AVOC emissions were 3.7 + 2.5 times higher than those in KORUS v5.
In some regions, the air residence time was not long enough to use 1-day yields for the top-down estimates. We investigated the sensitivity of top-down estimates to the changes in HCHO yields related to the air residence time. Over SMA2 and Daesan, the estimated AVOC emissions increased by 42 kmol h -1 , but the increments were within the uncertainty ranges. In Gunsan, B-C, and Geoje, the air residence time was too short, and further research is required to improve the estimates.
In addition, we assumed no errors for the AVOC speciation, but there were missing anthropogenic sources for MEK, methanol, ALD2, and ACET in the bottom-up inventory. Compared with DC-8 observations, the simulated missing ratios showed large biases for toluene, xylene, !C3 alkenes, and ethene over SMA2 and for benzene, !C4 alkanes, !C3 alkenes, and ethene over Daesan. Therefore, to improve the accuracy of the top-down estimate method, areas with calm weather conditions must be selected to satisfy the air residence time for yields, and AVOC speciation should be more accurate in the bottomup inventory.
We estimated the hourly top-down AVOC emissions over SMA1 þ SMA2 using GeoTASO HCHO VCDs to show the diurnal variation of AVOC emissions to demonstrate the applicability of geostationary satellite observations. Although the estimated emissions varied diurnally, the hourly estimated emissions differed marginally from those implemented in the model. To accurately estimate the diurnal variation of AVOC emissions, regular measurements and HCHO yield diurnal variations are required. Geostationary satellites such as GEMS, TEMPO, and Sentinel-4 will enable the accurate and detailed evaluation of diurnal variation in AVOC emissions.

Data accessibility statement
Data during the Korea-United States Air Quality campaign used in this study can be downloaded through the data archive website (https://www-air.larc.nasa.gov/cgi-bin/ ArcView/korusaq).

Supplemental files
The supplemental files for this article can be found as follows: Tables S1-S5. Figures S1-S3. Docx.