Representing climate evolution in ensembles of GCM simulations for the Hudson Bay System

Climate impact studies often require a reduction of the ensembles of opportunity from the Coupled Model Intercomparison Project when the simulations are used to drive impact models. An impact model ’ s nature limits the number of feasible realizations based on complexity and computational requirements or capacities. For the purpose of driving a hydrological model and an ocean model in the BaySys research program, two hierarchical, differently sized simulation ensembles were produced to represent climate evolution for the region of the Hudson Bay Drainage Basin. We compare a 19-member ensemble to a 5-member subset to demonstrate comparability of the driving climate used to produce model results. Ten extreme climate indicators and their changes are compared for the full study region and seven sub regions, on an annual and seasonal basis and for two future climate horizons. Results indicate stronger warming in the North and for cold temperatures and an East-West gradient in precipitation with larger absolute increases to the East and South of the Hudson Bay. Generally, the smaller ensemble is sufficient to adequately reproduce the mean and spread in the indicators found for the larger ensemble. The analysis of extreme climate indicators ensures that the tails of the distribution of temperature and precipitation are addressed. We conclude that joint analysis at the interface of the hydrological and ocean model domains are not limited by the application of differently sized climate simulation ensembles as driving input for the two different modeling exercises of the BaySys project environmental studies, yet acknowledging that impact model output may be dependent on other factors.


Introduction
Physical, biological, and biogeochemical processes in Hudson Bay are the focus of the Hudson Bay System (BaySys) multiyear research program. BaySys research aims to explore the influence of both climate change and anthropogenic regulation of freshwater on various marine components. Central to the project's research objectives are climate scenarios used to drive various impact models established by the BaySys teams, where changes in temperature and precipitation are the primary drivers for ecosystem change. This article presents a review of the climate scenarios developed for the BaySys project covering 1981-2070 at various temporal and spatial scales.
Situated entirely north of 45 latitude, the Hudson Bay Drainage Basin (HBDB) extends into the Arctic and spans eleven ecozones offering a multitude of landscapes and climatic characteristics. Because of the HBDB's geographic location, it experiences warming at a rate considerably greater than the global mean-a phenomenon known as Arctic amplification (Serreze and Barry, 2011). There is strong scientific consensus that temperatures in Canada will continue to increase into the future, and the warming rate is expected to be about twice that of global mean (Zhang et al., 2019, Stadnyk and. Certain oceanic processes within Hudson Bay itself, such as summer sea ice coverage, are particularly sensitive to temperature and show a declining trend in many oceans surrounding Canada over 1968(Derksen et al., 2019. Analysis of global climate model (GCM) sea surface temperature (SST) in Greenan et al. (2019) suggests that future projected increases in summer SST exceed winter SST, largely due to the absence of ice cover in summer, which allows for greater heat gains.
Beyond the impact of air temperature on SST and sea ice, hydrological processes and changes in the HBDB are driven by climate, which in turn impacts the freshwater entering Hudson Bay through rivers and streams (MacDonald et al., 2018). Sillmann et al. (2013) project statistically significant changes in several temperature and precipitation-based extreme indices for the HBDB including projections of fewer cold spells, more warm spells, increasing precipitation on wet days, and increased number of very wet days. Seneveratne et al. (2016) shows that changes in extreme variables, such as the annual minimum night time temperature and annual maximum 5day consecutive precipitation, are greater over the Hudson Bay and HBDB region, relative to the global mean.
Hudson Bay together with James Bay, Foxe Basin (FB), Hudson Strait, and Ungava Bay comprise the world's largest inland sea and is the terminus for a drainage area that spans approximately 3.8 million km 2 (Kuzyk and Candlish, 2019). The drainage area, herein termed the HBDB, is of special importance as it provides a major source of freshwater and nutrients that help sustain a diverse marine ecosystem. Climate scenarios developed for BaySys must therefore cover the oceanic boundaries, as well as the HBDB land mass. Minimum temperature, maximum temperature, and total precipitation at the daily resolution are the primary variables considered for climate scenario selection assessment of the climate ensembles in this study. Although additional variables were used to drive the ocean models, they were not considered as influential in driving change within the ocean system as temperature and precipitation (Lukovich et al., 2021).
Climate simulation ensembles from the CMIP archives (Coupled Model Intercomparison Project; Taylor, et al., 2012) provide a large number of individual simulations. The properties of such ensembles have been studied to understand their characteristics and improve the outcomes of climate analysis (e.g., Tebaldi andKnutti, 2007 Masson andKnutti, 2011), however for impact studies, the amount of data often exceeds the computational and processing capacities. Approaches to reduce the size of an ensemble while reasonably retaining its key properties in terms of variability and climate change signal have been proposed (Cannon, 2015;Casajus et al., 2016;Lutz et al, 2016;Mendlik and Gobiet, 2016;Wilcke and Bärring, 2016;Herger et al., 2018). Constraints in applying such procedures may arise from impact model requirements and data availability. To provide the hydrology and ocean circulation modeling groups within BaySys with transient climate information, two climate simulation ensembles were developed using a combination of meeting data availability constraints and the k-means clustering approach proposed by Casajus et al. (2016). A larger ensemble of 19 simulations at daily temporal resolution was provided to the BaySys hydrological modeling team (MacDonald et al., 2018, and a smaller subset of 5 climate scenarios was used by the BaySys ocean modeling team needing subdaily resolution and additional variables like air pressure, wind, and solar radiation (Lukovich et al., 2021). To ensure adequate diversity in future climate scenarios was captured by the smaller subset, an in-depth assessment of climate scenario performance between the larger hydrologic climate ensemble and ocean subset was needed.
Within the BaySys research program, several recent studies examine the regional and HBDB-wide hydrologic changes associated with climate impacts based on the larger climate scenario ensemble (MacDonald et al., 2018;Lilhare et al., 2019;Stadnyk et al., 2019;, and one over the extended pan-Arctic domain using the smaller climate scenario subset ensemble , Lukovich et al., 2021. More studies are currently underway; therefore, it is important to understand the differences and similarities among the two ensembles, and their relative variability, given their influence on recent literature. The objective of this article is to present the BaySys climate scenario ensemble development and to assess the evolution of climate variables across the HBDB terrestrial domain for two distinct ensembles: the main terrestrial ensemble, and a smaller subset of the main ensemble used to drive the ocean model for BaySys. Using a set of climate indicators proposed by the Expert Team on Climate Change Detection and Indices (ETCCDI; http:// etccdi.pacificclimate.org/list_27_indices.shtml), we examine to what extent the smaller subset is representative of the larger ensemble in terms of the evolution of historical climate across the terrestrial domain, and the projected changes for two future periods (near future: 2021-2050, far future: 2041-2070). This article focuses on climatic change across the terrestrial domain, and Lukovich et al. (2021) address changes over the marine system domain.

Study domain
The HBDB terrestrial region drains approximately onethird of Canada's landmass (3.8 M km 2 ) and spans 6 provinces, 2 territories, and 4 states of the United States of America. The HBDB terrestrial domain is assessed as one region and also subdivided in a portion of our analyses into 7 subregions by drainage outlets and associated upstream contributing area (i.e., FB; Northwestern Hudson Bay [NWHB]; Hudson Lowlands/West James Bay; East James Bay; East Hudson Bay/Ungava Bay [EHB-UB]; Nelson Basin [NB]; Churchill Basin). These subdivided regions were employed to compare global climate model data of different spatial resolution and to provide climate evolution analyses corresponding with BaySys teams analyzing runoff over similar regions. This subdivision allows to characterize spatially dependent climatic variability across the expansive HBDB region ( Figure 1).
Each subregion is highlighted by one of the 7 colors displayed in the map legend. The provinces/territories (Canada) and states (United States) are delimited in gray.
Major rivers and tributaries in the HBDB drain into the southern James Bay, into Hudson Bay centrally, and Hudson Strait and FB to the North. Owing to significant latitudinal and longitudinal gradients, vegetation and land cover are highly varied across the region. To the western edge of the drainage basin are the Rocky Mountains (approximately 4,000 masl in elevation), which are dominated by rock, moss and lichen coverage, predominantly coniferous forest with increasing deciduous forest moving eastward, and wetland-dominated terrain in low elevation regions overlying poorly draining soils and bedrock. East of the Rocky Mountains is the Canadian Prairie region to the south dominated by grasslands and agriculture, disconnected drainage regions (Spence and Woo, 2003;Shook et al., 2013), poorly draining clayey soils, and the Boreal forest and shield region to the north. The central portion of the basin (east and west James Bay regions) similarly resides within the Boreal ecoregion, dominated by coniferous forest, numerous wetlands, and extensive surface water. The southern edge of James Bay resides within the poorly draining and deltaic-like Hudson Plains ecoregion, with Taiga Shield dominating the Ungava Bay and Western Hudson Bay region further north along the eastern edge of the drainage basin. FB is classified as a mix of Southern and Northern Arctic regions (based on ecoregions defined by Bostock, 1970). Mean annual air temperatures commonly remain <0 C at the northern extent of the basin (FB); however, the southern regions of the basin are typical of the Great Plains with annual mean air temperature above freezing and large amplitudes between -30 C in the winter to þ30 C in the summer months. More detailed assessments of the hydrography and stratigraphy of the region are presented in Dery et al. (2016), Stadnyk et al. (2019), and Stadnyk et al. (2020).

Reference data and variables of interest
Historical daily observations of temperature and precipitation were obtained from the Hydrological Global Forcing Data (HydroGFDv2) reanalysis data set (Berg et al., 2018), which has global coverage at 0.5 spatial resolution. These data were selected as baseline historical observations for the BaySys project after an extensive comparison of several reanalysis products meeting the needs of the project, having sufficient spatial and temporal coverage of the domain, and demonstrated evidence within the peer-reviewed literature of their accuracy within the HBDB domain (Lilhare et al., 2019;Pokorny et al., 2020). HydroGFDv2 was selected among these products because of their demonstrated performance across the domain, spatial and temporal resolution, their global coverage, and near real-time updating with historical fields produced back to 1960. Their background field utilizes ERA40 and ERA-I, which also makes them consistent with the ERA-based background forcing used for the NEMO ocean model in BaySys, providing continuity between the terrestrial and marine domains. Data from the HydrGFDv2 data set were obtained from 1960 to 2018, including precipitation, minimum temperature, maximum temperature, and were provided by the Swedish Meteorological and Hydrological Institute for the BaySys project. Historical climate data from HydroGFDv2 were used to drive and calibrate the hydrologic model for the BaySys project (HYPE; Stadnyk et al., 2020) and were employed as the reference climate for the bias correction of global climate model simulation data.

Future climate scenarios and selection of ensembles
Climate simulations for the BaySys project were collected from GCMs participating in Phase five of the Coupled Model Intercomparison Project (CMIP5; Taylor, et al., 2012). A total of 54 simulations from 29 GCMs ( Table  1) were considered using daily outputs for minimum temperature, maximum temperature, and total precipitation archived for historic and future periods of interest. Simulations were selected from a larger ensemble (of 71) to include only GCMs driven by Representative Concentration Pathways RCP4.5 and RCP8.5 (van Vuuren et al., 2011). The lower emissions scenario RCP2.6 was deemed unlikely given current levels of emissions and mitigation success, and RCP6.0 was deemed redundant as it overlaps with RCP4.5 and RCP8.5 scenarios. In order to avoid skewing ensemble statistics toward GCMs with multiple runs, only 1 run per GCM-RCP combination was considered. This results in 54 simulation ensemble samples ( Table 1).
BaySys employs daily data from climate model simulations to drive hydrological models over the HBDB and using all 54 CMIP5 simulations exceeds the computational capacities of BaySys modeling teams. A k-means clustering approach was applied to reduce the 54 available simulations. The approach proposed by Casajus et al. (2016) allows to objectively select a subset of simulations. The method that partitions n objects, described by p variables into k clusters, permits to optimize the representation of the uncertainty range of the ensemble (Casajus et al., 2016). Based on a set of selection criteria the clusters of similar simulations are then represented by the simulation closest to the cluster's center (schematic illustration provided in appendix C of Kuzyk and Candlish, 2019). While a good representation of the CMIP5 ensemble was the key objective, some additional constraints were applied to the selection process for BaySys. The k-means algorithm was forced to meet these constraints as follows.
First, the number of 19 simulations was determined as the threshold to explain roughly 90% of the variance of the 54 simulation ensemble, following the cost-benefit optimization proposed by Casajus et al. (2016, supporting information S2). To assess climate change impacts in BaySys, both on the hydrological systems around the Hudson Bay, as well as the water body of Hudson Bay itself, it was important to include CMIP5 climate simulations that provided the variables and temporal resolution required to drive the NEMO ocean model for the Hudson Bay marine domain (Lukovich et al., 2021). Overall, accessible data archives contained 5 simulations for which the required 3-hourly data for driving NEMO were available, providing temperature, precipitation, surface pressure, longwave and shortwave radiation, specific humidity and wind. While the initial project planned for using three climate simulations to feed NEMO, additional computational capacities allowed to employ all five simulations providing the necessary data. Starting from this ensemble of opportunity, the clustering algorithm was forced to include these five models in the subset of 19 by applying The subset of 19 simulations selected through cluster analysis are bolded in the RCPs column. The subset of 5 simulations used to drive the NEMO model are bolded and accompanied with the "N" superscript in the RCPs column. weights. Second, BaySys as a large Canadian project sought to include results from Canadian climate research. Therefore, selection of the Canadian Earth System Model (CanESM) was favored as the representative of its respective cluster. Clustering was performed on the 54 CMIP5 simulations described above. Ten variables were used as clustering criteria, derived from spatial averages over the BaySys domain ( Figure 1) and consisting of annual (2) and seasonal (8) changes in temperature and precipitation between the reference period 1981-2010 (H) and the far future period 2041-2070 (F2). Variables were standardized to account for their different units. Numbered dots in Figure 2 show the 19-member ensemble (ENS) and the 5-member subset ensemble (SUB) in the 2D space of annual mean precipitation change and annual mean temperature change. The ensembles were used by BaySys project teams to drive hydrological modeling (MacDonald et al., 2018;Stadnyk et al., 2020;Tefs et al., 2021) and the NEMO ocean model (Lukovich et al., 2021). In this study we focus on the variables used in both modeling exercises, that is, temperature and precipitation.
The 54 simulation ensemble samples ( Table 1) originate from a wide range of modelling institutions, spatial resolution, emission scenarios (i.e., RCP) and responses to increased greenhouse gases, as represented by the Equilibrium Climate Sensitivity (ECS; Flato et al., 2013). Since climate change assessments are subject to various sources of uncertainty including emission scenario, GCM selection, and internal climate variability (e.g., Chen et al., 2011;Vermeulen et al., 2013), a broad ensemble is important to ensure multiple sources of climate simulation uncertainty are captured. While the ECS represents an individual model's global mean temperature (GMT) change in response to increased greenhouse gasses, it is also important to recognize regional differences, especially for northern areas where warming is accelerated (Serreze and Barry, 2011). To illustrate regional differences, Figure 3 shows the ratio of local mean temperature (LMT) change to GMT change using the median value from raw GCM surface temperature (CMIP5 tas variable) data from the 54, 19, and 5-simulation member ensembles. A moving 30-year window is used to generate local (GCM grid) and GMT changes, which are then interpolated onto a common 1 latitude Â 1 longitude grid. Following a time sampling approach (James et al., 2017) the median of 54 GCM simulations projects 2 C of GMT warming (0.61 C observed plus 1.39 C projected from 1981-2010) to occur in the 2030-2059 period. The median scaling relationship between LMT and GMT changes, developed from 54 GCM simulations, project the Hudson Bay and HBDB region to warm at approximately 1.9 times the rate of GMT, which is consistent with supplementary information provided in Seneviratne et al. (2016). Zhang et al. (2019) found that Canada's land mass, as a whole, is projected to warm at about twice the rate of GMT. Overall, it is evident that warming for the Hudson Bay and HBDB can occur more rapidly relative to the global average change.

Bias correction
Daily values of maximum and minimum temperature along with precipitation were extracted for the selected simulations for the period 1981-2070 (90 years). The raw simulation data were bias-corrected using the quantile mapping approach proposed by Mpelasoka and Chiew (2009). The HydroGFDv2 reanalysis data set was used for climatic reference. The higher resolution reference grid points were spatially averaged to match the lower resolution climate simulation grid points. Trends were removed from time series prior to the correction using a third order polynomial and reapplied subsequently. A moving window approach was applied to build transfer functions for each day of the year using 15 days prior to and after the day being corrected. Additive correction was used for minimum and maximum temperatures; precipitation was corrected multiplicatively.

Climate indicators
Historical and future climatologies and trends within the HBDB are assessed herein using 10 indicators ( Table 2) selected from 27 core climate extremes indices recommended by the ETCCDI (http://etccdi.pacific climate.org/ list_27_indices.shtml). The temperature based-indices selected to describe climate in the HBDB are the hottest and coldest day (TXx, TNn), along with the number of extreme cold days (FD, ID), and given mean winter (TN10p) and summer (TX90p) extreme temperature changes (Zhang et al., 2011). Selected precipitation based-indices include RX5day and CDD for characterization of multi-day wet and dry precipitation events, while PRCPTOT and SDII give useful information on daily precipitation for longer temporal windows (Sillmann et al., 2013a(Sillmann et al., , 2013b. As the definition of the temperature percentile based indices indicates, the threshold for a given day of the year calendar is calculated using data from 5 consecutive days centered on the day of interest over the selected base period, or from a sample of 150 data for this study (Zhang et al., 2005). Examples of these thresholds at the grid point for all 19 simulations are provided in supplemental Figures S1 (on January 1) and S2 (July 1) for TN10p, and Figure S3 and S4, respectively, on the same dates but for TX90p. Note that since the selected 19 simulations all share the same historical period (1981-2010), a model with two RCPs (4.5 and 8.5) gives identical (historical) results for its two members, where years 2006-2010 were sampled from the RCP4.5 experiment. show the ensemble median from 54, 19, and 5 GCM simulations, respectively. Data are re-gridded to a common 1 Â 1 grid. Hudson Bay and HBDB region is hatched for visual reference. Maps generated using M_MAP script (Pawlowicz, 2020

Analysis of climate indicators
From GCM precipitation and temperature data sets, climate extremes indices ( Table 2) were calculated per model grid for annual and seasonal aggregations using the MeteoLab toolbox (Santander Meteorology Group, 2015). Two different analyses were chosen to assess the climate change signal and compare the two differently sized ensembles. First, a global temporal analysis was based on annual and seasonal spatial averages over the entire HBDB, using all land-based grids. Trends were evaluated from the individual 90-year annual, and 90-year winter and summer time series, assessed for each simulation and ensemble at 5% significance (P value ¼ 0.05) using a modified version of the Mann-Kendall test (Mann, 1945;Kendall, 1975), which applies Mann-Kendall trend-free prewhitening (MK-TFPW) proposed by Yue et al. (2002). The MK-TFPW procedure improves the detection of significant trends in serially correlated hydrological time series. Past and future climatologies for both ensembles at the HBDB scale were compared using three time frames: the historical period 1981-2010 (H), a near future period 2021-2050 (F1) and a far future period 2041-2070 (F2). Violin plots were generated using samples of 570 and 150 absolute values derived from the 30-year annual time series for ENS 19 and SUB 5 simulations, respectively. Violin plots combine both box plot and probability density function information into one (Hintze and Nelson, 1998), with a vertical line representing a classical boxplot and separating two identical kernel density estimates of the ensemble distribution. A circle represents the data median, a thicker line their interquartile range, with a thin line extending above (below) the 75th (25th) percentile corresponding to the upper (lower) whisker range, and tails on the density trace indicating outliers.
Furthermore, an assessment of future mean change from both ensembles alongside their standard deviation was performed by computing change values as Delta F1 (F1-H) and Delta F2 (F2-H). In the specific cases of TX90p and TN10p, change assessment between time frames uses exceedance days instead of their rates as both indicators already represent exceedance rates relative to a baseline period (Sillmann et al., 2013b), which in our study corresponds to the historical (H) period.
The second analysis evaluates indicators across seven HBDB subregions (Figure 1) at the seasonal scale to evaluate both ensembles near and far future median changes at a regional scale corresponding with significant watershed drainages. Seasons are defined as DJF, MAM, JJA, and SON, by generating in each case 3 months maximum (for TXx and CDD), minimum (for TNn), total (for TX90p, TN10p, FD, ID, PRCPTOT, and RX5day), and average (for SDII) data sets. The two ensembles per subregion are represented by their mean and coefficient of variation (CV ¼ standard deviation/mean) for seasonal indicators for time periods H, F1, and F2, calculated from the 30-year time series of grids defined by subregion polygons. The future change of mean regional indicators and their CV were calculated per simulation as DF1 and DF2 with respect to H. ENS and SUB are represented by the ensemble median of near and far future changes (DF1 and DF2).

Temporal analysis at the global scale of the HBDB 3.1.1. Trend analysis
We detect significant positive trends in TXx, TNn, TX90p, PRCPTOT, RX5day, and SDII, and negative trends in TN10p, FD, ID (not significant for SUB summer), and CDD in annual, winter, and summer time series from 1981 to 2070 using both ensembles. Figures S5-S14 present the results of this analysis for the 10 indicators. Time series show the interannual variability in all indicators, with some discrepancies between ENS and SUB means for PRCPTOT and CDD (Table 3), where slopes of the trends detected are expressed per decade for all 10 indicators. Trends in TNn are stronger in winter, with slopes for SUB deviating slightly from ENS for annual, winter, and summer precipitation based indices and summer TXx, TNn, and TX90p.

Distributions and changes in annual averages
Historical and future climatologies are summarized using violin plots from both ensembles (Figure 4). Violin plots for TXx and TNn (Figure 4, first row) indicate warming from the historic (H) to future periods (F1, F2), with increasing mean exceedance rates of warm days (cold nights) increasing (decreasing) from 10% to more than 20% (less than 5%) by 2070 (Figure 4, second row). Since exceedance thresholds were held constant, we illustrate a decrease in the number of outlier warm days (cold nights) in the future. We similarly see a decline in FD and ID (third row). Precipitation indicators for amount (PRCPTOT), extremes (RX5day), and intensity (SDII) all indicate gradual increase in future periods, while dry periods (CDD) show a slight diminution. As can be expected, indicator distributions cluster around the median and appear to be relatively normally distributed, with some asymmetry appearing in the far future period for both the ENS than SUB. Skewed distributions are found for TXx, TX90p, and FD, ID. Table 4 summarizes the ensembles climatological near and far future mean changes with their standard deviation in brackets. Both ensemble (ENS and SUB) magnitudes of change are very similar, suggesting that SUB is representative of the ENS mean projected change ( Table  4). Minimum temperature (TNn) is projected to increase up to 5 C on average while an increase of near 3 C of maximum temperature (TXx) can be expected for the F2 period. From future periods F1 to F2 an augmentation of TX90p by more than 30 days to 55 days, on average, was found. Accordingly frost days (FDs) have a negative trend in the same temporal window with a reduction of up to 26 days, on average.

Spatial analyses by subregion of the HBDB
We analyze seasonal changes for the near (F1) and far (F2) future periods with respect to the reference period (H). Changes are represented for seasonal means and coefficient of variation (CV) by region (for details see Section 2.6), as shown on Figures 5-13. Both ENS and SUB future median changes of TXx and TNn's means indicate continuous warming, more pronounced for the TNn than for TXx. The strongest future warming of TXx (TNn) during winter (fall) is detected in both ensembles at higher latitudes in Foxe Basin, Northwestern Hudson Bay, and East Hudson Bay/Ungava Bay (Figures 5 and 6, respectively). Strong warming of TXx also occurs in Hudson Lowlands/ West James Bay and East James Bay during spring, where SUB shows larger increases than ENS. A peculiarity is noted over the Foxe Bay region for TXx, where SUB projects a decrease in the spring TXx mean for the far future period F2 that is associated with an increase in variability (CV), which is opposite to other trends and the ENS trend. The variability of both temperature indices is projected to decrease consistently and slightly, except for spring TXx derived from SUB. There is a slight increase in CV associated with lower TXx in both future periods. Both ensembles agree on persistent future increases (decreases) across seasons and regions in the number of warm days (cold nights), with their variability sometimes disagreeing in terms of direction of change (Figures 7 and  8, respectively). Changes are most pronounced for TX90p in SUB, where summer and fall median exceedance days increase by up to 27 days in the FB for the far future period. Small reductions in the number of exceedance days are projected for TN10p (4-9 days).
FDs and icing days (IDs) indicate small to no future mean change during winter across all subregions ( Figures  9 and 10, respectively). This pattern holds during summer, where most regions do not have FDs and IDs do not occur at all except in FB, hence no changes are found. It is during the shoulder seasons where climate change impacts these indicators most significantly by reducing their occurrence by 2-10 days for F1 in spring and in the fall. An overall reduction of 10-15 FD is expected in spring and fall for most subbasins by F2. A notable exception is found in FB where fall and summer FD decrease further into the future from F1 to F2, but little to no decrease in spring (Figure 9).
IDs similarly show little to no change in summer and winter. Variability in both FD and ID increase most across the southern HBDB in summer. ID decreases almost homogeneously across all subregions in spring, with no trend from F1 to F2 except the slight increase (1 day) shown by SUB for F2 in the FB. Continued and sustained warming into the far future (F2) appears to further reduce IDs only in the three northernmost subregions during fall. Overall, the effect of warming for the HBDB appears to reduce periods of freeze-thaw cycling more than it reduces the number of days with temperatures below freezing.
Total precipitation amount is projected to increase continuously into the future in all seasons, across the HBDB with two exceptions (Figure 11). A decrease of 5 mm in mean PRCPTOT is projected for the NB during summer by ENS for F2, and SUB indicates a very small decrease (<1 mm) in spring for F1 over the FB. Agreement between projections from ENS and SUB is generally lower for PRCPTOT across all seasons and regions than is observed for temperature indices, with SUB changes being slightly higher than ENS. Increases are larger in the Eastern and  Southern subregions, with more pronounced changes in spring and fall. With precipitation being a generally noisier and less spatially consistent variable, trends in the variability (CV) more often disagree between future time horizons, as well as among the ensembles, relative to the temperature indices.  Extreme 5-day precipitation expressed by the RX5day indicator ( Figure 12) is projected to continuously increase through our study period, with more pronounced increases in the winter and summer. A single outlier occurs in the far future from ENS where an almost negligible decrease is projected for the NB (-0.34 mm). By  comparing Figure 12 to Figure 4, we observe that the overall increase in RX5day projected across the HBDB is composed of seasonally and spatially quite varied changes.
For instance, increases are relatively homogeneous in space in the winter but are greater in the south in spring and summer. And while the Prairie subregions experience Figure 11. Changes in the regional seasonal total precipitation (PRCPTOT). Colors indicate ensembles median of changes to the indicator's regional mean, while upward/downward pointing triangles indicate ensemble median increase/decrease in spread (coefficient of variation). The four symbols per region are arranged to indicate changes in the mean for F1 in the left column and F2 in the right column and compare ENS values in the top row to SUB values in the bottom row. DOI: https://doi.org/10.1525/elementa.2021.00011.f11 Figure 12. Changes in the regional seasonal maximum 5 day precipitation (RX5day). Colors indicate ensembles median of changes to the indicator's regional mean, while upward/downward pointing triangles indicate ensemble median increase/decrease in spread (coefficient of variation little change in RX5day precipitation from fall to spring, summer increases are among the most pronounced. Analysis of SDII by subregion indicates very small changes (<1 mm/day; not shown). CDD as an indicator of meteorological drought exhibits relatively small positive and negative changes for most subregions and seasons. The number of consecutive dry days, however, decreases in the northern regions in the far future (F2) by up to 4 days in the summer and up to 7 days in spring ( Figure 13).

Discussion
In this study, we evaluate the comparability of two differently sized, hierarchical climate simulation ensembles. Derived from the CMIP5 data set, the smaller subset ensemble (SUB) consists of five simulations that are part of the larger 19-member ensemble (ENS). The rationale of the two ensembles stems from (1) data handling and computational capacity within the BaySys project, as well as (2) data availability. The nineteen member ensemble ENS was developed with the objective to force the hydrological modeling over the HBDB requiring only daily temperature and precipitation with a manageable number of inputs that span the uncertainty in the CMIP5 ensemble of opportunity . The smaller, five member ensemble simulations (SUB) were used to drive the ocean modeling of the Hudson Bay by the NEMO ocean circulation model (NEMO, Lukovich et al., 2021), which is computationally more demanding and requires additional forcing data; thus allowing the use of fewer climate simulations. In the evaluation of climate change and river regulation impacts on Hudson Bay, the results of the two modeling efforts converge at the interface of river mouths and estuaries. It is therefore of interest to assure that the two ensembles are commensurate in both their representation of climate, as well as their climate change signal. We use 10 climate indicators ( Table 2) chosen from the climate extremes indices recommended by the ETCCDI to evaluate the two ensembles.
In the "global" assessment over the entire HBDB, annual, as well as summer and winter trends are statistically significant for both temperature and precipitation related indicators. As would be expected, extreme cold temperatures (TNn) increase faster than extreme warm temperatures (TXx), however, the number of cold nights (TN10p) decreases at almost half the rate of the increase in warm days (TX90p). With increasing temperatures, FDs and IDs both become less frequent into the future. Their changes, however, do manifest very little in the winter and summer seasons, with most of the changes occurring in the shoulder seasons. Overall wet day precipitation (PRCPTOT) increases into the future as does the precipitation amount of extreme events described by five consecutive day precipitation (RX5day). While PRCPTOT increases are stronger in winter, RX5day increases are relatively uniform across all seasons. Precipitation increases are accompanied by a slight increase in daily precipitation intensity (SDII). Consequently, the number of consecutive dry days (CDDs) is declining across the HBDB. Table 3 indicates these results are stable across the two ensembles considered. Figure 13. Changes in the regional seasonal consecutive dry days. Colors indicate ensembles median of changes to the indicator's regional mean, while upward/downward pointing triangles indicate ensemble median increase/decrease in spread (coefficient of variation Comparison of the distributions of each indicator for the historical (H) near future (F1) and far future (F2) periods confirms the similarity of the two ensembles. Violin plots (Figure 4) illustrate that both the median and kernel density is well preserved between ENS and SUB for all three periods, although the violins for SUB tend to be less densely populated. Only in some cases are the SUB violins stripped of some outliers, seen as cut off tails, with the different populations leading to slightly modified shapes. Violin plots for seasonal indicators support good agreement between the two ensembles (not shown).
The comparability of the two ensembles persists within the climate change signal calculated for annual values of the indicators ( Table 4). Temperature changes deviate by no more than one-tenth of a degree, except for extreme cold temperatures in F2 where SUB warms 0.3 C less than ENS. Indicators reported in days disagree by no more than one day of change between ENS and SUB. Among the precipitation indicators, SUB indicates slightly stronger increases than ENS and exceeds the full ensemble change by 2%. This points to overall slightly wetter future conditions projected by SUB simulations and may be due to the strong representation of MIROC5 simulations, which have previously been described as simulating the overall highest PRCPTOT and RX5day (Sillmann et al., 2013a). Due to the smaller sample size, the standard deviation in the climate change signal is consequently larger for SUB than ENS.
For a more detailed portrait of the changes over the HBDB, quantification of future seasonal change was completed for 7 subregions of the HBDB. By evaluating ensemble median future change in each indicator's mean and coefficient of variation, we discern the details of the spatial and seasonal differences in climate change signals and reveal some disagreement between the ENS and SUB. From temperature indices, arctic amplification effects are detected within the northern FB, NWHB, and EHB-UB. This is evident from extreme warm temperature changes (TXx) for winter and summer, but less pronounced for the cold extreme temperatures (TNn). Stronger, more pronounced warming is also apparent in northern regions among less extreme climate characteristics, for example, the increase in the count of warm days (TX90p). In agreement with the detected trends discussed above, the number of cold nights (TN10p) decreases at a lower rate, more spatially uniform and with only a slight latitudinal gradient. The more detailed look at FDs and IDs (Figures 9 and 10) indicates that the changes occur mainly in the shoulder seasons (MAM and SON), where FDs decline more in the far future period than IDs. For the northern sub regions, we detect that the number of FD also decreases in summer (JJA).
The temporal-spatial differentiation in total precipitation (PRCPTOT) indicates that slight increases are stronger along the Eastern portion of the HBDB. This longitudinal gradient is most pronounced in fall. Five consecutive day cumulative precipitation (RX5day) similarly increases longitudinally with stronger increases in the East; however, this pattern is reversed in summer where southern and central subregions indicate stronger increases in extreme rainfall. These patterns appear to amplify into the far future, with larger gradients across the HBDB. Since larger amounts of consecutive 5-day precipitation have direct implications for streamflow (Sillmann et al., 2013b), these findings align well with similar spatial patterning described for runoff by Stadnyk et al. (2019). Changes in the number of CDDs across the HBDB mostly slightly decrease but with some indication of a small increase in fall for the southern and western subregions. Northern subregions may experience dry periods that are shortened by up to a week in the far future.
Trends in the variability of the indicators for the two future periods reveal only minor change. Because of these small changes, we retain only the sign of the change for the presentation of our results. We find that changes in variability are not only small, but also noisy, and the sign may change between future horizons and ensembles. Possible changes in variability at finer scales were not detected by our subregional lumped analysis.
Similarity between the two ensembles, ENS and SUB, holds in the majority of comparison across subregions and among seasons. While the sign of the change in variability flips for some regions and seasons, differences in the change signal are minor and remain mostly within the same or adjacent color classes, as can be seen in the maps. We can, therefore, state that the results of the climate change analysis of overall annual and seasonal climatologies over the terrestrial HBDB drawn from ENS or SUB are commensurate. The ensembles may differ in finer detail where SUB may not reproduce aspects sampled with ENS, thus the latter remaining the preferred ensemble. It should be noted that compared to ENS, SUB projects intensified warming over parts of Hudson Bay (Figure 3) which may affect results for marine studies related for example to sea ice and water temperatures.
In summary, we report that the future climate of the HBDB is projected to be warmer, with more extreme warm temperatures and less extreme cold temperatures, with more warm days and fewer cold nights. The number of days with freeze-thaw cycles (FD) will decrease, particularly in the shoulder seasons, leaving also less IDs in those seasons. Temperature-related changes are generally more pronounced in the northern, high-latitude portion of our study domain, confirming the arctic amplification effect. Future climate conditions also project an overall increase in precipitation, and rainfall from extreme 5-day events. Increases in precipitation are associated with an increase in daily precipitation intensity and a reduction in the duration of dry periods. While nuanced in some subregion seasonality, a general conclusion across the HBDB is a longitudinal gradient with stronger increases in absolute precipitation amounts for the Eastern portion of the drainage basin.

Conclusions
Making a selection from the large number of climate simulations available from CMIP ensembles is often a requirement for studies employing large-scale and/or long-time series impact models that ingest data at daily or finer time steps. Computational and data handling constraints make this step inevitable to reduce data input/ output and computational cost. In the context of studying climate change and river regulation impacts on the Hudson Bay for the BaySys Project, a hydrological model (HYPE;MacDonald et al., 2018;Stadnyk et al., 2020 and an ocean circulation model (NEMO; Lukovich et al., 2021) were driven with the two different sets of climate simulation data compared in this study. The smaller of the two ensembles used to drive NEMO was a subset of five simulations from the larger 19 simulation ensemble used to drive HYPE. The ensembles were selected from 54 CMIP5 simulations using annual and seasonal mean climate change signals of precipitation and temperature variables, as this was the focus for the BaySys project and the most influential driving variables for the BaySys suite of models. We analyzed the comparability of the two ensembles by comparing 10 climate indicators (Zhang et al., 2011) calculated from daily temperature and precipitation values for the HBDB and across 7 subregions. Statistics of the indicators were compared for a historical reference period and 2 future time periods, on an annual and seasonal basis. In general, cold temperatures over the HBDB will increase faster (up to 8 C) than warm temperatures (up to 4 C), with stronger warming in the North. This warming will reduce FDs and IDs almost exclusively during the shoulder seasons and more so in the southern portion of the drainage basin. Total precipitation as well as 5-day maxima will increase by up to 10% in future, with slightly higher average precipitation intensities. This affects dry spells, which our analysis projects to be slightly shorter in the future. Absolute precipitation increases along a west to east gradient, with larger increases in the eastern portion of the study region. This gradient is less pronounced for 5-day precipitation amounts, where increases are also projected for southern and eastern basins, particularly in summer and winter. By definition, the 19-member ensemble is more densely populated and therefore samples the climate in more detail. Our comparison, however, shows that both the baseline and future climate mean and variability of the indicators are commensurate among the two different-sized ensembles. Since our comparison is built on extreme indicators, this comparability is maintained regarding changes to highest and lowest temperatures, as well as the largest precipitation amounts and dry spells, although the most extreme outliers in the distributions are less frequent in the smaller ensemble. We conclude that over the HBDB, studies and sub projects of BaySys that rely on the smaller sub ensemble of 5 simulations sample a very similar climate as the ones that can employ the larger 19 simulation ensemble. It can be expected that joint analysis of model results the domain interfaces, although they may be subject to other factors, are not substantially limited by the representation of the driving climate through differently sized ensembles.

Data accessibility statement
All data used in this research are accessible from the Coupled Model Intercomparison Project version 5 (CMIP5) and were post-processed by Ouranos.

Supplemental files
The supplemental files for this article can be found as follows: Figures S1-S14. Docx