Projecting regions of North Atlantic right whale, Eubalaena glacialis, habitat suitability in the Gulf of Maine for the year 2050

North Atlantic right whales (Eubalaena glacialis) are critically endangered, and recent changes in distribution patterns have been a major management challenge. Understanding the role that environmental conditions play in habitat suitability helps to determine the regions in need of monitoring or protection for conservation of the species, particularly as climate change shifts suitable habitat. This study used three species distribution modeling algorithms, together with historical whale abundance data (1993–2009) and environmental covariate data, to build monthly ensemble models of past E. glacialis habitat suitability in the Gulf of Maine. The model was projected onto the year 2050 for a range of climate scenarios. Specifically, the distribution of the species was modeled using generalized additive models, boosted regression trees, and artificial neural networks, with environmental covariates that included sea surface temperature, bottom water temperature, bathymetry, a modeled Calanus finmarchicus habitat index, and chlorophyll. Year-2050 projections used downscaled climate anomaly fields from Representative Concentration Pathway 4.5 and 8.5. The relative contribution of each covariate changed seasonally, with an increase in the importance of bottom temperature and C. finmarchicus in the summer, when model performance was highest. A negative correlation was observed between model performance and sea surface temperature contribution. The 2050 projections indicated decreased habitat suitability across the Gulf of Maine in the period from July through October, with the exception of narrow bands along the Scotian Shelf. The results suggest that regions outside of the current areas of conservation focus may become increasingly important habitats for E. glacialis under future climate scenarios.


Introduction
Many species of baleen whales were hunted to near extinction in the last few centuries as a result of commercial whaling (Clapham et al., 1999). Of the Gulf of Maine species, the North Atlantic right whale (Eubalaena glacialis) suffered the greatest population loss due to high commercial demand and is currently one of the most endangered marine mammals on the planet (listed as "Critically Endangered" on the IUCN Red List; Cooke, 2020), with an estimated population size of 356 + 11 and 13 as of 2019 (Pettis et al., 2021). Important E. glacialis habitats overlap with regions of high human activity, including heavily trafficked shipping lanes and fishing grounds (Kraus et al., 2005), which has inhibited postwhaling recovery (Clapham et al., 1999). The anthropogenic threats of vessel strikes and entanglements in fishing gear currently pose high risks to the species (Kraus, 1990;Caswell et al., 1999;Knowlton and Kraus, 2001;Kraus et al., 2005;Kenney, 2018;Sharp et al., 2019). As use of the marine environment changes, other human activities, such as wind energy and aquaculture, could pose additional threats in the future. In 2010, E. glacialis appears to have experienced a regime shift that coincided with increases in deep water temperature and decreases in the biomass of Calanus finmarchicus, a crucial food source (Davis et al., 2017;Record et al., 2019b;Sorochan et al., 2019), suggesting that the species is vulnerable to the effects of climate change. The calving rate has decreased roughly 40% since 2010, raising concern for the future of the species (Kraus et al., 2016).
Despite their comparative rarity today, large whales, including E. glacialis, serve fundamental ecological roles. Prior to commercial whaling, large whales likely recycled more nitrogen than three times the atmospheric input to the ocean. Today, large whales still sustain productivity in ecosystems by bringing deep water nutrients to the surface despite lower population numbers (Roman and McCarthy, 2010). When whales die and sink to the bottom of the ocean, they provide substrate for whale fall ecosystems. Whale falls are unique communities of organisms that have evolved to live off of the resources provided by whale carcasses. A conservative estimate suggested that 50% of all whale mortalities eventually become whale falls, facilitating the recycling of nutrients (Smith and Baco, 2003). Additionally, large whales served as a major carbon storage pool prior to commercial whaling. Due to their large biomass, whales efficiently sequester carbon, and rebuilding populations could be part of a strategy to manage carbon (Pershing et al., 2010).
For much of the year-June through September-E. glacialis heavily rely upon C. finmarchicus, a lipid-rich copepod that is abundant and highly aggregated in the Gulf of Maine in the summer months (Murison and Gaskin, 1989;Beardsley et al., 1996;Baumgartner et al., 2003a;Runge et al., 2014). During the summer growing season, C. finmarchicus store lipids in order to survive overwintering (i.e., diapause; Ji, 2011), providing an energy-rich prey resource for whales. C. finmarchicus mature within months in temperate waters like the Gulf of Maine (Ji, 2011), which has historically provided suitable habitat for this zooplankton species. However, warming waters are likely affecting the suitability of the Gulf of Maine for C. finmarchicus (Plourde et al., 2009;Runge et al., 2014;Record et al., 2019a). The distribution of E. glacialis shifted around the year 2010 (Meyer-Gutbrod and Greene, 2014; Davis et al., 2017), with an increased presence in the mid-Atlantic region, a known migratory corridor, and a decreased presence in the eastern Gulf of Maine, suggesting that E. glacialis habitat suitability may be directly affected by climate change. As populations of copepods, namely C. finmarchicus, shift their distribution, E. glacialis will likely have to forage in new areas to find food . The current uncertainty in the seasonal patterns of E. glacialis feeding grounds and how they are likely to shift as climate change progresses confound efforts to protect the species from anthropogenic threats, including ship strikes and entanglement in fishing gear. In order to conserve E. glacialis, assessing current knowledge of climate-driven regime shifts and the effectiveness of current management strategies is imperative (Record et al., 2019b).
Species distribution models (SDMs) are an important tool for understanding and predicting habitat suitability . They can also help to provide a more complete picture of the distributions of rare or elusive species (Guisan and Zimmermann, 2000;Razgour et al., 2016;e.g., Bosso et al., 2018), such as E. glacialis. SDMs work by deriving relationships between species occurrence data (i.e., presence/absence) and relevant environmental covariates. These relationships can then be projected onto spatial and temporal domains of interest. Understanding the potential present and future distributions of a species is an important component of long-term resource management plans. Many SDM algorithms have been developed, each with strengths and weaknesses . Ensemble modeling allows for the merging of multiple SDMs in order to combine the benefits of individual SDM algorithms and improve predictive performance (Thuiller et al., 2009).
The focus of this study is E. glacialis in the Gulf of Maine (Figure 1). This region has historically been a critical foraging ground for the species. Intensive marine mammal surveys in this region over the past 30þ years (Brown et al., 2007) have provided comprehensive knowledge of the seasonal distribution patterns of E. glacialis. Because the climate has changed rapidly in the Gulf of Maine (e.g., Mills et al. 2013;Pershing et al., 2015), an improved understanding of possible future E. glacialis habitats is essential for effective resource management planning. To address these issues, we built and projected monthly ensemble SDMs of E. glacialis habitat suitability in the Gulf of Maine using an extensive presence/absence data set from multiple marine mammal surveys. We used output from a downscaled physical ocean model for moderate and extreme (i.e., high end of "business as usual") warming scenarios, satellitederived chlorophyll data, and modeled E. glacialis prey as environmental covariates. Because of the uncertainty surrounding the future state of the climate, as well as current knowledge of E. glacialis movement patterns, these projections do not represent forecasts. Instead, they constitute a synthesis of the current state of knowledge of the past distribution of E. glacialis and represent an approach to understanding what some potential future habitat scenarios for this critically endangered species might look like.

Study area
The study extent covered the Gulf of Maine from 39 to 45 N and 63 to 71 W ( Figure 1a). This domain includes the important historically recognized summer foraging grounds of E. glacialis, namely Cape Cod Bay (Mayo et al. 2004), the Great South Channel (CETAP, 1982;Kenney and Wishner, 1995), and the Bay of Fundy and Roseway Basin . The study extent also includes areas known to host large numbers of right whales, but which have not been identified as primary feeding grounds, such as the area between the New York shipping lanes and Cape Cod (Leiter et al., 2017), the deep basins of the Gulf of Maine, and the Scotian Shelf ; Figure 1a).

Species presence/absence data
We obtained marine mammal survey effort and associated sighting conditions from the North Atlantic Right Whale Consortium (NARWC, 2016) Sightings Database. From this data set, we extracted records of E. glacialis presence and absence (survey effort but E. glacialis not present) spanning the years 1993-2009 for our study area (Table S1; Figure 1b). We selected this range of years to overlap with the years represented by our environmental covariate data because of high survey effort and because the data preceded the year during which E. glacialis began to shift their distribution in response to changes in the Gulf of Maine climate (i.e., 2010). Included in the data set are survey information and E. glacialis sightings from numerous programs. Major data contributions to the data set we used (>5% of the dataset) came from shipboard and aerial surveys conducted by the Center for Coastal Studies from 1997 to 2009 (Brown et al., 2007), the New England Aquarium from 1993 to 2009, and the National Marine Fisheries Service (Cole et al., 2007). Our data set included data from line-transect surveys and from platforms of opportunity, as described in Pendleton et al. (2009). We excluded records for which the Beaufort sea state was greater than three and visibility was less than or equal to two nautical miles, both of which are indicators of poor conditions.

Environmental covariate data
Physical environmental covariate data were extracted from the present and future climate simulations of the Bedford Institute of Oceanography North Atlantic Model (BNAM) high resolution ocean circulation model of the North Atlantic Ocean (Brickman et al., 2016), with the Gulf of Maine region being the focus for this special issue. The BNAM simulations use an average of six Coupled Model Intercomparison Project Phase 5 future climate atmospheric forcings to produce Representative Concentration Pathway (RCP) 4.5 and RCP 8.5 climatologies for the periods centered between 2055 and 2075. In general, the model projections for the two RCPs are similar, with the main difference being that the RCP 4.5 scenario projects weaker temperature increases throughout the water column. The BNAM output includes temperature, salinity, and fluid dynamics fields for the entire water column, which have been used in numerous future climate studies (e.g., Beazley et al., 2018;Stanley et al., 2018;Le Corre et al., 2020;Mbaye et al., 2020). Each of these covariates was available across our study domain at a monthly temporal resolution for the climatological and year-2050 projections.
We used monthly chlorophyll data from the Ocean Colour Climate Change Initiative (Sathyendranath et al., 2019), linearly interpolated to match the spatial grid of the BNAM model. Our analysis took a climatological approach, using model averaged output representing the present climate period, centered on 1995. This period is representative of the conditions in the Gulf of Maine prior to the recent rapid climate shift (Pershing et al., 2015). The C. finmarchicus field was calculated using a life history model with sea surface temperature and surface chlorophyll as inputs (Ji, 2011). This model uses the seasonally varying conditions and provides a good index of the suitability of a location for diapause that can be estimated from available environmental data (Record et al., 2018). The bathymetry layer for the study extent was the NOAA Earth Topography relief model, ETOPO1 (Amante and Eakins, 2009), downsampled to match the grid of the other data layers.

Data processing
Presences and absences were extracted from the NARWC Sightings Database. Latitude and longitude values for each record were collated to a raster grid with 0.089 km by 0.058 km spatial resolution. The grid was defined by the native grid of the environmental covariate data described above-specifically that of the BNAM physical ocean model. Presence or absence of E. glacialis was assigned to each pixel with survey effort. A presence was recorded if there was at least one E. glacialis sighting at a given time within the pixel. An absence was recorded if there were no sightings of E. glacialis at a given time within the pixel. Duplicate records in one pixel on the same day were reduced to a presence if at least one record contained a presence and were converted to an absence if there was survey effort in that pixel but no recorded presence. The environmental covariate values at each pixel containing a presence or absence were appended to the sightings and effort data. The newly formatted data were cross-checked with the original environmental covariate and sightings data to ensure correct matching. We tested additional covariates relevant to E. glacialis and for which data were available in preliminary runs and excluded those that appeared to have very low explanatory power; for example, the Lyapunov exponent derived from flow fields, salinity, photosynthetically active radiation, and particulate organic carbon. Correlation coefficients were computed between the covariates chosen to determine which should be included in the model. Despite a high correlation between sea surface temperature and bottom temperature (r ¼ 0.95; Table S2), we decided to include both covariates due to differing biological significance-in particular, the importance of deep water temperatures to C. finmarchicus distributions. For this reason, covariates were ultimately chosen based on biological relevance (Guisan and Zimmermann, 2000) and contribution during preliminary model runs.

Species distribution modeling
Monthly individual and ensemble models were built using the biomod2 package in R (Thuiller et al., 2009; R Development Core Team, 2019). The package contains 10 inbuilt statistical SDMs. The biomod2 package was selected because of its ensemble functionality, which allowed us to create models more robust than individual models. Using a user-specified evaluation metric threshold, biomod2 averages individual models together to create an ensemble. We integrated generalized additive models (GAMs), boosted regression trees (BRTs), and artificial neural networks (ANNs) to create an ensemble. We tuned the GAMs by selecting the "GAM_mgcv" algorithm and setting the number of knots to 4. We tuned the BRTs by setting the number of trees to 1,000, the interaction depth to 9, the shrinkage to 0.01, the bag fraction to 0.5, and the number of cross-validation folds to 10. We tuned the ANNs by setting the number of cross-validation folds to 10 and the maximum number of iterations to 1,000. Model tuning was performed using the BIOMOD_ModelingOptions function in biomod2.
Each algorithm has strengths and weaknesses. Although GAMs are often accurate explanatory models, they tend to have low predictive capabilities (Roloff and Kernohan, 1999;Pearce et al., 2000;Guisan et al., 2002). GAMs serve as good baseline models that tend to capture general trends in the data. Despite potential for overfitting, BRTs learn the data in detail and generally have higher predictive performance than other models Elith et al., 2008). ANNs generally perform better than other models when the environmental covariates are strongly correlated (Li and Wang, 2013), as is the case in this study. Using ensemble models, we were able to leverage the strengths of each algorithm in order to improve model performance and projection capabilities.
For each modeling approach, we built separate models for each month of the year, working under the assumption that the relationship between E. glacialis distribution and environmental covariates is different at different times of the year. For all models, the covariates used were sea surface temperature, bottom water temperature, bathymetry (log depth), the modeled C. finmarchicus habitat index, and chlorophyll (log concentration). First, the data were formatted, so that the sightings data matched the environmental covariate data spatially and temporally. The individual monthly models for each of the three modeling algorithms were built using 10-fold cross-validation with random 70%/30%, training-testing, data splits for each fold. This approach resulted in a total of 30 models per month. Ten-fold cross-validation indicates that 10 models were built for each month and modeling algorithm, each with a different random 70%/30% data split to capture a range of possible model outputs. When extrapolating models across space or time onto non-analog conditions, random data splits can lead to overfitting SDMs (e.g., Elith et al., 2010;Anderson, 2013). To address this issue, we compared climatological fields with those from the year-2050 projections ( Figure S1). We found a high degree of overlap between the present and year-2050 projected conditions; therefore, we used random data splits. Response curves for the three model algorithms were plotted for each cross-validation fold. Evaluations and variable contributions were extracted from the biomod2 model object and plotted using a separate R script. We calculated the pairwise correlation between covariate contributions and the area under the receiver operator characteristic curve (AUC) scores in order to assess the relationship between the contribution of the covariates and model performance.
AUC and the true skill statistic (TSS) were used to compare ensemble and individual models, as both are common methods to evaluate SDMs (Fielding and Bell, 1997;Allouche et al., 2006;Liu et al., 2011). AUC compares the proportion of correctly classified presences to the proportion of incorrectly classified absences between the threshold values of 0 and 1. An AUC above 0.5 indicates better performance than a random model (Fielding and Bell, 1997). AUC is useful when true absence data are available and the goal of the model is to estimate the actual distribution of a species (Jiménez-Valverde, 2012), as is the case in this study. TSS compares the model output to a validation dataset that perfectly predicts the distribution, which is then used to compute the proportion of true positives for both presences and absences. Scores range between -1 and þ1, with a score of above zero indicating better performance than a random model (Allouche et al., 2006). Both AUC and TSS were computed using the inbuilt bio-mod2 function (Thuiller et al., 2009). An ensemble model was created for each month from a subset of the 30 monthly individual models (i.e., one model for each of the 10 cross-validation folds for each of the three algorithms). Models with an AUC greater than or equal to 0.7 were included in the ensemble. TSS was also calculated to verify the patterns seen using the AUC but was not used to determine inclusion and exclusion of the individual models from the ensembles. The ensemble was created by averaging the habitat suitability probabilities produced by the individual models. The ensemble models were projected back onto the environmental covariates to create one habitat suitability projection map for each month.
For the year-2050 projections, the ensembles and each of the individual models were projected onto the future climatological covariates. For the physical covariates, we used the projected climatologies for RCP 4.5 and RCP 8.5. For chlorophyll, we ran each climate scenario with three chlorophyll scenarios to obtain a wide range of results: half of present chlorophyll levels (HC), present chlorophyll levels (SC), and double present chlorophyll levels (DC). This approach gave six projected climate change scenarios for each month, and for each model. For C. finmarchicus, we computed the same index using the 2050 covariates as input under the SC scenario.

Model performance
The ensemble models consistently performed better than the individual models based on both AUC ( Figure 2) and TSS ( Figure S2). Individual models performed similarly to one another, with the evaluation score varying seasonally. BRTs performed worse than ANNs and GAMs in the winter months but performed better than ANNs and the same as GAMs in the fall, though these differences were within the range of variation for each method. The late summer and early fall (July, August, September, and October) had the best performance (ensemble AUC ! 0.9). The late winter, early spring, and November models had the worst performance (individual models: 0.7 AUC 0.8; ensembles: 0.8 AUC 0.9). For most of the results shown below, we focused on the July through October period, where model performance was consistently highest.

Covariate contribution
Covariate contributions varied seasonally, especially in the GAM and BRT models (Figure 3). A spring-bloom-like pattern was evident in April and May for both the GAM and BRT models based on the increased chlorophyll contribution. C. finmarchicus contribution appeared to have a high level of seasonality, especially in the BRTs where C. finmarchicus only contributed in the summer months (Figure 3c). The contribution of C. finmarchicus appeared to vary inversely with the contribution of bathymetry. This seasonality was less apparent in other models, especially ANNs, where the contribution of covariates appeared to vary little across the months (Figure 3a). In the winter, sea surface temperature contributed more in comparison to bottom temperature. In the summer months, when model performance was highest, bottom temperature contribution tended to dominate over sea surface temperature contribution. A correlation between the contribution of two covariates does not indicate that the two covariates are correlated.
Model performance varied with the relative contributions of some of the environmental covariates. There was a significant negative correlation between sea surface temperature and AUC for the three individual models (P < 0.001; Tables 1-3). There was a significant positive correlation between bottom temperature and AUC in the ANN and GAM models (P < 0.05; Tables 1 and 2), as well as between C. finmarchicus and AUC in the ANN and BRT models (P < 0.05; Tables 1 and 3).

Response curves
We chose to present the response curves for the BRT models for illustration ( Figure 4). The response curves for the ANN and GAM models are included in the supplement ( Figures S3 and S4). Model-predicted habitat suitability decreased with increased sea surface temperature and bottom temperature. Greater depths were associated with more suitable habitat than shallower depths. Increased C. finmarchicus and chlorophyll were associated with higher model-predicted habitat suitability. The different cross-validation folds (from each of the 10 model runs) appeared to result in similar response curves, with the exception of some extreme values of covariates (e.g., low sea surface temperature and high bottom temperature; Figure 4). Response curves were similar across the 4 months, with some variation. There was a stronger response to bottom temperature, for example, in the later months, and a weaker response to chlorophyll.

Climatological hindcasts
The ensemble model hindcasts aligned well with the 1993-2009 sightings data for the four months selected. In July, there was high habitat suitability in the central Gulf of Maine, especially off the northwestern edge of George's Bank and in Grand Manan Basin in the Bay of Fundy (Figure 5a). There was also high suitability along the Scotian Shelf and in Roseway Basin, despite a lack of  Asterisks indicate significant at P value < 0.05 (*), < 0.01 (**), or < 0.001 (***).
Art. 9(1) page 6 of 16 Ross et al: Projecting year-2050 North Atlantic right whale distributions sightings in that region. In August, high habitat suitability was concentrated primarily in Grand Manan Basin in the Bay of Fundy and in Roseway Basin, which is consistent with the relatively high number of sightings in these regions ( Figure 5b). In September, model hindcasts were similar to those in August, but with generally higher habitat suitability both in the Bay of Fundy and along the Scotian Shelf (Figure 5c). In October, in addition to the Bay of Fundy, there was a long band of high habitat suitability along much of the coast, but outside of coastal waters (Figure 5d).

Year-2050 projections
The year-2050 projections showed decreased habitat suitability in year-2050 under both the RCP 4.5 and RCP 8.5 scenarios. The overall pattern indicated a decline in habitat suitability in historical foraging areas, such as the Bay of Fundy and Cape Cod Bay, and an increase in habitat suitability in the eastern part of the domain along the Scotian Shelf and in Roseway Basin. Beyond this general pattern, some interesting nuances emerged. In July, the central Gulf of Maine and Roseway Basin became less suitable relative to the hindcasts (Figure 6a). The Bay of Fundy became unsuitable in all of the RCP 8.5 scenarios and in the HC scenario for RCP 4.5, and slightly less suitable in the SC and DC scenarios (Figure 6a). The July projections suggested a large decrease in habitat suitability in the central Gulf of Maine, specifically off of the northwestern edge of George's Bank (Figure 6a). A decrease in the Bay of Fundy, along the Scotian Shelf, and in Roseway Basin was observed, though these decreases were not as pronounced.
In August, the entire Gulf of Maine domain appeared unsuitable under all six scenarios, with the exception of slight suitability in Roseway Basin (Figure 6b). This lack of suitability was apparent, particularly in the RCP 4.5 and RCP 8.5 DC scenarios.
In September, there appeared to be a northeastward shift in habitat suitability, with the most suitable habitat located along the Scotian Shelf in all six scenarios ( Figure  6c). Habitat suitability decreased in the western Gulf of Maine, with the exception of a slight increase off the northwestern edge of George's Bank in both of the HC scenarios, relative to the hindcasts (Figure 6c). There was higher suitability relative to the hindcasts in the central Gulf of Maine in the SC and DC scenarios for RCP 4.5 and RCP 8.5 (Figure 6c).
In October, there was decreased habitat suitability in the western Gulf of Maine under all six scenarios ( Figure  6d). In the HC and DC scenarios for both RCP 4.5 and RCP 8.5, habitat suitability increased in Roseway Basin and along the Scotian Shelf (Figure 6d). The HC scenario under both RCP 4.5 and RCP 8.5 predicted more widespread suitability relative to the other scenarios.

Discussion
Managing right whales is a complex problem involving many human uses of the marine environment, such as energy development, ecotourism, and commercial shipping and fishing. Because of the dynamic and spatially varying objectives of the many stakeholders, there would be great benefit to understanding the changing distribution of Eubalaena glacialis (Brown et al., 2007;Pendleton et al., 2009;Pershing et al., 2009;Pendleton et al., 2012;Monsarrat et al., 2015;Gomez et al., 2017;Record et al., 2019b). Mortality rates for E. glacialis have increased markedly over the past few years as a byproduct of changing foraging patterns (Meyer-Gutbrod and Greene, 2014;Davies and Brillant, 2019;Record et al., 2019b). So far, the management response has been reactive to these changes, making adjustments only after spatiotemporal foraging patterns have changed and often after mortalities have occurred. Using forward-looking projections of changes in habitat suitability could help inform a more proactive approach to reducing humancaused mortality. To begin with, projections could aid in choosing which regions may warrant increased survey effort. There are regions in both the hindcasts and the year-2050 projections that show high habitat suitability but which have not been the focus of marine mammal survey programs. Projections can guide where to search for new aggregation areas before large mortality events occur. Taking the long view of changing species distributions also gives us a reference point for planning emerging  uses of the marine environment, such as the responsible development of aquaculture and wind energy resources. The ensemble models developed here had high performance metrics when evaluated using past E. glacialis data (1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009) and past climatologically averaged environmental covariates. Combining three individual models into ensembles allowed us to optimize model performance and hindcasting and projection capabilities. The ensemble models performed better than the individual models in part due to the selection threshold (AUC ! 0.7) for inclusion in the ensemble, which increased the AUC of the ensembles. Hindcasts indicated a high level of habitat suitability in areas and times with many sightings, such as Grand Manan Basin in the Bay of Fundy. In the past, the Bay of Fundy has been a crucial feeding ground for E. glacialis in July and August (Murison and Gaskin, 1989;Baumgartner et al., 2003b;Baumgartner and Mate, 2005;Brown et al., 2009;. In some cases, the hindcasts highlighted regions of limited historical survey effort. In October-a month of high model performance-the hindcasts indicated that Roseway Basin and regions along the Scotian Shelf are suitable E. glacialis Due to monitoring challenges, modeling approaches, and the dependence of the system on human behavior (e.g., emissions), projections of future whale distributions have substantial uncertainty. Using an ensemble approach and by projecting the models onto multiple future climate and chlorophyll scenarios, we attempted to explore this uncertainty space as fully as possible. Even with this widened view, there was some degree of consistency between the projected distributions. Generally, the model results showed a decrease in habitat suitability throughout the Gulf of Maine under most projections. In cases where suitable habitat remained, projections showed a northeastward shift in habitat suitability. This shift is consistent with other evidence that E. glacialis is susceptible to the effects of climate change through prey-driven changes (e.g., Meyer-Gutbrod and Greene, 2014; Davis et al., 2017;Record et al., 2019b).
One reason to undertake a modeling study like this one is to shed light on which aspects of the problem we understand fairly well and which aspects have knowledge gaps. The difference in model performance at different times of year is instructive in this regard. The individual models and the ensembles performed best in July, August, September, and October relative to the other months, with ensemble AUCs ranging from 0.92 to 0.95, demonstrating very strong performance (Pearce and Ferrier, 2000). These are among the months during which the Gulf of Maine has historically been an important foraging ground for E. glacialis (Murison and Gaskin, 1989;Kenney and Wishner, 1995;Brown et al., 2009;Pendleton et al., 2009;Patrician and Kenney, 2010). In particular, these are months when they are targeting the densely aggregated, lipid-rich stages of C. finmarchicus (Baumgartner et al., 2003b; Davies et al. 2014), suggesting that models based on the currently available covariates are able to capture this particular aspect of E. glacialis phenology accurately. The ensemble models performed the worst between November and June. The December performance was elevated somewhat, but this may have been due to the low number of sightings in December falsely inflating the AUC score (Table S1). These months also had the largest discrepancy between individual model evaluations and the ensemble evaluations. Some of the winter and spring months are historically important foraging times, particularly in Cape Cod Bay. During these months, other copepod species (i.e., Pseudocalanus spp. and Centropages spp.) can be more important than C. finmarchicus for periods of time (Mayo and Marx, 1990;Pendleton et al. 2009;Stamieszkin et al. 2010). Including these species as prey data layers could improve model performance. The ability to project these other copepod species onto year-2050 conditions would likely improve our projections of E. glacialis habitat suitability in the winter and spring months.
The environmental covariates each contributed in different ratios to the individual models. There was high seasonal variability, and during months when model performance was highest, bottom temperature and C.
finmarchicus had higher contributions. The ANN model did not capture seasonality, unlike the other two algorithms (Figure 3a), which may be due to the way that ANNs balance correlated covariates; they are adaptable and able to find a smooth fit for many correlated variables (Li and Wang, 2013). An increase in the contribution of chlorophyll in April and May in both the GAM and BRT models corresponds to the timing of the spring bloom; chlorophyll could be acting as a proxy in the models for copepod production during non-diapause months (Record et al., 2019a). The C. finmarchicus covariate used a model that focuses on diapause as an index of the lipid-rich prey resource. The seasonality of this covariate contribution reflects this choice: For example, this covariate only contributed in and adjacent to the summer months in the BRT models ( Figure 3c). This seasonality is consistent with the time period during which C. finmarchicus begins to enter diapause (Ji, 2011;Ji et al., 2017) and when E. glacialis feed upon this prey species in the Gulf of Maine .
The relationship between covariate contribution and model performance can be instructive as to the different roles of the covariates. A significant negative correlation between sea surface temperature and AUC for the three individual models suggests that a high reliance on sea surface temperature decreases model performance (Tables 1-3). On the contrary, bottom temperature was generally positively correlated with model performance.
Although sea surface temperature is among the most commonly used covariates in models of marine species (e.g., Nye et al., 2009;Pershing et al., 2009;Palialexis et al., 2011;Pendleton et al., 2012;Becker et al., 2018;Abrahms et al., 2019), these results illustrate a potential pitfall to ignoring bottom temperature, or other subsurface temperature measurements, when building SDMs in marine environments. The significant positive correlation between the C. finmarchicus habitat index and model performance also underscores the importance of including prey fields in E. glacialis SDMs. Relying on variables measured at the surface, like sea surface temperature and surface chlorophyll, is likely to miss important dynamics.
The year-2050 projections, although not forecasts per se, illustrate some important points. The lack of habitat suitability in the Bay of Fundy and the general northeastward shift in suitability suggests that E. glacialis is unlikely to return to its historical foraging patterns, potentially as a result of prey resources shifting northward into colder waters due to climate change (e.g., Record et al., 2019b;Sorochan et al., 2019). The September and October projections show an increase in habitat suitability in Roseway Basin and along the Scotian Shelf, indicating a shift to a narrower region of high suitability in comparison to the hindcasts. Although suitability decreased in the rest of the Gulf of Maine, this increase along Roseway Basin and the Scotian Shelf suggests the possibility of suitable habitat for E. glacialis just outside of the Gulf of Maine in the year 2050. Among the different climate scenarios tested, the greatest change in E. glacialis distribution patterns was generally associated with the higher emissions scenarios (RCP 8.5) and with the lowest chlorophyll scenarios (HC).
This study sets an initial benchmark for projecting future right whale distributions and leads to some logical next steps to improve the projections. If, as it appears, good model performance is related to the prey field input, then including Centropages spp. and Pseudocalanus spp. covariates would improve model performance for winter and spring months; including other Calanus species would allow for extending the model domain further into the Canadian waters that are becoming increasingly important. Strategies for integrating multiple types of right whale observation data (e.g., acoustic detections, whale-watch tours) into the model would broaden its applicability. The model framework could also be tuned to account for interannual variability as a way to make more near-term projections and to understand the higher resolution responses to environmental change. Some biological inconsistencies in the models, like the lack of seasonality in the covariate contributions of the ANNs, suggest that further intercomparison of different modeling techniques would be beneficial. Finally, we would benefit from a better understanding of the mechanisms underlying relevant processes, such as the biophysical coupling that aggregates prey into dense patches and whale movement behaviors beyond direct foraging. No model captures every process, but a carefully evaluated modeling exercise can synthesize what we do know about changing distributions and can also highlight knowledge gaps and data deficiencies.
Whales have many important ecological roles in the oceans (e.g., Roman and McCarthy, 2010;Pershing et al., 2010), and right whales have influence over human activities as a result of their status as critically endangered; maintaining their populations will likely mean adapting to long-term change. SDMs are inherently uncertain (Pearson et al., 2006), and that uncertainty should be kept in mind. For example, an important caveat is that these projections represent climatological 30-year means, and in any given year, historical foraging grounds could still be important. Still, projections can give us insights into possible future scenarios, so that we can be proactive when it comes to adapting to change. Evidence of past regime shifts, and the mechanisms suggested (e.g., Meyer-Gutbrod and Greene, 2014;Record et al., 2019b;Sorochan et al., 2019), indicate that future shifts in E. glacialis distribution are a strong possibility. A better understanding of our ability to predict future changes, and their associated uncertainty, will increase the likelihood of successful management and conservation plans.
Data accessibility statement E. glacialis sightings data were obtained from the North Atlantic Right Whale Consortium Sightings Database (www.narwc.org) and are available from the NARWC upon request. The physical oceanographic covariates from the downscaled BNAM climate model are archived and maintained within Fisheries and Oceans Canada. Requests for model output go through official channels via DB. The chlorophyll data are from the Ocean Colour Climate Change Initiative (https://www.oceancolour.org). The bathymetry data are from the NOAA ETOPO1 relief model (https://www.ngdc.noaa.gov/mgg/global/). Modeling code can be found at https://github.com/BigelowLab/ RIWH_2050.

Supplemental files
The supplemental files for this article can be found as follows: Table S1. Number of presences and absences used in monthly models of E. glacialis habitat suitability. Table S2. Correlation coefficients between environmental covariates. Figure S1. Density plots of present and year-2050 environmental covariates. Figure S2. Evaluation score for each model algorithm for each month using TSS. Figure S3. Response curves for the 10 cross-validation runs of the ANN models. Figure S4. Response curves for the 10 cross-validation runs of the GAM models.
Right Whale Consortium for curation and dissemination of marine mammal survey data. The collection of data utilized in this study was undertaken by numerous organizations and agencies whose field teams spent hundreds of hours in the air and at sea over the years. We recognize that the present study could not have been undertaken without those dedicated observers, their pilots, captains, support staff, and collaborators. We thank Philip Nyhus, Loren McClenachan, Michelle Staudinger, Robert Stephenson, Jody Deming, and an anonymous reviewer for providing extensive feedback and Manuel Gimond for providing technical help.

Funding
Funding for this project was provided by the Colby College Environmental Studies Department, the Colby College Provost Fund, the Colby College Presidential Scholars Opportunity Grant Fund, and Bigelow Laboratory for Ocean Sciences institutional funds. Support for NRR was provided by NASA grants NNX14AM77G and NNX16AG59G. Support for CAM was provided by NOAA Fisheries and the Massachusetts Division of Marine Fisheries.

Competing interests
The authors have no competing interests to declare.