The occurrence of a butterfly species (parnassius) is determined by climate and sedum presence

Abstract

Climate change is expected to affect some species more than others. Alpine species are particularly susceptible because changes in climate shift distributions to higher altitudes that can result in extirpation on certain mountains. Additionally, species that are reliant on interactions with other species can be at increased risk because mismatches in their distributions over time. Here, we compare the project changes in distribution of Rocky Mountain Apollo butterfly (Parnassius smintheus) and the host-plant they are reliant on. We hypothesize that climate change will negatively impact P. smintheus more relative to other alpine butterfly species because parnassius is dependent on a host-plant that has a different climate niche and thus decreasing overlap in area with climate change. We used maximum entropy (MaxEnt) to model the projected occurrence of P. smintheus, its host-plant, and four other butterfly species with climate change. We modelled the future distributions using four major global circulation models, each of the four representative concentration pathways of climate change, and two time frames (2050, 2070). The variables we used in the models included altitude, soil characteristics, and climate variables from WorldClim.

Species distribution modelling

All species distribution modelling was conducting using MaxEnt on a Linux server from Compute Canada. Two separate scripts were used: 1) models for parnassius with sedum pasmModels, and 2) models for all other species AllSppModels. In addition, we provide scripts that combine and crop all the climate rasters across the different RCPs & GCMs combineClimateRas.r. Scripts that were not run on the server include generating the resistance matrices resistanceMatrix.r, determining the relevant soil variables soildata.r, converting a digitial elevation model to a standard deviation at the 1 km resolution altprocessing.r, and creating a polygon for the study area studyArea.r. Climate and elevation rasters are not included in this repository because the rasters are too large for support on GitHub. The climate rasters were downloaded from WorldClim

None of this work would have been possible if not for those that make GBIF possible through data collection, cleaning, compiling, and publishing. The datasets used in this study can be found below:

Insect: GBIF.org (08 November 2018) GBIF Occurrence Download https://doi.org/10.15468/dl.sunrzb Plant: GBIF.org (08 November 2018) GBIF Occurrence Download https://doi.org/10.15468/dl.cwjtau

## load packages 
library(raster)
library(dismo)
library(ClusterR)

library(rgdal)
#library(rJava)
# library(maptools)
library(rgeos)
library(MASS)
library(bestglm)
library(tidyverse)

## Load study area
studyarea <- readOGR("data//studyOutline.shp")

## OGR data source with driver: ESRI Shapefile 
## Source: "D:\RStudio\CCTrophicInteractions\data\studyOutline.shp", layer: "studyOutline"
## with 25 features
## It has 10 fields

download sedum and parnassius data

# spp.name <- data.frame(spp=c("sela","pasm"),genus=c("Sedum","Parnassius"),species=c("lanceolatum","smintheus"))
# 
# ## Collected bias species from http://friendsofkootenay.ca/
# ## bias.spp <- read.csv("data//biasdata.csv", stringsAsFactors=FALSE)
# ## Collected from butterflies of North America https://www.butterfliesofamerica.com/list.htm
# bias.spp <- read.csv("data//biasdata2.csv", stringsAsFactors=FALSE)
# attach(bias.spp)
# 
# for(i in 1:length(Genus))
# {
# tryCatch({
# temp <- gbif(Genus[i],Species[i], geo=T, ext=extent(r0), ntries=10) ## extract species name from GBIF server
# if(is.null(temp)) stop(paste("Species failed: ",Genus[i],Species[i]))
# temp<-na.omit(temp[,c("lon","lat")]) ## remove all columns except lat-long
# temp <- temp[!duplicated(temp), ] ## duplicated occurences
# temp["species.name"]<- rep(paste(Genus[i]," ",Species[i]),nrow(temp)) ##generate column for species name
# ##write csv with species occurences
# write.csv(temp, paste0("data//bias//",Genus[i],".",Species[i],".csv"),row.names=F)
# }, 
# error=function(e){cat("ERROR :",conditionMessage(e), "\n")})
# }
# 
# ## List species files and join them
# occurdat<-list.files("data//bias//",pattern=".csv$",full=T)
# 
# butterflies <- data.frame()
# for(i in 1:length(occurdat)){
# temp<-read.csv(occurdat[i])
# butterflies<-rbind(butterflies,temp)
# }
# 
# write.csv(butterflies, "data//butterfly.occ.csv", row.names = FALSE)

Plot areas of test species

## Load species
sedm <- read.csv("data//sedumRed.csv")
pasm <- read.csv("data//parnassius.clean.csv")


## get lat long from each dataset
bothspp <- rbind(sedm[,c("lon","lat")], pasm[,c("lon","lat")])
bothspp[,"species"] <- c(rep("Sedum sp.",nrow(sedm)),rep("P. smintheus", nrow(pasm)))

## crop to study area
bothspp <- subset(bothspp, lon < -101 & lon > -148.7)  ## adjust lon
bothspp <- subset(bothspp, lat < 69 & lat > 32) ## adjust at

## Create convex polygons of study area
find_hull <- function(df) df[chull(df$lon, df$lat), ]
hulls <- plyr::ddply(bothspp, "species", find_hull)

## Colour blind palette
cbPalette <- c("#56B4E9","#E69F00")


## Plot distributions
ggplot(bothspp, aes(x=lon, y=lat, color=species, fill=species)) + geom_polygon(data = hulls, alpha = 0.5) + theme(text = element_text(size=12), axis.text.x = element_text(angle=35, hjust=1), panel.grid.major = element_blank(), panel.grid.minor = element_blank()) + xlab("Longitude") + ylab("Latitude") + geom_point(size=0.8) + scale_fill_manual(values = cbPalette) + scale_color_manual(values = cbPalette)

Generate raster bias file

# Get a list of species on GBIF for Canada
Insect: GBIF.org (08 November 2018) GBIF Occurrence Download https://doi.org/10.15468/dl.sunrzb
Plant: GBIF.org (08 November 2018) GBIF Occurrence Download https://doi.org/10.15468/dl.cwjtau
Lists were refined to Canada, has coordinates, Insecta/Plantae, &  observation/living speciman/museum, and without geospatial issues

plants <- read.csv("data//plantsbias.csv")
insect <- read.csv("data//insectbias.csv")

biasspp <- rbind(plants,insect[-2])

butterflies <- read.csv("data//butterfly.occ.csv")

## create empty raster
empty.ras <- raster(nrows=dim(r0)[1],ncols=dim(r0)[1], crs=crs(r0), ext=extent(r0))

coordinates(butterflies) <- ~ lon +lat ## convert species into sp.df
butterflies <- crop(butterflies, empty.ras) ## remove occurrences outside of study area

# ## plot occurrences
occur.ras<-rasterize(butterflies@coords, empty.ras, 1)
plot(occur.ras)

##density function to create bias raster
presences<-which(values(occur.ras)==1)
pres.locs<-coordinates(occur.ras)[presences,]

dens<-kde2d(pres.locs[,1],pres.locs[,2],n=c(nrow(occur.ras),ncol(occur.ras)))
dens.ras<-raster(dens)
plot(dens.ras)

## write bias file raster
writeRaster(dens.ras, "data//biasfile.bil", overwrite=T)

Test relevant climate variables

r1 <- raster("data\\rasters\\alt.bil") #altitude
r2 <- raster("data\\rasters\\bio_1.bil") # annual temp
r3 <- raster("data\\rasters\\bio_12.bil") # annual precip
r4 <- raster("data\\rasters\\bio_6.bil") # min temperature in coldest month
r5 <- raster("data\\rasters\\bio_5.bil") # max temp of warmest month (July temp)
r6 <- raster("data\\rasters\\snowpack\\snowpack10.bil") ## snow in October for higher latitudes
r7 <- raster("data\\rasters\\snowpack\\snowpack12.bil") ## snow in December for majority of spp range
r8 <- raster("data\\rasters\\tmin\\tmin_10.bil") # average min temperature in October for higher latitudes
r9 <- raster("data\\rasters\\tmin\\tmin_11.bil") # average min temperature in November for majority of spp range
r10 <- raster("data\\rasters\\tmax\\tmax_10.bil") # average max temperature in October for higher latitudes
r11 <- raster("data\\rasters\\tmax\\tmax_11.bil") # average max temperature in November for majority of spp range

## soil rasters as suggests by Jens as proxies for bedrock
topsoiloc <- raster("data//soildata//topsoiloc.bil") ## organic carbon in soil
topsoilph <- raster("data//soildata//topsoilph.bil")  ## ph in water of soil
topsoilcec <- raster("data//soildata//topsoilcec.bil")  ## cation exchange - EC of soil
topsoilsilt <- raster("data//soildata//topsoilsilt.bil")  ## silt in topsoil
topsoilsand <- raster("data//soildata//topsoilsand.bil")  ## sand in topsoil

soilstack <- stack(topsoiloc,topsoilph, topsoilcec, topsoilsilt, topsoilsand)
soilstack <- crop(soilstack, west.ext)

## load bias file
biasfile <- raster("data//biasfile.bil")

# climate.all <- raster()
# for(i in 1:length(occurdat)){
#   temp.ras <- raster(occurdat[i])
#   climate.all <- stack(climate.all, temp.ras)
# }

climate.all <- stack(r1,r3,r4,r10)

## crop to western Canada
climate.all <- crop(climate.all, extent(biasfile))

## Read in Parnassius occurrence
pasm.gps<- read.csv("data//sedumall.csv")

## convert facilitated occurrence into spatial points
pasm.gps <- pasm.gps[,1:2]
crs.world <-CRS("+proj=longlat +datum=WGS84")
coordinates(pasm.gps) <- ~lon+lat
proj4string(pasm.gps) <- crs.world

##crop out points to western Canada
pasm.gps <- crop(pasm.gps, extent(biasfile))

## extract climate values for parnassius
pasm.climate <- extract(soilstack, pasm.gps)

## extract climate values for background
backgr <- randomPoints(soilstack, 5000)
backgr.climate <- extract(soilstack, backgr)

## join two datasets and add presence absence column
climate.data <- rbind(data.frame(pasm.climate),data.frame(backgr.climate))
climate.data[,"y"] <- as.numeric( c(rep("1",nrow(pasm.climate)),rep("0",nrow(backgr.climate))))

# ## run best subsets
m1 <- bestglm(climate.data, family=binomial, IC="AIC")

## see best models and output
m1$BestModels
summary(m1)

## Dropped snowpack in November because not relevant in best subsets
## dropped mininum temperatures because not significant contributors in Maxent (i.e. <3 %)

Create the niche space of target species

## list target species
spp.list <- c("data//bias//Erebia.epipsodea.csv","data//bias//Pontia.occidentalis.csv","data//bias//Chlosyne.palla.csv","data//bias//Pieris.rapae.csv","data//parnassius.clean.csv","data//sedumRed.csv")

## load points
extractClim <- function(x){

gps<- read.csv(x)

## convert facilitated occurrence into spatial points
gps <- gps[,1:2]
crs.world <-CRS("+proj=longlat +datum=WGS84")
coordinates(gps) <- ~lon+lat
proj4string(gps) <- crs.world

##crop out points to western Canada
gps <- crop(gps, studyarea)

## extract climate values for species
gps.climate <- raster::extract(allstack, gps) ## all variables
## output climate variable
return(data.frame(gps.climate))
}


## join all the species into one dataframe
climSpecies <- data.frame()
  for(i in 1:length(spp.list)){
  temp <- extractClim(spp.list[i])
  spp.name <- basename(spp.list[i])
  spp.name <- gsub(".csv","", spp.name)
  temp[,"species"] <- spp.name
  climSpecies <- rbind(climSpecies, temp)
  }

library(vegan)

## remove NAs
climSpecies <- na.omit(climSpecies)
## Remove outlier soil OC
climSpecies <- subset(climSpecies, topsoiloc < 12)
##drop sedum
climSpecies <- subset(climSpecies, species!="sedumRed")

## DCA with ordination data
ordClim <- climSpecies[1:10][rowSums(climSpecies[1:10]) != 0 ,]
# ordClim <- decostand(climSpecies2[1:10], method="standardize")
ordClim <- climSpecies[1:10]

## conduct a DCA
dca1 <- decorana(ordClim)
summary(dca1) ## <2.5 axis lengths = linear

cbPalette <- c("#999999", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7","#000000")

## standardize
ordClim <- decostand(ordClim, method="standardize")

## Run PCA
pca1 <- rda(ordClim, scale=T, center=T) ## run as correlation matrix and centralize
par(mar = c(4.5, 4.5, 0.5, 0.5)) 
plot(pca1, type="n", xlab="PCA1 (52.4%)", ylab=c("PCA2 (15.2%)"), cex.lab=1.3,
     xlim=c(-2,2), ylim=c(-1,1.5))
ordihull(scores(pca1, choices=c(1,2), display="sites"),  group = climSpecies$species, draw="polygon", col=c("#999999", "#E69F00", "#56B4E9", "#009E73","#F0E442"), alpha=150)
orditorp(scores(pca1, choices=c(1,2), display="species")/2.5, display="species", cex=1)

unique(climSpecies2$species)

## compare differences among species
library(lmPerm)

## Axis 1
pcaOut <- scores(pca1, choices=c(1), display="sites")
m1 <- aov(pcaOut[,1] ~ climSpecies$species)
m1.pm <- aovp(pcaOut[,1] ~ climSpecies$species)
summary(m1.pm)
TukeyHSD(m1)

## Axis 2
pcaOut <- scores(pca1, choices=c(2), display="sites")
m2 <- aov(pcaOut[,1] ~ climSpecies$species)
m2.pm <- aovp(pcaOut[,1] ~ climSpecies$species)
summary(m2.pm)
TukeyHSD(m2)

### extract range of variables from species
library(tidyverse)

means <- climSpecies2 %>% group_by(species) %>% summarize_all(funs(mean)) %>% data.frame(.)
mins <- climSpecies2 %>% group_by(species) %>% summarize_all(funs(min)) %>% data.frame(.)
maxs <- climSpecies2 %>% group_by(species) %>% summarize_all(funs(max)) %>% data.frame(.)

summarized <- rbind(means,mins,maxs)
summarized[,"stat"] <- c(rep("mean",nrow(means)),rep("min",nrow(mins)),rep("max",nrow(maxs)))

## trends in climate patterns among species
summarized

## rename species
climSpecies <- merge(climSpecies, data.frame(species=unique(climSpecies$species), Species=c("E. epipsodea", "P. occidentalis","C. palla","P. rapae", "P. smintheus")))

### generate histograms for species
library(tidyverse)
##Altitude
ggplot(climSpecies, aes(x=alt)) + geom_density(aes(fill=Species), alpha=0.3) + xlab("altitude (m)") + theme(text = element_text(size=14))
## Alt diff
ggplot(climSpecies, aes(x=elevationdiff)) + geom_density(aes(fill=Species), alpha=0.3) + xlab("variation in elevation (SD)") + theme(text = element_text(size=14))
## Tmax11
ggplot(climSpecies, aes(x=tmax11)) + geom_density(aes(fill=Species), alpha=0.3)+ xlab("November tmax (°C)") + theme(text = element_text(size=14))
## bio3
ggplot(climSpecies, aes(x=bio3)) + geom_density(aes(fill=species), alpha=0.3)+ xlab("isothermality (SD)") + theme(text = element_text(size=14))
## bio19
ggplot(climSpecies, aes(x=bio19)) + geom_density(aes(fill=Species), alpha=0.3)+ xlab("precipitation in coldest quarter (mm)") + theme(text = element_text(size=14))
## bio6
ggplot(climSpecies, aes(x=bio6)) + geom_density(aes(fill=Species), alpha=0.3)+ xlab("coldest temp in coldest month") + theme(text = element_text(size=14))
## bio3
## prec7
ggplot(climSpecies, aes(x=prec07)) + geom_density(aes(fill=Species), alpha=0.3)+ xlab("July precipitation (mm)") + theme(text = element_text(size=14))
## topsoiloc
ggplot(climSpecies, aes(x=topsoiloc)) + geom_density(aes(fill=species), alpha=0.3) + xlim(0,12)+ xlab("soil organic carbon (%)") + theme(text = element_text(size=14))
## topsoilsilt
ggplot(climSpecies, aes(x=topsoilsilt)) + geom_density(aes(fill=Species), alpha=0.3) + xlab("silt in soil (%)") + theme(text = element_text(size=14))

Analyze differences in models

otherspp <- read.csv("othersppOut.csv")
pasmSedum <- read.csv("pasmOut.csv")

pasmClimate <- subset(otherspp, species == "parnassius.clean")

diffModels <- data.frame(modeltype=c(rep("climateOnly",nrow(pasmClimate)),rep("sedumClimate",nrow(pasmSedum))), lost=c(pasmClimate$areaLoss,pasmSedum$areaLoss),gain=c(pasmClimate$areaGain,pasmSedum$areaGain))

## loss
m1 <- glm(lost ~ modeltype, data=diffModels, family="quasibinomial")
summary(m1)

## 
## Call:
## glm(formula = lost ~ modeltype, family = "quasibinomial", data = diffModels)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -0.15666  -0.06490  -0.01407   0.06189   0.19677  
## 
## Coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           -3.63623    0.04663 -77.988  < 2e-16 ***
## modeltypesedumClimate  0.41478    0.06040   6.867 5.65e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for quasibinomial family taken to be 0.00652604)
## 
##     Null deviance: 1.8646  on 239  degrees of freedom
## Residual deviance: 1.5508  on 238  degrees of freedom
## AIC: NA
## 
## Number of Fisher Scoring iterations: 6

anova(m1, test="Chisq")

## Analysis of Deviance Table
## 
## Model: quasibinomial, link: logit
## 
## Response: lost
## 
## Terms added sequentially (first to last)
## 
## 
##           Df Deviance Resid. Df Resid. Dev Pr(>Chi)    
## NULL                        239     1.8646             
## modeltype  1  0.31381       238     1.5508 4.08e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## pairwise comparison
library(emmeans)

## Welcome to emmeans.
## NOTE -- Important change from versions <= 1.41:
##     Indicator predictors are now treated as 2-level factors by default.
##     To revert to old behavior, use emm_options(cov.keep = character(0))

emm1 <- emmeans(m1, specs = pairwise ~ modeltype)
emm1

## $emmeans
##  modeltype    emmean     SE  df asymp.LCL asymp.UCL
##  climateOnly   -3.64 0.0466 Inf     -3.73     -3.54
##  sedumClimate  -3.22 0.0384 Inf     -3.30     -3.15
## 
## Results are given on the logit (not the response) scale. 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast                   estimate     SE  df z.ratio p.value
##  climateOnly - sedumClimate   -0.415 0.0604 Inf -6.867  <.0001 
## 
## Results are given on the log odds ratio (not the response) scale.

aggregate(lost ~ modeltype, data=diffModels,FUN=mean) ## values

##      modeltype       lost
## 1  climateOnly 0.02567501
## 2 sedumClimate 0.03836652

## gain
m2 <- glm(gain ~ modeltype, data=diffModels, family="quasibinomial")
summary(m2)

## 
## Call:
## glm(formula = gain ~ modeltype, family = "quasibinomial", data = diffModels)
## 
## Deviance Residuals: 
##       Min         1Q     Median         3Q        Max  
## -0.176901  -0.057754   0.000073   0.050952   0.150563  
## 
## Coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           -2.75878    0.02883 -95.704  < 2e-16 ***
## modeltypesedumClimate -0.22549    0.04298  -5.247 3.43e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for quasibinomial family taken to be 0.005588141)
## 
##     Null deviance: 1.5139  on 239  degrees of freedom
## Residual deviance: 1.3592  on 238  degrees of freedom
## AIC: NA
## 
## Number of Fisher Scoring iterations: 6

anova(m2, test="Chisq")

## Analysis of Deviance Table
## 
## Model: quasibinomial, link: logit
## 
## Response: gain
## 
## Terms added sequentially (first to last)
## 
## 
##           Df Deviance Resid. Df Resid. Dev  Pr(>Chi)    
## NULL                        239     1.5139              
## modeltype  1  0.15465       238     1.3592 1.436e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## pairwise comparison
emm2 <- emmeans(m2, specs = pairwise ~ modeltype)
emm2

## $emmeans
##  modeltype    emmean     SE  df asymp.LCL asymp.UCL
##  climateOnly   -2.76 0.0288 Inf     -2.82     -2.70
##  sedumClimate  -2.98 0.0319 Inf     -3.05     -2.92
## 
## Results are given on the logit (not the response) scale. 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast                   estimate    SE  df z.ratio p.value
##  climateOnly - sedumClimate    0.225 0.043 Inf 5.247   <.0001 
## 
## Results are given on the log odds ratio (not the response) scale.

aggregate(gain ~ modeltype, data=diffModels,FUN=mean) ## values

##      modeltype       gain
## 1  climateOnly 0.05959259
## 2 sedumClimate 0.04814152

#### Compare among species

## revise pasm-sedum dataset
pasmSedum[,"species"] <- "parnassius-sedum"
names(pasmSedum)[2:3] <- c("cor.model","auc.model") ## rename columns
otherspp[,c("cor.current","cor.future")] <- NA ## add empty columns for sedum correlation
allspp <- rbind(pasmSedum, otherspp)

## proportion of range change
allspp[,"loss"] <- allspp[,"areaLoss"]/allspp[,"oldRange"] ## proportion of oldrange
allspp[,"gain"] <- allspp[,"areaGain"]/allspp[,"oldRange"]

## Gain
m3 <- glm(gain ~ species, data=allspp, family="binomial")

## Warning in eval(family$initialize): non-integer #successes in a binomial glm!

summary(m3)

## 
## Call:
## glm(formula = gain ~ species, family = "binomial", data = allspp)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -0.34255  -0.11431  -0.00081   0.09609   0.77780  
## 
## Coefficients:
##                            Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                 -1.3113     0.2232  -5.874 4.26e-09 ***
## speciesErebia.epipsodea     -0.7192     0.3621  -1.986 0.046988 *  
## speciesparnassius-sedum     -0.7026     0.3606  -1.948 0.051389 .  
## speciesparnassius.clean     -0.5486     0.3483  -1.575 0.115267    
## speciesPieris.rapae         -0.5457     0.3481  -1.568 0.116958    
## speciesPontia.occidentalis  -1.1893     0.4108  -2.895 0.003792 ** 
## speciessedumRed             -1.7815     0.5005  -3.559 0.000372 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 38.915  on 839  degrees of freedom
## Residual deviance: 19.215  on 833  degrees of freedom
## AIC: 230.05
## 
## Number of Fisher Scoring iterations: 6

anova(m3, test="Chisq")

## Analysis of Deviance Table
## 
## Model: binomial, link: logit
## 
## Response: gain
## 
## Terms added sequentially (first to last)
## 
## 
##         Df Deviance Resid. Df Resid. Dev Pr(>Chi)   
## NULL                      839     38.915            
## species  6     19.7       833     19.215 0.003131 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## pairwise comparison
emm3 <- emmeans(m3, specs = pairwise ~ species)
emm3

## $emmeans
##  species             emmean    SE  df asymp.LCL asymp.UCL
##  Chlosyne.palla       -1.31 0.223 Inf     -1.75    -0.874
##  Erebia.epipsodea     -2.03 0.285 Inf     -2.59    -1.472
##  parnassius-sedum     -2.01 0.283 Inf     -2.57    -1.459
##  parnassius.clean     -1.86 0.267 Inf     -2.38    -1.336
##  Pieris.rapae         -1.86 0.267 Inf     -2.38    -1.334
##  Pontia.occidentalis  -2.50 0.345 Inf     -3.18    -1.825
##  sedumRed             -3.09 0.448 Inf     -3.97    -2.215
## 
## Results are given on the logit (not the response) scale. 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast                               estimate    SE  df z.ratio p.value
##  Chlosyne.palla - Erebia.epipsodea       0.71918 0.362 Inf  1.986  0.4233 
##  Chlosyne.palla - parnassius-sedum       0.70259 0.361 Inf  1.948  0.4482 
##  Chlosyne.palla - parnassius.clean       0.54859 0.348 Inf  1.575  0.6985 
##  Chlosyne.palla - Pieris.rapae           0.54572 0.348 Inf  1.568  0.7031 
##  Chlosyne.palla - Pontia.occidentalis    1.18932 0.411 Inf  2.895  0.0582 
##  Chlosyne.palla - sedumRed               1.78147 0.501 Inf  3.559  0.0068 
##  Erebia.epipsodea - parnassius-sedum    -0.01659 0.402 Inf -0.041  1.0000 
##  Erebia.epipsodea - parnassius.clean    -0.17059 0.391 Inf -0.437  0.9995 
##  Erebia.epipsodea - Pieris.rapae        -0.17346 0.391 Inf -0.444  0.9994 
##  Erebia.epipsodea - Pontia.occidentalis  0.47014 0.447 Inf  1.051  0.9420 
##  Erebia.epipsodea - sedumRed             1.06229 0.531 Inf  2.001  0.4142 
##  parnassius-sedum - parnassius.clean    -0.15400 0.389 Inf -0.395  0.9997 
##  parnassius-sedum - Pieris.rapae        -0.15687 0.389 Inf -0.403  0.9997 
##  parnassius-sedum - Pontia.occidentalis  0.48674 0.446 Inf  1.091  0.9311 
##  parnassius-sedum - sedumRed             1.07888 0.530 Inf  2.036  0.3920 
##  parnassius.clean - Pieris.rapae        -0.00287 0.378 Inf -0.008  1.0000 
##  parnassius.clean - Pontia.occidentalis  0.64073 0.436 Inf  1.468  0.7638 
##  parnassius.clean - sedumRed             1.23287 0.522 Inf  2.363  0.2144 
##  Pieris.rapae - Pontia.occidentalis      0.64361 0.436 Inf  1.475  0.7596 
##  Pieris.rapae - sedumRed                 1.23575 0.522 Inf  2.369  0.2117 
##  Pontia.occidentalis - sedumRed          0.59214 0.565 Inf  1.047  0.9429 
## 
## Results are given on the log odds ratio (not the response) scale. 
## P value adjustment: tukey method for comparing a family of 7 estimates

aggregate(gain ~ species, data=allspp,FUN=mean) ## values

##               species       gain
## 1      Chlosyne.palla 0.21226636
## 2    Erebia.epipsodea 0.11603776
## 3    parnassius-sedum 0.11775074
## 4    parnassius.clean 0.13471366
## 5        Pieris.rapae 0.13504903
## 6 Pontia.occidentalis 0.07581329
## 7            sedumRed 0.04340590

## Loss
m4 <- glm(loss ~ species, data=allspp, family="quasibinomial")
summary(m4)

## 
## Call:
## glm(formula = loss ~ species, family = "quasibinomial", data = allspp)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -0.33500  -0.07252  -0.02074   0.04991   0.30572  
## 
## Coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                -3.01460    0.04489 -67.162  < 2e-16 ***
## speciesErebia.epipsodea    -0.29058    0.06816  -4.263 2.24e-05 ***
## speciesparnassius-sedum     0.74118    0.05546  13.364  < 2e-16 ***
## speciesparnassius.clean     0.22738    0.06048   3.759 0.000182 ***
## speciesPieris.rapae        -3.59718    0.26269 -13.694  < 2e-16 ***
## speciesPontia.occidentalis -0.77969    0.07867  -9.911  < 2e-16 ***
## speciessedumRed             0.18058    0.06106   2.958 0.003187 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for quasibinomial family taken to be 0.01077871)
## 
##     Null deviance: 26.2229  on 839  degrees of freedom
## Residual deviance:  8.7851  on 833  degrees of freedom
## AIC: NA
## 
## Number of Fisher Scoring iterations: 9

anova(m4, test="Chisq")

## Analysis of Deviance Table
## 
## Model: quasibinomial, link: logit
## 
## Response: loss
## 
## Terms added sequentially (first to last)
## 
## 
##         Df Deviance Resid. Df Resid. Dev  Pr(>Chi)    
## NULL                      839    26.2229              
## species  6   17.438       833     8.7851 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## pairwise comparison
emm4 <- emmeans(m4, specs = pairwise ~ species)
emm4

## $emmeans
##  species             emmean     SE  df asymp.LCL asymp.UCL
##  Chlosyne.palla       -3.01 0.0449 Inf     -3.10     -2.93
##  Erebia.epipsodea     -3.31 0.0513 Inf     -3.41     -3.20
##  parnassius-sedum     -2.27 0.0326 Inf     -2.34     -2.21
##  parnassius.clean     -2.79 0.0405 Inf     -2.87     -2.71
##  Pieris.rapae         -6.61 0.2588 Inf     -7.12     -6.10
##  Pontia.occidentalis  -3.79 0.0646 Inf     -3.92     -3.67
##  sedumRed             -2.83 0.0414 Inf     -2.92     -2.75
## 
## Results are given on the logit (not the response) scale. 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast                               estimate     SE  df z.ratio p.value
##  Chlosyne.palla - Erebia.epipsodea        0.2906 0.0682 Inf   4.263 0.0004 
##  Chlosyne.palla - parnassius-sedum       -0.7412 0.0555 Inf -13.364 <.0001 
##  Chlosyne.palla - parnassius.clean       -0.2274 0.0605 Inf  -3.759 0.0032 
##  Chlosyne.palla - Pieris.rapae            3.5972 0.2627 Inf  13.694 <.0001 
##  Chlosyne.palla - Pontia.occidentalis     0.7797 0.0787 Inf   9.911 <.0001 
##  Chlosyne.palla - sedumRed               -0.1806 0.0611 Inf  -2.958 0.0487 
##  Erebia.epipsodea - parnassius-sedum     -1.0318 0.0608 Inf -16.980 <.0001 
##  Erebia.epipsodea - parnassius.clean     -0.5180 0.0654 Inf  -7.922 <.0001 
##  Erebia.epipsodea - Pieris.rapae          3.3066 0.2639 Inf  12.532 <.0001 
##  Erebia.epipsodea - Pontia.occidentalis   0.4891 0.0825 Inf   5.929 <.0001 
##  Erebia.epipsodea - sedumRed             -0.4712 0.0659 Inf  -7.149 <.0001 
##  parnassius-sedum - parnassius.clean      0.5138 0.0520 Inf   9.879 <.0001 
##  parnassius-sedum - Pieris.rapae          4.3384 0.2609 Inf  16.631 <.0001 
##  parnassius-sedum - Pontia.occidentalis   1.5209 0.0724 Inf  21.019 <.0001 
##  parnassius-sedum - sedumRed              0.5606 0.0527 Inf  10.643 <.0001 
##  parnassius.clean - Pieris.rapae          3.8246 0.2620 Inf  14.599 <.0001 
##  parnassius.clean - Pontia.occidentalis   1.0071 0.0763 Inf  13.204 <.0001 
##  parnassius.clean - sedumRed              0.0468 0.0579 Inf   0.808 0.9843 
##  Pieris.rapae - Pontia.occidentalis      -2.8175 0.2668 Inf -10.562 <.0001 
##  Pieris.rapae - sedumRed                 -3.7778 0.2621 Inf -14.413 <.0001 
##  Pontia.occidentalis - sedumRed          -0.9603 0.0767 Inf -12.515 <.0001 
## 
## Results are given on the log odds ratio (not the response) scale. 
## P value adjustment: tukey method for comparing a family of 7 estimates

aggregate(loss ~ species, data=allspp,FUN=mean) ## values

##               species        loss
## 1      Chlosyne.palla 0.046770512
## 2    Erebia.epipsodea 0.035393793
## 3    parnassius-sedum 0.093347929
## 4    parnassius.clean 0.058018521
## 5        Pieris.rapae 0.001342621
## 6 Pontia.occidentalis 0.022003800
## 7            sedumRed 0.055513294

#### comparison with different RCP
library(tidyverse)

## standard error
se<-function(x) {sd(x)/sqrt(length(x))}

## separate GCM & RCP
allspp[,"RCP"] <- gsub("[^[:digit:]]","",allspp$model)
allspp[,"GCM"] <- gsub("[[:digit:]]","",allspp$model)

## rename species to be simpler
speciesShort <- data.frame(species = unique(allspp$species), spp=c("P. smintheus w/ Sedum","C. palla","P. smintheus","Sedum sp.","E. epipsodea","P. rapae","P. occidentalis"))
allspp <- merge(allspp, speciesShort, by="species")

meanchange <- allspp %>% group_by(RCP, spp) %>% summarize(loss=mean(areaLoss),errorLoss=se(areaLoss), gain=mean(areaGain), errorGain=se(areaGain))

### plot gain and loss as barplot


## convert change in study area to change from current distribution
allspp[,"loss"] <- allspp[,"areaLoss"]/allspp[,"oldRange"] ## proportion of oldrange
allspp[,"loss"] <- allspp[,"loss"]*-1 ## conert to negative
allspp[,"gain"] <- allspp[,"areaGain"]/allspp[,"oldRange"]
allspp[,"net"] <- allspp[,"loss"]+allspp[,"gain"]
allsppchange <- allspp %>% gather(change, value, c(loss,gain,net))

## convert values to percentagse
allsppchange[,"value"] <- allsppchange[,"value"]*100

change <- allsppchange %>% group_by(RCP, GCM, spp, change) %>% summarize(avg=mean(value), error=se(value)) %>%  data.frame(.)

## reorder plot
change$spp <- factor(change$spp, levels= c("P. smintheus","P. smintheus w/ Sedum","Sedum sp.","C. palla","E. epipsodea","P. occidentalis","P. rapae"))

## Rename GCM
gcmdf <- data.frame(GCM=c("gf","he","mg","mp"), gcm=c("GF (drier)","HE (drier & much warmer)","MG (wetter & less seasonal)","MP (modest warming & precipitation variability)"))
change <- merge(change, gcmdf, by="GCM")

## CB palette
cbPalette <- c("#56B4E9","#E69F00","#999999")

## Convert RCP to decimal
change$RCP <- as.character(as.numeric(change$RCP)/10)


## plot
ggplot(change, aes(x=spp, y=avg, fill=change)) + geom_bar(stat="identity", position=position_dodge()) + scale_fill_manual(values=cbPalette) + geom_hline(yintercept = 0) +geom_errorbar(aes(ymin=avg-error, ymax=avg+error), width=.2, position=position_dodge(.9)) + facet_grid(RCP ~ gcm ) + theme_bw()+ ylab("Percent change in range")  + xlab("") + theme(text = element_text(size=12), axis.text.x = element_text(angle=35, hjust=1), panel.grid.major = element_blank(), panel.grid.minor = element_blank()) + ylim(-20,40) + geom_vline(xintercept=2.5, linetype="dashed")

## values for results section
change %>% group_by(spp, change, GCM) %>% summarize(total=mean(avg)) %>%  data.frame(.)

##                      spp change GCM        total
## 1           P. smintheus   gain  gf  12.73532308
## 2           P. smintheus   gain  he  15.05210230
## 3           P. smintheus   gain  mg   7.77107945
## 4           P. smintheus   gain  mp  10.85659893
## 5           P. smintheus   loss  gf  -5.21624531
## 6           P. smintheus   loss  he  -3.54690218
## 7           P. smintheus   loss  mg  -2.60507321
## 8           P. smintheus   loss  mp  -2.78929670
## 9           P. smintheus    net  gf   7.51907777
## 10          P. smintheus    net  he  11.50520013
## 11          P. smintheus    net  mg   5.16600623
## 12          P. smintheus    net  mp   8.06730223
## 13 P. smintheus w/ Sedum   gain  gf   7.84925476
## 14 P. smintheus w/ Sedum   gain  he  15.16871048
## 15 P. smintheus w/ Sedum   gain  mg  11.23949190
## 16 P. smintheus w/ Sedum   gain  mp  13.07033273
## 17 P. smintheus w/ Sedum   loss  gf -13.99966609
## 18 P. smintheus w/ Sedum   loss  he -10.16195175
## 19 P. smintheus w/ Sedum   loss  mg  -5.97146174
## 20 P. smintheus w/ Sedum   loss  mp  -7.29908534
## 21 P. smintheus w/ Sedum    net  gf  -6.15041133
## 22 P. smintheus w/ Sedum    net  he   5.00675873
## 23 P. smintheus w/ Sedum    net  mg   5.26803016
## 24 P. smintheus w/ Sedum    net  mp   5.77124739
## 25             Sedum sp.   gain  gf  10.08347770
## 26             Sedum sp.   gain  he  17.20401871
## 27             Sedum sp.   gain  mg  11.60660773
## 28             Sedum sp.   gain  mp  14.99135796
## 29             Sedum sp.   loss  gf  -9.33036822
## 30             Sedum sp.   loss  he  -6.65938265
## 31             Sedum sp.   loss  mg  -2.55874429
## 32             Sedum sp.   loss  mp  -4.65891329
## 33             Sedum sp.    net  gf   0.75310948
## 34             Sedum sp.    net  he  10.54463606
## 35             Sedum sp.    net  mg   9.04786344
## 36             Sedum sp.    net  mp  10.33244467
## 37              C. palla   gain  gf  14.50704118
## 38              C. palla   gain  he  34.96690525
## 39              C. palla   gain  mg  17.52415236
## 40              C. palla   gain  mp  17.90844371
## 41              C. palla   loss  gf  -9.14959795
## 42              C. palla   loss  he  -3.49462992
## 43              C. palla   loss  mg  -2.56526157
## 44              C. palla   loss  mp  -3.49871547
## 45              C. palla    net  gf   5.35744323
## 46              C. palla    net  he  31.47227533
## 47              C. palla    net  mg  14.95889079
## 48              C. palla    net  mp  14.40972824
## 49          E. epipsodea   gain  gf  15.03072622
## 50          E. epipsodea   gain  he  16.08429303
## 51          E. epipsodea   gain  mg   8.26477189
## 52          E. epipsodea   gain  mp  14.63981915
## 53          E. epipsodea   loss  gf  -0.28321220
## 54          E. epipsodea   loss  he  -0.02357398
## 55          E. epipsodea   loss  mg  -0.14289250
## 56          E. epipsodea   loss  mp  -0.08736992
## 57          E. epipsodea    net  gf  14.74751402
## 58          E. epipsodea    net  he  16.06071905
## 59          E. epipsodea    net  mg   8.12187939
## 60          E. epipsodea    net  mp  14.55244923
## 61       P. occidentalis   gain  gf   2.50971005
## 62       P. occidentalis   gain  he   7.25886875
## 63       P. occidentalis   gain  mg   0.82672600
## 64       P. occidentalis   gain  mp   6.76705512
## 65       P. occidentalis   loss  gf  -8.93687554
## 66       P. occidentalis   loss  he  -6.40278041
## 67       P. occidentalis   loss  mg  -3.55227439
## 68       P. occidentalis   loss  mp  -3.31338715
## 69       P. occidentalis    net  gf  -6.42716549
## 70       P. occidentalis    net  he   0.85608834
## 71       P. occidentalis    net  mg  -2.72554839
## 72       P. occidentalis    net  mp   3.45366798
## 73              P. rapae   gain  gf   8.02200871
## 74              P. rapae   gain  he  10.63199663
## 75              P. rapae   gain  mg   2.85393418
## 76              P. rapae   gain  mp   8.81737528
## 77              P. rapae   loss  gf  -2.89463721
## 78              P. rapae   loss  he  -1.68913692
## 79              P. rapae   loss  mg  -2.17982333
## 80              P. rapae   loss  mp  -2.03792256
## 81              P. rapae    net  gf   5.12737150
## 82              P. rapae    net  he   8.94285971
## 83              P. rapae    net  mg   0.67411084
## 84              P. rapae    net  mp   6.77945272

Maps of projected species change in the future

###  get outline of western NA
# get California outline
us <- getData("GADM", country="USA", level=1) ## use state bounds from gadm website:
can <- getData("GADM", country="CAN", level=1) ## use state bounds from gadm website:
NApoly <- rbind(us,can)
studyarea <- raster("data//studyArea.tif")
NApoly <- crop(NApoly, studyarea)

NAreduced <- gSimplify(NApoly, tol=0.15, topology=FALSE) ## reduce tolerance to minimize islands
plot(NAreduced)
spdf <- SpatialPolygonsDataFrame(NAreduced, data.frame(NApoly))

studyOutline <- NAreduced

## Load study area
studyOutline <- readOGR("data//studyOutline.shp")

library(colorspace)
## list species
species <- unique(allspp$species)

### list rasters
rasDiff <- list.files("F://PASMmodels//RasterDiff//", pattern=glob2rx(paste0(species[5],"*85*")), full.names = T)

# rasDiff <- list.files("E://PASMmodels//RasterDiff//", pattern="*85*", full.names = T) ## parnassius w sedum
rasOut <- list.files("F://PASMmodels//rasOut//", pattern="current", full.names = T)

### Average across climate models
diffmap <- mean(stack(rasDiff))

## Load current raster file
curmap <- raster(rasOut[6])

## select color ramp
# colors <- rev(diverge_hcl(11, palette = "Blue Red 3"))
# colors[1:4] <- c( "#5F1415" , "#5F1415",  "#9D3D3D", "#9D3D3D" )
# colors[7:11] <- c("#265BAB","#265BAB","#002F70","#002F70")
# colors[5:6] <- "#FFFFFF00"

## select new color ramp
colors <- rev(diverge_hcl(11, palette = "Blue Red 3"))
colors[1:4] <- c( "#FF7F50" , "#FF7F50",  "#FFA400", "#FFA400" )
colors[7:11] <- c("#067BC2","#067BC2","#002F70","#002F70")
colors[5:6] <- "#FFFFFF00"


## set breakpoints 
breakpoints <- seq(-1,1, length.out = 11)

## set breakpoints for current distro
breakpointsgrey <- seq(0,1, length.out = 11)
colorgrey <- rev(sequential_hcl(11, palette = "Grays"))

## plot area where PASM is lost


par(mar = c(3, 3, 0.5, 0.5))
plot(curmap, breaks=breakpointsgrey, col=alpha(colorgrey,0.7), legend=FALSE) ## add current distro as grey 
plot(diffmap, breaks=breakpoints, col=colors, add=T) ## add change in distro with colours
plot(studyOutline, add=T, border = "#4C4C4C90") ## outline of western North America
text(-145,36, expression(italic("P. rapae")), cex=2)

pdf("figure3.pdf", useDingbats=F, width=21, height=7)
par(mfrow=c(2,2))
dev.off()
# 
# text(-145,36, expression(atop(italic("P. smintheus"),italic("w/ sedum"))), cex=2)