R codes

In this page, I aim to provide a collection of R codes and functions that can be valuable to fellow scientists engaged in plant ecology research or related fields.

Your feedback and contributions are highly encouraged as I do not consider myself an R expert. Please feel free to contact me at gabriele.midolo [at] gmail [dot] com for any question or feedbacks.

1. Quickly retrieve ‘LHS’ traits for the European flora from FloraVeg.EU

updated (September 2023)

FloraVeg.EU is a great source to obtain data for the European vegetation, including habitat information and species traits. Some of these information are fully available at the download section. The data on FloraVeg.EU also contains some information on three main functional traits of the ‘leaf-height-seed’ (LHS) scheme proposed in the famous paper by Westoby (1998), namely specific leaf area (SLA), plant height and seed mass.

Screenshot from the FloraVeg.EU website. For each species, some traits of the LHS scheme might be available. Original link: here.

These data are however not (yet!) available in the download section.

The best solution to get such trait data is to get in touch with my former group members in Brno (Czech Republic) (see contacts here). Of course databases like TRY and LEDA are also great option, but requires some times to access the data and process them.
Sometimes one just need traits for a few single species for a quick analysis - (or to make comparisons / check outliers of LHS traits with other databases). In such cases one can easily use R to quickly access such information using the rvest R package (Wickham, 2022). This works pretty good if your plant names already follows the nomenclature from Euro+Med!
N.B.: Please, cite the FloraVeg.EU when using the data obtained this way. See ‘Data source and citation’ below

Here is the get.LHS.FloraVegEU function:

get.LHS.FloraVegEU <- function(species) {
  require(rvest)
  
  url <- paste0('https://floraveg.eu/taxon/overview/',species)
  webpage <- read_html(URLencode(url))
    
  #1. Get plant height
  str_plant.height <- '#panel_2 .featureDetail:nth-child(1) b'
  plant.height_html <- html_nodes(webpage, str_plant.height)
  plant.height <- html_text(plant.height_html)
  plant.height <- gsub('[^0-9.-]', '', plant.height)
  if(identical(plant.height, character(0))) {plant.height=NA} else {
    if(plant.height=='' | plant.height=='-') {plant.height=NA} else {
      plant.height <- as.numeric(plant.height)
    }
    }
    
  #2. get specific leaf area
  str_SLA <- '#panel_3 .align-items-center div'
  SLA_html <- html_nodes(webpage, str_SLA)
  SLA <- html_text(SLA_html)
  if(identical(SLA, character(0))){
    SLA <- NA
   } else {
     for (k in 1:length(SLA)) {
       i2 <- gsub('mm2', '', SLA[[k]])
       SLA[[k]] <- gsub('[^0-9.-]', '', i2)
     }
     SLA <- suppressWarnings(as.numeric(SLA[!is.na(as.numeric(SLA))]))
     
     if (identical(SLA, numeric(0))){
       SLA <- NA
     }
     
  }
    
  #3. get seed mass
  str_SM <- '#panel_5 b'
  SM_html <- html_nodes(webpage, str_SM)
  SM <- html_text(SM_html)
  SM <- gsub('[^0-9.-]', '', SM)
  SM <- ifelse(identical(SM, character(0)), NA, SM)
  SM <- as.numeric(SM)
    
  res <- data.frame(
      plant_height = plant.height,
      sla = SLA,
      seed_mass = SM)

  return(res)
}

To use the function just type the name of a single species present in the FloraVeg.EU checklist to obtain the results. For example:

> get.LHS.FloraVegEU('Bellis perennis')
  plant_height   sla seed_mass
1         0.09 32.32      0.12

Note that species nomencalure must correspond the one in FloraVeg.EU. Mispelled species or species for which data are not available will return either an error or NA values. To work with several species and avoid this issue, one can easily apply a modified version of this function using the ‘possibly’ function of the purrr R package (Wickham, 2023).

#An example of species list (vector):
splis = c('Bellis perennis', #Existing species with full LHS information
          'Pinus mugo', #Existing species with incomplete LHS information
          'Pseudobetckea caucasica', #Existing species with no information 
          'Taraxacum midolii' #Mispelled / unexisting species
          )

library(purrr)
get.LHS.FloraVegEU_possibly <- possibly(get.LHS.FloraVegEU, 
                                        data.frame(plant_height = NA, sla = NA, seed_mass = NA))
res <- map(splis, get.LHS.FloraVegEU_possibly) #map the function
names(res) <- splis #set species names
res.df <- do.call(rbind, res) # bind dataset

The code returns:

> res.df
                        plant_height   sla seed_mass
Bellis perennis                 0.09 32.32      0.12
Pinus mugo                      3.12    NA      6.08
Pseudobetckea caucasica           NA    NA        NA
Taraxacum midolii                 NA    NA        NA

References:

FloraVeg.EU – Database of European Vegetation, Habitats and Flora. www.floraveg.eu

Westoby, M. (1998). A leaf-height-seed (LHS) plant ecology strategy scheme. Plant and soil, 199, 213-227. https://doi.org/10.1023/A:1004327224729

Wickham H (2023). purrr: Functional Programming Tools. R package version 1.0.1, https://CRAN.R-project.org/package=purrr.

Wickham H (2022). rvest: Easily Harvest (Scrape) Web Pages. R package version 1.0.3, https://CRAN.R-project.org/package=rvest.

Data source and citation for plant functional traits:

Axmanová, I. (2022). Plant height. – www.FloraVeg.EU.

Axmanová, I. (2022). Specific leaf area. – www.FloraVeg.EU.

Axmanová, I. (2022). Seed mass. – www.FloraVeg.EU.

French Flora database (baseflor), project of Flore et végétation de la France et du Monde: CATMINAT. Available at http://philippe.julve.pagesperso-orange.fr/catminat.htm [accessed June 2020]

García-Gutiérrez, T., Jiménez-Alfaro, B., Fernández-Pascual, E., & Müller, J. V. (2018). Functional diversity and ecological requirements of alpine vegetation types in a biogeographical transition zone. Phytocoenologia, 77–89. https://doi.org/10.1127/phyto/2017/0224

Guarino, R., La Rosa, M. & Pignatti, S. (Eds) (2019). Flora d’Italia, volume 4. Bologna: Edagricole.

Hintze, C., Heydel, F., Hoppe, C., Cunze, S., König, A., & Tackenberg, O. (2013). D3: The Dispersal and Diaspore Database – Baseline data and statistics on seed dispersal. Perspectives in Plant Ecology, Evolution and Systematics, 15(3), 180–192. https://doi.org/10.1016/j.ppees.2013.02.001

Ladouceur, E., Bonomi, C., Bruelheide, H., Klimešová, J., Burrascano, S., Poschlod, P., … Jiménez-Alfaro, B. (2019). The functional trait spectrum of European temperate grasslands. Journal of Vegetation Science, 30(5), 777–788. https://doi.org/10.1111/jvs.12784

Kaplan, Z., Danihelka, J., Chrtek, J. Jr., Kirschner, J., Kubát, K., Štěpánek, J. & Štech, M. (Eds) (2019). Klíč ke květeně České republiky [Key to the flora of the Czech Republic]. Ed. 2. Praha: Academia.

Kleyer, M., Bekker, R. M., Knevel, I. C., Bakker, J. P., Thompson, K., Sonnenschein, M., … Peco, B. (2008). The LEDA Traitbase: A database of life-history traits of the Northwest European flora. Journal of Ecology, 96(6), 1266–1274. https://doi.org/10.1111/j.1365-2745.2008.01430.x

Tavşanoğlu, Ç., & Pausas, J. (2018). A functional trait database for Mediterranean Basin plants. Scientific data, 5, 180135. https://doi.org/10.1038/sdata.2018.135

2. Plot pairwise complete correlation matrices

updated (September 2023)

It often happens that a species × trait matrix (or any data set really) is incomplete because of missing trait information for some species. Some species may have some traits, while others don’t. In my experience, this makes difficult to check trait-trait co-variation using common visualization tools in R. Here, I attempted to provide a simple solution by plotting pairwise correlation estimates retaining complete observations (with cor(..., use="pairwise.complete.obs")) for each pair in the lower-right panel and reporting the sample size in the upper-left panel. This way, one can always know how many species report trait data information for each pair and check their correlation value.
Here are the functions that you can load in your environment to easily produce these plots.

#Plot correlation matrix with gaps

#1. Accessory functions ####
getsamplesize <- function(vec_a, vec_b) {
  nrow(drop_na(data.frame(vec_a,vec_b)))
}
firstup <- function(x) {
  substr(x, 1, 1) <- toupper(substr(x, 1, 1))
  x
}
get_lower_tri<-function(cormat){
  cormat[upper.tri(cormat)] <- NA
  return(cormat)
}
get_upper_tri <- function(cormat){
  cormat[lower.tri(cormat)]<- NA
  return(cormat)
}

#2. load the function
#N.B. this function requires tidyverse, reshape2, and corrr packages

plot.inc.cor <- function(mat, round.label.digit=2, cor.method='pearson', size.label=4){
  
  corm <- cor(mat, method = cor.method, use="pairwise.complete.obs")
  
  pcorm <- cor(mat, method = cor.method, use="pairwise.complete.obs") %>%
    get_lower_tri %>%
    reshape2::melt()
  pcorm$value = ifelse(pcorm$Var1 == pcorm$Var2, NA, pcorm$value)
  psams <- corrr::colpair_map(mat, getsamplesize) %>%
    as.data.frame %>%
    tibble::column_to_rownames('term') %>%
    as.matrix %>%
    get_upper_tri %>%
    reshape2::melt() %>%
    setNames(names(pcorm)) 
  
  pcorm$value = round(pcorm$value, round.label.digit)
  # Format the numeric values with two decimal places
  pcorm$formatted_value <- sprintf(paste0('%.',round.label.digit,'f'), pcorm$value)
  pcorm$formatted_value <- ifelse(pcorm$formatted_value=='NA',NA,pcorm$formatted_value)
  
  p <- ggplot() + 
    geom_tile(data = pcorm, aes(x=Var1, y=Var2, fill=value), color='grey80')+
    geom_text(data = psams, aes(x=Var1, y=Var2, label = value, alpha=log(value)), color = "black", size = size.label) +
    geom_text(data = pcorm, aes(x=Var1, y=Var2, label = formatted_value), color = "black", size = size.label) +
    scale_fill_gradient2(low = "#4A6FE3", mid = "white", high = "#D33F6A", 
                         midpoint = 0, limit = c(-1,1), space = "Lab", 
                         na.value = 'white',
                         name=paste0(firstup(cor.method),'\ncorrelation')) +
    theme_classic() +
    guides(alpha='none')+
    theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1, color='black'),
          axis.text.y = element_text(color='black'),
          axis.title = element_blank(),
          axis.line = element_blank(),
          panel.background = element_rect(fill = "white",
                                          colour = "white")) +
    coord_fixed() +
    geom_abline(slope = 1, intercept = 0, color='grey80') 
  return(p)
}

Simulate some data and visualize the output with the plot.inc.cor function:

set.seed(9)

library(tidyverse); library(reshape2); library(corrr)

# Number of rows and columns in the dataframe
num_species <- 100
num_traits <- 7

# Generate random data with NAs
dmatrix <- matrix(runif(num_species * num_traits), nrow = num_species)

# Randomly replace 25% of values with NA
dmatrix[sample(length(dmatrix), length(dmatrix)*0.25)] <- NA

# Convert the matrix to a dataframe
dmatrix <- as.data.frame(dmatrix)
names(dmatrix) = paste0('Trait',1:ncol(dmatrix))

plot.inc.cor(dmatrix, cor.method='pearson', round.label.digit=2, size.label=3)

Example of output obtained, using default options. Upper left panel displays the number of observations without NAs (complete cases) for each pair; bottom right panel shows their correlation coefficient (default is ‘pearson’).

3. Calculate range elongation

updated (April 2024)

UPDATE! You can download the get_RLI function here. For additional details, please refer to the tutorial below and to the main publication available online: Midolo (2024) in Global Ecology and Biogeography.

Species range shape refers to how a species’ distribution differs from a perfectly symmetrical or isotropic pattern expected under neutral range dynamics. Some researchers have indeed proposed estimating shape elongation as a significant biogeographical species attribute (Rapoport, 1982; Brown et al., 1996; Pigot et al., 2010; Baselga et al., 2012; Castro-Insua et al., 2018).

Here I share an R code (= the get_RLI function) to calculate a metric similar to what described as the Range Linearity Index (RLI) by Pigot et al. (2010). The RLI is a dimensionless measure calculated as RLI = l² / a, where l represents the maximum linear dimension, and a represents the area of the polygon. To compute the l metric, we will determine the longest length among the shortest paths connecting all possible combinations of the westernmost, easternmost, northernmost, and southernmost geographical points at the extremes of each polygon. As a result, the RLI offers an intuitive way to quantify shape elongation, analogous to the long axis to width ratio (Graves 1988; Pigot et al. 2010). For example, an RLI value of 2 indicates a shape that is twice as long as it is wide.

Workflow to calculate range shape (= range elongation calculated as the Range Linearity Index, RLI) in the `get_RLI` function. The figure illustrates a schematic example for the native range of the wych elm (*Ulmus glabra*) (data source: Caudullo et al. 2017). Panel a) display the whole range of the species. Panel b) shows the largest range portions (= fragments) selected for RLI calculation (the biggest range fragments are used by default based on the rationale these are more important in defyining the whole range shape). RLI for species w is calculated as the average of RLI values of each range fragment (RLIi). RLI is the dimensioness ratio between squared length (l²) and area of the polygon, where l is the longest path across the shortest distance between the geographical points located at the extremes of each polygon (line in red; panel c). Maps are displayed with UTM EPSG:32633 projection.

Here is the code for the get_RLI function:

get_RLI <- function(range, # a multi polygons sf object of the species range
                    directional.constraint='none', # constraint the shortest path? default is none, use 'lon' or 'lat' to constrain the paths along longitude or latitude
                    fragment.inclusivity = 0.1, # the lowest this value, the highest is the proportion of fragments considered for range calculation. Lowest fragments 
                    columns.and.rows=100, # number of col and rows of the raster for each range fragment, increasing this number makes a raster with finer resolution, increasing considerably the amount of time!
                    direction = 8, # the 'directions' argument of the gdistance::transition() function. Values of 4, 8 and 16 are reccomended
                    graphical = T, # if T, plot the range with line objects (paths) used for RLI estimation
                    col.bl='red', # if graphical == T, the color of the final paths used for RLI estimation in the graph
                    very.graphical = F) # plot each fragment under examination and its candidate paths
  {
  st=Sys.time()
  
  UTM32 = '+proj=utm +zone=32 +datum=WGS84 +units=m +no_defs'
  orirange <- st_sf(range) # trasform to sf object
  st_geometry(orirange) <- 'geometry'
  orirange <- st_transform(orirange, UTM32) #reproject to UTM zone 32
  rprp <- orirange %>% st_cast('POLYGON') # cast to POLYGON (= separating fragments)
  rprp$area <- as.numeric(st_area(rprp$geometry)) # calculate area for each fragment in m^2
  rprp = rprp %>% arrange(desc(area))
  rprp$adif <- rprp$area/max(rprp$area) # proportional area difference with the nmain range
  #exclude too small fragments:
  rprpo = rprp
  rprp = rprp[rprp$adif>=fragment.inclusivity,]
  rprp$id = 1:nrow(rprp)
  
  print(paste0('Assessing ',nrow(rprp),' range fragments out of ', nrow(rprpo)))
  
  sentieri <- list() #list where to store fragments' RLI stats
  for (frag in 1:nrow(rprp)) {
    
    st.i <- Sys.time()
    
    polyi <- rprp[frag,]
    #apply Behrman transofmration, lon_0 set depending upon longitudinal extension (correcting for distortion)
    polyi = st_transform(polyi, crs = '+proj=cea +lon_0=0 +lat_ts=30 +x_0=0 +y_0=0 +datum=WGS84 +ellps=WGS84 +units=m +no_defs')
    if(round(abs(min(st_coordinates(polyi)[,1]))) == round(abs(max(st_coordinates(polyi)[,1])))){
      polyi = st_transform(rprp[frag,], crs = '+proj=cea +lon_0=10 +lat_ts=30 +x_0=0 +y_0=0 +datum=WGS84 +ellps=WGS84 +units=m +no_defs')
    }
    
    crs.polyi <- crs(as_Spatial(polyi))
    
    poly.grid <- st_make_grid(polyi, n=rep(columns.and.rows,2)) %>%
      st_sf() %>%
      mutate(row.id = row_number())
    
    defaultW <- getOption("warn")
    
    #1. define the area where the line can pass through (= 'playground')
    options(warn = -1)
    playground <- poly.grid %>%
      left_join(
        as.data.frame(st_intersects(poly.grid,polyi))%>%rename(value = col.id),'row.id'
      ) %>% dplyr::select(-row.id) %>%
      st_cast('POINT') %>%
      as_Spatial() %>% 
      as.data.frame() %>% 
      setNames(c('z','x','y')) %>% 
      dplyr::select(x,y,z) %>%
      rasterFromXYZ(crs = crs.polyi)
    options(warn = defaultW)
    
    #plot:
    if(very.graphical) {plot(playground, col='grey',main=paste0('fragment n. ',frag),legend=F)}
    
    
    #2. define the start/arrival point cells (= 'rallypoints')
    coord <- st_coordinates(polyi) %>% as.data.frame()
    
    options(warn = -1)
    rallypoints <- list(
      southernmost = coord[which.min(coord[, "Y"]), ],
      northernmost = coord[which.max(coord[, "Y"]), ],
      easternmost = coord[which.max(coord[, "X"]), ],
      westernmost = coord[which.min(coord[, "X"]), ]
    ) %>% 
      bind_rows(.id='flag') %>%
      st_as_sf(coords=c("X","Y"), crs = st_crs(polyi)) %>%
      dplyr::select(flag,geometry)
    
    raster.cooord <- poly.grid %>%
      left_join(
        as.data.frame(st_intersects(poly.grid,rallypoints))%>%rename(value = col.id),'row.id'
      ) %>% 
      filter(!is.na(value)) %>%
      dplyr::select(-row.id) %>%
      st_centroid() %>%
      st_coordinates %>%
      as.data.frame()
    
    rallyrasters <- list(
      southernmost = raster.cooord[which.min(raster.cooord[, "Y"]), ],
      northernmost = raster.cooord[which.max(raster.cooord[, "Y"]), ],
      easternmost = raster.cooord[which.max(raster.cooord[, "X"]), ],
      westernmost = raster.cooord[which.min(raster.cooord[, "X"]), ]
    ) %>% 
      bind_rows(.id='flag')
    
    options(warn = defaultW)
    
    if(directional.constraint == 'lon') {
      rallyrasters <- rallyrasters %>% filter(flag == 'easternmost' | flag == 'westernmost')
    }
    
    if(directional.constraint == 'lat') {
      rallyrasters <- rallyrasters %>% filter(flag == 'southernmost' | flag == 'northernmost')
    }
    
    rallyrasters_spatial <- rallyrasters %>% st_as_sf(coords = c('X','Y'), crs=crs.polyi)
    
    #3. force misplaced points towards raster, if needed
    check.allocation <- raster::extract(playground,rallyrasters_spatial)
    if (any(is.na(check.allocation))){
      badpoints <- rallyrasters[is.na(check.allocation),] %>% st_as_sf(coords=c('X','Y'))
      for (k in 1:nrow(badpoints)) {
        vvo <- rasterToPoints(playground) %>% as.data.frame() %>% filter(!is.na(z))
        vv <- st_as_sf(vvo,coords=c('x','y'))
        vv$dist <- st_distance(vv$geometry, badpoints[k,]$geometry)
        new.alloc <- vvo[which(vv$dist == min(vv$dist)),] %>% sample_n(1)
        rallyrasters[is.na(check.allocation),][,2:3][k,]$X = new.alloc$x
        rallyrasters[is.na(check.allocation),][,2:3][k,]$Y = new.alloc$y
      }}
    
    rallyrasters_spatial <- rallyrasters %>% st_as_sf(coords = c('X','Y'), crs=crs.polyi)
    
    if(very.graphical) {
      rallyrasters_spatial$flagcol = c(rep('grey30',4))[1:nrow(rallyrasters)]
      plot(rallyrasters_spatial, col=rallyrasters_spatial$flagcol,add=T,legend=F, pch=16)
    }
    
    thepoints2test <- rallyrasters[,-1]
    thepoints2test$id <- 1:nrow(thepoints2test)
    
    egrid_unique.combs <- as.data.frame(t(combn(thepoints2test$id,2))) %>% 
      bind_rows(data.frame(V1=thepoints2test$id,V2=thepoints2test$id)) %>% 
      setNames(c('n1','n2')) #a non-redundant version of expand_grid
    
    egrid <- cbind(
      expand_grid(n1=thepoints2test$id, n2=thepoints2test$id),
      expand_grid(d1=cbind(thepoints2test$X,thepoints2test$Y), d2=cbind(thepoints2test$X,thepoints2test$Y))
    ) %>% 
      semi_join(egrid_unique.combs, by=c('n1','n2'))%>% #a non-redundant tibble of pairwise combinations
      filter(n1!=n2)
    egrid <- egrid[,-c(1,2)]
    
    tr1 <- gdistance::transition(playground, transitionFunction=mean, directions=direction)
    tr1 <- gdistance::geoCorrection(tr1,'c')
    
    #start detecting the possible paths...
    if(very.graphical) {vcol <- viridis::viridis(nrow(egrid),direction = 1)}
    linee <- list()
    lunghezze <- vector()
    for (l in 1:nrow(egrid)) {
      linee[[l]] <- gdistance::shortestPath(tr1, 
                                    as.numeric(egrid[l,1]),
                                    as.numeric(egrid[l,2]),
                                    output="SpatialLines") %>%
        st_as_sf() %>%
        st_set_crs(tr1@srs) %>%
        st_transform(st_crs(orirange))
      
      if(very.graphical){plot(linee[[l]] %>% st_transform(crs(playground)), add=T, col=vcol[l],lwd=2, lty=1)}
      lunghezze[l] <- as.numeric(st_length(linee[[l]]))
    }
    
    I.stat <- max(lunghezze)
    a <- polyi$area
    
    which.best.line <- c(which(lunghezze == I.stat))
    bl.id <- ifelse(length(which.best.line) == 1, which.best.line, sample(which.best.line,1))
    bl <- linee[[bl.id]]

    sentieri[[frag]] <- list(
      best.path = bl,
      df = data.frame(fragment=frag, I=I.stat, A=a, RLI = as.numeric(I.stat)^2/a)
    )
    
    cati = str_pad(frag,nchar(nrow(rprp)),side='left',pad='0')
    et.i = Sys.time() - st.i
    message('Fragment ',cati,' time: ',paste(round(et.i,2)),' ',attr(et.i,'units'))
  }
  
  
  out = list(
    RLI.vals = bind_rows(purrr::map(sentieri, function(x){x$df})),
    RLI.lins = bind_rows(purrr::map(sentieri, function(x){st_sf(x$best.path)}))
  )
  
  if(graphical){
    plot(as_Spatial(orirange), col='black', border=NA)
    plot(as_Spatial(rprp),col='darkolivegreen3',add=T,border=NA)
    plot(out$RLI.lins, col=col.bl, add=T, lwd=3)
  }
  
  et=Sys.time()-st
  message(paste0('... DONE! Total elapsed time: ',paste(round(et,2)),' ',attr(et,'units')))
  return(out)
}

Here is a simple tutorial. In this example, we calculate RLI for Taurus fir (Abies cilicica), a tree species growing in Lebanon and Turkey. The original native range was downloaded here (source: Caudullo et al. 2017).

#1. prepare the data

#the function depends upon the following R packages:

library(tidyverse); library(sf); library(raster); library(gdistance)

#define coordinates of the simplifed range of Taurus fir

coords <- "MULTIPOLYGON (((30.70949 37.44737, 30.47034 37.32006, 30.52049 37.20553, 30.71815 37.2789, 30.70949 37.44737)), ((31.11012 37.42078, 30.97528 37.4679, 30.88733 37.36318, 30.93263 37.24811, 31.11012 37.42078)), ((36.46468 37.03979, 36.31841 36.89317, 36.33939 36.7773, 36.49918 36.8471, 36.46468 37.03979)), ((36.6707 37.29474, 36.66847 37.51403, 36.59149 37.58607, 36.59808 37.87782, 36.73713 37.93399, 36.84522 37.90433, 36.96676 37.9688, 36.96026 38.04769, 36.70301 38.12471, 36.67512 38.18166, 36.54407 38.27753, 36.43666 38.13201, 36.15473 38.10206, 36.21972 38.26696, 36.29701 38.28576, 36.39462 38.32924, 36.09729 38.36283, 35.89325 38.15608, 35.73671 38.09842, 35.69503 37.97337, 35.24333 37.68169, 35.04145 37.78691, 34.84408 37.54285, 34.47599 37.51281, 34.39836 37.40821, 34.6797 37.42135, 34.68839 37.31277, 34.55881 37.28519, 34.27545 37.15911, 34.18091 36.91732, 33.99108 36.94332, 33.80763 36.56233, 34.02319 36.61594, 34.19481 36.74509, 34.34339 36.97006, 34.59568 37.03146, 34.63063 37.1486, 34.94324 37.24132, 34.98489 37.32413, 35.56452 37.59327, 35.79142 37.8377, 35.8945 38.05866, 35.97055 37.94501, 35.95044 37.6577, 36.13366 37.56702, 36.30933 37.58298, 36.43974 37.49443, 36.46066 37.35781, 36.6707 37.29474)), ((36.23528 34.52347, 35.98332 34.34524, 35.96573 34.2879, 36.28164 34.46973, 36.23528 34.52347)), ((31.27839 37.37006, 31.34101 37.04838, 31.52389 36.88489, 31.95332 36.8085, 32.09129 36.57597, 32.23787 36.55928, 32.48722 36.16406, 33.13701 36.23291, 33.17804 36.38689, 33.09851 36.40655, 33.05761 36.52104, 32.96027 36.36613, 32.73956 36.37612, 32.63873 36.26141, 32.5014 36.32853, 32.48627 36.75422, 32.34106 36.76603, 32.26759 36.84551, 32.28035 36.75343, 32.11068 36.73524, 32.08518 36.86551, 31.86951 36.9232, 31.95336 37.16447, 31.70671 37.38032, 31.4169 37.29067, 31.27839 37.37006)), ((32.93654 36.86382, 32.99047 36.65178, 33.08621 36.62425, 33.04779 36.8336, 32.93654 36.86382)))"

#coordinates to multipolygon sf object:

range <- st_as_sfc(coords, crs='+proj=longlat +datum=WGS84 +no_defs')

#quickly visualise the geometry in a map (optional, requires mapview):

mapview::mapview(range)

Simplified polygon shape for Taurus fir (*Abies cilicica*) using mapview

Now, let’s run the get_RLI function, with default options:

#run the function with default options:
res <- get_RLI(range)
[1] "Assessing 2 range fragments out of 7" 
Fragment 1 time: 2.82 secs 
Fragment 2 time: 3.04 secs 
... DONE! Total elapsed time: 5.94 secs

Graphical output obtained from the function at the end of RLI calculation (set `graphical=FALSE` to supress the plotting). By default the function displays the different fragments composing species ranges and the longest paths (in red). These are only calculated and drawn for the range fragments considered in the analysis (in green; unutilized range fragment are displayed in black).

Let’s inspect the results:

res$RLI.vals # values of RLI for each fragment, units are in m
  fragment        I           A      RLI
1        1 248601.8  5779445898 10.69356
2        2 374410.0 12863135567 10.89803

weighted.mean(res$RLI.vals$RLI, w = res$RLI.vals$A) # obtain RLI, weighted by the size of each fragment
10.83464

res$RLI.lins # LINESTRING geometries (the drawn lines)

We calculated the species-level RLI, using the average of the RLI of each fragment weighted by its area A. We conclude that range elongation of Abies cilicica is quite pronounced (RLI = 10.8), meaning it is almost about eleven times long as it is wide.

We can allow the function to return some plot displaying all possible shorthests path considered for each range fragment included in the analysis by setting the parameter very.graphical=TRUE.

# show the intermediate steps of calculation for each path
get_RLI(range, very.graphical = T, graphical = F)

All possible paths considered in each of the fragments assessed. Across each of the four path, the longest one is selected to calculate l by default. N.B. the geographic coordinates on the xy follow the Behrman transformation for estimation of geographic extremes.

In some systems, one may want to focus exclusively upon the elongation along either longitude or latitude. For example, given the south-to-north distribution of Andean montane forests, one might be interested in knowing the elongation of a species along latitude (e.g. see Graves 1988). On the other hand, range elongation along longitiude might be a good indicator for e.g. plant species constrained to coastal lines in the Mediterranean. In this context, we can simply force the function to consider only northermost-to-southernmost points with directional.constraint = 'lat' or westernmost-to-easternmost points with directional.constraint = 'lon' (default is directional.constraint = 'none')

#constraint shortest path only latitude or longitude paths
get_RLI(range, directional.constraint = 'lat', col.bl = 'red')
get_RLI(range, directional.constraint = 'lon', col.bl = 'blue')

The visual output obtained when `directional.constraint` parameter is modified accordingly (edited figures)

Finally, one may also consider more fragments for RLI calculation, as in principle there is no reason to exclude smallest fragments. However, I) range polygons could be much more complex and bigger than what we are using here, resulting into non-negligible computation times for this function, and II) the RLI calculation weighted by area makes nearly useless to consider all possible tiny fragments composing a species range. Anyway, we can modify the fragment.inclusivity parameter to do so (default here is setted to 0.1). This is a threshold value. It allows the inclusion of fragments by calculating the ratio between the area of each fragment to the area of the fragment with the largest area. Only fragments with values greater or equal to fragment.inclusivity are included for RLI calculation. Thus, the lowest this value, the highest is the proportion of fragments considered for range calculation. Set fragment.inclusivity=0 if you want all fragments to be considered for RLI estimation.

#decrease inclusivity treshold for range fragments (i.e., consider an increasing number of smaller fragments for RLI estimation) 
res2 <- get_RLI(range, fragment.inclusivity = 0.02) # default is 0.1

> res2$RLI.vals # we now included more fragments
  fragment         I           A       RLI
1        1  34405.63   304506996  3.887422
2        2  34215.45   364252009  3.213975
3        3 248601.78  5779445898 10.693559
4        4 374409.97 12863135567 10.898029

> weighted.mean(res2$RLI.vals$RLI, w = res2$RLI.vals$A) #but they matter less in overall RLI calculation
[1] 10.58135

You can see we have now two smaller fragments considered for RLI calculation. Yet, these new smaller fragments weight less in the final average, and thus they do not affect the final RLI calculation that much. Again, this follow the assumption that smaller ranges matter less for overall range shape estimation.

References

Baselga, A., Lobo, J. M., Svenning, J. C., & Araújo, M. B. (2012). Global patterns in the shape of species geographical ranges reveal range determinants. Journal of Biogeography, 39(4), 760–771.

Brown, J. H., Stevens, G. C., & Kaufman, D. M. (1996). The Geographic Range: Size, Shape, Boundaries, and Internal Structure. Annual Review of Ecology and Systematics, 27(1), 597–623.

Castro‐Insua, A., Gómez‐Rodríguez, C., Svenning, J. C., & Baselga, A. (2018). A new macroecological pattern: The latitudinal gradient in species range shape. Global Ecology and Biogeography, 27(3), 357-367.

Caudullo, G., Welk, E., & San-Miguel-Ayanz, J. (2017). Chorological maps for the main European woody species. Data in Brief, 12, 662–666.

Graves, G. R. (1988). Linearity of Geographic Range and Its Possible Effect on the Population Structure of Andean Birds. The Auk, 105(1), 47–52.

Pigot, A. L., Owens, I. P. F., & Orme, C. D. L. (2010). The environmental limits to geographic range expansion in birds. Ecology Letters, 13(6), 705–715.

Rapoport, E. H. (1982). Aerography: Geographical Strategies of Species. Pergamon Press, Oxford UK.