Randy Posada

Advisors: Luna L. Sanchez-Reyes, Emily Jane McTavish

McTavish Lab

School of Natural Sciences

University of California, Merced

Fig.cap="See attributions for link


My Report

Introduction

In this report I use the Open Tree of Life alongside Physcraper to create and access an updated phylogentic tree of all bats and delve into the multifunctionalities of Rstudio using the ‘rotl’ package to interact with The open tree of life services and the Open Tree Taxonomy.

There are over 1000 different species of bats. These extraordinary flying mammals use their hands to fly; granted their order name chiroptera, which translates in Greek to ‘Hand Wings’. Each of their fingers are connected to one another through a thin layer of skin which allows these nocturnal mammals to take off into flight. Chiroptera are the only mammals with the capability of continued flight.

The Open Tree of Life constructs an informative,comprehensive, and digitally-available dynamic tree of all life by synthesizing published phylogentic trees with taxonomic data to create a an comprehensive tree of all life. We can search the Open Tree Taxonomy for specific names or Ids. To observe and interact with the synthetic tree we can use One Zoom.To specifically view Chiroptera in the tree, you can use the following link One Zoom - Chiroptera. The Open tree of Life aims at assembling a comprehensice phlogenetic tree for all named species.

We can use tools from the Open Tree of Life alongside RStudio to extract, construct, and update phylogenetic Trees. The Open Tree Taxonomy (OTT) synthesizs taxonomic information and assigns taxons with a unique identifier known as an OTT Id. To be able to use OTT Ids and interact with the open tree of life services we will need to install and use the rotl package. This package interface will allow the extraction of phylogenetic trees, information about the studies to build the synthetic tree, and ott ids using RStudio.


Interacting With The Open Tree Of Life

To get the OTT ids for a set of taxa we use the ‘rotl’ package. Any function from rotl that start with ‘trns_’ interacts with the OTT. The function tnrs_match_names used below allows us to deal with synonyms and misspellings and allows the linkage of scientific names to a corresponding unique OTT id. We then assign our taxon name to resolved_names object.

It is useful to know the class of an object since it makes manipulating objects much easier with different functions.When we create a class we create a data structure that will house all the objects that belong to a specific class. This is done for ease of access, organization, and clarity.

The class of the resolved_names object allows us to view the search string name, the unique name ,and the ott_id in respect to the open tree of life. The class of the resolved_names object which includes trns_match_names, allows us to view two outputs : “match_names” and “data_frame”.

In the following chunk, we subset the object to obtain certain columns and if needed, manipulate the formula to extract a specific component from the row. Since there is no function that allows us to extract the values from a row of match_names, we need to use resolved_names and indexing. Subsetting ultimately allows us to get values from all columns of one row.

Our goal is to obtain the ott_id for bats (Chiroptera). In order to extract the information we need to subset using the column name ‘unique_name’ in the second part of the formula ‘resolved_names’. This way, we can extract one specific value ( ) from the column we want using the column name.

An OTT id is a unique numerical identifier assigned to a taxon in the Open Tree Taxonomy. Every taxon has a specific OTT id. These OTT ids allow us to interact with the Open Tree of Life.

If we want to obtain the unique name of the taxon used in the synthetic open tree of life, we can use the following function. This function takes into account our previous output of data and extract a specific value.

The next code gives all information from the current synthetic Open Tree:

The previous code gave an output of the information from the Synthetic Open Tree Of Life (OTOL) using the package rotl.

This function assigns our matched name ‘chiroptera’ to “Chiroptera_ott_id” and will therefore extract the ott_id we wanted for chiroptera once we run it.

The following code will help us get the Chiroptera subtree from the synthetic tree:

chiroptera_subtree <- rotl::tol_subtree(ott_id = chiroptera_ott_id)

ape::Ntip(chiroptera_subtree)
#> [1] 1820

ape::plot.phylo(chiroptera_subtree, cex = 0.1, type = "fan")

It is relevant to note that our taxon is monophyletic since nonmonophyletic taxa contain ‘invalid’ or ‘broken’ data. When the taxon is ‘broken’, its ott_id is not assigned to a node in the synthetic tree. The following code will tell us if the taxon is monophyletic:

rotl::is_in_tree(chiroptera_ott_id)
#> [1] TRUE

The above code confirmed that indeed our taxon is monophyletic by giving the output ‘TRUE’.

OTT ids and node ids allow us to interact with the synthetic OTOL.

One way to obtain branch lengths proportional to time is with the datelife package. An alternative way to get branch lengths on a tree is to arbitrarily generate them with ape::compute.brlen(). In the following section we will use the datelife package.

Example 1: Chiroptera Families

First we will get all families from Chiroptera and their OTT ids.

chiroptera_families <- datelife::get_ott_children(ott_ids = chiroptera_ott_id, ott_rank = "family")

We will use Chiroptera families’ OTT ids to retrieve a tree from the Open Tree Of Life.

First, we must Figure out how to extract the OTT ids as a vector.

Now we can use the OTT ids to extract a subtree from the Open Tree of Life.

Lets look at the structure of the Chiroptera families subtree.

Plotting Tree Of Chiroptera families.

ape::plot.phylo(chiroptera_families_subtree, cex = 0.8)

Example 2 : Five Chiroptera Taxa That I Like

To get an even smaller bat tree with 5 taxa, first get the scientific names of families, genera, or species of bat. Then run rotl::tnrs_match_names to get the OTT ids.

Here I chose the following five Taxa: “Megadermatidae”,“Mormoopidae”,“Vespertilionidae”,“Mystacinidae”,and “Furipteridae.”

my_ott_ids <- rotl::tnrs_match_names(c("Megadermatidae","Mormoopidae","Vespertilionidae","Mystacinidae","Furipteridae"))

We will need to extract the OTT ids only, because now we have the whole table.

Retrieving a subtree from the Open Tree Of Life, with taxon names as tip labels.

This code chunk provides us with the info of our tree.

To plot the above tree, the ape functiopn “plot.phylo” is used.

ape::plot.phylo(my_tree, cex = 1)

TASK 5: Describe how do you get help to use a function in R?

Getting the dates available for the five taxa tree we will use the function datelife::get_datelife_result, but how do we use that function. Let’sget some help with ?

Now we can run the function with some confidence.

YOUR_DATELIFE_RESULT_OBJECT <- datelife::get_datelife_result(input = my_tree, get_spp_from_taxon = TRUE)

TASK 7: Take the output from datelife::get_datelife_result and run the following code chunk.

chiroptera_phylo_all <- datelife::summarize_datelife_result(YOUR_DATELIFE_RESULT_OBJECT, summary_format = "phylo_all")
names(chiroptera_phylo_all)
#> [1] "Shi, Jeff J., Daniel L. Rabosky. 2015. Speciation dynamics during the global radiation of extant bats. Evolution 69 (6): 1528-1545"                                                                                                                            
#> [2] "Bininda-Emonds, Olaf R. P., Marcel Cardillo, Kate E. Jones, Ross D. E. MacPhee, Robin M. D. Beck, Richard Grenyer, Samantha A. Price, Rutger A. Vos, John L. Gittleman, Andy Purvis. 2007. The delayed rise of present-day mammals. Nature 446 (7135): 507-512"
#> [3] "Bininda-Emonds, Olaf R. P., Marcel Cardillo, Kate E. Jones, Ross D. E. MacPhee, Robin M. D. Beck, Richard Grenyer, Samantha A. Price, Rutger A. Vos, John L. Gittleman, Andy Purvis. 2007. The delayed rise of present-day mammals. Nature 446 (7135): 507-512"
#> [4] "Bininda-Emonds, Olaf R. P., Marcel Cardillo, Kate E. Jones, Ross D. E. MacPhee, Robin M. D. Beck, Richard Grenyer, Samantha A. Price, Rutger A. Vos, John L. Gittleman, Andy Purvis. 2007. The delayed rise of present-day mammals. Nature 446 (7135): 507-512"
#> [5] "Hedges, S. Blair, Julie Marin, Michael Suleski, Madeline Paymer, Sudhir Kumar. 2015. Tree of life reveals clock-like speciation and diversification. Molecular Biology and Evolution 32 (4): 835-845"                                                          
#> [6] "Lack J.B., & Van den bussche R.A. 2010. Identifying the Confounding Factors in Resolving Phylogenetic Relationships in Vespertilionidae. Journal of Mammalogy, ."                                                                                              
#> [7] "Dumont E.R., Davalos L.M., Goldberg A., Santana S.E., Rex K., & Voigt C.C. 2012. Morphological innovation, diversification and invasion of a new adaptive zone. Proceedings of the Royal Society B: Biological Sciences, 279: 1797-1805."

plot_phylo_all plots the output of summarize_datelife_results.The output corresponds to all the chronograms that have at least two of the taxa that are given as input to the get_datelife_result.

datelife::plot_phylo_all(trees = chiroptera_phylo_all)

Example 3: All the Chiroptera!

Let’s get a full Chiroptera subtree from the Open Tree of Life.

When you run the get_datelife_result function it will get node ages from published trees that contain at least two taxa in your search:

The datelife result object is not a tree but a list of tables with the node ages for each pair of taxa from your search. For our 1800 species in the Chiroptera, we got the following trees with node ages:

names(chiroptera_dr)
#> [1] "Shi, Jeff J., Daniel L. Rabosky. 2015. Speciation dynamics during the global radiation of extant bats. Evolution 69 (6): 1528-1545"                                                                                                                            
#> [2] "Bininda-Emonds, Olaf R. P., Marcel Cardillo, Kate E. Jones, Ross D. E. MacPhee, Robin M. D. Beck, Richard Grenyer, Samantha A. Price, Rutger A. Vos, John L. Gittleman, Andy Purvis. 2007. The delayed rise of present-day mammals. Nature 446 (7135): 507-512"
#> [3] "Bininda-Emonds, Olaf R. P., Marcel Cardillo, Kate E. Jones, Ross D. E. MacPhee, Robin M. D. Beck, Richard Grenyer, Samantha A. Price, Rutger A. Vos, John L. Gittleman, Andy Purvis. 2007. The delayed rise of present-day mammals. Nature 446 (7135): 507-512"
#> [4] "Bininda-Emonds, Olaf R. P., Marcel Cardillo, Kate E. Jones, Ross D. E. MacPhee, Robin M. D. Beck, Richard Grenyer, Samantha A. Price, Rutger A. Vos, John L. Gittleman, Andy Purvis. 2007. The delayed rise of present-day mammals. Nature 446 (7135): 507-512"
#> [5] "Hedges, S. Blair, Julie Marin, Michael Suleski, Madeline Paymer, Sudhir Kumar. 2015. Tree of life reveals clock-like speciation and diversification. Molecular Biology and Evolution 32 (4): 835-845"                                                          
#> [6] "Lack J.B., & Van den bussche R.A. 2010. Identifying the Confounding Factors in Resolving Phylogenetic Relationships in Vespertilionidae. Journal of Mammalogy, ."                                                                                              
#> [7] "Dumont E.R., Davalos L.M., Goldberg A., Santana S.E., Rex K., & Voigt C.C. 2012. Morphological innovation, diversification and invasion of a new adaptive zone. Proceedings of the Royal Society B: Biological Sciences, 279: 1797-1805."

We have 7 studies in OpenTree with ages for the Chiroptera. The code above provided all references of the seven studies as an output.

To get the actual chronograms we need to run another function:

chiroptera_phylo_all <-  datelife::summarize_datelife_result(chiroptera_dr, summary_format = "phylo_all")
# We will write this object into a file, bc it takes a long time to run
save(chiroptera_phylo_all, file="data/chiroptera_phylo_all.RData")

Now, we have to load it into the R work space so it is available for the next part

load("../data/chiroptera_phylo_all.RData")

The following function will allow us to plot the Tree with the ages.

datelife::plot_phylo_all(trees = chiroptera_phylo_all, write="pdf")

However, they are quite large, so we will not show them here for now.

Summarizing node ages is slow so we will save the output of datelife::summarize_datelife_result in the data folder. This function summarizes the node information from all the chronograms in chiroptera_phylo_all.

To plot the chronogram we will use ape::plot.phylo

ape::plot.phylo(chiroptera_phylo_median, cex = 1.2)
# Add the time axis:
ape::axisPhylo()
# And a little hack to add the axis name:
graphics::mtext("Time (myrs)", side = 1, line = 2, at = max(get("last_plot.phylo",envir = .PlotPhyloEnv)$xx) * 0.5)

Updating a Chiroptera chronogram with Python

The Physcraper software allows to update a published phylogeny with new DNA sequences from GenBank.

This can be your task for the fall if you are interested.

Reproducibility

Do you want to reproduce this report yourself?

The following piece of code will render this report as a pdf document:

Attributions

bat image Open Tree of Life add link to OTL and references