All Species Living Tree Project · Kylepedia

All Species Living Tree Project

Details

Title: Chapter 3 - The All-Species Living Tree Project

Citation: Yarza, Pablo, and Raul Munoz. “The all-species living tree project.” Methods in Microbiology 41 (2014): 45-59.

Overview

This paper is the third chapter from a journal book series on prokaryote systematics. It summarizes how the database was created and curated, and describes some of the resources that they provide that are based on the database. The team describes the project as follows:

The aim of the project is to reconstruct separate and curated 16S and 23S rRNA datasets and trees spanning all sequenced type strains of the hitherto classified species of Archaea and Bacteria.

The project is guided by the editors of the journal Systematic and Applied Microbiology The ARB/SILVA, and List of Prokaryotic names with Standing in Nomenclature (LPSN) teams take care of the technical details.

Technical Details

Data Collection

Background

Prior to describing the project itself, the paper describes the existing database resources for microbiology, such as sequence databases (INSDC), 16S specific databases, nomenclature, and type strain information. The type strain part is noteworthy, as type strain repositories are expected to have higher standards for the information provided to them (although they noted that problems still exist) by the teams who are depositing a new strain.

They also describe the main sources of error for the taxonomic data entered into biological databases: (1) different repositories containing different names for the same type strains, and (2) the large number of strains (> 1 million) and repositories (~600).

Taxonomy

Filtering

Notes:

A lot of the curation steps that they do are done manually by the team members. The paper doesn’t describe what these manual steps actually are. It would be good to see if these are documented somewhere.

The SILVA databases are separate from the LTP tree. The LTP tree is limited to type strains, while SILVA is more inclusive.

1: TOBA 2: LPSN