How Do You Know If 2 Animals Belong To The Same Specie?
RNA. 2007 Sep; 13(9): 1469–1472.
Distinguishing species
Tobias Müller
Section of Bioinformatics, Biocenter, University of Würzburg, Am Hubland, D-97074 Würzburg, Germany
Nicole Philippi
Section of Bioinformatics, Biocenter, University of Würzburg, Am Hubland, D-97074 Würzburg, Frg
Thomas Dandekar
Department of Bioinformatics, Biocenter, University of Würzburg, Am Hubland, D-97074 Würzburg, Federal republic of germany
Jörg Schultz
Section of Bioinformatics, Biocenter, University of Würzburg, Am Hubland, D-97074 Würzburg, Federal republic of germany
Matthias Wolf
Section of Bioinformatics, Biocenter, University of Würzburg, Am Hubland, D-97074 Würzburg, Germany
Received 2007 May 7; Accepted 2007 Jun xiv.
Abstruse
Given ii organisms, how can one distinguish whether they belong to the same species or not? This might be straightforward for 2 divergent organisms, but tin can be extremely difficult and laborious for closely related ones. A molecular marker giving a clear distinction would therefore exist of immense do good. The internal transcribed spacer 2 (ITS2) has been widely used for depression-level phylogenetic analyses. Instance studies revealed that a compensatory base modify (CBC) in the helix II or helix III ITS2 secondary structure between ii organisms correlated with sexual incompatibility. We analyzed more than 1300 closely related species to test whether this correlation is mostly applicable. In 93%, where a CBC was plant between organisms classified within the same genus, they belong to unlike species. Thus, a CBC in an ITS2 sequence-structure alignment is a sufficient condition to distinguish even closely related species.
Keywords: CBC, compensatory base change; ITS2, internal transcribed spacer 2; phylogeny; species concept
INTRODUCTION
"A species is a reproductive customs of populations (reproductively isolated from others), which occupies a specific niche in nature" (Mayr 1982). Fifty-fifty if several objections could exist raised against this "definition," it is maybe the all-time-known statement in modern biology. "In fact, it is an indicator hypothesis: information technology does not tell us what a biospecies is but how to recognize it, namely past observing reproduction or else past failing to notice the latter. Neither [common] reproduction nor [sexual] isolation are defining properties of a species but, at best, properties of organisms that may be used as symptoms of the latters' membership in a item species. In other words two organisms do not belong to the same species considering they mate and reproduce, just they only are able to practise so because they vest to the same species" (Mahner and Bunge 1997). In this written report we are looking for a molecular classifier that might indicate that two organisms belong to unlike species. Nosotros are interested in an indicator hypothesis that is easy to work upon and additionally yields a certain probability that two organisms belong to distinct species. Compensatory base of operations changes (CBCs) in the internal transcribed spacer 2 region (ITS2) of the nuclear rRNA cistron have been suggested as such a classifier. CBCs occur in a paired region of a principal RNA transcript when both nucleotides of a paired site mutate, while the pairing itself is maintained (due east.chiliad., Thousand-C mutates to A-U). According to Coleman and Vacquier (2002), "…in all […] eukaryote groups where a broad array of species has been compared for both [rRNA] ITS2 sequence secondary structure and tested for whatever vestige of interspecies sexual compatibility, an interesting correlation has been establish. When sufficient evolutionary distance has accumulated to produce fifty-fifty one CBC in the relatively conserved pairing positions of the ITS2 transcript secondary structure, taxa differing by the CBC are observed experimentally to exist totally incapable of intercrossing" (see also Coleman 2003, 2007). With the advent of the ITS2 Database (Schultz et al. 2005, 2006; Wolf et al. 2005a) currently consisting of 65,000 ITS2 sequences and their individual secondary structures as well equally of 4SALE, a programme for synchronous sequence and secondary construction alignment and editing (Seibel et al. 2006), we are now able to test this hypothesis by a big-scale assay. Here, we prove that albeit the fact that a lack of CBCs in ITS2 secondary structures is not an indicator of ii organisms belonging to the same species, at least one CBC is a classifier with a 93.eleven% reliability at least for plants and fungi that indicates two organisms belonging to distinct species. Because CBCs in ITS2 secondary structures are found to correlate strongly with distinct biological species, 1 can utilise this molecular indicator for determining at least the minimal number of distinct species from a set of ITS2 secondary structures of a clade, e.g., in analyses of environmental samples or for metagenomics. Moreover, the correlation between CBCs and the species concept occurs independent of reproduction and mating affinities, i.e., should piece of work out similarly well for sexual and asexual species.
RESULTS
The taxonomic distribution of all alignments is shown in Table 1. Because of the underrepresentation of information from animals, primarily for lack of ITS2 sequences in GenBank, conclusions can currently only be made for fungi and plants. The CBC distribution for the subselection of alignments from the same species versus alignments from different species in the same genus (nonspecies) is shown in Table 2. The data imply an overall CBC charge per unit of 0.0525 in species versus an almost xiv-fold rate of 0.71 in nonspecies. To summate the probability for the occurrence of at least one CBC in a global pairwise species sequence-structure alignment, nosotros normalized the absolute frequencies by the overall sum. With these data we could calculate the probability that two sequences belong to different species given at to the lowest degree ane CBC in the pairwise global sequence-structure alignment as follows:
TABLE i.
TABLE 2.
Hither, nosotros have calculated a provisional probability of 93.11%, with an error rate of half-dozen.89%, that ii species are distinguished in current taxonomy, given that at least one CBC is observed. All further associated probabilities are shown below:
To gauge these probabilities for unbalanced samples, we count 66% species and 33% nonspecies alignments, and vice versa. The probability P(non-species|CBC > 0) changed to 87% and 97%, respectively.
If the emergence of CBCs is mainly caused by the amount of fourth dimension two sequences evolved independently, the number of CBCs between ii sequences should be dependent on their overall deviation. Nosotros analyzed the CBC charge per unit in relation to the evolutionary altitude by the Jukes Cantor Correction formula for all nonspecies sequence-structure alignments. Figure 1 shows a directly linear proportional relation between sequence divergence and the mean accumulated CBCs equally indicated by the robust local regression curve (lowess) and the linear fit of the regression line. Its slope of ii.3 indicates that with one expected mutation per site, 2.3 CBCs can be expected betwixt two sequences. At least 1 CBC can be expected between 2 sequences with a Jukes-Cantor distance of at to the lowest degree 0.42, although with a high variance. Complementarily, if on average every 2.5th site is mutated in the pairwise alignment, approximately one CBC tin can be expected. The behavior of CBCs on the structural level of the ITS2 as a marker for reconstructing phylogenies scales like the complete master ITS2 sequence with its identities and mutations.
DISCUSSION
In a large-scale analysis, nosotros tested the hypothesis that taxa differing past at least one CBC represent unlike species. In 93.11% of the cases where two sequences taken from the same genus show a CBC, they belong to different species. When interpreting this pct, it has to be taken into business relationship that classifying species of the same genus is the most challenging scenario, every bit they are highly related to each other. An fifty-fifty meliorate classification can be expected when comparing species of different orders or even families. This result is likewise influenced by the distribution of species belonging to one genus or to different genera. Depending on the sampling distribution, we therefore conclude that a CBC is indeed (on a 93.xi% level) a sufficient marking for the identification of species. For the exceptional cases, the CBCs are equally distributed over all four helices. A list of all taxonomic vagaries concerning NCBIs species designations is now open for give-and-take (for Supplemental fabric, run across http://its2.bioapps.biozentrum.uni-wuerzburg.de/TaxonomicVagaries.pdf), i.e., it has to be tested that in these cases the NCBI taxonomy fits mating affinities of the respective organisms.
In contrast, the absence of a CBC between two taxa predicted that in just 76.57% of the cases they belong to the same species. There was a significant correlation betwixt the boilerplate number of CBCs and the general deviation, measured as the Jukes Cantor distance, between two ITS2 sequences. At that place is no causal human relationship between the existence of a CBC and speciation. The CBC is rather a mensurate of elapsed evolutionary time, indicating that sufficient fourth dimension has passed to make a speciation outcome very likely and is not a necessary criterion.
The expected number of CBCs depends on the caste of divergence, the sequence length (consummate or partial), and on the CBC charge per unit per site. We suggest using the whole ITS2 sequence equally a ground for species nomenclature. This raises the question as to whether the divergence inside a phylogenetic mark like the ITS2 could be used straight as a marker for speciation. We found a large variance of the distance between species of the same genus in this marking. Therefore, any cutting-off for classification volition be inherently fault prone. In contrast, the presence of a CBC is a handy and binary classifier (yes or no) with a known error rate (6.89%) that could be routinely used by tools like 4SALE (Seibel et al. 2006), the CBCAnalyzer (Wolf et al. 2005b), or by the ITS2 Database (Schultz et al. 2006). Due to concerted evolution in rRNA repeats, a CBC may thus primarily be a molecular indicator for no genetic commutation between 2 populations. However, how CBCs evolve within the repeats remains unclear. It is a limitation of our study that currently sufficient data exist merely for plants and fungi. It would exist interesting to collect ITS2 information for animals likewise and to analyze whether our results can be transferred to this group. If so, we propose using CBCs of ITS2 as a general marking for the identification and classification of eukaryotic species.
Fabric AND METHODS
ITS2 sequences and their predicted structures were retrieved from the ITS2 database v1.0 (Schultz et al. 2005, 2006; Wolf et al. 2005a) together with their taxonomic classification via the Lather interface. Taxon-specific global multiple sequence-construction alignments were generated with 4SALE—a tool for synchronous sequence and secondary construction alignment and editing (Seibel et al. 2006) using an ITS2-specific scoring matrix (Wolf et al. 2005a). Alignments with at least three nonidentical sequences were subdivided into species and nonspecies alignments. The latter contained sequences from unlike just closely related species of the aforementioned genus. In this study, species designations are clearly not based on mating affinities, or any other species concept, merely on NCBIs taxonomy as implemented in the ITS2 Database. In total, 1373 species and 400 nonspecies alignments were generated, manually checked, and curated. Four hundred species alignments were randomly chosen as a database for a counterbalanced statistical analysis of 800 alignments. The CBCAnalyzer (Wolf et al. 2005b) every bit implemented in 4SALE (Seibel et al. 2006) was used to count CBCs. All statistical analyses were performed using the statistical environment R (Ihaka and Gentleman 1996). The robust linear regression was performed with the rlm function as implemented in the MASS packet (Venables and Ripley 2002).
ACKNOWLEDGMENTS
Nosotros thank the DFG (High german Research Foundation) for financial support (Mu-2831/1-1) and Philipp Seibel (Würzburg, Germany) for his help with 4SALE.
Footnotes
REFERENCES
- Coleman, A.Due west. ITS2 is a double-edged tool for eukaryote evolutionary comparisons. Trends Genet. 2003;19:370–375. [PubMed] [Google Scholar]
- Coleman, A.W. Pan-eukaryote ITS2 homologies revealed by RNA secondary structure. Nucleic Acids Res. 2007;35:3322–3329. [PMC free article] [PubMed] [Google Scholar]
- Coleman, A.Due west., Vacquier, V.D. Exploring the phylogenetic utility of ITS sequences for animals: A examination example for abalone (Haliotis) J. Mol. Evol. 2002;54:246–257. [PubMed] [Google Scholar]
- Ihaka, R., Gentleman, R. R: A language for data assay and graphics. J. Comput. Graph. Statist. 1996;five:299–314. [Google Scholar]
- Mahner, M., Bunge, M. Foundations of biophilosophy. Springer; Berlin: 1997. [Google Scholar]
- Mayr, East. The growth of biological idea. Harvard University Printing; Cambridge, MA: 1982. [Google Scholar]
- Schultz, J., Maisel, S., Gerlach, D., Muller, T., Wolf, M. A mutual cadre of secondary structure of the internal transcribed spacer 2 (ITS2) throughout the Eukaryota. RNA. 2005;11:361–364. [PMC free article] [PubMed] [Google Scholar]
- Schultz, J., Muller, T., Achtziger, Thousand., Seibel, P.N., Dandekar, T., Wolf, Grand. The internal transcribed spacer ii database– A web server for (not only) low level phylogenetic analyses. Nucleic Acids Res. 2006;34:W704–W707. doi: 10.1093/nar/gki129. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
- Seibel, P.Northward., Muller, T., Dandekar, T., Schultz, J., Wolf, M. 4SALE–a tool for synchronous RNA sequence and secondary structure alignment and editing. BMC Bioinformatics. 2006;7:498. [PMC free article] [PubMed] [Google Scholar]
- Venables, W.N., Ripley, B.D. Modern applied statistics with Southward. Springer; Berlin: 2002. [Google Scholar]
- Wolf, K., Achtziger, M., Schultz, J., Dandekar, T., Muller, T. Homology modeling revealed more than 20,000 rRNA internal transcribed spacer ii (ITS2) secondary structures. RNA. 2005a;11:1616–1623. [PMC gratuitous commodity] [PubMed] [Google Scholar]
- Wolf, M., Friedrich, J., Dandekar, T., Muller, T. CBCAnalyzer: Inferring phylogenies based on compensatory base of operations changes in RNA secondary structures. In Silico Biol. 2005b;5:291–294. [PubMed] [Google Scholar]
Articles from RNA are provided here courtesy of The RNA Society
Source: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1950759/
Posted by: takahashipleataring.blogspot.com
0 Response to "How Do You Know If 2 Animals Belong To The Same Specie?"
Post a Comment