You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+14-4Lines changed: 14 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,9 @@
2
2
Check trees for compatibility with defined monophyletic [edit - not right terminology ] groups - "The incontestable clan test"
3
3
4
4
## Background
5
-
###What does it do?
5
+
6
+
### What does it do?
7
+
6
8
Clan_check analyses single-copy phylogenetic trees to assess if they violate clans* defined by the user.
7
9
8
10
>*see the following paper for a definiton of a "clan"
@@ -18,6 +20,7 @@ The software will also return a 1 if the none of the taxa from the clan are foun
18
20
A "0" means that two or more of the taxa from that clan were found and they were not in a clan (i.e. they were not together to the exclusion of all other taxa on the tree).
19
21
20
22
### But... why?
23
+
21
24
This is designed for large-scale phylogenomic analyses where the user may have thousands of phylogenetic trees. While every effort may have been taken to ensure that the best orthlogs have been chosen, sometimes due to hidden paralogy it is not easy to get the choice right.
22
25
23
26
In these cases, the only evidence that the gene family may be problematic is when the resulting phylogeentic tree is incorrect for known or "incontestable" groups.
@@ -100,16 +103,23 @@ The output will be named `[phylip formatted tree file].scores.txt` and will have
100
103
101
104
Where `tree number` is in the same order as the input trees, `size` = the number of taxa in the tree, `Clan x` is the clan definied by the xth line of the clan file.
102
105
106
+
### Interpreting the results
107
+
103
108
A "1" in the table means that this tree did not violate this clan.
109
+
104
110
A "0" in the table means that this tree violated this clan.
111
+
105
112
A "?" in the table means that there were not enough taxa from the Clan in this tree to carry out the test (minimum required is 2 taxa).
106
113
107
-
All three trees did not contain Clan 3, (c d a) despite all three trees containing all three taxa
114
+
So in the test data:
115
+
116
+
* All three trees did not contain Clan 3, (c d a) despite all three trees containing all three taxa
108
117
109
-
Both tree 2 and tree 3 did not contain clan 1 (c d b), despite both trees containing all three taxa
118
+
*Both tree 2 and tree 3 did not contain clan 1 (c d b), despite both trees containing all three taxa
110
119
111
-
We could not test Clan 6 (g d) against Tree 1 or Tree 3 as neither of those trees had taxon "g".
120
+
*We could not test Clan 6 (g d) against Tree 1 or Tree 3 as neither of those trees had taxon "g".
112
121
122
+
For each tree, you can express the number of Clans violated as a sum, percentage, or treat any violation as a reason to exlucde the tree from further analyses. It all depends on what question you are asking and the level of stringency you wish to apply.
0 commit comments