物种注释分类汇总
最后发布时间 : 2022-11-04 13:49:24
浏览量 :
OTU对应物种注释2列格式:去除sintax中置信值,只保留物种注释,替换:为_,删除引号
这里的文件otus.sintax是去除质体和非细菌后,过滤得到的
cut -f 1,4 results/文件otus.sintax是 \
|sed 's/\td/\tk/;s/:/__/g;s/,/;/g;s/"//g' \
> results/taxonomy2.txt
ASV_1 | k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Streptosporangiales;f__Thermomonosporaceae;g__Actinocorallia |
---|---|
ASV_2 | k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Burkholderiales;f__Comamonadaceae;g__Pelomonas |
ASV_3 | k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Burkholderiales;f__Comamonadaceae;g__Rhizobacter |
ASV_4 | k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Burkholderiales;f__Comamonadaceae;g__Rhizobacter |
ASV_8 | k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Streptomycetales;f__Streptomycetaceae;g__Streptomyces |
ASV_6 | k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Burkholderiales;f__Comamonadaceae;g__Piscinibacter |
ASV_7 | k__Bacteria;p__Bacteroidetes;c__Flavobacteriia;o__Flavobacteriales;f__Flavobacteriaceae;g__Flavobacterium |
OTU对应物种8列格式:注意注释是非整齐
生成物种表格OTU/ASV中空白补齐为Unassigned
awk 'BEGIN{OFS=FS="\t"}{delete a; a["k"]="Unassigned";a["p"]="Unassigned";a["c"]="Unassigned";a["o"]="Unassigned";a["f"]="Unassigned";a["g"]="Unassigned";a["s"]="Unassigned";\
split($2,x,";");for(i in x){split(x[i],b,"__");a[b[1]]=b[2];} \
print $1,a["k"],a["p"],a["c"],a["o"],a["f"],a["g"],a["s"];}' \
results/taxonomy2.txt > results/matrix/otus.tax
sed 's/;/\t/g;s/.__//g;' results/matrix/otus.tax|cut -f 1-8 | \
sed '1 s/^/OTUID\tKingdom\tPhylum\tClass\tOrder\tFamily\tGenus\tSpecies\n/' \
> results/taxonomy.txt
taxonomy.txt
OTUID | Kingdom | Phylum | Class | Order | Family | Genus | Species |
---|---|---|---|---|---|---|---|
ASV_1 | Bacteria | Actinobacteria | Actinobacteria | Streptosporangiales | Thermomonosporaceae | Actinocorallia | Unassigned |
ASV_2 | Bacteria | Proteobacteria | Betaproteobacteria | Burkholderiales | Comamonadaceae | Pelomonas | Unassigned |
ASV_3 | Bacteria | Proteobacteria | Betaproteobacteria | Burkholderiales | Comamonadaceae | Rhizobacter | Unassigned |
ASV_4 | Bacteria | Proteobacteria | Betaproteobacteria | Burkholderiales | Comamonadaceae | Rhizobacter | Unassigned |
ASV_8 | Bacteria | Actinobacteria | Actinobacteria | Streptomycetales | Streptomycetaceae | Streptomyces | Unassigned |
ASV_6 | Bacteria | Proteobacteria | Betaproteobacteria | Burkholderiales | Comamonadaceae | Piscinibacter | Unassigned |
统计门纲目科属,使用 rank参数 p c o f g,为phylum, class, order, family, genus缩写
界(Kingdom)、门(Phylum)、纲(Class)、目(Order)、科(Family)、属(Genus)、种(Species)
mkdir -p results/tax
for i in p c o f g;do
usearch -sintax_summary results/otus.sintax \
-otutabin results/otutab_rare.txt -rank ${i} \
-output results/tax/sum_${i}.txt
done
sed -i 's/(//g;s/)//g;s/\"//g;s/\#//g;s/\/Chloroplast//g' results/tax/sum_*.txt
sum_p.txt
Phylum | KO1 | KO2 | KO3 | KO4 | KO5 | KO6 | OE1 | OE2 | OE3 | OE4 | OE5 | OE6 | WT1 | WT2 | WT3 | WT4 | WT5 | WT6 | All |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Proteobacteria | 65.9 | 51.6 | 62.4 | 63 | 74.8 | 70 | 57.5 | 48.2 | 51 | 56.9 | 59.5 | 57.3 | 51.6 | 61.2 | 57.4 | 56.7 | 57 | 57.8 | 53.9 |
Actinobacteria | 26.5 | 40.3 | 27.7 | 28.7 | 17.4 | 24.3 | 29.5 | 36 | 34.2 | 29.1 | 31 | 29.7 | 36.1 | 30 | 32.5 | 34.5 | 34.8 | 32 | 28.5 |
Bacteroidetes | 3.03 | 2.67 | 7.29 | 3.05 | 4.75 | 2.12 | 3.62 | 8.07 | 8.71 | 7.4 | 3.72 | 3.53 | 5.36 | 6.03 | 4.7 | 4.11 | 4.83 | 5.75 | 5 |
Firmicutes | 1.74 | 1.96 | 1.22 | 3.23 | 0.73 | 1.52 | 5.19 | 2.81 | 2.56 | 3.37 | 1.79 | 5.2 | 0.82 | 0.62 | 1.22 | 1.02 | 0.53 | 1.11 | 3.9 |
Chloroflexi | 1.85 | 2.43 | 0.63 | 1.38 | 1.56 | 1.2 | 1.64 | 3.03 | 1.7 | 1.88 | 2.12 | 1.93 | 4.76 | 0.99 | 1.87 | 2.04 | 2.01 | 2.1 | 3.7 |
Acidobacteria | 0.3 | 0.34 | 0.19 | 0.13 | 0.22 | 0.27 | 0.95 | 0.41 | 0.55 | 0.27 | 0.72 | 0.84 | 0.33 | 0.34 | 0.92 | 0.46 | 0.13 | 0.33 | 1.4 |
sum_g.txt
Genus | KO1 | KO2 | KO3 | KO4 | KO5 | KO6 | OE1 | OE2 | OE3 | OE4 | OE5 | OE6 | WT1 | WT2 | WT3 | WT4 | WT5 | WT6 | All |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Unassigned | 10.6 | 10.6 | 11.3 | 9.83 | 7.32 | 11.2 | 12.4 | 10.2 | 9.72 | 10.5 | 10.6 | 12.3 | 9.14 | 8.85 | 12.4 | 10.3 | 8.41 | 10.7 | 17.1 |
Nocardioides | 1.67 | 2.44 | 2.58 | 1.84 | 1.79 | 2.29 | 1.72 | 2.8 | 1.96 | 1.6 | 2.25 | 1.75 | 1.68 | 1.45 | 2.19 | 1.8 | 1.85 | 1.83 | 2.8 |
Gaiella | 0.57 | 0.92 | 0.51 | 0.71 | 0.39 | 0.6 | 0.84 | 1.09 | 0.6 | 0.81 | 0.83 | 1 | 0.97 | 0.74 | 0.8 | 0.8 | 0.75 | 0.66 | 2.6 |
Steroidobacter | 2.52 | 2.21 | 1.77 | 1.74 | 1.26 | 1.91 | 3.45 | 2.53 | 2.51 | 2.46 | 3.07 | 2.97 | 1.95 | 2.77 | 3.31 | 3.03 | 2.55 | 3.38 | 2.4 |
Acidibacter | 1.61 | 1.5 | 1.19 | 1.14 | 0.87 | 1.19 | 2.49 | 1.31 | 1.46 | 1.36 | 1.75 | 2.06 | 1.35 | 1.19 | 2.13 | 1.59 | 1.19 | 2.08 | 1.6 |
Streptomyces | 3.22 | 2.83 | 3.14 | 2.27 | 1.17 | 1.84 | 2.61 | 3.74 | 8.57 | 4.99 | 3.16 | 3.24 | 4.19 | 3.58 | 3.36 | 6.34 | 4.6 | 3.5 | 1.6 |