OTU对应物种注释2列格式:去除sintax中置信值,只保留物种注释,替换:为_,删除引号这里的文件otus.sintax是去除质体和非细菌后,过滤得到的
cut -f 1,4 results/文件otus.sintax是 \ |sed 's/\td/\tk/;s/:/__/g;s/,/;/g;s/"//g' \ > results/taxonomy2.txt
OTU对应物种8列格式:注意注释是非整齐
生成物种表格OTU/ASV中空白补齐为Unassigned
awk 'BEGIN{OFS=FS="\t"}{delete a; a["k"]="Unassigned";a["p"]="Unassigned";a["c"]="Unassigned";a["o"]="Unassigned";a["f"]="Unassigned";a["g"]="Unassigned";a["s"]="Unassigned";\ split($2,x,";");for(i in x){split(x[i],b,"__");a[b[1]]=b[2];} \ print $1,a["k"],a["p"],a["c"],a["o"],a["f"],a["g"],a["s"];}' \ results/taxonomy2.txt > results/matrix/otus.tax sed 's/;/\t/g;s/.__//g;' results/matrix/otus.tax|cut -f 1-8 | \ sed '1 s/^/OTUID\tKingdom\tPhylum\tClass\tOrder\tFamily\tGenus\tSpecies\n/' \ > results/taxonomy.txt
taxonomy.txt
统计门纲目科属,使用 rank参数 p c o f g,为phylum, class, order, family, genus缩写界(Kingdom)、门(Phylum)、纲(Class)、目(Order)、科(Family)、属(Genus)、种(Species)
mkdir -p results/tax for i in p c o f g;do usearch -sintax_summary results/otus.sintax \ -otutabin results/otutab_rare.txt -rank ${i} \ -output results/tax/sum_${i}.txt done sed -i 's/(//g;s/)//g;s/\"//g;s/\#//g;s/\/Chloroplast//g' results/tax/sum_*.txt
sum_p.txt
sum_g.txt