R脚本选择细菌古菌(真核)、去除叶绿体、线粒体并统计比例;输出筛选并排序的OTU表输入为OTU表result/raw/otutab.txt和物种注释result/raw/otus.sintax输出筛选并排序的特征表result/otutab.txt和统计污染比例文件result/raw/otutab_nonBac.txt和过滤细节otus.sintax.discard真菌ITS数据,请改用otutab_filter_nonFungi.R脚本,只筛选真菌
otutab.txt
otus.sintax
Rscript script/otutab_filter_nonBac.R \ --input results/matrix/otutab.txt \ --taxonomy results/matrix/otus.sintax \ --output results/otutab.txt\ --stat results/matrix/otutab_nonBac.stat \ --discard results/matrix/otus.sintax.discard
otutab_nonBac.stat
otus.sintax.discard
cut -f 1 results/otutab.txt | tail -n+2 > results/otutab.id usearch -fastx_getseqs results/raw/otus.fa \ -labels results/otutab.id -fastaout results/otus.fa
otus.fa
>ASV_1 GTAGTCCACGCCGTAAACGGTGGGCGCTAGATGTGGGGACCTTCCACGGTTTCTGCGTCGCAGCTAACGCATTAAGCGCC CCGCCTGGGGAGTACGGTCGCAAGACTAAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCGGCGGAGCATGTTGCTTA ATTCGACGCAACGCGAAGAACCTTACCAAGGCTTGACATCGCCGGAAAACTCGCAGAGATGCGGGGTCCTTTTGGGCCGG TGACAGGTGGTGCATGGCTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTCGTTCT ATGTTGCCAGCACGCCCTTCGGGGTGGTGGGGACTCATAGGAGACTGCCGGGGTCAACTCGGA >ASV_2 GTAGTCCACGCCCTAAACGATGTCAACTGGTTGTTGGGAGGGTTTCTTCTCAGTAACGTAGCTAACGCGTGAAGTTGACC GCCTGGGGAGTACGGCCGCAAGGTTGAAACTCAAAGGAATTGACGGGGACCCGCACAAGCGGTGGATGATGTGGTTTAAT TCGATGCAACGCGAAAAACCTTACCTACCCTTGACATGTCTGGAATCCTGAAGAGATTTGGGAGTGCTCGAAAGAGAGCC AGAACACAGGTGCTGCATGGCCGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTGT CATTAGTTGCTACGAAAGGGCACTCTAATGAGACTGCCGGTGACAAACCGGA
awk 'NR==FNR{a[$1]=$0}NR>FNR{print a[$1]}'\ results/matrix/otus.sintax results/otutab.id \ > results/otus.sintax
usearch -otutab_stats results/otutab.txt \ -output results/otutab.stat
otutab.stat
220951 Reads (221.0k) 18 Samples 1521 OTUs 27378 Counts 5170 Count =0 (18.9%) 4891 Count =1 (17.9%) 4017 Count >=10 (14.7%) 437 OTUs found in all samples (28.7%) 609 OTUs found in 90% of samples (40.0%) 1426 OTUs found in 50% of samples (93.8%) Sample sizes: min 11028, lo 11646, med 12434, mean 12275.1, hi 12701, max 13786
此时的得到以下文件用于后续分析
results/otutab.txtresults/otus.faresults/otus.sintax