展开

等量抽样标准化

最后发布时间 : 2022-11-04 11:40:49 浏览量 :

Normlize by subsample

使用vegan包进行等量重抽样,输入reads count格式Feature表result/otutab.txt
可指定输入文件、抽样量和随机数,输出抽平表result/otutab_rare.txt和多样性alpha/vegan.txt

mkdir -p results/alpha

Rscript script/otutab_rare.R --input results/otutab.txt \
      --depth 10000 --seed 1 \
      --normalize results/otutab_rare.txt \
      --output results/alpha/vegan.txt

otutab_rare.txt

#OTUIDKO1KO2KO3KO4KO5KO6OE1OE2OE3OE4OE5OE6WT1WT2WT3WT4WT5WT6
ASV_1330536214322277323418518430317410320632660489519411406
ASV_2584334658598727503210152360379285217361413220311399341
ASV_3191139282230320255189135322319258214259507298332320226
ASV_4458105140193343144178171117173173115291357157151306232
ASV_81671561551075177117165458257176170176214167279141149
ASV_62581902402403803017782100127109125141154132145125129

vegan.txt

SampleIDrichnesschao1ACEshannonsimpsoninvsimpson
KO112101439.93364928911433.187711262195.880300426145660.9897546497.6051597991676
KO211981376.439130434781399.752295278345.953815771800230.99172944120.910796850516
KO310511298.85786802031318.908805359785.610281563707250.9884164686.3293949863341
KO410501274.951351351351267.603126124055.629234843598340.9891107491.8336048546917
KO59681152.716417910451191.256558354475.421499508858370.9856085669.4857498624182
KO611181315.31251328.649909572365.807039385046950.99112052112.619207431066

抽平前

usearch -otutab_stats results/otutab.txt \
    -output results/otutab.stat
	
  220951  Reads (221.0k)
        18  Samples
      1521  OTUs

     27378  Counts
      5170  Count  =0  (18.9%)
      4891  Count  =1  (17.9%)
      4017  Count >=10 (14.7%)

       437  OTUs found in all samples (28.7%)
       609  OTUs found in 90% of samples (40.0%)
      1426  OTUs found in 50% of samples (93.8%)

Sample sizes: min 11028, lo 11646, med 12434, mean 12275.1, hi 12701, max 13786

抽平后

usearch -otutab_stats results/otutab_rare.txt \
    -output results/otutab_rare.stat
	
    180000  Reads (180.0k)
        18  Samples
      1521  OTUs

     27378  Counts
      6222  Count  =0  (22.7%)
      5415  Count  =1  (19.8%)
      3345  Count >=10 (12.2%)

       360  OTUs found in all samples (23.7%)
       515  OTUs found in 90% of samples (33.9%)
      1386  OTUs found in 50% of samples (91.1%)

Sample sizes: min 10000, lo 10000, med 10000, mean 10000.0, hi 10000, max 10000