metadata
WT1_1.fq.gz
zcat data/seq/WT1_1.fq.gz |less -S @HISEQ:549:HLYNYBCXY:1:1101:6760:2239 1:N:0:CACTCAAT ACGACTACAGAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGATGATAACTAGCTGTCCGGGCACATGGTGCTTGGGTGGCGCAGCTAACGCATTAAGTTATCCGCCTGGGGAGTACGGTCGCAAGATTAAAACTCAAAGGAATTGACGGGGGCCTGCA> + DDDDDIIIIHIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIHHIIIHIIIIIIIIIIIIIIIIIIIIHIIIIIIIIIIHIIIIIGHIIIIIIIIIIIHIIIIIIIIIIHIIIGHIIIIHIHHEHHHGHHIG> @HISEQ:549:HLYNYBCXY:1:1101:15281:2155 1:N:0:CACTCAAT ACGACTACAGAACAGGATTAGATACCCTGGTAGTCCACGCCCTAAACGATGTCAACTGGTTGTTGGGTCTTCACTGACTCAGTAACGAAGCTAACGCGTGAAGTTGACCGCCTGGGGAGTACGGCCGCAAGGTTGAAACTCAAAGGAATTGACGGGGACCCGCAC> + DDDDDICHIHIIIIIIIIIIIIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIHIIIIIIIIIIIIIHIHHIIIGIIHIIIHIIIIIIHHIIIIIHHIIHIIIIIIHIIIIHIIGIIIIGHIIIIHIIIIIIHHHIIFHIIIIIGHIIIHII>
WT1_2.fq.gz
zcat data/seq/WT1_2.fq.gz |less -S @HISEQ:549:HLYNYBCXY:1:1101:6760:2239 2:N:0:CACTCAAT ACGTCATCCCCACCTTCCTCCGGCTTATCACCGGCGGTTTCCTTAGAGTGCCCAACTGAATGATGGCAACTAAGGACGAGGGTTGCGCTCGTTGCGGGACTTAACCCAACATCTCACGACACGAGCTGACGACAGCCATGCAGCACCTGTCACTGGTCCAGCCGA> + DDDDDIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIHHIIIGIFHIHIIHHIIHHHIIHIIIIIIIIFIIIHHIIFHHHHIIIIIIIIIIIIIIIIIIIIHHCGDHIIIHIHHIHIHIGHHIGIIGHHHHCFEHHHHH-> @HISEQ:549:HLYNYBCXY:1:1101:15281:2155 2:N:0:CACTCAAT ACGTCATCCCCACCTTCCTCCGGTTTGTCACCGGCAGTCTCATTAGAGTGCCCAACTAAATGTAGCAACTAATGACAAGGGTTGCGCTCGTTGCGGGACTTAACCCAACATCTCACGACACGAGCTGACGACAGCCATGCAGCACCTGTGTTACGGTTCTCTTTC> + DDDDDIIIIIIIIIIIIIIIIIIIIIIIIIIHHIIIIIIIIIIIIIIIIIIIIIIIIIHHGIIIIIIIIIIIGIIHIIHIIIIIIIIIIIIHIIIIHHHFHGCGHIIHIIIHIIIIIGIIIIIIIIHHIIIIIIII?GHH?@GEHFGFEHHHIHIH@@HHIIIEG>
# vsearch --fastq_mergepairs data/seq/WT1_1.fq.gz --reverse data/seq/WT1_2.fq.gz --fastqout results/WT1.merged.fq --relabel WT1. for i in `tail -n+2 data/metadata.txt|cut -f1`;do mkdir -p results/merged vsearch --fastq_mergepairs data/seq/${i}_1.fq.gz --reverse data/seq/${i}_2.fq.gz --fastqout results/merged/${i}.merged.fq --relabel ${i}. done &
WT1.merged.fq
cat results/merged/WT1.merged.fq |less -S @WT1.1 ACGACTACAGAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGATGATAACTAGCTGTCCGGGCACATGGTGCTTGGGTGGCGCAGCTAACGCATTAAGTTATCCGCCTGGGGAGTACGGTCGCAAGATTAAAACTCAAAGGAATTGACGGGGGCCTGCA> + DDDDDIIIIHIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIHHIIIHIIIIIIIIIIIIIIIIIIIIHIIIIIIIIIIHIIIIIGHIIIIIIIIIIIHIIIIIIIIIIHIIIGHIIIIHIHHEHHHGHHIG> @WT1.2 ACGACTACAGAACAGGATTAGATACCCTGGTAGTCCACGCCCTAAACGATGTCAACTGGTTGTTGGGTCTTCACTGACTCAGTAACGAAGCTAACGCGTGAAGTTGACCGCCTGGGGAGTACGGCCGCAAGGTTGAAACTCAAAGGAATTGACGGGGACCCGCAC> + DDDDDICHIHIIIIIIIIIIIIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIHIIIIIIIIIIIIIHIHHIIIGIIHIIIHIIIIIIHHIIIIIHHIIHIIIIIIHIIIIHIIGIIIIGHIIIIHIIIIIIHHHIIFHIIIIIGHIIIHII>
cat results/merged/*.merged.fq > results/merged/all.fq
all.fq
cat results/merged/all.fq |less -S、 @KO1.1 ACGCTCGACAAACAGGATTAGATACCCTGGTAGTCCACGCCCTAAACGATGTGTGCTGGGCGTCGGGGGGCTTGCCCCTCGGTGCCGGAGCCAACGCGGTAAGCACACCGCCTGGGGAGTACGGCCGCAAGGTTAAAACTCAAAGGAATTGACGGGGGCCCGCAC> + @DDDDHIIIIIIHHIIIIGHIHICGHIIIIIIH<FHF?CHHIHHCGHHHHIFHCHE@G@EF?HHHHCHID/EEHCEHHII?EGDHI/DHHIFHHHHD<HDHGDHHEGGFCHHHCHH<EEE0DECHHIGIHHED@FCDFE0;DGHHI?GH/BDFHDHHH<?-8?8E> @KO1.2 ACGCTCGACAAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGATGGATGCTAGCCGTTGGCCGGTTTACCGGTCAGTGGCGCAGCTAACGCTTTAAGCATCCCGCCTGGGGAGTACGGTCGCAAGATTAAAACTCAAAGGAATTGACGGGGGCCCGCAC> + DDDDDIIIGIIIIIIIIIIIHIIIIIIIGIIIIIHIIIIIIIIGIGIHHIHHIIIIIIIIIIIIIIIIIIIHIIIIIHHIIIHHGIDHDHHIIIIIIIIIIIHIIIGHIHIHHHHIHHIIIHHDGHIHHHDCHHIIIGIIIIIHIGHIIHHHGHEHDHHIGIIII>
左端10bp标签+19bp上游引物V5共为29,右端V7为18bp下游引物Cut barcode 10bp + V5 19bp in left and V7 18bp in right务必清楚实验设计和引物长度,引物已经去除可填0,27万条序列14s
mkdir -p results/raw vsearch --fastx_filter results/merged/all.fq \ --fastq_stripleft 29 \ --fastq_stripright 18 \ --fastq_maxee_rate 0.01 \ --fastaout results/raw/filtered.fa
filtered.fa
cat results/raw/filtered.fa |less -S >KO1.1 GTAGTCCACGCCCTAAACGATGTGTGCTGGGCGTCGGGGGGCTTGCCCCTCGGTGCCGGAGCCAACGCGGTAAGCACACC GCCTGGGGAGTACGGCCGCAAGGTTAAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCGGCGGAGCATGTTGCTTAAT TCGACGCAACGCGAAGAACCTTACCAAGGCTTGACATCGCCGGAAAACTCGCAGAGATGCGGGGTCCTTTTGGGCCGGTG ACAGGTGGTGCATGGCTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTCGTTCTAT GTTGCCAGCACGCCCTTCGGGGTGGTGGGGACTCATAGGAGACTGCCGGGGTCAACTCGG >KO1.2 GTAGTCCACGCCGTAAACGATGGATGCTAGCCGTTGGCCGGTTTACCGGTCAGTGGCGCAGCTAACGCTTTAAGCATCCC GCCTGGGGAGTACGGTCGCAAGATTAAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTCAAT TCGACGCAACGCGAAGAACCTTACCAGCTCTTGACATGTCTCGTATGGGTTTCAGAGATGAGACCCTTCAGTTCGGCTGG CGAGAACACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTC GCCTTTAGTTGCCATCATTTAGTTGGGCACTCTAAAGGGACTGCCGGTGATAAGCCGCGA