Mapping RNA-seq Reads with STAR

Output files

  • Log.out: 主日志文件,包含大量有关运行的详细信息,此文件对于故障排除和调试非常有用。
  • Log.progress.out:报告job进度统计信息,如已处理reads的数量、mapped reads的百分比等。每隔1分钟更新一次。
                                 Started job on |       Mar 29 21:35:30
                             Started mapping on |       Mar 29 21:35:51
                                    Finished on |       Mar 29 21:44:36
       Mapping speed, Million of reads per hour |       172.08

                          Number of input reads |       25095163
                      Average input read length |       199
                                    UNIQUE READS:
                   Uniquely mapped reads number |       21489980
                        Uniquely mapped reads % |       85.63%
                          Average mapped length |       198.84
                       Number of splices: Total |       4482974
            Number of splices: Annotated (sjdb) |       4414683
                       Number of splices: GT/AG |       4446804
                       Number of splices: GC/AG |       32461
                       Number of splices: AT/AC |       3709
               Number of splices: Non-canonical |       0
                      Mismatch rate per base, % |       0.26%
                         Deletion rate per base |       0.01%
                        Deletion average length |       1.82
                        Insertion rate per base |       0.01%
                       Insertion average length |       1.45
                             MULTI-MAPPING READS:
        Number of reads mapped to multiple loci |       3149053
             % of reads mapped to multiple loci |       12.55%
        Number of reads mapped to too many loci |       55833
             % of reads mapped to too many loci |       0.22%
                                  UNMAPPED READS:
  Number of reads unmapped: too many mismatches |       0
       % of reads unmapped: too many mismatches |       0.00%
            Number of reads unmapped: too short |       344611
                 % of reads unmapped: too short |       1.37%
                Number of reads unmapped: other |       55686
                     % of reads unmapped: other |       0.22%
                                  CHIMERIC READS:
                       Number of chimeric reads |       0
                            % of chimeric reads |       0.00%

Note that STAR counts a paired-end read as one read(unlike the samtools flagstat/idxstats, which count each mate separately).

    每个拼接都以拼接数计算,这将对应于以SJ求和的计数 包含制表符分隔格式的高置信度折叠拼接接头。请注意,Star将junction start/end点定义为intronic bases,而许多其他软件将其定义为exonic bases。这些列具有以下含义:
  • column 1: chromosome
  • column 2: first base of the intron (1-based)
  • column 3: last base of the intron (1-based)
  • column 4: strand (0: undefined, 1: +, 2: -)
  • column 5: intron motif: 0: non-canonical; 1: GT/AG, 2: CT/AC, 3: GC/AG, 4: CT/GC, 5: AT/AC, 6: GT/AT
  • column 6: 0: unannotated, 1: annotated in the splice junctions database. Note that in 2-pass mode, junctions detected in the 1st pass are reported as annotated, in addition to annotated junctions from GTF.
  • column 7: number of uniquely mapping reads crossing the junction
  • column 8: number of multi-mapping reads crossing the junction
  • column 9: maximum spliced alignment overhang
chr1    10060   10106   2       2       0       0       1       41
chr1    10060   10178   2       2       0       0       1       46
chr1    10066   10106   2       2       0       0       1       41
chr1    10066   10178   2       2       0       0       1       46
chr1    10072   10106   2       2       0       0       1       41
chr1    10072   10178   2       2       0       0       1       46
chr1    10078   10106   2       2       0       0       1       41
chr1    10078   10178   2       2       0       0       1       46
chr1    10084   10106   2       2       0       0       1       41
