构建从NCBI下载微生物的泛基因租

最后发布时间 : 2025-04-26 12:51:16 浏览量 :

物种Ligilactobacillus murinus(1622)一共有1263个基因组

其中Assembly level:complete,chromosome并且Annotated by NCBI RefSeq有10个基因组,可以使用下面命令下载

生信小木屋

ncbi-genome-download   \
 --assembly-levels complete,chromosome \
 --species-taxids 1622 bacteria

注意参数--formats的默认值是genbank--section的默认值是refseq

  -s {refseq,genbank}, --section {refseq,genbank}
                        NCBI section to download (default: refseq)
  -l ASSEMBLY_LEVELS, --assembly-levels ASSEMBLY_LEVELS
                        Assembly levels of genomes to download (default: all). A comma-separated list
                        of assembly levels is also possible. For example: "complete,chromosome".
                        Choose from: ['all', 'complete', 'chromosome', 'scaffold', 'contig']
  -F FILE_FORMATS, --formats FILE_FORMATS
                        Which formats to download (default: genbank).A comma-separated list of
                        formats is also possible. For example: "fasta,assembly-report". Choose from:
                        ['genbank', 'fasta', 'rm', 'features', 'gff', 'protein-fasta', 'genpept',
                        'wgs', 'cds-fasta', 'rna-fna', 'rna-fasta', 'assembly-report', 'assembly-
                        stats', 'translated-cds', 'all']

下载结果如下:

.
└── refseq
    └── bacteria
        ├── GCF_000364205.2
        │   ├── GCF_000364205.2_ASM36420v2_genomic.gbff.gz
        │   └── MD5SUMS
        ├── GCF_003288115.1
        │   ├── GCF_003288115.1_ASM328811v1_genomic.gbff.gz
        │   └── MD5SUMS
        ├── GCF_003288135.1
        │   ├── GCF_003288135.1_ASM328813v1_genomic.gbff.gz
        │   └── MD5SUMS
        ├── GCF_010586905.1
        │   ├── GCF_010586905.1_ASM1058690v1_genomic.gbff.gz
        │   └── MD5SUMS
        ├── GCF_029369785.1
        │   ├── GCF_029369785.1_ASM2936978v1_genomic.gbff.gz
        │   └── MD5SUMS
        ├── GCF_030295135.1
        │   ├── GCF_030295135.1_ASM3029513v1_genomic.gbff.gz
        │   └── MD5SUMS
        ├── GCF_033405435.1
        │   ├── GCF_033405435.1_ASM3340543v1_genomic.gbff.gz
        │   └── MD5SUMS
        ├── GCF_035904565.1
        │   ├── GCF_035904565.1_ASM3590456v1_genomic.gbff.gz
        │   └── MD5SUMS
        ├── GCF_045161885.1
        │   ├── GCF_045161885.1_ASM4516188v1_genomic.gbff.gz
        │   └── MD5SUMS
        └── GCF_045164165.1
            ├── GCF_045164165.1_ASM4516416v1_genomic.gbff.gz
            └── MD5SUMS