read kraken hash.k2d taxo.k2d opts.k2d

最后发布时间:2026-02-14 17:14:24 浏览量:

build_db.cc 从 NCBI 参考序列(FASTA)和分类信息(nodes.dmp / names.dmp)构建 Kraken 2 所需的紧凑哈希表索引(Compact Hash Table, CHT)
输入:

  • FASTA 格式的参考序列(隐含在 ID-to-taxid 映射中)
  • seqid → taxid 映射文件(如 seqid2taxid.map)
  • NCBI taxonomy 目录(含 nodes.dmp, names.dmp)

输出:

  • hash.k2d:紧凑哈希表(存储 minimizer → LCA taxid)
  • opts.k2d:选项元数据(k, l, mask 等)
  • taxonomy.k2d:精简后的内部 taxonomy 结构
#include <fstream>
#include <iostream>
#include "../src/kraken2_data.h"   
using kraken2::IndexOptions; 

int main() {
    IndexOptions opts;

    std::ifstream ifs("/opt/workspace/kraken2/kraken2/data/small_db/opts.k2d", std::ios::binary);
    if (!ifs) {
        std::cerr << "Failed to open opts.k2d.tmp" << std::endl;
        return 1;
    }

    ifs.read(reinterpret_cast<char*>(&opts), sizeof(opts));
    if (!ifs) {
        std::cerr << "Failed to read IndexOptions from file" << std::endl;
        return 1;
    }

    std::cout << "k = " << opts.k << "\n";
    std::cout << "l = " << opts.l << "\n";
    std::cout << "spaced_seed_mask = " << opts.spaced_seed_mask << "\n";
    std::cout << "toggle_mask = " << opts.toggle_mask << "\n";
    std::cout << "dna_db = " << opts.dna_db << "\n";

    return 0;
}
// g++ -std=c++11 -I../src -O2 opts.k2d.cc -o opts.k2d
快捷入口
C and Cpp 思维导图 浏览PDF 下载PDF
分享到:
标签