展开

nextflow cache

最后发布时间 : 2024-09-06 10:27:45 浏览量 :

Comparing the hashes of two runs
When using -dump-hashes json,the task hashes can be more easily extracted into a diff. Here is an example Bash script to perform two runs and produce a diff:

Sep-24 20:44:26.798 [Actor Thread 4] INFO  nextflow.processor.TaskProcessor - [saySecond (1)] cache hash: 6d013ed42165601f566441641599822b; mode: LENIENT; entries: [
    {
        "hash": "52fc10538560a06bf4eefb5029a3a408",
        "type": "java.util.UUID",
        "value": "68e313b7-a96b-482f-aa5a-5d3c2971779c"
    },
    {
        "hash": "4f60e6294a34dfbe2dd400257e58d05e",
        "type": "java.lang.String",
        "value": "saySecond"
    },
    {
        "hash": "480d57b7cbbe5454ea270a39e044ed1e",
        "type": "java.lang.String",
        "value": "    \"\"\"\n    cat $db = > db2.json\n    \"\"\"\n"
    },
    {
        "hash": "112f589ee6fa1b07d3f510e3885ea446",
        "type": "java.lang.String",
        "value": "master:5000/stress:latest"
    },
    {
        "hash": "0d39a5ff3a5c828a386e57fe6d0f07cd",
        "type": "java.lang.String",
        "value": "db"
    },
    {
        "hash": "9598b73b3492f1a8034f97fd39eff09f",
        "type": "nextflow.util.ArrayBag",
        "value": "[FileHolder(sourceObj:/data/workspace/1/nf-hello/db/a.txt, storePath:/data/workspace/1/nf-hello/db/a.txt, stageName:a.txt)]"
    },
    {
        "hash": "4f9d4b0d22865056c37fb6d9c2a04a67",
        "type": "java.lang.String",
        "value": "$"
    },
    {
        "hash": "16fe7483905cce7a85670e43e4678877",
        "type": "java.lang.Boolean",
        "value": "true"
    }
]
process saySecond {
  scratch true
  stageInMode "copy"
  container "master:5000/stress:latest"

  cache 'lenient'
  input: 
    path db
  output:
    path("db2.json")
  script:
    """
    cat $db > db2.json
    """
}
workflow {
  ch_input = Channel2.fromPath(["/data/workspace/1/nf-hello/db/a.txt","/data/workspace/1/nf-hello/db/b.txt"])
  saySecond(ch_input)
}

directive cache

process noCacheThis {
  cache false

  script:
  <your command string here>
}

directive cache包含一下几个值:

  • false: disable cache
  • true(default): Enable caching. Input file metadata (name, size, last updated timestamp) are included in the cache keys.
  • 'deep': Enable caching. Input file content is included in the cache keys.
  • 'lenient': Enable caching. Minimal input file metadata (name and size only) are included in the cache keys.

    此策略为由于文件时间戳不一致而在共享文件系统上观察到的不正确缓存无效提供了一种解决方法。

生信小木屋

Mar-17 09:09:32.392 [Actor Thread 54] 
    INFO  nextflow.processor.TaskProcessor - 
    [6e/c88da8] Cached process > BOWTIE2TWINS (YH4_T1)
Mar-17 09:09:32.850 [Task submitter] 
    DEBUG nextflow.executor.GridTaskHandler - 
    [SLURM] submitted process BOWTIE2TWINS (DH4_T1) > 
    jobId: 579084; 
    workDir: /data/metagenomics/pml_nextflow/work/7e/184f076728f7fca360e15c152ce81e


Mar-17 09:09:32.850 [Task submitter] 
    INFO  nextflow.Session - 
    [7e/184f07] Submitted process > BOWTIE2TWINS (DH4_T1)

Mar-17 09:09:56.408 [Task monitor] 
    DEBUG n.processor.TaskPollingMonitor - 
    Task completed > TaskHandler[
        jobId: 579084; 
        id: 180; 
        name: BOWTIE2TWINS (DH4_T1); 
        status: COMPLETED; 
        exit: 0; 
        error: -; 
        workDir: /data/metagenomics/pml_nextflow/work/7e/184f076728f7fca360e15c152ce81e 
        started: 1679015376405; 
        exited: 2023-03-17T01:09:55.108408Z; 
    ]

nextflow流程恢复的故障排除

Troubleshooting Nextflow resume

不确定的通道输入

cache源码