nextflow cache
最后发布时间 : 2024-09-06 10:27:45
浏览量 :
- github issue Add cloud cache plugin
- nextflow cache Class Diagram
- Nextflow runtime updates: Talking tasks
Comparing the hashes of two runs
When using -dump-hashes json
,the task hashes can be more easily extracted into a diff. Here is an example Bash script to perform two runs and produce a diff:
Sep-24 20:44:26.798 [Actor Thread 4] INFO nextflow.processor.TaskProcessor - [saySecond (1)] cache hash: 6d013ed42165601f566441641599822b; mode: LENIENT; entries: [
{
"hash": "52fc10538560a06bf4eefb5029a3a408",
"type": "java.util.UUID",
"value": "68e313b7-a96b-482f-aa5a-5d3c2971779c"
},
{
"hash": "4f60e6294a34dfbe2dd400257e58d05e",
"type": "java.lang.String",
"value": "saySecond"
},
{
"hash": "480d57b7cbbe5454ea270a39e044ed1e",
"type": "java.lang.String",
"value": " \"\"\"\n cat $db = > db2.json\n \"\"\"\n"
},
{
"hash": "112f589ee6fa1b07d3f510e3885ea446",
"type": "java.lang.String",
"value": "master:5000/stress:latest"
},
{
"hash": "0d39a5ff3a5c828a386e57fe6d0f07cd",
"type": "java.lang.String",
"value": "db"
},
{
"hash": "9598b73b3492f1a8034f97fd39eff09f",
"type": "nextflow.util.ArrayBag",
"value": "[FileHolder(sourceObj:/data/workspace/1/nf-hello/db/a.txt, storePath:/data/workspace/1/nf-hello/db/a.txt, stageName:a.txt)]"
},
{
"hash": "4f9d4b0d22865056c37fb6d9c2a04a67",
"type": "java.lang.String",
"value": "$"
},
{
"hash": "16fe7483905cce7a85670e43e4678877",
"type": "java.lang.Boolean",
"value": "true"
}
]
process saySecond {
scratch true
stageInMode "copy"
container "master:5000/stress:latest"
cache 'lenient'
input:
path db
output:
path("db2.json")
script:
"""
cat $db > db2.json
"""
}
workflow {
ch_input = Channel2.fromPath(["/data/workspace/1/nf-hello/db/a.txt","/data/workspace/1/nf-hello/db/b.txt"])
saySecond(ch_input)
}
directive cache
process noCacheThis {
cache false
script:
<your command string here>
}
directive cache包含一下几个值:
- false: disable cache
- true(default): Enable caching. Input file metadata (name, size, last updated timestamp) are included in the cache keys.
- 'deep': Enable caching. Input file content is included in the cache keys.
- 'lenient': Enable caching. Minimal input file metadata (name and size only) are included in the cache keys.
此策略为由于文件时间戳不一致而在共享文件系统上观察到的不正确缓存无效提供了一种解决方法。
Mar-17 09:09:32.392 [Actor Thread 54]
INFO nextflow.processor.TaskProcessor -
[6e/c88da8] Cached process > BOWTIE2TWINS (YH4_T1)
Mar-17 09:09:32.850 [Task submitter]
DEBUG nextflow.executor.GridTaskHandler -
[SLURM] submitted process BOWTIE2TWINS (DH4_T1) >
jobId: 579084;
workDir: /data/metagenomics/pml_nextflow/work/7e/184f076728f7fca360e15c152ce81e
Mar-17 09:09:32.850 [Task submitter]
INFO nextflow.Session -
[7e/184f07] Submitted process > BOWTIE2TWINS (DH4_T1)
Mar-17 09:09:56.408 [Task monitor]
DEBUG n.processor.TaskPollingMonitor -
Task completed > TaskHandler[
jobId: 579084;
id: 180;
name: BOWTIE2TWINS (DH4_T1);
status: COMPLETED;
exit: 0;
error: -;
workDir: /data/metagenomics/pml_nextflow/work/7e/184f076728f7fca360e15c152ce81e
started: 1679015376405;
exited: 2023-03-17T01:09:55.108408Z;
]
nextflow流程恢复的故障排除
Troubleshooting Nextflow resume