在slurm上运行nextflow
process中的directives
在main.nf
的相同目录下添加配置文件nextflow.config
,执行nf run sample.nf -resume
executor {
name = 'slurm'
cpus = 36
memory = '150 GB'
queueSize= 10
}
process {
cpus = 8
memory = '80 GB'
clusterOptions = '-p low'
queueSize= 5
}
查看slurm上的job
scontrol show job 578447
JobId=578447 JobName=nf-SAMPLE
UserId=zyd(1001) GroupId=zyd(1001) MCS_label=N/A
Priority=1 Nice=0 Account=test QOS=normal
JobState=RUNNING Reason=None Dependency=(null)
Requeue=0 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
RunTime=00:00:13 TimeLimit=UNLIMITED TimeMin=N/A
SubmitTime=2023-03-16T18:24:33 EligibleTime=2023-03-16T18:24:33
AccrueTime=2023-03-16T18:24:33
StartTime=2023-03-16T18:24:33 EndTime=Unknown Deadline=N/A
PreemptEligibleTime=2023-03-16T18:24:33 PreemptTime=None
SuspendTime=None SecsPreSuspend=0 LastSchedEval=2023-03-16T18:24:33 Scheduler=Main
Partition=low AllocNode:Sid=node1:6453
ReqNodeList=(null) ExcNodeList=(null)
NodeList=node4
BatchHost=node4
NumNodes=1 NumCPUs=8 NumTasks=1 CPUs/Task=8 ReqB:S:C:T=0:0:*:*
TRES=cpu=8,mem=80G,node=1,billing=8
Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
MinCPUsNode=8 MinMemoryNode=80G MinTmpDiskNode=0
Features=(null) DelayBoot=00:00:00
OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
Command=.command.run
WorkDir=/data/metagenomics/pml_nextflow/work/a4/afdcae3a30f9777ff14a41e5b23254
StdErr=/data/metagenomics/pml_nextflow/work/a4/afdcae3a30f9777ff14a41e5b23254/.command.log
StdIn=/dev/null
StdOut=/data/metagenomics/pml_nextflow/work/a4/afdcae3a30f9777ff14a41e5b23254/.command.log
Power=
可以看到slurm的实际使用资源是
TRES=cpu=8,mem=80G
,即是在process中所配置的资源,有关在slurm中配置资源见资源的管理
既然是在process中配置的资源生效,那么在executor
中配置的资源有什么作用?
如果将process
的配置注释掉,那么结果是TRES=cpu=1,mem=1G
,这是slurm的默认配置。
查看netxflow
的文档,我们可以发现,在process Directives中可以添加memory、cpu这些配置。
process SAMPLE{
publishDir = [
path: { "test_data" },
mode: 'copy',
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
cpus = 8
memory = '80 GB'
clusterOptions = '-p low'
queueSize= 5
input:
tuple val(pair_id),path(reads)
output:
path "${pair_id}/*.gz"
script:
// println reads[0]
"""
zcat ${reads[0]} | seqkit sample -n 1000 -o ${pair_id}/${pair_id}_1.fastq.gz
zcat ${reads[1]} | seqkit sample -n 1000 -o ${pair_id}/${pair_id}_2.fastq.gz
"""
}
workflow{
SAMPLE(["ZF230216-57", ["/xxx/ZF230216-57/ZF230216-57_1.fq.gz","/xxx/ZF230216-57/ZF230216-57_2.fq.gz"]])
}
同样的道理,我们可以为process添加label
process SAMPLE{
publishDir = [
path: { "test_data" },
mode: 'copy',
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
label 'process_low'
input:
tuple val(pair_id),path(reads)
output:
path "${pair_id}/*.gz"
....
}
nextflow.config
配置文件内容如下:
executor {
name = 'slurm'
cpus = 36
memory = '150 GB'
queueSize= 10
}
process {
withLabel: 'process_low' {
cpus = 8
memory = '80 GB'
clusterOptions = '-p low'
}
}
scontrol show job 578450
JobId=578450 JobName=nf-SAMPLE
UserId=zyd(1001) GroupId=zyd(1001) MCS_label=N/A
Priority=1 Nice=0 Account=test QOS=normal
JobState=RUNNING Reason=None Dependency=(null)
Requeue=0 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
RunTime=00:00:29 TimeLimit=365-00:00:00 TimeMin=N/A
SubmitTime=2023-03-16T19:53:01 EligibleTime=2023-03-16T19:53:01
AccrueTime=2023-03-16T19:53:01
StartTime=2023-03-16T19:53:01 EndTime=2024-03-15T19:53:01 Deadline=N/A
PreemptEligibleTime=2023-03-16T19:53:01 PreemptTime=None
SuspendTime=None SecsPreSuspend=0 LastSchedEval=2023-03-16T19:53:01 Scheduler=Backfill
Partition=low AllocNode:Sid=node1:103542
ReqNodeList=(null) ExcNodeList=(null)
NodeList=node4
BatchHost=node4
NumNodes=1 NumCPUs=8 NumTasks=1 CPUs/Task=8 ReqB:S:C:T=0:0:*:*
TRES=cpu=8,mem=80G,node=1,billing=8
Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
MinCPUsNode=8 MinMemoryNode=80G MinTmpDiskNode=0
Features=(null) DelayBoot=00:00:00
OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
Command=.command.run
WorkDir=/data/metagenomics/pml_nextflow/work/d4/366249842debfb588868e690f85f12
StdErr=/data/metagenomics/pml_nextflow/work/d4/366249842debfb588868e690f85f12/.command.log
StdIn=/dev/null
StdOut=/data/metagenomics/pml_nextflow/work/d4/366249842debfb588868e690f85f12/.command.log
Power=