ENCODE pipeline (under active development in CPU folder)
Differences from Juicer 1.5.6:
bwa mem –SP5M paired end mode; most recent BWA release 0.7.17
nofrag default
no short end read alignment anymore
alignonly, mergeonly, deduponly stages added
simplification of chimeric_blacklist in terms of output stem
chimeric_blacklist producing partition of reads into bams:
- alignable.bam with duplicates marked
- collisions.bam
(those two ideally ENCODE product that one can download)
- collisions_low_mapq.bam
- unmapped.bam
- mapq0.bam
mitochondria no longer treated as a special case
collisions now include contigs as well (blacklist instead of whitelist)
The merged_nodups file still appears as part of the pipeline. To recover merged_nodups from alignable.bam produced by ENCODE (for example), use the bamtotxt.sh script