MiXCR v4.5.0
🚀 New features
Multi-chain clone assembly for single-cell data
Now MiXCR calculates Heavy-Light antibody and Alpha-Beta and Gamma-Delta TCR combined clones for single-cell data. Two new commands were introduced to enable this functionality:
groupClones: calculates multi-chain clones from assembled clonotypes and writes result in a binary format;exportCloneGroups: export information about combined clonotypes.
All single-cell presets now automatically produce combined multi-chain output in both binary and textual formats, see files with names matching *.clone.groups.tsv pattern in the output folder.
New characteristics in clonotype export
- Export biochemical properties of gene regions with
-biochemicalProperty <geneFeature> <property>or-baseBiochemicalProperties <geneFeature>export options. Available in export for alignments, clones and SHM tree nodes. Available properties:Hydrophobicity,Charge,Polarity,Volume,Strength,MjEnergy,Kf1,Kf2,Kf3,Kf4,Kf5,Kf6,Kf7,Kf8,Kf9,Kf10,Rim,Surface,Turn,Alpha,Beta,Core,Disorder,N2Strength,N2Hydrophobicity,N2Volume,N2Surface. - Export isotype with
-isotype [<(primary|subclass|auto)>] - Export
-mutationRate [<gene_feature>]inexportShmTreesWithNodes,exportClonesandexportCloneGroupscommand: number of mutations relative to corresponding germline divided by the target sequence size. ForexportClonesandexportCloneGroupsCDR3 is not included in calculation.
Support for wider set of input formats
- Support for
cramfiles as input foranalyzeandaligncommands. Optionally, a reference to the genome can be specified by--reference-for-cram - Fixed usage of BAM input for
analyzeandalign, if file contains both paired and single reads
Algorithm enhancements
- Global consensus assembly algorithm, applied in
assembleto collapse UMI/Cell groups into contigs, now have much better seed selection empirical step for multi-consensus assembly scenarios. This significantly increases sensitivity during assembly of secondary consensuses from the same group of sequences. - New constrain in low-quality reads mapping procedure preventing cross-cell read mapping.
📚 Preset updates
- Additional improvement of clone filters in
10x-sc-xcr-vdjpreset. - Tag pattern upgrade for
cellecta-human-rna-xcr-umi-drivermap-air. Now UMI includes a part of the C-gene primer to increase diversity, and R2 is also used for payload. - Assembling feature fix for
irepertoire-human-rna-xcr-repseq-pluspreset. Now{CDR2Begin:FR4End}. - New preset for BD full-length protocol with enhanced beads V2 featuring B384 whitelists:
bd-sc-xcr-rhapsody-full-length-enhanced-bead-v2. - New preset for Takara Bio SMART-Seq Mouse TCR (with UMIs):
takara-mouse-rna-tcr-umi-smarseq. - Presets for new Cellecta kits:
cellecta-human-dna-xcr-umi-drivermap-air,cellecta-human-rna-xcr-full-length-umi-drivermap-air,cellecta-mouse-rna-xcr-umi-drivermap-air. - Presets for iRepertoire RepSeq+ kits with UMI:
irepertoire-mouse-rna-xcr-repseq-plus-umi-pe,irepertoire-human-rna-xcr-repseq-plus-umi-se,irepertoire-human-rna-xcr-repseq-plus-umi-pe. isotypefield added toexportClonesfor presets supporting isotype identification.- Split by C-gene enabled in
thermofisher-human-rna-igh-oncomine-lrandcellecta-human-rna-xcr-umi-drivermap-airpresets to facilitate isotype separation. - Default consensus assembly parameters
maxNormalizedAlignmentPenaltyandaltSeedPenaltyToleranceare adjusted to increase sensitivity. - The
--split-by-sampleoption is now set totrueby default for allalignpresets, as well as all presets that inherit from it. This new default behavior applies unless it is directly overridden in the preset or with--dont-split-by-samplemix-in. exportAlignmentsnow reports UMI and/or Cell barcodes by default for presets with barcodes.
🛠️ Minor improvements & fixes
- Fixed possible crash with
--dry-runoption inanalyze - More informative help message that appears when using a deprecated preset and incorrectly suggests using
--assemble-contigs-byinstead of--assemble-clonotypes-by. - When split-by-tags is enabled,
exportCloneandexportShmTreesWithNodesnow output read count as the sum of reads for given tags selection, more complicated formula was used in previous versions exportAlignmentsby default now include the columntopChains.exportClonesfunction reportstopChainsfor single cell presets.- Fixed calculation of
geneFamilyNamefor genes likeIGHA*00(without the number before*symbol) - Better formatting in
listPresetscommand. Added grouping by vendor, labels and optional filtering - Validation of input types in
alignoranalyzeby given tag pattern