Issue 645: MaxQuant mod parser will fail when first two letters are shared between 2 different mods

Status:	open
Assigned To:	Matt Chambers
Type:	Defect
Area:	Skyline
Priority:	3
Milestone:	4.3

Opened:	2019-04-25 09:51 by Matt Chambers
Changed:	2019-06-21 08:45 by Matt Chambers
Resolved:
Resolution:

Closed:

2019-04-25 09:51

Matt Chambers

Title		»	MaxQuant mod parser will fail when first two letters are shared between 2 different mods
Assigned To	Guest	»	matt.chambers42@gmail.com
Notify		»	Brendan MacLean;Nick Shulman
Type		»	Defect
Area		»	Skyline
Priority		»	3
Milestone		»	4.3

For example:
EFISQLCLQEKIR 13 1 Trimethyl (K),Trioxidation (C) _EFISQLC(tr)LQEK(tr)IR_

MaxQuantReader will match both mods to Trimethyl because it comes first in the list of mod names. In this case we could fix it by adding AA specificity to the lookup, but if two mods share the first 2 letters AND specificity (Sulfation and Sulfo share Y, Cysteinyl and "Cysteinyl - carbamidomethylation" share C, etc.), then there's no hope for this way of matching mods to position. The only way I can see to do it properly is to look at the extra probabilities columns for each non-terminal mod. For example:
The "Trimethyl (K) probabilities" column has:
EFISQLCLQEK(1)IR

And the "Trioxidation (C) probabilties" column has:
EFISQLC(1)LQEKIR

What a mess!

2019-06-21 08:45

Matt Chambers

Milestone

4.3