Issue 502: Improve UnimodCompiler to preserve modification more than one decimal place for modifications that require more to disambiguate

Opened:2017-03-06 by Brendan MacLean
Changed:2018-10-16 by Brendan MacLean
Resolved:2018-10-15 by Nick Shulman
Closed:2018-10-16 by Brendan MacLean
2017-03-06 Brendan MacLean
Title»Improve UnimodCompiler to preserve modification more than one decimal place for modifications that require more to disambiguate
It turns out that quite a few modifications in Unimod can only be distinguished from each other using 2 decimal places:

K[42.0] - Acetyl (K), Guanidinyl (K), Propyl (K), Trimethyl (K)
C[58.0] - Carboxymethyl (C), Delta:H(6)C(3)O(1) (C)
D[22.0] - Cation:Na (DE), Cation:Mg[II] (DE)
E[22.0] - Cation:Na (DE), Cation:Mg[II] (DE)
C[-18.0] - Dehydrated (N-term C), Cys->Oxoalanine (C)
K[145.1] - DiLeu4plex (K), DiLeu4plex115 (K), DiLeu4plex117 (K), DiLeu4plex118 (K)
Y[145.1] - DiLeu4plex (Y), DiLeu4plex115 (Y), DiLeu4plex117 (Y), DiLeu4plex118 (Y)
K[111.0] - ICPL:13C(6) (K), Nmethylmaleimide (K)
K[144.1] - iTRAQ4plex (K), iTRAQ4plex114 (K), iTRAQ4plex115 (K)
Y[144.1] - iTRAQ4plex (Y), iTRAQ4plex114 (Y), iTRAQ4plex115 (Y)
K[304.2] - iTRAQ8plex (K), iTRAQ8plex:13C(6)15N(2) (K)
Y[304.2] - iTRAQ8plex (Y), iTRAQ8plex:13C(6)15N(2) (Y)
M[-48.0] - Met->Hsl (C-term M), Dethiomethyl (M)
K[140.1] - mTRAQ (K), Xlink:DMP-de (K)
S[80.0] - Phospho (ST), Sulfo (STY)
T[80.0] - Phospho (ST), Sulfo (STY)
Y[80.0] - Phospho (Y), Sulfo (STY)
K[4.0] - ICPL:2H(4) (K), mTRAQ:13C(3)15N(1) (K), IMID:2H(4) (K), Label:2H(4) (K), Succinyl:13C(4) (K), Succinyl:2H(4) (K)
Y[4.0] - mTRAQ:13C(3)15N(1) (Y), Label:2H(4) (Y)
K[8.0] - mTRAQ:13C(6)15N(2) (K), Dimethyl:2H(6)13C(2) (K)
K[154.1] - 4-ONE (K), Xlink:DMP-s (K)
K[170.0] - AccQTag (K), Cresylphosphate (K), Menadione (K)
C[42.0] - Acetyl (C), Amidino (C)
K[132.1] - benzylguanidine (K), Propiophenone (K)
K[339.2] - BHAc (K), NHS-LC-Biotin (K)
C[414.2] - Bodipy (C), PEO-Iodoacetyl-LC-Biotin (C)
K[70.0] - Butyryl (K), Crotonaldehyde (K), PyruvicAcidIminyl (K)
K[57.0] - Carbamidomethyl (K), Gly (K)
S[57.0] - Carbamidomethyl (S), Gly (S)
T[57.0] - Carbamidomethyl (T), Gly (T)
K[44.0] - Carboxy (K), Ethanolyl (K)
G[16.0] - Carboxy->Thiocarboxy (Protein C-term G), Oxidation (C-term G)
K[58.0] - Carboxymethyl (K), Delta:H(6)C(3)O(1) (K)
D[61.9] - Cation:Cu[I] (DE), Cation:Zn[II] (DE)
E[61.9] - Cation:Cu[I] (DE), Cation:Zn[II] (DE)
D[6.0] - Cation:Li (DE), SulfanilicAcid:13C(6) (D)
E[6.0] - Cation:Li (DE), SulfanilicAcid:13C(6) (E)
K[244.1] - CLIP_TRAQ_4 (K), Xlink:EGS (K)
C[70.0] - Crotonaldehyde (C), PyruvicAcidIminyl (Protein N-term C)
N[3.0] - Deamidated:18O(1) (NQ), Delta:H(1)O(-1)18O(1) (N)
R[54.0] - Delta:H(2)C(3)O(1) (R), MG-H1 (R)
K[28.0] - Delta:H(4)C(2) (K), Dimethyl (K), Ethyl (K), Formyl (K)
K[56.0] - Delta:H(4)C(3)O(1) (K), Propionyl (K)
S[136.0] - Diethylphosphate (S), DTT_ST (S)
T[136.0] - Diethylphosphate (T), DTT_ST (T)
R[28.0] - Dimethyl (R), Label:13C(6)15N(4)+Methyl:2H(3)13C(1) (R)
K[34.1] - Dimethyl:2H(4)13C(2) (K), Label:13C(6)+Dimethyl (K)
C[32.0] - Dioxidation (C), Sulfide (C)
K[87.0] - DTBP (K), glycidamide (K)
C[120.0] - DTT_C (C), PS_Hapten (C)
C[456.1] - EGCG1 (C), FMNC (C)
K[108.0] - Ethylphosphate (K), HydroxymethylOP (K)
S[108.0] - Ethylphosphate (S), O-Dimethylphosphate (S)
T[108.0] - Ethylphosphate (T), O-Dimethylphosphate (T)
Y[108.0] - Ethylphosphate (Y), O-Dimethylphosphate (Y)
K[178.0] - Galactosyl (K), Gluconoylation (K)
K[127.1] - GIST-Quat (K), SMA (K)
K[162.1] - Hex (K), O-pinacolylmethylphosphonate (K)
S[162.1] - Hex (S), O-pinacolylmethylphosphonate (S)
T[162.1] - Hex (T), O-pinacolylmethylphosphonate (T)
Y[162.1] - Hex (Y), O-pinacolylmethylphosphonate (Y)
C[86.0] - HMVK (C), Malonyl (C)
K[156.1] - HNE (K), Xlink:DSS (K)
C[138.1] - HNE-Delta:H(2)O (C), ICDID (C)
K[59.0] - Hydroxytrimethyl (K), Methyl+Acetyl:2H(3) (K)
M[20.0] - Label:13C(1)2H(3)+Oxidation (M), Label:13C(4)+Oxidation (M)
K[120.1] - Label:13C(4)15N(2)+GlyGly (K), Label:13C(6)+GlyGly (K)
K[122.1] - Label:13C(6)15N(2)+GlyGly (K), Xlink:DMP (K)
K[46.0] - Label:2H(4)+Acetyl (K), Methylthio (K)
K[16.0] - Methyl:2H(2) (K), Oxidation (K)
C[80.0] - Phospho (C), Sulfo (C)
K[3.0] - Acetyl:2H(3) (K), GIST-Quat:2H(3) (K), Propionyl:13C(3) (K)
S[4.0] - AEC-MAEC:2H(4) (S), mTRAQ:13C(3)15N(1) (S)
T[4.0] - AEC-MAEC:2H(4) (T), mTRAQ:13C(3)15N(1) (T)
C[2.0] - Carboxymethyl:13C(2) (C), IGBP:13C(2) (C)
K[6.0] - Dimethyl:2H(6) (K), GIST-Quat:2H(6) (K), SPITC:13C(6) (K)
C[6.0] - DTT_C:2H(6) (C), ICAT-H:13C(6) (C), ICDID:2H(6) (C)
C[5.0] - EQAT:2H(5) (C), SecNEM:2H(5) (C)
M[4.0] - Label:13C(1)2H(3) (M), Label:13C(4) (M)
C[3.0] - Propionamide:2H(3) (C), QAT:2H(3) (C)

Or even 3 decimal places:

K[145.13] - DiLeu4plex (K), DiLeu4plex117 (K)
Y[145.13] - DiLeu4plex (Y), DiLeu4plex117 (Y)
K[144.10] - iTRAQ4plex (K), iTRAQ4plex115 (K)
Y[144.10] - iTRAQ4plex (Y), iTRAQ4plex115 (Y)
M[-48.00] - Met->Hsl (C-term M), Dethiomethyl (M)
K[140.09] - mTRAQ (K), Xlink:DMP-de (K)
K[4.03] - ICPL:2H(4) (K), IMID:2H(4) (K), Label:2H(4) (K), Succinyl:2H(4) (K)
K[4.01] - mTRAQ:13C(3)15N(1) (K), Succinyl:13C(4) (K)
K[339.16] - BHAc (K), NHS-LC-Biotin (K)
K[70.04] - Butyryl (K), Crotonaldehyde (K)
K[57.02] - Carbamidomethyl (K), Gly (K)
S[57.02] - Carbamidomethyl (S), Gly (S)
T[57.02] - Carbamidomethyl (T), Gly (T)
N[2.99] - Deamidated:18O(1) (NQ), Delta:H(1)O(-1)18O(1) (N)
R[54.01] - Delta:H(2)C(3)O(1) (R), MG-H1 (R)
K[28.03] - Delta:H(4)C(2) (K), Dimethyl (K), Ethyl (K)
K[56.03] - Delta:H(4)C(3)O(1) (K), Propionyl (K)
C[120.02] - DTT_C (C), PS_Hapten (C)
S[108.00] - Ethylphosphate (S), O-Dimethylphosphate (S)
T[108.00] - Ethylphosphate (T), O-Dimethylphosphate (T)
Y[108.00] - Ethylphosphate (Y), O-Dimethylphosphate (Y)
K[178.05] - Galactosyl (K), Gluconoylation (K)
K[59.05] - Hydroxytrimethyl (K), Methyl+Acetyl:2H(3) (K)
K[42.05] - Propyl (K), Trimethyl (K)
K[3.02] - Acetyl:2H(3) (K), GIST-Quat:2H(3) (K)
C[2.01] - Carboxymethyl:13C(2) (C), IGBP:13C(2) (C)
K[6.04] - Dimethyl:2H(6) (K), GIST-Quat:2H(6) (K)
C[6.04] - DTT_C:2H(6) (C), ICDID:2H(6) (C)
C[5.03] - EQAT:2H(5) (C), SecNEM:2H(5) (C)
C[3.02] - Propionamide:2H(3) (C), QAT:2H(3) (C)

Anything that is not distinguishable at 3 decimal places seems to be because the delta masses actually are identical:

M[-48.003] - Met->Hsl (C-term M), Dethiomethyl (M)
K[140.095] - mTRAQ (K), Xlink:DMP-de (K)
K[4.025] - ICPL:2H(4) (K), IMID:2H(4) (K), Label:2H(4) (K), Succinyl:2H(4) (K)
K[339.162] - BHAc (K), NHS-LC-Biotin (K)
K[70.042] - Butyryl (K), Crotonaldehyde (K)
K[57.021] - Carbamidomethyl (K), Gly (K)
S[57.021] - Carbamidomethyl (S), Gly (S)
T[57.021] - Carbamidomethyl (T), Gly (T)
N[2.988] - Deamidated:18O(1) (NQ), Delta:H(1)O(-1)18O(1) (N)
R[54.011] - Delta:H(2)C(3)O(1) (R), MG-H1 (R)
K[28.031] - Delta:H(4)C(2) (K), Dimethyl (K), Ethyl (K)
K[56.026] - Delta:H(4)C(3)O(1) (K), Propionyl (K)
S[107.998] - Ethylphosphate (S), O-Dimethylphosphate (S)
T[107.998] - Ethylphosphate (T), O-Dimethylphosphate (T)
Y[107.998] - Ethylphosphate (Y), O-Dimethylphosphate (Y)
K[178.048] - Galactosyl (K), Gluconoylation (K)
K[42.047] - Propyl (K), Trimethyl (K)
K[3.019] - Acetyl:2H(3) (K), GIST-Quat:2H(3) (K)
C[2.007] - Carboxymethyl:13C(2) (C), IGBP:13C(2) (C)
K[6.038] - Dimethyl:2H(6) (K), GIST-Quat:2H(6) (K)
C[6.038] - DTT_C:2H(6) (C), ICDID:2H(6) (C)
C[5.031] - EQAT:2H(5) (C), SecNEM:2H(5) (C)
C[3.019] - Propionamide:2H(3) (C), QAT:2H(3) (C)

We need to improve Unimod compiler to do this analysis, and then encode the number of decimal places to use in the LightModifiedSequence lookup keys produced by Skyline into UnimodData.cs, and also generate UnimodData.cpp for BlibBuild with amino acid residue, full precision mass, and the number of digits required to disambiguate so that the peptideSeqMod column in RefSpectra uses the same lookup format as Skyline will use.

2017-03-06 Brendan MacLean
2017-09-21 Brendan MacLean

2017-09-21 Brendan MacLean
Assigned»Nick Shulman
This has turned out to be much more difficult than I expected. We lost around 2 months of Kaipo's time to it. Nick Shulman is now working on it.

2018-10-15 Nick Shulman
resolve as Fixed
Assigned ToNick Shulman»Brendan MacLean
This got fixed in Skyline 4.1.

2018-10-16 Brendan MacLean
Assigned ToBrendan MacLean»Guest