Issue 502: Improve UnimodCompiler to preserve modification more than one decimal place for modifications that require more to disambiguate

issues
Status:closed
Assigned To:Guest
Type:Todo
Area:Skyline
Priority:2
Milestone:4.1
Opened:2017-03-06 16:43 by Brendan MacLean
Changed:2018-10-16 14:24 by Brendan MacLean
Resolved:2018-10-15 15:04 by Nick Shulman
Resolution:Fixed
Closed:2018-10-16 14:24 by Brendan MacLean
2017-03-06 16:43 Brendan MacLean
Title»Improve UnimodCompiler to preserve modification more than one decimal place for modifications that require more to disambiguate
Assigned To»kaipot@u.washington.edu
Type»Todo
Area»Skyline
Priority»2
Milestone»3.7
It turns out that quite a few modifications in Unimod can only be distinguished from each other using 2 decimal places:

K[42.0] - Acetyl (K), Guanidinyl (K), Propyl (K), Trimethyl (K)
C[58.0] - Carboxymethyl (C), Delta:H(6)C(3)O(1) (C)
D[22.0] - Cation:Na (DE), Cation:Mg[II] (DE)
E[22.0] - Cation:Na (DE), Cation:Mg[II] (DE)
C[-18.0] - Dehydrated (N-term C), Cys->Oxoalanine (C)
K[145.1] - DiLeu4plex (K), DiLeu4plex115 (K), DiLeu4plex117 (K), DiLeu4plex118 (K)
Y[145.1] - DiLeu4plex (Y), DiLeu4plex115 (Y), DiLeu4plex117 (Y), DiLeu4plex118 (Y)
K[111.0] - ICPL:13C(6) (K), Nmethylmaleimide (K)
K[144.1] - iTRAQ4plex (K), iTRAQ4plex114 (K), iTRAQ4plex115 (K)
Y[144.1] - iTRAQ4plex (Y), iTRAQ4plex114 (Y), iTRAQ4plex115 (Y)
K[304.2] - iTRAQ8plex (K), iTRAQ8plex:13C(6)15N(2) (K)
Y[304.2] - iTRAQ8plex (Y), iTRAQ8plex:13C(6)15N(2) (Y)
M[-48.0] - Met->Hsl (C-term M), Dethiomethyl (M)
K[140.1] - mTRAQ (K), Xlink:DMP-de (K)
S[80.0] - Phospho (ST), Sulfo (STY)
T[80.0] - Phospho (ST), Sulfo (STY)
Y[80.0] - Phospho (Y), Sulfo (STY)
K[4.0] - ICPL:2H(4) (K), mTRAQ:13C(3)15N(1) (K), IMID:2H(4) (K), Label:2H(4) (K), Succinyl:13C(4) (K), Succinyl:2H(4) (K)
Y[4.0] - mTRAQ:13C(3)15N(1) (Y), Label:2H(4) (Y)
K[8.0] - mTRAQ:13C(6)15N(2) (K), Dimethyl:2H(6)13C(2) (K)
K[154.1] - 4-ONE (K), Xlink:DMP-s (K)
K[170.0] - AccQTag (K), Cresylphosphate (K), Menadione (K)
C[42.0] - Acetyl (C), Amidino (C)
K[132.1] - benzylguanidine (K), Propiophenone (K)
K[339.2] - BHAc (K), NHS-LC-Biotin (K)
C[414.2] - Bodipy (C), PEO-Iodoacetyl-LC-Biotin (C)
K[70.0] - Butyryl (K), Crotonaldehyde (K), PyruvicAcidIminyl (K)
K[57.0] - Carbamidomethyl (K), Gly (K)
S[57.0] - Carbamidomethyl (S), Gly (S)
T[57.0] - Carbamidomethyl (T), Gly (T)
K[44.0] - Carboxy (K), Ethanolyl (K)
G[16.0] - Carboxy->Thiocarboxy (Protein C-term G), Oxidation (C-term G)
K[58.0] - Carboxymethyl (K), Delta:H(6)C(3)O(1) (K)
D[61.9] - Cation:Cu[I] (DE), Cation:Zn[II] (DE)
E[61.9] - Cation:Cu[I] (DE), Cation:Zn[II] (DE)
D[6.0] - Cation:Li (DE), SulfanilicAcid:13C(6) (D)
E[6.0] - Cation:Li (DE), SulfanilicAcid:13C(6) (E)
K[244.1] - CLIP_TRAQ_4 (K), Xlink:EGS (K)
C[70.0] - Crotonaldehyde (C), PyruvicAcidIminyl (Protein N-term C)
N[3.0] - Deamidated:18O(1) (NQ), Delta:H(1)O(-1)18O(1) (N)
R[54.0] - Delta:H(2)C(3)O(1) (R), MG-H1 (R)
K[28.0] - Delta:H(4)C(2) (K), Dimethyl (K), Ethyl (K), Formyl (K)
K[56.0] - Delta:H(4)C(3)O(1) (K), Propionyl (K)
S[136.0] - Diethylphosphate (S), DTT_ST (S)
T[136.0] - Diethylphosphate (T), DTT_ST (T)
R[28.0] - Dimethyl (R), Label:13C(6)15N(4)+Methyl:2H(3)13C(1) (R)
K[34.1] - Dimethyl:2H(4)13C(2) (K), Label:13C(6)+Dimethyl (K)
C[32.0] - Dioxidation (C), Sulfide (C)
K[87.0] - DTBP (K), glycidamide (K)
C[120.0] - DTT_C (C), PS_Hapten (C)
C[456.1] - EGCG1 (C), FMNC (C)
K[108.0] - Ethylphosphate (K), HydroxymethylOP (K)
S[108.0] - Ethylphosphate (S), O-Dimethylphosphate (S)
T[108.0] - Ethylphosphate (T), O-Dimethylphosphate (T)
Y[108.0] - Ethylphosphate (Y), O-Dimethylphosphate (Y)
K[178.0] - Galactosyl (K), Gluconoylation (K)
K[127.1] - GIST-Quat (K), SMA (K)
K[162.1] - Hex (K), O-pinacolylmethylphosphonate (K)
S[162.1] - Hex (S), O-pinacolylmethylphosphonate (S)
T[162.1] - Hex (T), O-pinacolylmethylphosphonate (T)
Y[162.1] - Hex (Y), O-pinacolylmethylphosphonate (Y)
C[86.0] - HMVK (C), Malonyl (C)
K[156.1] - HNE (K), Xlink:DSS (K)
C[138.1] - HNE-Delta:H(2)O (C), ICDID (C)
K[59.0] - Hydroxytrimethyl (K), Methyl+Acetyl:2H(3) (K)
M[20.0] - Label:13C(1)2H(3)+Oxidation (M), Label:13C(4)+Oxidation (M)
K[120.1] - Label:13C(4)15N(2)+GlyGly (K), Label:13C(6)+GlyGly (K)
K[122.1] - Label:13C(6)15N(2)+GlyGly (K), Xlink:DMP (K)
K[46.0] - Label:2H(4)+Acetyl (K), Methylthio (K)
K[16.0] - Methyl:2H(2) (K), Oxidation (K)
C[80.0] - Phospho (C), Sulfo (C)
K[3.0] - Acetyl:2H(3) (K), GIST-Quat:2H(3) (K), Propionyl:13C(3) (K)
S[4.0] - AEC-MAEC:2H(4) (S), mTRAQ:13C(3)15N(1) (S)
T[4.0] - AEC-MAEC:2H(4) (T), mTRAQ:13C(3)15N(1) (T)
C[2.0] - Carboxymethyl:13C(2) (C), IGBP:13C(2) (C)
K[6.0] - Dimethyl:2H(6) (K), GIST-Quat:2H(6) (K), SPITC:13C(6) (K)
C[6.0] - DTT_C:2H(6) (C), ICAT-H:13C(6) (C), ICDID:2H(6) (C)
C[5.0] - EQAT:2H(5) (C), SecNEM:2H(5) (C)
M[4.0] - Label:13C(1)2H(3) (M), Label:13C(4) (M)
C[3.0] - Propionamide:2H(3) (C), QAT:2H(3) (C)

Or even 3 decimal places:

K[145.13] - DiLeu4plex (K), DiLeu4plex117 (K)
Y[145.13] - DiLeu4plex (Y), DiLeu4plex117 (Y)
K[144.10] - iTRAQ4plex (K), iTRAQ4plex115 (K)
Y[144.10] - iTRAQ4plex (Y), iTRAQ4plex115 (Y)
M[-48.00] - Met->Hsl (C-term M), Dethiomethyl (M)
K[140.09] - mTRAQ (K), Xlink:DMP-de (K)
K[4.03] - ICPL:2H(4) (K), IMID:2H(4) (K), Label:2H(4) (K), Succinyl:2H(4) (K)
K[4.01] - mTRAQ:13C(3)15N(1) (K), Succinyl:13C(4) (K)
K[339.16] - BHAc (K), NHS-LC-Biotin (K)
K[70.04] - Butyryl (K), Crotonaldehyde (K)
K[57.02] - Carbamidomethyl (K), Gly (K)
S[57.02] - Carbamidomethyl (S), Gly (S)
T[57.02] - Carbamidomethyl (T), Gly (T)
N[2.99] - Deamidated:18O(1) (NQ), Delta:H(1)O(-1)18O(1) (N)
R[54.01] - Delta:H(2)C(3)O(1) (R), MG-H1 (R)
K[28.03] - Delta:H(4)C(2) (K), Dimethyl (K), Ethyl (K)
K[56.03] - Delta:H(4)C(3)O(1) (K), Propionyl (K)
C[120.02] - DTT_C (C), PS_Hapten (C)
S[108.00] - Ethylphosphate (S), O-Dimethylphosphate (S)
T[108.00] - Ethylphosphate (T), O-Dimethylphosphate (T)
Y[108.00] - Ethylphosphate (Y), O-Dimethylphosphate (Y)
K[178.05] - Galactosyl (K), Gluconoylation (K)
K[59.05] - Hydroxytrimethyl (K), Methyl+Acetyl:2H(3) (K)
K[42.05] - Propyl (K), Trimethyl (K)
K[3.02] - Acetyl:2H(3) (K), GIST-Quat:2H(3) (K)
C[2.01] - Carboxymethyl:13C(2) (C), IGBP:13C(2) (C)
K[6.04] - Dimethyl:2H(6) (K), GIST-Quat:2H(6) (K)
C[6.04] - DTT_C:2H(6) (C), ICDID:2H(6) (C)
C[5.03] - EQAT:2H(5) (C), SecNEM:2H(5) (C)
C[3.02] - Propionamide:2H(3) (C), QAT:2H(3) (C)

Anything that is not distinguishable at 3 decimal places seems to be because the delta masses actually are identical:

M[-48.003] - Met->Hsl (C-term M), Dethiomethyl (M)
K[140.095] - mTRAQ (K), Xlink:DMP-de (K)
K[4.025] - ICPL:2H(4) (K), IMID:2H(4) (K), Label:2H(4) (K), Succinyl:2H(4) (K)
K[339.162] - BHAc (K), NHS-LC-Biotin (K)
K[70.042] - Butyryl (K), Crotonaldehyde (K)
K[57.021] - Carbamidomethyl (K), Gly (K)
S[57.021] - Carbamidomethyl (S), Gly (S)
T[57.021] - Carbamidomethyl (T), Gly (T)
N[2.988] - Deamidated:18O(1) (NQ), Delta:H(1)O(-1)18O(1) (N)
R[54.011] - Delta:H(2)C(3)O(1) (R), MG-H1 (R)
K[28.031] - Delta:H(4)C(2) (K), Dimethyl (K), Ethyl (K)
K[56.026] - Delta:H(4)C(3)O(1) (K), Propionyl (K)
S[107.998] - Ethylphosphate (S), O-Dimethylphosphate (S)
T[107.998] - Ethylphosphate (T), O-Dimethylphosphate (T)
Y[107.998] - Ethylphosphate (Y), O-Dimethylphosphate (Y)
K[178.048] - Galactosyl (K), Gluconoylation (K)
K[42.047] - Propyl (K), Trimethyl (K)
K[3.019] - Acetyl:2H(3) (K), GIST-Quat:2H(3) (K)
C[2.007] - Carboxymethyl:13C(2) (C), IGBP:13C(2) (C)
K[6.038] - Dimethyl:2H(6) (K), GIST-Quat:2H(6) (K)
C[6.038] - DTT_C:2H(6) (C), ICDID:2H(6) (C)
C[5.031] - EQAT:2H(5) (C), SecNEM:2H(5) (C)
C[3.019] - Propionamide:2H(3) (C), QAT:2H(3) (C)

We need to improve Unimod compiler to do this analysis, and then encode the number of decimal places to use in the LightModifiedSequence lookup keys produced by Skyline into UnimodData.cs, and also generate UnimodData.cpp for BlibBuild with amino acid residue, full precision mass, and the number of digits required to disambiguate so that the peptideSeqMod column in RefSpectra uses the same lookup format as Skyline will use.

2017-03-06 16:44 Brendan MacLean
Notify»laura declerck

2017-09-21 14:08 Brendan MacLean
Assigned Tokaipot@u.washington.edu»Nick Shulman
Milestone3.7»4.1
This has turned out to be much more difficult than I expected. We lost around 2 months of Kaipo's time to it. Nick Shulman is now working on it.

2018-10-15 15:04 Nick Shulman
resolve as Fixed
Statusopen»resolved
Assigned ToNick Shulman»Brendan MacLean
This got fixed in Skyline 4.1.

2018-10-16 14:24 Brendan MacLean
close
Statusresolved»closed
Assigned ToBrendan MacLean»Guest