Title | | » | Improve UnimodCompiler to preserve modification more than one decimal place for modifications that require more to disambiguate |
Assigned To | | » | kaipot@u.washington.edu |
Type | | » | Todo |
Area | | » | Skyline |
Priority | | » | 2 |
Milestone | | » | 3.7 |
It turns out that quite a few modifications in Unimod can only be distinguished from each other using 2 decimal places:
K[42.0] - Acetyl (K), Guanidinyl (K), Propyl (K), Trimethyl (K)
C[58.0] - Carboxymethyl (C), Delta:H(6)C(3)O(1) (C)
D[22.0] - Cation:Na (DE), Cation:Mg[II] (DE)
E[22.0] - Cation:Na (DE), Cation:Mg[II] (DE)
C[-18.0] - Dehydrated (N-term C), Cys->Oxoalanine (C)
K[145.1] - DiLeu4plex (K), DiLeu4plex115 (K), DiLeu4plex117 (K), DiLeu4plex118 (K)
Y[145.1] - DiLeu4plex (Y), DiLeu4plex115 (Y), DiLeu4plex117 (Y), DiLeu4plex118 (Y)
K[111.0] - ICPL:13C(6) (K), Nmethylmaleimide (K)
K[144.1] - iTRAQ4plex (K), iTRAQ4plex114 (K), iTRAQ4plex115 (K)
Y[144.1] - iTRAQ4plex (Y), iTRAQ4plex114 (Y), iTRAQ4plex115 (Y)
K[304.2] - iTRAQ8plex (K), iTRAQ8plex:13C(6)15N(2) (K)
Y[304.2] - iTRAQ8plex (Y), iTRAQ8plex:13C(6)15N(2) (Y)
M[-48.0] - Met->Hsl (C-term M), Dethiomethyl (M)
K[140.1] - mTRAQ (K), Xlink:DMP-de (K)
S[80.0] - Phospho (ST), Sulfo (STY)
T[80.0] - Phospho (ST), Sulfo (STY)
Y[80.0] - Phospho (Y), Sulfo (STY)
K[4.0] - ICPL:2H(4) (K), mTRAQ:13C(3)15N(1) (K), IMID:2H(4) (K), Label:2H(4) (K), Succinyl:13C(4) (K), Succinyl:2H(4) (K)
Y[4.0] - mTRAQ:13C(3)15N(1) (Y), Label:2H(4) (Y)
K[8.0] - mTRAQ:13C(6)15N(2) (K), Dimethyl:2H(6)13C(2) (K)
K[154.1] - 4-ONE (K), Xlink:DMP-s (K)
K[170.0] - AccQTag (K), Cresylphosphate (K), Menadione (K)
C[42.0] - Acetyl (C), Amidino (C)
K[132.1] - benzylguanidine (K), Propiophenone (K)
K[339.2] - BHAc (K), NHS-LC-Biotin (K)
C[414.2] - Bodipy (C), PEO-Iodoacetyl-LC-Biotin (C)
K[70.0] - Butyryl (K), Crotonaldehyde (K), PyruvicAcidIminyl (K)
K[57.0] - Carbamidomethyl (K), Gly (K)
S[57.0] - Carbamidomethyl (S), Gly (S)
T[57.0] - Carbamidomethyl (T), Gly (T)
K[44.0] - Carboxy (K), Ethanolyl (K)
G[16.0] - Carboxy->Thiocarboxy (Protein C-term G), Oxidation (C-term G)
K[58.0] - Carboxymethyl (K), Delta:H(6)C(3)O(1) (K)
D[61.9] - Cation:Cu[I] (DE), Cation:Zn[II] (DE)
E[61.9] - Cation:Cu[I] (DE), Cation:Zn[II] (DE)
D[6.0] - Cation:Li (DE), SulfanilicAcid:13C(6) (D)
E[6.0] - Cation:Li (DE), SulfanilicAcid:13C(6) (E)
K[244.1] - CLIP_TRAQ_4 (K), Xlink:EGS (K)
C[70.0] - Crotonaldehyde (C), PyruvicAcidIminyl (Protein N-term C)
N[3.0] - Deamidated:18O(1) (NQ), Delta:H(1)O(-1)18O(1) (N)
R[54.0] - Delta:H(2)C(3)O(1) (R), MG-H1 (R)
K[28.0] - Delta:H(4)C(2) (K), Dimethyl (K), Ethyl (K), Formyl (K)
K[56.0] - Delta:H(4)C(3)O(1) (K), Propionyl (K)
S[136.0] - Diethylphosphate (S), DTT_ST (S)
T[136.0] - Diethylphosphate (T), DTT_ST (T)
R[28.0] - Dimethyl (R), Label:13C(6)15N(4)+Methyl:2H(3)13C(1) (R)
K[34.1] - Dimethyl:2H(4)13C(2) (K), Label:13C(6)+Dimethyl (K)
C[32.0] - Dioxidation (C), Sulfide (C)
K[87.0] - DTBP (K), glycidamide (K)
C[120.0] - DTT_C (C), PS_Hapten (C)
C[456.1] - EGCG1 (C), FMNC (C)
K[108.0] - Ethylphosphate (K), HydroxymethylOP (K)
S[108.0] - Ethylphosphate (S), O-Dimethylphosphate (S)
T[108.0] - Ethylphosphate (T), O-Dimethylphosphate (T)
Y[108.0] - Ethylphosphate (Y), O-Dimethylphosphate (Y)
K[178.0] - Galactosyl (K), Gluconoylation (K)
K[127.1] - GIST-Quat (K), SMA (K)
K[162.1] - Hex (K), O-pinacolylmethylphosphonate (K)
S[162.1] - Hex (S), O-pinacolylmethylphosphonate (S)
T[162.1] - Hex (T), O-pinacolylmethylphosphonate (T)
Y[162.1] - Hex (Y), O-pinacolylmethylphosphonate (Y)
C[86.0] - HMVK (C), Malonyl (C)
K[156.1] - HNE (K), Xlink:DSS (K)
C[138.1] - HNE-Delta:H(2)O (C), ICDID (C)
K[59.0] - Hydroxytrimethyl (K), Methyl+Acetyl:2H(3) (K)
M[20.0] - Label:13C(1)2H(3)+Oxidation (M), Label:13C(4)+Oxidation (M)
K[120.1] - Label:13C(4)15N(2)+GlyGly (K), Label:13C(6)+GlyGly (K)
K[122.1] - Label:13C(6)15N(2)+GlyGly (K), Xlink:DMP (K)
K[46.0] - Label:2H(4)+Acetyl (K), Methylthio (K)
K[16.0] - Methyl:2H(2) (K), Oxidation (K)
C[80.0] - Phospho (C), Sulfo (C)
K[3.0] - Acetyl:2H(3) (K), GIST-Quat:2H(3) (K), Propionyl:13C(3) (K)
S[4.0] - AEC-MAEC:2H(4) (S), mTRAQ:13C(3)15N(1) (S)
T[4.0] - AEC-MAEC:2H(4) (T), mTRAQ:13C(3)15N(1) (T)
C[2.0] - Carboxymethyl:13C(2) (C), IGBP:13C(2) (C)
K[6.0] - Dimethyl:2H(6) (K), GIST-Quat:2H(6) (K), SPITC:13C(6) (K)
C[6.0] - DTT_C:2H(6) (C), ICAT-H:13C(6) (C), ICDID:2H(6) (C)
C[5.0] - EQAT:2H(5) (C), SecNEM:2H(5) (C)
M[4.0] - Label:13C(1)2H(3) (M), Label:13C(4) (M)
C[3.0] - Propionamide:2H(3) (C), QAT:2H(3) (C)
Or even 3 decimal places:
K[145.13] - DiLeu4plex (K), DiLeu4plex117 (K)
Y[145.13] - DiLeu4plex (Y), DiLeu4plex117 (Y)
K[144.10] - iTRAQ4plex (K), iTRAQ4plex115 (K)
Y[144.10] - iTRAQ4plex (Y), iTRAQ4plex115 (Y)
M[-48.00] - Met->Hsl (C-term M), Dethiomethyl (M)
K[140.09] - mTRAQ (K), Xlink:DMP-de (K)
K[4.03] - ICPL:2H(4) (K), IMID:2H(4) (K), Label:2H(4) (K), Succinyl:2H(4) (K)
K[4.01] - mTRAQ:13C(3)15N(1) (K), Succinyl:13C(4) (K)
K[339.16] - BHAc (K), NHS-LC-Biotin (K)
K[70.04] - Butyryl (K), Crotonaldehyde (K)
K[57.02] - Carbamidomethyl (K), Gly (K)
S[57.02] - Carbamidomethyl (S), Gly (S)
T[57.02] - Carbamidomethyl (T), Gly (T)
N[2.99] - Deamidated:18O(1) (NQ), Delta:H(1)O(-1)18O(1) (N)
R[54.01] - Delta:H(2)C(3)O(1) (R), MG-H1 (R)
K[28.03] - Delta:H(4)C(2) (K), Dimethyl (K), Ethyl (K)
K[56.03] - Delta:H(4)C(3)O(1) (K), Propionyl (K)
C[120.02] - DTT_C (C), PS_Hapten (C)
S[108.00] - Ethylphosphate (S), O-Dimethylphosphate (S)
T[108.00] - Ethylphosphate (T), O-Dimethylphosphate (T)
Y[108.00] - Ethylphosphate (Y), O-Dimethylphosphate (Y)
K[178.05] - Galactosyl (K), Gluconoylation (K)
K[59.05] - Hydroxytrimethyl (K), Methyl+Acetyl:2H(3) (K)
K[42.05] - Propyl (K), Trimethyl (K)
K[3.02] - Acetyl:2H(3) (K), GIST-Quat:2H(3) (K)
C[2.01] - Carboxymethyl:13C(2) (C), IGBP:13C(2) (C)
K[6.04] - Dimethyl:2H(6) (K), GIST-Quat:2H(6) (K)
C[6.04] - DTT_C:2H(6) (C), ICDID:2H(6) (C)
C[5.03] - EQAT:2H(5) (C), SecNEM:2H(5) (C)
C[3.02] - Propionamide:2H(3) (C), QAT:2H(3) (C)
Anything that is not distinguishable at 3 decimal places seems to be because the delta masses actually are identical:
M[-48.003] - Met->Hsl (C-term M), Dethiomethyl (M)
K[140.095] - mTRAQ (K), Xlink:DMP-de (K)
K[4.025] - ICPL:2H(4) (K), IMID:2H(4) (K), Label:2H(4) (K), Succinyl:2H(4) (K)
K[339.162] - BHAc (K), NHS-LC-Biotin (K)
K[70.042] - Butyryl (K), Crotonaldehyde (K)
K[57.021] - Carbamidomethyl (K), Gly (K)
S[57.021] - Carbamidomethyl (S), Gly (S)
T[57.021] - Carbamidomethyl (T), Gly (T)
N[2.988] - Deamidated:18O(1) (NQ), Delta:H(1)O(-1)18O(1) (N)
R[54.011] - Delta:H(2)C(3)O(1) (R), MG-H1 (R)
K[28.031] - Delta:H(4)C(2) (K), Dimethyl (K), Ethyl (K)
K[56.026] - Delta:H(4)C(3)O(1) (K), Propionyl (K)
S[107.998] - Ethylphosphate (S), O-Dimethylphosphate (S)
T[107.998] - Ethylphosphate (T), O-Dimethylphosphate (T)
Y[107.998] - Ethylphosphate (Y), O-Dimethylphosphate (Y)
K[178.048] - Galactosyl (K), Gluconoylation (K)
K[42.047] - Propyl (K), Trimethyl (K)
K[3.019] - Acetyl:2H(3) (K), GIST-Quat:2H(3) (K)
C[2.007] - Carboxymethyl:13C(2) (C), IGBP:13C(2) (C)
K[6.038] - Dimethyl:2H(6) (K), GIST-Quat:2H(6) (K)
C[6.038] - DTT_C:2H(6) (C), ICDID:2H(6) (C)
C[5.031] - EQAT:2H(5) (C), SecNEM:2H(5) (C)
C[3.019] - Propionamide:2H(3) (C), QAT:2H(3) (C)
We need to improve Unimod compiler to do this analysis, and then encode the number of decimal places to use in the LightModifiedSequence lookup keys produced by Skyline into UnimodData.cs, and also generate UnimodData.cpp for BlibBuild with amino acid residue, full precision mass, and the number of digits required to disambiguate so that the peptideSeqMod column in RefSpectra uses the same lookup format as Skyline will use.