Unknown modifications Wilfred Tang  2014-12-06
In building a library (Settings menu, select Peptide Settings, go to Library tab, click on Build button), it appears that Skyline does not like "unknown" modifications (modifications not mappable to Unimod(?) based on delta mass). See attached screen capture. I am using Skyline-daily (64-bit)

Why does Skyline need to know the modification name? Shouldn't the delta mass alone be sufficient information?

Is there a way to use "unknown" modifications in Skyline? We commonly deal with data having lots of glycan modifications - there are a wide variety of glycan modifications, and most are not in Unimod.

Brendan MacLean responded:  2014-12-07
Hi Wilfred,
It really depends on whether you want to target peptides with these modifications in Skyline. If you do not need to target these peptides, then it doesn't really matter that Skyline does not recognize them. If you do want to target them, then you will have to define these modifications yourself, manually, using the Peptide Settings - Modifications - Structural Modifications - Edit List > Add button. In the Edit Structural Modification form, you can choose whether you want to give the modification a chemical formula or just specify the mass of the modification. (Note: In this case, be sure to check the "Variable" check box.)

I suppose, as you point out, it would not be difficult for Skyline to offer to automatically create such modifications for you with names like what Skyline shows in the form you have supplied (e.g. N[1038.4]).

We don't just do this automatically because these types of modifications lose information from a more completely defined modification. For example, without a chemical formula, we can't accurately predict isotope distributions for extraction from MS1 scans. Neither can we know whether the modification might be prone to loss like phosphorylation.

So, we definitely prefer to have chemical formulas for all modifications, but for your case, I can see that it might be cumbersome for you to supply the necessary modifications.

At least the good news is that you only need to define these modifications once, and then you can pass them around either by saving settings using the Settings > Save menu item of just by saving them to a document and passing that document around.

Hope this helps. Sorry for the extra work, but it shouldn't be all that bad.

Thanks for posting this to the Skyline support board.

Wilfred Tang responded:  2014-12-10
Hi Brendan,

Thank you for the thorough explanation. Understanding the context is very helpful.

(1) I would certainly encourage Skyline to add the capability to automatically create such modifications (i.e., on the basis of mass only). This would be a very general capability for all "unknown" modifications, such as glycans in our case.

(2) For glycans, there is a straightforward way to get the chemical formulas, though it would take a bit more work. While there is a huge diversity in glycans (thousands of different ones), they are made up of relatively few building blocks (e.g., Hex, short for Hexose, or HexNAc, short for N-Acetylhexosamine). Having Skyline be able to understand a half dozen building blocks would allow it to get exact chemical formulas to a rich variety of glycans (e.g., HexNAc(4)Hex(5)NeuAc(1))

joshuasmith responded:  2020-04-22
Hi Brendan,

I know this thread is pretty old, so maybe this issue has been addressed or implemented in Skyline subsequently, but I am not aware of it. This seemed to be the most relevant thread for me to describe my twist on the issue.

Like Wilfred, I am generating a lot of data involving variable modifications that may not be in/mappable to Unimod. In my case, they are not glycans, but instead small molecule adducts to cysteine or other specific amino acid residues (i.e., known nucleophilic hotspot sites of electrophilic adduction in specific peptide[s]). In some cases, I am doing PRM workflows with adducts that have known formulae, so even if I am targeting many adducts, like you mentioned, it is relatively straightforward to create a template Skyline file with my manually-curated variable mods, and just reuse that template file for each PRM batch/project. That has worked well for us.

However, we are also working on DIA workflows, where the goal is to agnostically identify adducts that are not known a priori, and/or where the number of variable modifications identified in a sample or batch may be several hundred to a thousand or more putative unique adducts (unique in terms of delta m/z). We have tried two approaches to merge this workflow with Skyline: 1) In-house scripts that do the adduct extraction from mzmL files, followed by generation of a transition list for targeted analysis of the DIA data in Skyline; 2) open search of multiple mzmL files with MSFragger, followed by library generation for importation into Skyline and DIA searching of individual samples.

Option 1 has worked up through the step of automatically generating a Skyline-readable transition list with single entries for each unique variable mod to a specific residue in a specific peptide. I can then import this transition list into Skyline, after which I would be able to do "targeted" analysis of the DIA data within Skyline. Option 2 has worked through conducting the MSFragger open searches and library generation, followed by import of the library into Skyline.

However, in both cases, I get a similar issue as Wilfred: the peptides with variable mods that are already in my Skyline modification list are imported perfectly, but other modifications, being novel, are not present in the modifications list, are listed as errors, and are not imported as modified peptides. As you suggested, a solution here could be to manually enter each modification prior to importing the transition list or library, but with either approach, the number of putative modifications can be so numerous that this creates an incredible workflow bottleneck, especially when trying to deploy the method across multiple sample sets, projects, etc, as each may have its own unique modification profile. I would also prefer the ability to automatically import the variable mods due to concerns over potential introduction of errors with that much manual entry.

Specific to my MSFragger workflow option (and maybe this is my ignorance with learning MSFragger), I do not currently know how to generate a list of unique mods that are present in my MSFragger library - I am just trying to import a library file (containing the modified peptides to search files against). So even if I wanted to manually enter all the modifications MSFragger found, I do not have a (human-readable?) list to work off of.

I definitely understand the concerns over chemical identity of incompletely defined modifications that you've described, but we are trying to do "first-pass" discovery of potential adducts here, which we would then follow up with targeted PRM analysis, confirmation with standards, etc. I also see the hiccups that are created in MS1 isotope ratio predictions without defined modification formulae, but we are doing DIA, so we are not as crucially reliant on MS1 scans anyway.

Long story short, I sign onto Wilfred's request #1 in his reply above: the functionality for automatic importing of "incomplete definitions" of modifications (with review functionality, preferably). I'm hoping this has been made possible since this post started in 2014. But if this has not been implemented, would not be implemented soon, or isn't possible, what about batch import of modifications into the modification list, potentially through a command line route through Skyline Runner? That would at least partially mitigate our current workflow bottleneck with this type of analysis.

Sorry for the long post! Love Skyline and thanks!
Josh Smith